Information Management (IM)
IM1. Information models and systems [core]
IM2. Database systems [core]
IM3. Data modeling [core]
IM4. Relational databases [elective]
IM5. Database query languages [elective]
IM6. Relational database design [elective]
IM7. Transaction processing [elective]
IM8. Distributed databases [elective]
IM9. Physical database design [elective]
IM10. Data mining [elective]
IM11. Information storage and retrieval [elective]
IM12. Hypertext and hypermedia [elective]
IM13. Multimedia information and systems [elective]
IM14. Digital libraries [elective]
Information Management (IM) plays a critical role in almost all areas where computers are used. This area includes the capture, digitization, representation, organization, transformation, and presentation of information; algorithms for efficient and effective access and updating of stored information, data modeling and abstraction, and physical file storage techniques. It also encompasses information security, privacy, integrity, and protection in a shared environment. The student needs to be able to develop conceptual and physical data models, determine what IM methods and techniques are appropriate for a given problem, and be able to select and implement an appropriate IM solution that reflects all suitable constraints, including scalability and usability.
IM1. Information models and systems [core]
Minimum core coverage time: 3 hours
Topics:
- History and motivation for information systems
- Information storage and retrieval (IS&R)
- Information management applications
- Information capture and representation
- Analysis and indexing
- Search, retrieval, linking, navigation
- Information privacy, integrity, security, and preservation
- Scalability, efficiency, and effectiveness
Learning objectives:
- Compare and contrast information with data and knowledge.
- Summarize the evolution of information systems from early visions up through modern offerings, distinguishing their respective capabilities and future potential.
- Critique/defend a small- to medium-size information application with regard to its satisfying real user information needs.
- Describe several technical solutions to the problems related to information privacy, integrity, security, and preservation.
- Explain measures of efficiency (throughput, response time) and effectiveness (recall, precision).
- Describe approaches to ensure that information systems can scale from the individual to the global.
IM2. Database systems [core]
Minimum core coverage time: 3 hours
Topics:
- History and motivation for database systems
- Components of database systems
- DBMS functions
- Database architecture and data independence
- Use of a database query language
Learning objectives:
- Explain the characteristics that distinguish the database approach from the traditional approach of programming with data files.
- Cite the basic goals, functions, models, components, applications, and social impact of database systems.
- Describe the components of a database system and give examples of their use.
- Identify major DBMS functions and describe their role in a database system.
- Explain the concept of data independence and its importance in a database system.
- Use a query language to elicit information from a database.
IM3. Data modeling [core]
Minimum core coverage time: 4 hours
Topics:
- Data modeling
- Conceptual models (including entity-relationship and UML)
- Object-oriented model
- Relational data model
Learning objectives:
- Categorize data models based on the types of concepts that they provide to describe the database structure -- that is, conceptual data model, physical data model, and representational data model.
- Describe the modeling concepts and notation of the entity-relationship model and UML, including their use in data modeling.
- Describe the main concepts of the OO model such as object identity, type constructors, encapsulation, inheritance, polymorphism, and versioning.
- Define the fundamental terminology used in the relational data model .
- Describe the basic principles of the relational data model.
- Illustrate the modeling concepts and notation of the relational data model.
IM4. Relational databases [elective]
Topics:
- Mapping conceptual schema to a relational schema
- Entity and referential integrity
- Relational algebra and relational calculus
Learning objectives:
- Prepare a relational schema from a conceptual model developed using the entity-relationship model
- Explain and demonstrate the concepts of entity integrity constraint and referential integrity constraint (including definition of the concept of a foreign key).
- Demonstrate use of the relational algebra operations from mathematical set theory (union, intersection, difference, and cartesian product) and the relational algebra operations developed specifically for relational databases (select, product, join, and division).
- Demonstrate queries in the relational algebra.
- Demonstrate queries in the tuple relational calculus.
IM5. Database query languages [elective]
Topics:
- Overview of database languages
- SQL (data definition, query formulation, update sublanguage, constraints, integrity)
- Query optimization
- QBE and 4th-generation environments
- Embedding non-procedural queries in a procedural language
- Introduction to Object Query Language
Learning objectives:
- Create a relational database schema in SQL that incorporates key, entity integrity, and referential integrity constraints.
- Demonstrate data definition in SQL and retrieving information from a database using the SQL SELECT statement.
- Evaluate a set of query processing strategies and select the optimal strategy.
- Create a non-procedural query by filling in templates of relations to construct an example of the desired query result.
- Embed object-oriented queries into a stand-alone language such as C++ or Java (e.g., SELECT Col.Method() FROM Object).
IM6. Relational database design [elective]
Topics:
- Database design
- Functional dependency
- Normal forms (1NF, 2NF, 3NF, BCNF)
- Multivalued dependency (4NF)
- Join dependency (PJNF, 5NF)
- Representation theory
Learning objectives:
- Determine the functional dependency between two or more attributes that are a subset of a relation.
- Describe what is meant by 1NF, 2NF, 3NF, and BCNF.
- Identify whether a relation is in 1NF, 2NF, 3NF, or BCNF.
- Normalize a 1NF relation into a set of 3NF (or BCNF) relations and denormalize a relational schema.
- Explain the impact of normalization on the efficiency of database operations, especially query optimization.
- Describe what is a multivalued dependency and what type of constraints it specifies.
- Explain why 4NF is useful in schema design.
IM7. Transaction processing [elective]
Topics:
- Transactions
- Failure and recovery
- Concurrency control
Learning objectives:
- Create a transaction by embedding SQL into an application program.
- Explain the concept of implicit commits.
- Describe the issues specific to efficient transaction execution.
- Explain when and why rollback is needed and how logging assures proper rollback.
- Explain the effect of different isolation levels on the concurrency control mechanisms.
- Choose the proper isolation level for implementing a specified transaction protocol.
IM8. Distributed databases [elective]
Topics:
- Distributed data storage
- Distributed query processing
- Distributed transaction model
- Concurrency control
- Homogeneous and heterogeneous solutions
- Client-server
Learning objectives:
- Explain the techniques used for data fragmentation, replication, and allocation during the distributed database design process.
- Evaluate simple strategies for executing a distributed query to select the strategy that minimizes the amount of data transfer.
- Explain how the two-phase commit protocol is used to deal with committing a transaction that accesses databases stored on multiple nodes.
- Describe distributed concurrency control based on the distinguished copy techniques and the voting method.
- Describe the three levels of software in the client-server model.
IM9. Physical database design [elective]
Topics:
- Storage and file structure
- Indexed files
- Hashed files
- Signature files
- B-trees
- Files with dense index
- Files with variable length records
- Database efficiency and tuning
Learning objectives:
- Explain the concepts of records, record types, and files, as well as the different techniques for placing file records on disk.
- Give examples of the application of primary, secondary, and clustering indexes.
- Distinguish between a nondense index and a dense index.
- Implement dynamic multilevel indexes using B-trees.
- Explain the theory and application of internal and external hashing techniques.
- Use hashing to facilitate dynamic file expansion.
- Describe the relationships among hashing, compression, and efficient database searches.
- Evaluate costs and benefits of various hashing schemes.
- Explain how physical database design affects database transaction efficiency.
IM10. Data mining [elective]
Topics:
- The usefulness of data mining
- Associative and sequential patterns
- Data clustering
- Market basket analysis
- Data cleaning
- Data visualization
Learning objectives:
- Compare and contrast different conceptions of data mining as evidenced in both research and application.
- Explain the role of finding associations in commercial market basket data.
- Characterize the kinds of patterns that can be discovered by association rule mining.
- Describe how to extend a relational system to find patterns using association rules.
- Evaluate methodological issues underlying the effective application of data mining.
- Identify and characterize sources of noise, redundancy, and outliers in presented data.
- Identify mechanisms (on-line aggregation, anytime behavior, interactive visualization) to close the loop in the data mining process.
- Describe why the various close-the-loop processes improve the effectiveness of data mining.
IM11. Information storage and retrieval [elective]
Topics:
- Characters, strings, coding, text
- Documents, electronic publishing, markup, and markup languages
- Tries, inverted files, PAT trees, signature files, indexing
- Morphological analysis, stemming, phrases, stop lists
- Term frequency distributions, uncertainty, fuzziness, weighting
- Vector space, probabilistic, logical, and advanced models
- Information needs, relevance, evaluation, effectiveness
- Thesauri, ontologies, classification and categorization, metadata
- Bibliographic information, bibliometrics, citations
- Routing and (community) filtering
- Search and search strategy, information seeking behavior, user modeling, feedback
- Information summarization and visualization
- Integration of citation, keyword, classification scheme, and other terms
- Protocols and systems (including Z39.50, OPACs, WWW engines, research systems)
Learning objectives:
- Explain basic information storage and retrieval concepts.
- Describe what issues are specific to efficient information retrieval.
- Give applications of alternative search strategies and explain why the particular search strategy is appropriate for the application.
- Perform Internet-based research.
- Design and implement a small to medium size information storage and retrieval system.
IM12. Hypertext and hypermedia [elective]
Topics:
- Hypertext models (early history, web, Dexter, Amsterdam, HyTime)
- Link services, engines, and (distributed) hypertext architectures
- Nodes, composites, and anchors
- Dimensions, units, locations, spans
- Browsing, navigation, views, zooming
- Automatic link generation
- Presentation, transformations, synchronization
- Authoring, reading, and annotation
- Protocols and systems (including web, HTTP)
Learning objectives:
- Summarize the evolution of hypertext and hypermedia models from early versions up through current offerings, distinguishing their respective capabilities and limitations.
- Explain basic hypertext and hypermedia concepts.
- Demonstrate a fundamental understanding of information presentation, transformation, and synchronization.
- Compare and contrast hypermedia delivery based on protocols and systems used.
- Design and implement web-enabled information retrieval applications using appropriate authoring tools.
IM13. Multimedia information and systems [elective]
Topics:
- Devices, device drivers, control signals and protocols, DSPs
- Applications, media editors, authoring systems, and authoring
- Streams/structures, capture/represent/transform, spaces/domains, compression/coding
- Content-based analysis, indexing, and retrieval of audio, images, and video
- Presentation, rendering, synchronization, multi-modal integration/interfaces
- Real-time delivery, quality of service, audio/video conferencing, video-on-demand
Learning objectives:
- Describe the media and supporting devices commonly associated with multimedia information and systems.
- Explain basic multimedia presentation concepts.
- Demonstrate the use of content-based information analysis in a multimedia information system.
- Critique multimedia presentations in terms of their appropriate use of audio, video, graphics, color, and other information presentation concepts.
- Implement a multimedia application using a commercial authoring system.
IM14. Digital libraries [elective]
Topics:
- Digitization, storage, and interchange
- Digital objects, composites, and packages
- Metadata, cataloging, author submission
- Naming, repositories, archives
- Spaces (conceptual, geographical, 2/3D, VR)
- Architectures (agents, buses, wrappers/mediators), interoperability
- Services (searching, linking, browsing, and so forth)
- Intellectual property rights management, privacy, protection (watermarking)
- Archiving and preservation, integrity
Learning objectives:
- Explain the underlying technical concepts in building a digital library.
- Describe the basic service requirements for searching, linking, and browsing.
- Critique scenarios involving appropriate and inappropriate use of a digital library, and determine the social, legal, and economic consequences for each scenario.
- Describe some of the technical solutions to the problems related to archiving and preserving information in a digital library.
- Design and implement a small digital library.