Thursday, March 8, 2007

Characteristics of Datawarehousing - Time Variant

In order to discover trends in business, analysts need large amounts of data. This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. Adata warehouse’s focus on change over time is what is meant by the term time variant.

Characteristics of Datawarehousing - Nonvolatile

This characteristic of datawarehousing is very interesting. Normally when there is any change in data , those changes are made in the Operational data base. However in case of datawarehouse the snapshot of the table is made on an interval basis ( daily ) and gets inserted into the datawarehouse. There is no need to change the data present in DW.

Characteristics of Datawarehousing - Integrated

Integration is closely related to subject orientation. Data warehouses must put data from heterogeneous sources having different formats into a single or multiple warehouses having a consistent format. When all issues relating to the conflicts like naming conventions and datatypes are resolved, they are said to be integrated.

Characteristics of Datawarehousing - Subject Oriented

Data warehouses are designed to help you analyze data - so datawarehouses can be modeled according to the way the data needs to be analysed. Eg : When a bank wants to analyse the customers and their accounts inorder to provide discounts or promotions , they will have to build a data warehouse around the accounts . This ability to define a data warehouse by subjectarea makes the data warehouse subject oriented.

Characteristics of Datawarehousing :

A common way of introducing data warehousing is to refer to the characteristics ofa data warehouse as set forth by William Inmon:
1) Subject Oriented
2) Integrated
3) Nonvolatile
4) Time Variant

What is slowly changing dimension. What kind of scd used in your project?

Dimension attribute values may change constantly over the time. (Say for example customer dimension has customer_id,name, and address) customer address may change over time.
How will you handle this situation?
There are 3 types,
1) To overwrite the existing record,
2) To create additional new record at the time of change with the new attribute values.
3) To create new field to keep new values in the original dimension table

What is difference between data mart and data warehouse?

A data mart designed for a particular line of business, such as sales, marketing, or finance.

Where as data warehouse is enterprise-wide/organizational

The data flow of data warehouse depending on the approach.