Information Management :: Historical Data
Continuing the topic of information management; and how we should think about it as a developer; we basically need to consider two things that impacts our implementation approach.
- Maintaining historical data
- Making available the aggregated data
In this post; I will discuss the first one; and will use an example of the telephone company’s billing system to introduce few things that I have learned. The TELCO provides the telephone services; and it offers different packages; that get changed over the time. The TELCO has customers that opt for different packages over the time. The customers use the services; and each call being made by the customer is billed as per their current package.
Things look simple; but there are few twists; if you need to maintain the historical data. You need to have records of different things; that you will decide after completing “Measurement Plan” phase with business analyst and management. In our example; few might be like what packages company offered; what were their dates; the revisions in them and their exact dates. The package history of each customer; when they changed to which package. The association of each call with the package of the user he was having at that time. In short; you need to accommodate lot of things almost every where to keep track of the historical data. You need a system with which you can tell at the given date/time what were the package and its description and how user used it. In routine; we usually don’t bother such things and tries the famous data normalization approaches with which most of the time; we are not able to provide such historical information.
This is clear that we need to de-normalize our data design to accommodate historical data. The historical data can be divided into two categories (Its my understanding; I might be wrong)
Slowly Changing Data, e-g “Package Switching” of the customer is slowly changing data; when was the last time you change your cell phone package with your company? Months ago? We need to add new tables to store such information; lets call them *History tables; e-g “UserPackageHistory”
Rapidly Changing Data, e-g the voucher type being used to recharge the account against which the given calls are being made. For such data; we might tends to add additional field in the our core tables; for this example; it might be “VoucherTypeIndex” field in the “Calls”
An interesting thing to note here is that it might have decided to have information about what exact voucher is used; may be for auditing purposes or may be to analyze which vouchers are selling more through which reseller. You might decide to have VoucherIndex and VoucherTypeIndex both in your Calls table; and its clear violation of the normalization; but it’s a norm in such scenarios!