It sounds to me like they want two databases (they probably need to be on 
two separate servers) and that your logging application may need to pull 
double duty. You are being asked to keep an OLTP database in sync with an 
OLAP database in real time. That means that you probably need to commit 
changes to both at the same time OR somehow queue up a list of updates for 
a third-party process (any process that is NOT your application) to come 
back and make the batch changes you need.

I had a similar situation once, thousands of ad clicks per minute had to 
be both logged (OLTP) and aggregated (OLAP) for billing.  The solution we 
used was to build a "raw" log table (only one index) and hit that table 
once every minute or so with an application (not the logging application) 
that first took a snapshot of the records it was going to process, copied 
them into a long-term log, and aggregated them into the OLAP tables. Then 
the raw log was purged of the processed records to keep it small.  We used 
multiple parallel processes to aggregate the raw results. What we ended up 
doing was running 4 aggregating processes with each process working only 
it's section of our client list. That way no two threads could collide on 
processing raw records from the same client.

I know it sounds rather complex but it was able to keep up with almost 
250000 click-throughs per day.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Michael Haggerty <[EMAIL PROTECTED]> wrote on 11/09/2004 07:06:18 PM:

> I am working on a data warehousing solution involving
> mysql and have a question about best practices. We are
> standardized on mysql 4.1, and this is for a rather
> picky client.
> 
> We have a relational transaction database that stores
> the results of customer calls and a dimensional
> reporting database used as a data mart by several
> applications. Each night, we run a process that
> aggregates the number of calls, the subjects of each
> call, and various other data to populate the reporting
> database. We would like to move to a real time
> solution, and are struggling with the best way to
> implment it.
> 
> What we are considering is a solution where we mirror
> the transactional database and repopulate key tables
> in the reporting database every minute or few minutes.
> I am loathe to do this, mainly because it would add to
> our server load and could possibly lead to 'dirty
> reads' (i.e. where one table in the reporting database
> is populated with fresh data but others are not). At
> the same time, the client is demanding we implement
> something.
> 
> Does anyone have any war stories or suggestions for
> how to accomplish this?
> 
> Thank You,
> M
> 
> 
> 
> 
> 
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]
> 

Reply via email to