On Thu, Apr 3, 2008 at 2:28 PM, Dre <[EMAIL PROTECTED]> wrote: > 1) Several sources seem to suggest MyISAM is a good choice for data > warehousing, but due to my lack of experience in a transaction-less world, > this makes me a little nervous.
MyISAM has the advantage of very fast loading. It's much faster to load than InnoDB. However, InnoDB had better query performance on the warehouse data I used. > How do you handle data inconsistency > problems when ETL jobs fail? Usually, with a warehouse system the answer is that you rebuild it. I have built an incrementally loading ETL system, but I was under the impression that is not the norm. > I > can see doing something like manually cleaning out the necessary tables > before you re-run, but that seems a bit messy to me. Anyone figure out a > better approach? I think it's pretty common to use a Y-loading approach, with one empty schema and then do an atomic RENAME at the end to swap the new tables into place. When I used MyISAM tables, I did it that way. > 2) Isn't the lack of bitmap indexes a problem in the warehouse? Most FKs in > the fact tables will be low cardinality columns; queries that didn't use > date would be very slow on large fact tables (MS SQL had this problem). Has > anyone run into this with MySQL? You can bundle up low-cardinality columns by using a "junk dimension." It made a big difference for me. You make a table with all of the combinations of the low-cardinality fields and assign a key to each combination. There are some good presentation on data warehousing available from the website for the MySQL Users Conference. - Perrin -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]