Hello all. We're periodically scan HBase tables to aggregate statistic information, and store it to MySQL.
We have 3 kinds of CP (kind of data source), each has one Channel and one Article table. (Channel : Article is 1:N relation.) All CPs table schema are different a bit, so in order to aggregate we should apply different logics, with joining Channel and Article. I've thought about workflow like this, but I wonder it can make sense. 1. run single process which initializes MySQL by creating table, deleting row, etc. 2. run 3 M/Rs simultaneously to aggregate statistic information for each CP, and insert rows per Channel to MySQL. 3. run single process which finalizes whole aggregation - runs aggregation query from MySQL to insert new row to MySQL, rolling table, etc. Definitely 1,2,3 should be run in a row. Any helps are really appreciated! Thanks. Regards. Jungtaek Lim (HeartSaVioR)