Re: HIVE SparkSQL

2015-03-18 Thread
Hi:

   I need to count some Game Player Events in the game.
  Such as :   How Many  Players stay in the game scene 1--Save the
Princess from a Dragon
   Moneys they have paid in the last 5 min
   How many players pay money for go through this scene
much more esily
   age distribution of themgender distribution of them
   How many players have not login the game for 5 days
after they go through this game scene

T  The log file have been pre-format, can be load into the mysql directly:

 RoleLevelUp|1426251269733|5503232ae4b00f39751f1012|2015-03-14
02:22:46|192.168.1.16|1048630|220|0|2|57|1993|
 RoleLevelUp|1426251269734|5503232ae4b00f39751f1012|2015-03-14
02:22:52|192.168.1.16|1048630|水奈坤|0|0|3|67|1999|
 RoleLevelUp|1426251269735|550329f9e4b00f39751f101d|2015-03-14
02:24:57|192.168.1.137|1048631|z12|0|0|41|0|380|
 RoleLevelUp|1426251269736|5503232ae4b00f39751f1012|2015-03-14
02:39:01|192.168.1.16|1048630|水奈坤|0|0|15|0|2968


   Now mysql can't satisfy the analysis needs, we want to use other
technical to rebuild all static Systems

 Thanks

Best Regards

Yours
 Meng


HIVE SparkSQL

2015-03-17 Thread
Hi:

   I need to migrate a Log Analysis System from mysql + some C++ real time
computer framwork to Hadoop ecosystem.

   When I want to build a data warehouse. don't know which one is the right
choice. Cassandra? HIVE? Or just SparkSQL ?

There is few benchmark for these systems.

My scenario as below:

Every 5 seconds, flume will translate a log file from IDC.   The log
file is pre-format to adapt Mysql Load event。 There is many IDCs,and will
close down OR reconnect to the flume random.

Every online IDC must receive analyse of their LOG every 5mins

Any Suggestion?

Thanks
Yours
Meng