Select some data from Hive (SparkSQL) directly using NodeJS

2015-08-25 Thread Phakin Cheangkrachange
Hi,

I just wonder if there's any way that I can get some sample data (10-20
rows) out of Spark's Hive using NodeJs?

Submitting a spark job to show 20 rows of data in web page is not good for
me.

I've set up Spark Thrift Server as shown in Spark Doc. The server works
because I can use *beeline* to connect and query data. Is there any NodeJs
package that can be used to connect and query from this server??

Best Regards,
Phakin Cheangkrachange


Re: HIVE SparkSQL

2015-03-18 Thread 宫勐
Hi:

   I need to count some Game Player Events in the game.
  Such as :   How Many  Players stay in the game scene 1--Save the
Princess from a Dragon
   Moneys they have paid in the last 5 min
   How many players pay money for go through this scene
much more esily
   age distribution of themgender distribution of them
   How many players have not login the game for 5 days
after they go through this game scene

T  The log file have been pre-format, can be load into the mysql directly:

 RoleLevelUp|1426251269733|5503232ae4b00f39751f1012|2015-03-14
02:22:46|192.168.1.16|1048630|220|0|2|57|1993|
 RoleLevelUp|1426251269734|5503232ae4b00f39751f1012|2015-03-14
02:22:52|192.168.1.16|1048630|水奈坤|0|0|3|67|1999|
 RoleLevelUp|1426251269735|550329f9e4b00f39751f101d|2015-03-14
02:24:57|192.168.1.137|1048631|z12|0|0|41|0|380|
 RoleLevelUp|1426251269736|5503232ae4b00f39751f1012|2015-03-14
02:39:01|192.168.1.16|1048630|水奈坤|0|0|15|0|2968


   Now mysql can't satisfy the analysis needs, we want to use other
technical to rebuild all static Systems

 Thanks

Best Regards

Yours
 Meng


Re: HIVE SparkSQL

2015-03-18 Thread Jörn Franke
Hallo,

Depending non your needs, search technology, such as SolrCloud or
ElasticSearch makes more sense. If you go for the Cassandra solution you
can use the lucene text indexer...
I am not sure if hive or sparksql are very suitable for text. However, if
you do not need text search then feel free to go for them.
What kind of statistics / aggregates do.you want to get out of of your logs?

Best regards
Le 18 mars 2015 04:29, 宫勐 shadowinl...@gmail.com a écrit :

 Hi:

I need to migrate a Log Analysis System from mysql + some C++ real time
 computer framwork to Hadoop ecosystem.

When I want to build a data warehouse. don't know which one is the
 right choice. Cassandra? HIVE? Or just SparkSQL ?

 There is few benchmark for these systems.

 My scenario as below:

 Every 5 seconds, flume will translate a log file from IDC.   The log
 file is pre-format to adapt Mysql Load event。 There is many IDCs,and will
 close down OR reconnect to the flume random.

 Every online IDC must receive analyse of their LOG every 5mins

 Any Suggestion?

 Thanks
 Yours
 Meng



HIVE SparkSQL

2015-03-17 Thread 宫勐
Hi:

   I need to migrate a Log Analysis System from mysql + some C++ real time
computer framwork to Hadoop ecosystem.

   When I want to build a data warehouse. don't know which one is the right
choice. Cassandra? HIVE? Or just SparkSQL ?

There is few benchmark for these systems.

My scenario as below:

Every 5 seconds, flume will translate a log file from IDC.   The log
file is pre-format to adapt Mysql Load event。 There is many IDCs,and will
close down OR reconnect to the flume random.

Every online IDC must receive analyse of their LOG every 5mins

Any Suggestion?

Thanks
Yours
Meng