Hallo, Depending non your needs, search technology, such as SolrCloud or ElasticSearch makes more sense. If you go for the Cassandra solution you can use the lucene text indexer... I am not sure if hive or sparksql are very suitable for text. However, if you do not need text search then feel free to go for them. What kind of statistics / aggregates do.you want to get out of of your logs?
Best regards Le 18 mars 2015 04:29, "宫勐" <shadowinl...@gmail.com> a écrit : > Hi: > > I need to migrate a Log Analysis System from mysql + some C++ real time > computer framwork to Hadoop ecosystem. > > When I want to build a data warehouse. don't know which one is the > right choice. Cassandra? HIVE? Or just SparkSQL ? > > There is few benchmark for these systems. > > My scenario as below: > > Every 5 seconds, flume will translate a log file from IDC. The log > file is pre-format to adapt Mysql Load event。 There is many IDCs,and will > close down OR reconnect to the flume random. > > Every online IDC must receive analyse of their LOG every 5mins > > Any Suggestion? > > Thanks > Yours > Meng >