Hi All,
Currently I am using ElasticSearch for a logging system.
My first solution is that every log will put on ES and index will rolling
by date.
To do real time stats, I will use Aggregation.
To do statistic I will use Spark (or Hive, Shark whatever) on ES data
(thanks to ElasticSearch-Hadoop
It depends on various factors. Do you put all the data under one index or
is it one index per day/month/hour? What type of script and performance
degradation do you see? If it's easier feel free to reach out on irc. I'll
be traveling this week but we'll be back the next one.
Cheers
On Oct 12,
Hi Costin Leau,
Currently I just pull all data in one index (INDEX_NAME_DATE)
In my benmark, I just do two function, count and count distinct field.
P/S: Thanks for your fast response, I would really happy to see you at IRC
(just give me the time).
On Sunday, October 12, 2014 8:02:57 PM