Re: [hadoop] newbie question

Christian Dahlqvist Sun, 03 May 2015 12:15:27 -0700

Hi,

I am sure Hadoop can help you calculate this, but you may also be able to 
go about this more efficiently in Elasticsearch. If you, as you mentioned, 
were to create a user centric index in addition to the event centric one 
that you have got, you could store a list of all the events belonging to a 
user there. This would allow you to efficiently identify the users that 
have all the required events through a simple query, and then just process 
these to verify that the order is correct, which is likely to scale and 
perform much better than the current approach. This is what is usually 
referred to as entity-centric indexing [1].


As updating the user centric index for every event inserted can often be 
expensive, a common approach is to create a batch job that periodically 
retrieves all new events, aggregates these per user and updates the user 
index. This will mean that the user index will not be completely up to date 
all the time, but as you spread out the processing work, it can make 
queries much more efficient.

[1] https://www.elastic.co/videos/entity-centric-indexing-london-meetup-sep-2014

Best regards,

Christian


On Sunday, 3 May 2015 10:21:35 UTC+1, Lior Goldemberg wrote:
>
> hi,
>
> i have few basic questions about es-hadoop,
> and i would really appreciate your kind help 
>
> 1. if i have currently ES cluster, do i have motivation to add hadoop 
> layer?
>
> 2. is the idea of ES-hadoop, that hadoop will be the data store, and ES 
> the search engine above it?
>
> 3. can logstash write to hadoop?
>
> 4. when i run queries to ES, does it go to HDFS in real time?
>
> thanks a lot!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/891973ff-14be-4720-9895-d7e6581b2323%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [hadoop] newbie question

Reply via email to