Please find my comments below :
>> From what you said above that means that you can not run ES queries on
data in Hadoop over something like a 6 month time range without it having
to pull in all that data and index it first.   - *CORRECT *.* Es queries
can run only on ES*

>>And I am assuming that the opposite is all correct that Hadoop can not
run jobs on data in ES without it first pulling in that data to its storage
first. - *NOT CORRECT*
The thing is , that you can run MR jobs against data stored in ES (via
EsInputFormat)
So you can do some realy cool stuff reading(and writing) data form ES and
the use the power of MR to process/analyze/dowhateveryouwant the data.

In most common case with Hadoop MR job you do the following
1. Job config : input, output, input format, output format , etc
2. Mapper - proces each "line" of the input (stored on HDFS) and eventualy
"emit" ket/val to Reducer
3. In reducer process all values for one key and eventualy emit again to
the output (on HDFS)

With Es-hadoop you can set the job input data to be read from ES (so step
1) and then all steps can be the same.

I am giving you some typical scenarios  :
1. Read(via es query) from ES
1.1 Process the data in a MR job
1.2 Store the output to HDFS [OR Store output to ES again (ESindexing
operation)]


2. Run MR job against data stored on HDFS
2.1 Process the data
2.2 Store the output to ES (ES indexing)

Cheers
Georgi




2014-06-09 13:47 GMT+02:00 ES USER <es.user.2...@gmail.com>:

> Thanks.  So just one final question.  From what you said above that means
> that you can not run ES queries on data in Hadoop over something like a 6
> month time range without it having to pull in all that data and index it
> first.  And I am assuming that the opposite is all correct that Hadoop can
> not run jobs on data in ES without it first pulling in that data to its
> storage first.
>
>
>
>
>
> On Friday, June 6, 2014 5:03:03 PM UTC-4, Costin Leau wrote:
>
>> ES stores data in its own internal format, which typically resides
>> locally. What you are stating is partially correct - with the connector you
>> would move/copy data between Hadoop and ES since, in order for ES to work
>> with data, it needs to actually index it (that is, to see it).
>> So you would use es-hadoop to index data from Hadoop in ES or/and query
>> ES directly from Hadoop.
>>
>>
>> On Fri, Jun 6, 2014 at 9:29 PM, ES USER <es.use...@gmail.com> wrote:
>>
>>> I guess the problem I having wrapping my head around is exactly where
>>> the data is residing and in what format.
>>>
>>> If I understand the Georgi's email above is it that you can run map
>>> reduce jobs against data stored in local ES through by utilizing es-hadoop
>>> and you can also run ES queries against data in Hadoop utilizing es-hadoop.
>>>
>>>
>>>   Is that correct?
>>>
>>>
>>>
>>>
>>> On Friday, June 6, 2014 12:39:44 PM UTC-4, Costin Leau wrote:
>>>
>>>> Adding to what Georgi wrote, es-hadoop does not create the shards for
>>>> you - that's up to you or index templates (which I highly recommend).
>>>> However es-hadoop is aware of the target shards and will use them to
>>>> parallelize the reads/writes (such as one task per shard).
>>>>
>>>>
>>>> On Fri, Jun 6, 2014 at 2:45 PM, Georgi Ivanov <georgi....@gmail.com>
>>>> wrote:
>>>>
>>>>> and i don't think this anyhow related with number of shards and nodes
>>>>>
>>>>>
>>>>> On Thursday, June 5, 2014 7:41:34 PM UTC+2, ES USER wrote:
>>>>>
>>>>>> Try as I might and I have read all the stuff I can find on ES'
>>>>>> website about this I understand somewhat how the integration works but 
>>>>>> not
>>>>>> the actual nuts and bolts of it.
>>>>>>
>>>>>> For example:
>>>>>>
>>>>>> Is Hadoop just storing the files that would normally be stored in the
>>>>>> local filesystem for the ES indexes or is it storing the data that would
>>>>>> normally be in those indexes and just accessed through es-hadoop?
>>>>>>
>>>>>> If it is the latter how do you go about determining whatto set for
>>>>>> the number of nodes and shards.
>>>>>>
>>>>>>
>>>>>> If anyone has any information on this or even better yet a place to
>>>>>> point me to that has better references so that I can research this on my
>>>>>> own it would be much appreciated.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>  --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to elasticsearc...@googlegroups.com.
>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>> msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40goo
>>>>> glegroups.com
>>>>> <https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/ed729795-a7d6-4320-9da2-16b214e653b0%
>>> 40googlegroups.com
>>> <https://groups.google.com/d/msgid/elasticsearch/ed729795-a7d6-4320-9da2-16b214e653b0%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/xZjsGukuV8g/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/59691ff7-25b7-4888-bf8b-ca1525637728%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/59691ff7-25b7-4888-bf8b-ca1525637728%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGKxwgkq36Hyji7dGp%3DmU2TVWwSx2pQAk4aKquQN%2B2JJMxF2Gw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to