Re: Setting id of document with elasticsearch-hadoop that is not in source document

2014-07-13 Thread James Campbell
Thanks for suggesting this option, I would also definitely like to have an
exclude option for fields that I currently have to include only to set the
_id, _type, and index, resulting in unnecessary fields in _source.


On Fri, Jul 11, 2014 at 4:39 PM, Costin Leau  wrote:

> Hi,
>
> I've opened up issue #230 to address your use case. Rather than offering a
> dedicated field for the ID, I opted to introduce an "include", "exclude"
> option to select (or remove) certain fields from a document before being
> saved to es. This will basically allow documents to be filtered and thus
> exclude the 'metadata' or fields that are not needed in ES directly through
> es-hadoop.
>
> Cheers,
>
>
> On Fri, Jul 11, 2014 at 9:36 PM, Brian Thomas 
> wrote:
>
>> I was just curious if there was a way of doing this without doing this, I
>> can add the field if necessary.
>>
>> For alternatives, what if in addition to es.mapping.id, there is another
>> property available also, like es.mapping.id.exlude that will not include
>> the id field in the source document.  In elasticsearch, you can create and
>> update documents without having to include the id in the source document,
>> so I think it would make sense to be able to do that with
>> elasticsearch-hadoop also.
>>
>> On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote:
>>
>>> You need to specify the id of the document you want to update somehow.
>>> Since in es-hadoop things are batch focused, each
>>> doc needs its own id specified somehow hence the use of 'es.mapping.id'
>>> to indicate its value.
>>> Is there a reason why this approach does not work for you - any
>>> alternatives that you thought of?
>>>
>>> Cheers,
>>>
>>> On 7/7/14 10:48 PM, Brian Thomas wrote:
>>> > I am trying to update an elasticsearch index using
>>> elasticsearch-hadoop.  I am aware of the *es.mapping.id*
>>> > configuration where you can specify that field in the document to use
>>> as an id, but in my case the source document does
>>> > not have the id (I used elasticsearch's autogenerated id when indexing
>>> the document).  Is it possible to specify the id
>>> > to update without having the add a new field to the MapWritable
>>> object?
>>> >
>>> >
>>> > --
>>> > You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send
>>> an email to
>>> > elasticsearc...@googlegroups.com >> unsubscr...@googlegroups.com>.
>>> > To view this discussion on the web visit
>>> > https://groups.google.com/d/msgid/elasticsearch/ce6161ad-
>>> d442-4ffb-9162-114cb8cd76dd%40googlegroups.com
>>> > >> d442-4ffb-9162-114cb8cd76dd%40googlegroups.com?utm_medium=
>>> email&utm_source=footer>.
>>> > For more options, visit https://groups.google.com/d/optout.
>>>
>>> --
>>> Costin
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>>
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/2c6753aa-c459-489b-9f86-6803a5616718%40googlegroups.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/zynzkAIWzp0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAJogdmd-EBAvd7hC3CZs%2BhjoohNuC_%2B%3Da%2B2k_kqKeKO9-jLFmA%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BAQu3yWpd36vnCxHbUi83GEGue2WjYQ%2B%2Bj_7xRWrsnSEeCvBg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting id of document with elasticsearch-hadoop that is not in source document

2014-07-11 Thread Costin Leau
Hi,

I've opened up issue #230 to address your use case. Rather than offering a
dedicated field for the ID, I opted to introduce an "include", "exclude"
option to select (or remove) certain fields from a document before being
saved to es. This will basically allow documents to be filtered and thus
exclude the 'metadata' or fields that are not needed in ES directly through
es-hadoop.

Cheers,


On Fri, Jul 11, 2014 at 9:36 PM, Brian Thomas 
wrote:

> I was just curious if there was a way of doing this without doing this, I
> can add the field if necessary.
>
> For alternatives, what if in addition to es.mapping.id, there is another
> property available also, like es.mapping.id.exlude that will not include
> the id field in the source document.  In elasticsearch, you can create and
> update documents without having to include the id in the source document,
> so I think it would make sense to be able to do that with
> elasticsearch-hadoop also.
>
> On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote:
>
>> You need to specify the id of the document you want to update somehow.
>> Since in es-hadoop things are batch focused, each
>> doc needs its own id specified somehow hence the use of 'es.mapping.id'
>> to indicate its value.
>> Is there a reason why this approach does not work for you - any
>> alternatives that you thought of?
>>
>> Cheers,
>>
>> On 7/7/14 10:48 PM, Brian Thomas wrote:
>> > I am trying to update an elasticsearch index using
>> elasticsearch-hadoop.  I am aware of the *es.mapping.id*
>> > configuration where you can specify that field in the document to use
>> as an id, but in my case the source document does
>> > not have the id (I used elasticsearch's autogenerated id when indexing
>> the document).  Is it possible to specify the id
>> > to update without having the add a new field to the MapWritable object?
>> >
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups "elasticsearch" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an email to
>> > elasticsearc...@googlegroups.com > unsubscr...@googlegroups.com>.
>> > To view this discussion on the web visit
>> > https://groups.google.com/d/msgid/elasticsearch/ce6161ad-
>> d442-4ffb-9162-114cb8cd76dd%40googlegroups.com
>> > > d442-4ffb-9162-114cb8cd76dd%40googlegroups.com?utm_medium=
>> email&utm_source=footer>.
>> > For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> Costin
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/2c6753aa-c459-489b-9f86-6803a5616718%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJogdmd-EBAvd7hC3CZs%2BhjoohNuC_%2B%3Da%2B2k_kqKeKO9-jLFmA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting id of document with elasticsearch-hadoop that is not in source document

2014-07-11 Thread Brian Thomas
I was just curious if there was a way of doing this without doing this, I 
can add the field if necessary.

For alternatives, what if in addition to es.mapping.id, there is another 
property available also, like es.mapping.id.exlude that will not include 
the id field in the source document.  In elasticsearch, you can create and 
update documents without having to include the id in the source document, 
so I think it would make sense to be able to do that with 
elasticsearch-hadoop also.

On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote:
>
> You need to specify the id of the document you want to update somehow. 
> Since in es-hadoop things are batch focused, each 
> doc needs its own id specified somehow hence the use of 'es.mapping.id' 
> to indicate its value. 
> Is there a reason why this approach does not work for you - any 
> alternatives that you thought of? 
>
> Cheers, 
>
> On 7/7/14 10:48 PM, Brian Thomas wrote: 
> > I am trying to update an elasticsearch index using elasticsearch-hadoop. 
>  I am aware of the *es.mapping.id* 
> > configuration where you can specify that field in the document to use as 
> an id, but in my case the source document does 
> > not have the id (I used elasticsearch's autogenerated id when indexing 
> the document).  Is it possible to specify the id 
> > to update without having the add a new field to the MapWritable object? 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to 
> > elasticsearc...@googlegroups.com   elasticsearch+unsubscr...@googlegroups.com >. 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com
>  
> > <
> https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com?utm_medium=email&utm_source=footer>.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
>
> -- 
> Costin 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2c6753aa-c459-489b-9f86-6803a5616718%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting id of document with elasticsearch-hadoop that is not in source document

2014-07-11 Thread Brian Thomas
I was just curious if there was a way of doing this without doing this, I 
can add the field if necessary.

For alternatives, what if in addition to es.mapping.id, there is another 
property available also, like es.mapping.id.include.in.src where you could 
specify whether the src field actually gets included in the source 
document.  In elasticsearch, you can create and update documents without 
having to include the id in the source document, so I think it would make 
sense to be able to do that with elasticsearch-hadoop also.

On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote:
>
> You need to specify the id of the document you want to update somehow. 
> Since in es-hadoop things are batch focused, each 
> doc needs its own id specified somehow hence the use of 'es.mapping.id' 
> to indicate its value. 
> Is there a reason why this approach does not work for you - any 
> alternatives that you thought of? 
>
> Cheers, 
>
> On 7/7/14 10:48 PM, Brian Thomas wrote: 
> > I am trying to update an elasticsearch index using elasticsearch-hadoop. 
>  I am aware of the *es.mapping.id* 
> > configuration where you can specify that field in the document to use as 
> an id, but in my case the source document does 
> > not have the id (I used elasticsearch's autogenerated id when indexing 
> the document).  Is it possible to specify the id 
> > to update without having the add a new field to the MapWritable object? 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to 
> > elasticsearc...@googlegroups.com   elasticsearch+unsubscr...@googlegroups.com >. 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com
>  
> > <
> https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com?utm_medium=email&utm_source=footer>.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
>
> -- 
> Costin 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/77259ed3-a896-47cc-9304-cc32046756ad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting id of document with elasticsearch-hadoop that is not in source document

2014-07-10 Thread Costin Leau
You need to specify the id of the document you want to update somehow. Since in es-hadoop things are batch focused, each 
doc needs its own id specified somehow hence the use of 'es.mapping.id' to indicate its value.

Is there a reason why this approach does not work for you - any alternatives 
that you thought of?

Cheers,

On 7/7/14 10:48 PM, Brian Thomas wrote:

I am trying to update an elasticsearch index using elasticsearch-hadoop.  I am 
aware of the *es.mapping.id*
configuration where you can specify that field in the document to use as an id, 
but in my case the source document does
not have the id (I used elasticsearch's autogenerated id when indexing the 
document).  Is it possible to specify the id
to update without having the add a new field to the MapWritable object?


--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.


--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53BF0A45.7000403%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Setting id of document with elasticsearch-hadoop that is not in source document

2014-07-07 Thread Brian Thomas
I am trying to update an elasticsearch index using elasticsearch-hadoop.  I 
am aware of the *es.mapping.id* configuration where you can specify that 
field in the document to use as an id, but in my case the source document 
does not have the id (I used elasticsearch's autogenerated id when indexing 
the document).  Is it possible to specify the id to update without having 
the add a new field to the MapWritable object?


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.