Re: The python client avoids the _parent field when it does reindex
Hi, this is indeed a problem with elasticsearch, sorry it took me so long to realize this. The search api (and therefore the scan helper which is used by the reindex) doesn't return the value for _parent by default. I created an issue in elasticsearch to change that (0). The workaround is to just use the scan and bulk methods (look how reindex looks, copy-paste and modify) to add fields=['_source', '_parent'] to the scan call and create your own version of expand_action callback that will extract the _parent value from the _fields dictionary. This is very far from ideal so I also created an issue for elasticsearch-py (1) to enable doing this (maybe even do this by default) and describe the issues in the docs. Thanks for reporting this and sorry for the oversight, Honza 0 - https://github.com/elasticsearch/elasticsearch/issues/8068 1 - https://github.com/elasticsearch/elasticsearch-py/issues/140 On Mon, Oct 13, 2014 at 9:36 AM, Costya Regev wrote: > Hi Kral, > > I think that the mapping is fine. Here it is: > > "mappings":{ > "account":{ > "properties":{ > "name":{ > "type":"string", > "index":"not_analyzed" > } > > ... ANOTHER ACCOUNT FIELDS ... > > }, > "_routing":{ > "required":true, > "path":"name" > } > }, > "event":{ > "properties":{ > > ... SOME EVENT FIELDS ... > > "account":{ > "type":"object", > "properties":{ > > ... SOME MORE FIELDS ... > > "name":{ > "type":"string", > "index":"no" > } > } > } > }, > "_parent":{ > "type":"account" > }, > "_timestamp":{ > "enabled":true, > "store":true > }, > "_routing":{ > "required":true, > "path":"account.name" > } > } > } > > Do you see anything wrong here? > Thx > > > On Sunday, October 12, 2014 9:46:37 PM UTC+3, Honza Král wrote: >> >> Hi Costya, >> >> the code actually looks for parent and should move it around. Are you >> sure you have your mappings set up correctly for the new index to >> include the parent/child relationship? >> >> Thanks >> >> On Sun, Oct 12, 2014 at 5:30 PM, Costya Regev wrote: >> > Hi, >> > >> > When we run the following code: >> > >> > from elasticsearch import Elasticsearch >> > from elasticsearch.helpers import reindex >> > >> > >> > if __name__ == "__main__": >> > es = Elasticsearch() >> > reindex(es, source_index='2014_03', target_index='2014_03_new', >> > chunk_size=500, scroll='5m') >> > >> > We get the 2014_03_new index with all the fields but the _parent field >> > inserted right. However, the _parent field, which we have in most of the >> > documents of the original index is consistently missing in all of the >> > documents of the new index. >> > >> > It looks like a bug in the client's code. >> > >> > We will appreciate your help. >> > >> > Regards, >> > Costya, Totango Metrics. >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups >> > "elasticsearch" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> > an >> > email to elasticsearc...@googlegroups.com. >> > To view this discussion on the web visit >> > >> > https://groups.google.com/d/msgid/elasticsearch/b52dc918-4529-40ab-97e0-2a3b0176ec59%40googlegroups.com. >> > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/62eb9ecc-b88b-439e-bea7-b31141f1deb1%40googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CABfdDiotzp4sPf5UDDBmD2wbpN92wAw_CUCo2o0ptK7-Vk7G8w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: The python client avoids the _parent field when it does reindex
Hi Kral, I think that the mapping is fine. Here it is: "mappings":{ "account":{ "properties":{ "name":{ "type":"string", "index":"not_analyzed" } ... ANOTHER ACCOUNT FIELDS ... }, "_routing":{ "required":true, "path":"name" } }, "event":{ "properties":{ ... SOME EVENT FIELDS ... "account":{ "type":"object", "properties":{ ... SOME MORE FIELDS ... "name":{ "type":"string", "index":"no" } } } }, "_parent":{ "type":"account" }, "_timestamp":{ "enabled":true, "store":true }, "_routing":{ "required":true, "path":"account.name" } } } Do you see anything wrong here? Thx On Sunday, October 12, 2014 9:46:37 PM UTC+3, Honza Král wrote: > > Hi Costya, > > the code actually looks for parent and should move it around. Are you > sure you have your mappings set up correctly for the new index to > include the parent/child relationship? > > Thanks > > On Sun, Oct 12, 2014 at 5:30 PM, Costya Regev > wrote: > > Hi, > > > > When we run the following code: > > > > from elasticsearch import Elasticsearch > > from elasticsearch.helpers import reindex > > > > > > if __name__ == "__main__": > > es = Elasticsearch() > > reindex(es, source_index='2014_03', target_index='2014_03_new', > > chunk_size=500, scroll='5m') > > > > We get the 2014_03_new index with all the fields but the _parent field > > inserted right. However, the _parent field, which we have in most of the > > documents of the original index is consistently missing in all of the > > documents of the new index. > > > > It looks like a bug in the client's code. > > > > We will appreciate your help. > > > > Regards, > > Costya, Totango Metrics. > > > > -- > > You received this message because you are subscribed to the Google > Groups > > "elasticsearch" group. > > To unsubscribe from this group and stop receiving emails from it, send > an > > email to elasticsearc...@googlegroups.com . > > To view this discussion on the web visit > > > https://groups.google.com/d/msgid/elasticsearch/b52dc918-4529-40ab-97e0-2a3b0176ec59%40googlegroups.com. > > > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/62eb9ecc-b88b-439e-bea7-b31141f1deb1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: The python client avoids the _parent field when it does reindex
Hi Costya, the code actually looks for parent and should move it around. Are you sure you have your mappings set up correctly for the new index to include the parent/child relationship? Thanks On Sun, Oct 12, 2014 at 5:30 PM, Costya Regev wrote: > Hi, > > When we run the following code: > > from elasticsearch import Elasticsearch > from elasticsearch.helpers import reindex > > > if __name__ == "__main__": > es = Elasticsearch() > reindex(es, source_index='2014_03', target_index='2014_03_new', > chunk_size=500, scroll='5m') > > We get the 2014_03_new index with all the fields but the _parent field > inserted right. However, the _parent field, which we have in most of the > documents of the original index is consistently missing in all of the > documents of the new index. > > It looks like a bug in the client's code. > > We will appreciate your help. > > Regards, > Costya, Totango Metrics. > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/b52dc918-4529-40ab-97e0-2a3b0176ec59%40googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CABfdDiob4EpAk1WgPKOV816G-myjHu5zJQLLgwPM5J1V65mJPQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.