Re: better places to store es.nodes and es.port in ES Hive integration?

2014-09-16 Thread Jinyuan Zhou
I have confirmed with both elasticsearch hive and easticsearcg mr, If both below situation happens, , EsOutFormat produces invalid header for bulk indexing. 1. es.resouce contains data to be extracted from doucment 2. es.mapping.id set to be one of field sin document I looked at the code

Re: better places to store es.nodes and es.port in ES Hive integration?

2014-09-16 Thread Costin Leau
Please upgrade to version 2.0.1 On 9/17/14 1:18 AM, Jinyuan Zhou wrote: I have confirmed with both elasticsearch hive and easticsearcg mr, If both below situation happens, , EsOutFormat produces invalid header for bulk indexing. 1. es.resouce contains data to be extracted from doucment 2.

Re: better places to store es.nodes and es.port in ES Hive integration?

2014-06-17 Thread Costin Leau
Most likely the some of your data contains some invalid entries which result in an invalid JSON payload being sent to ES. Check your ID values and/or keep an eye on issue #217 which aims to provide more human-friendly messages for the user. Cheers.

Re: better places to store es.nodes and es.port in ES Hive integration?

2014-06-17 Thread Jinyuan Zhou
I will check the value. However, it has problem only when I use both es.mapping.id and 'dynamic/mult resource wirtes' feature. used separately they are fine. Jinyuan (Jack) Zhou On Tue, Jun 17, 2014 at 6:25 AM, Costin Leau costin.l...@gmail.com wrote: Most likely the some of your data

Re: better places to store es.nodes and es.port in ES Hive integration?

2014-06-16 Thread Jinyuan Zhou
Just share a solution I learned hive side. hive cli has an -i option that takes a file of hive commands to initilize the session. so I can put a list of set comand as well as add jar ... command in one file, say inithive then run the cli as this: hive -i init.hive -f myscript.hql. Note table

Re: better places to store es.nodes and es.port in ES Hive integration?

2014-06-16 Thread Costin Leau
Thanks for sharing - can you also give an example of the table initialization in init.hive vs myscript.hql? Cheers! On 6/16/14 11:19 PM, Jinyuan Zhou wrote: Just share a solution I learned hive side. hive cli has an -i option that takes a file of hive commands to initilize the session.

Re: better places to store es.nodes and es.port in ES Hive integration?

2014-06-15 Thread Costin Leau
Could you please raise an issue with some type of example? Due to the way Hadoop (and Hive) works, things tend to be tricky in terms of configuring a job. The configuration needs to be created before a job is submitted which in practice means dynamic configurations are basically impossible

Re: better places to store es.nodes and es.port in ES Hive integration?

2014-06-15 Thread Jinyuan Zhou
Thanks Costin, I am aiming at modifying the existing hadoop cluster and hive installation and also modularizing some common es.* properies in a separate common place. I know the first goal can be achieved with hive cli --auxpath option and hive table's TBLPROPERTERTIES. For the secon goal, I

better places to store es.nodes and es.port in ES Hive integration?

2014-06-13 Thread Jinyuan Zhou
Hi, I am playing with elasticsearch and hive integration. The documentation says to set configuration like es.nodes, es.port in TBLPROPERTIES. It works. But it can cause many reduntant codes. If I have ten data set to index to the same es cluster, I would have to repeat this information ten