I have confirmed with both elasticsearch hive and easticsearcg mr, If both
below situation happens, , EsOutFormat produces invalid header for bulk
indexing.
1. es.resouce contains data to be extracted from doucment
2. es.mapping.id set to be one of field sin document
I looked at the code
Please upgrade to version 2.0.1
On 9/17/14 1:18 AM, Jinyuan Zhou wrote:
I have confirmed with both elasticsearch hive and easticsearcg mr, If both
below situation happens, , EsOutFormat
produces invalid header for bulk indexing.
1. es.resouce contains data to be extracted from doucment
2.
Most likely the some of your data contains some invalid entries which result in
an invalid JSON payload being sent to ES.
Check your ID values and/or keep an eye on issue #217 which aims to provide
more human-friendly messages for the user.
Cheers.
I will check the value. However, it has problem only when I use both
es.mapping.id and 'dynamic/mult resource wirtes' feature. used separately
they are fine.
Jinyuan (Jack) Zhou
On Tue, Jun 17, 2014 at 6:25 AM, Costin Leau costin.l...@gmail.com wrote:
Most likely the some of your data
Just share a solution I learned hive side.
hive cli has an -i option that takes a file of hive commands to initilize
the session.
so I can put a list of set comand as well as add jar ... command in one
file, say inithive
then run the cli as this: hive -i init.hive -f myscript.hql. Note table
Thanks for sharing - can you also give an example of the table initialization
in init.hive vs myscript.hql?
Cheers!
On 6/16/14 11:19 PM, Jinyuan Zhou wrote:
Just share a solution I learned hive side.
hive cli has an -i option that takes a file of hive commands to initilize the
session.
Could you please raise an issue with some type of example? Due to the way
Hadoop (and Hive) works,
things tend to be tricky in terms of configuring a job.
The configuration needs to be created before a job is submitted which in practice means
dynamic configurations
are basically impossible
Thanks Costin,
I am aiming at modifying the existing hadoop cluster and hive installation
and also modularizing some common es.* properies in a separate common
place. I know the first goal can be achieved with hive cli --auxpath
option and hive table's TBLPROPERTERTIES. For the secon goal, I
Hi,
I am playing with elasticsearch and hive integration. The documentation
says
to set configuration like es.nodes, es.port in TBLPROPERTIES. It works.
But it can cause many reduntant codes. If I have ten data set to index to
the same es cluster,
I would have to repeat this information ten