It worked when I added the scala 2.10 dependency to the project..
org.scala-lang
scala-library
2.10.1
On Sat, Jul 4, 2015 at 10:31 AM, Deepak Subhramanian
wrote:
> I am getting error while using es-hadoop with spark on scala11. It
> works with the spark-shell binaries I have with s
I am getting error while using es-hadoop with spark on scala11. It
works with the spark-shell binaries I have with scala10
This line generates the error.
val esRDD = sc.esRDD("realtimeanalytics/events")
This works
finaloutput.toJSON.saveJsonToEs("realtimeanalytics/events")
Any inputs will be app
k SQL?
Environment: (everything is running on the same box):
Elasticsearch 1.4.4
elasticsearch-hadoop 2.1.0.BUILD-SNAPSHOT
Spark 1.3.0.
CURL:
curl -XPOST "http://localhost:9200/summary/intervals/_search"; -d'
{
"query" : {
"filtered"
performance?
Are there any ways to tune performance of Elasticsearch + Spark SQL?
Environment: (everything is running on the same box):
Elasticsearch 1.4.4
elasticsearch-hadoop 2.1.0.BUILD-SNAPSHOT
Spark 1.3.0.
CURL:
curl -XPOST "http://localhost:9200/summary/intervals/_search
nsdag 29 april 2015 kl. 00:11:47 UTC+2 skrev Costin Leau:
>>
>> Hi,
>>
>> It seems you are running into a classpath problem. The class mentioned in
>> the exception (org/elasticsearch/hadoop/serialization/dto/Node) is part of
>> the elasticsearch-hadoop-hive-XXX.
Thanks. We got it working by adding the jar to the hive-config, rather than
by "add jar" ..
-ra
Den onsdag 29 april 2015 kl. 00:11:47 UTC+2 skrev Costin Leau:
>
> Hi,
>
> It seems you are running into a classpath problem. The class mentioned in
> the exception
Hi,
It seems you are running into a classpath problem. The class mentioned in the exception
(org/elasticsearch/hadoop/serialization/dto/Node) is part of the elasticsearch-hadoop-hive-XXX. jar - you can verify
this yourself.
The fact that it is not found at runtime suggests that the a different
Hi!
I've followed the various guides to get going with the
elasticsearch-hadoop-integration in Hive, but I run into some issue:
> add jar hdfs://host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar;
INFO : converting to local hdfs:
//host:9000//lib/elasticsearch-hadoop-hive-2.1
Based on your cryptic message I would guess the issue is likely that the jar you are building is incorrect as it's
manifest is invalid. Spark most likely is signed and thus extra content breaks this.
See
http://www.elastic.co/guide/en/elasticsearch/hadoop/master/troubleshooting.html#help
*Hello*
*when add elasticsearch-hadoop jar *
*this is a error*
Spark assembly has been built with Hive, including Datanucleus jars on
classpath
Exception in thread "main" java.lang.SecurityException: Invalid signature
file digest for Manifest main att
I'm having an issue very similar to this; I'm not sure exactly what you did
to get the array contents. I've made a new post here:
https://groups.google.com/forum/#!topic/elasticsearch/MpOqKthgqtA
--
Paul Chua
Data Scientist
317-979-5643
[image: cid:02526A0B-9444-47C7-A3EC-12B05A02CB50]
*We help
Hi,
Hadoop means a lot of things as it has a lot of components. I'm sorry to hear the resources you read don't give you
enough answers.
The 'definition' of Elasticsearch Hadoop is given in the documentation preface
[1] which I quote below:
"
Elasticsearch for Apac
Hi,
Even after going through so many resources and reading about es-hadoop i am
unable to clarify some of my doubts like:
How to run elasticsearch data nodes on your hadoop data nodes??
Can i install an elasticsearch cluster and store indexes on hadoop HDFS??
if yes then how??
Will i have to ke
for Costin...? I enjoyed the talk at Spark Summit East on
spark-elasticseach integration in Spark 1.3 (sparkContext.esRDD and
rdd.saveToEs APIs). Will these APIs eventually be able for pyspark
context/rdd? Cheers, JH
--
You received this message because you are subscribed to the Google Group
You're close:
elasticsearch-hadoop snapshot (aka dev aka master) works on spark 1.2, 1.1
and 1.0, both core and sql
elasticsearch-hadoop beta3 (not snapshot) works on spark 1.1 and spark 1.0,
both core and sql
elasticsearch-hadoop beta2 (not snapshot) works on spark 1.0 (core and sql)
The su
Thank you for the summary - you are confirming (as a sanity check for
myself):
elasticsearch-hadoop beta3 (not snapshot) on spark core 1.1 only
elasticsearch-hadoop-beta3-SNAPSHOT with spark core 1.1, 1.2 and 1.3 -- as
long as I don't use Spark SQL when using 1.2 and 1.3
Costin - I am a
e in order, the
same should apply for es-hadoop as well since it relies only on Spark (and
Scala of course).
On Tue, Mar 17, 2015 at 10:43 PM, Jeff Steinmetz <
jeffrey.steinm...@gmail.com> wrote:
> There are plenty of spark / akka / scala / elasticsearch-hadoop
> dependencies to ke
There are plenty of spark / akka / scala / elasticsearch-hadoop
dependencies to keep track of.
Is it true that elasticsearch-hadoop needs to be compiled for a specific
spark version to run correctly on the cluster? I'm also trying to keep
track of the akka version and scala version. i.e
you need to pass the array of maps as a script parameter
> and not use primitives instead (you can use Hive column mapping to extract
> the ones you need)?
>
> On Thu, Mar 12, 2015 at 11:56 PM, Chen Wang > wrote:
>
>> Folks,
>> I am using elasticsearch-hadoop-hive-2.1.0.B
JSON form incorrect.
Any reason why you need to pass the array of maps as a script parameter and
not use primitives instead (you can use Hive column mapping to extract the
ones you need)?
On Thu, Mar 12, 2015 at 11:56 PM, Chen Wang
wrote:
> Folks,
> I am using elasticsearch-hadoop-hive-2.1.
Folks,
I am using elasticsearch-hadoop-hive-2.1.0.Beta3.jar
I defined the external table as:.
CREATE EXTERNAL TABLE IF NOT EXISTS ${staging_table}(
customer_id STRING,
store_purchase array>)
ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe
Hi Julien
Yes. Probably that's the only possible work around to do this. What I am
planning to do is, calculate the name of index prior to writing and add a
field named "indexname" in my JSON and then I will use
JavaEsSpark.saveJsonToEs(jrd, "index_{indexname}/type");
Thanks for the reply.
Hi Abhishek,
I'll probably do a previous step to process the day by line (like
-MM-dd).
And just do
JavaEsSpark.saveJsonToEs(jrd, "index_{date}/type");
If the range is needed I'll probably do the same. Add a feature with {time}
- {time} % 86400}_{ {time} + 86400 - {date} % 86400 } and just
in
Hi Julien,
I am trying to achieve something similar. In my case, my JSON contains a
field "time" in Unix time. And i want to partition my indexes by this
field. That is, if one JSON1 contains 1422904680 in "time" and JSON2
contains 1422991080 in time, then i want to create indexes which are
pa
gt;
> Example of data (~700 millions lines for ~90days):
>
> 2014-01-01,05,06,ici
> 2014-01-04,05,06,la
>
> The first one have to be send to my-index-2014-01-01/my-type and the other
> my-index-2014-01-04/my-type
> I would like to do it without having to launch 90 saveJsonToES (us
e and the other
my-index-2014-01-04/my-type
I would like to do it without having to launch 90 saveJsonToES (using the
elasticsearch-hadoop spark API)
Is it more clear?
It seems that the dynamic index could work for me. I'll try that right away.
Thanks again
Julien
2015-01-19 16:18 GMT+01
[1]
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/configuration.html#cfg-multi-writes
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/spark.html#spark-write-dyn
[3] https://github.com/elasticsearch/elasticsearch-hadoop/issues/358
On 1/19/15 4:50 PM, J
a complex workflow using Spark (Parsing, Cleaning, Machine
>> Learning).
>> At the end of the workflow I want to send aggregated results to
>> elasticsearch so my portal could query data.
>> There will be two types of processing: streaming and the possibility to
>>
Learning).
> At the end of the workflow I want to send aggregated results to
> elasticsearch so my portal could query data.
> There will be two types of processing: streaming and the possibility to
> relaunch workflow on all available data.
>
> Right now I use elasticsearch-h
on all available data.
>
> Right now I use elasticsearch-hadoop and particularly the spark part to
> send document to elasticsearch with the saveJsonToEs(myindex, mytype)
> method.
> The target is to have an index by day using the proper template that we
> build.
> AFAIK you could not ad
available data.
Right now I use elasticsearch-hadoop and particularly the spark part to
send document to elasticsearch with the saveJsonToEs(myindex, mytype)
method.
The target is to have an index by day using the proper template that we
build.
AFAIK you could not add consideration of a feature
For the record, what spark and es-hadoop version are you using?
For each shard in your index, es-hadoop creates one Spark task which gets informed of the whereabouts of the underlying
shard.
So in your case, you would end up with 20 tasks/workers, one per shard,
streaming data back to the maste
I'm trying to get a spark job running that pulls several million documents
from an Elasticsearch cluster for some analytics that cannot be done via
aggregations. It was my understanding that es-hadoop maintained data
locality when the spark cluster was running alongside the elasticsearch
clust
you need some more info let me know.
Dummy input file was placed in src/test/resources/input/input.txt
for test to read it.
I tested this with gradle project (within my existing one).
elasticsearch-hadoop 2.0.2 dependency and java 7.
You can see the exception being thrown in console when running it
[1]
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/troubleshooting.html
On 12/15/14 6:03 PM, Kamil Dziublinski wrote:
Hi,
I had only one jar on classpath and none in hadoop cluster.
I had different types of values in my MapWritable tho. It turns out this was
the problem.
So I
changed everything to be Text it started working.
Is this intended behaviour?
Cheers,
Kamil.
On Friday, December 12, 2014 8:37:03 PM UTC+1, Costin Leau wrote:
>
> Hi,
>
> This error is typically tied to a classpath issue - make sure you have
> only one elasticsearch-hadoop jar
Thanks, I managed to fix this issue by `export
HADOOP_CLASSPATH=/path/to/my/elasticsearch-hadoop-2.0.2.jar`. Don't know
why, but it works. I have already configured that using -libjars; I don't
know why hadoop needs me to specify that again using that global variable.
Another questi
http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Example%3A+WordCount+v2.0
[2]
http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
On 12/14/14 3:32 PM, CAI Longqi wrote:
Hello, I’m using elasticsearch-hadoop-2.0.2.jar, and meet the pr
Hello, I’m using elasticsearch-hadoop-2.0.2.jar, and meet the problem:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/elasticsearch/hadoop/mr/EsOutputFormat
at com.clqb.app.ElasticSearch.run(ElasticSearch.java:46)
at org.apache.hadoop.util.ToolRunner.run(ToolRunn
Hi,
This error is typically tied to a classpath issue - make sure you have only one elasticsearch-hadoop jar version in your
classpath and on the Hadoop cluster.
On 12/12/14 5:56 PM, Kamil Dziublinski wrote:
Hi guys,
I am trying to run a MR job that reads from HDFS and stores into
Hi guys,
I am trying to run a MR job that reads from HDFS and stores into
ElasticSearch cluster.
I am getting following error:
Error:
org.elasticsearch.hadoop.serialization.EsHadoopSerializationException:
Cannot handle type [class org.apache.hadoop.io.MapWritable], instance
[org.apache.hadoop
I have actually went through the API and I get the big picture now.
I appreciate your help. Thanks! :)
Le mercredi 3 décembre 2014 16:50:33 UTC+1, Costin Leau a écrit :
>
> I'm not sure what you are expecting since the results are as expected. See
> the javadocs [1] for ArrayWritable.
> toStri
I'm not sure what you are expecting since the results are as expected. See the
javadocs [1] for ArrayWritable.
toStrings() returns a String[] while get() a Writable[]. In other words you get an array of Strings and Writables and
neither
implements toString natively.
To get the actual content yo
Ok actually Strings() returns the String[] array that has the contents and
that solved my problem.
Thanks again Costin! :)
Le mercredi 3 décembre 2014 16:29:38 UTC+1, Elias Abou Haydar a écrit :
>
> I've tried to call toStrings()
> I got this :
> title : [Ljava.lang.String;@35112ff7
>
> wi
I've tried to call toStrings()
I got this :
title : [Ljava.lang.String;@35112ff7
with the get(), i'm getting this:
title : [Lorg.apache.hadoop.io.Writable;@666f5678
Le mercredi 3 décembre 2014 16:21:40 UTC+1, Costin Leau a écrit :
>
> You're getting back an array ([Samsung EF-C])
I've already tried that. It doesn't work... :/
Le mercredi 3 décembre 2014 16:21:40 UTC+1, Costin Leau a écrit :
>
> You're getting back an array ([Samsung EF-C]) - a Writable wrapper
> around org.hadoop.io.ArrayWritable (to actually
> allow it to be serialized).
> So call toStrings() or ge
You're getting back an array ([Samsung EF-C]) - a Writable wrapper around org.hadoop.io.ArrayWritable (to actually
allow it to be serialized).
So call toStrings() or get() to get its content.
On 12/3/14 3:30 PM, Elias Abou Haydar wrote:
I've tried that. It returns a org.elasticsearch.hadoo
I've tried that. It returns a
org.elasticsearch.hadoop.mr.WritableArrayWritable object. How can I get my
field content out of that?
Le mercredi 3 décembre 2014 14:10:24 UTC+1, Costin Leau a écrit :
>
> That's because your MapWritables doesn't use Strings as keys but rather
> org.apache.hadoop.i
That's because your MapWritables doesn't use Strings as keys but rather
org.apache.hadoop.io.Text
In other words, you can see the data is in the map however you cannot retrieve it since you are using the wrong key (try
inspecting the map object types).
Try values.get(new Text("title"))
On 12
That works fine for me thank you! But I'd also wanted to be able to build
and object from the MapWritable values in the mapper.
Consider values as MapWritable object.
When I try to get a specified value from values.get("title") per example
the returning value is null but the field exists in th
Simply specify the fields that you are interested in, in the query and you are
good to go.
[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html
On 12/2/14 12:52 PM, Elias Abou Haydar wrote:
I'm trying to write a mapreduce job where I can query e
I'm trying to write a mapreduce job where I can query elasticsearch so it
can return to me specific fields. Is there any way to do that?
My mapping contains about 30 fields and I will need just 4 of them
("_id","title","description","category")
The way I was doing it is to process each answer t
Hi Telax,
Even though i don't set number of reduce tasks, Hadoop takes care of
starting reducers if needed. In my case 1 reducer is running. Issue here
custom reducer defined as part of Job configuration is not invoked by Hadoop
Thanks,
Sarath
>
>
--
You received this message because you are
Hi,
Doesn't look like you've set the number of reduce tasks in your job config.
i.e.
'job.setNumReduceTasks(10);'
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an emai
Hi Costin,
Thanks for the response. You are right. Map/Reduce integration relies on
the Input/OutputFormat. Even after removing EsOutputFormat my custom
reducer is not invoked. Should be some issue with hadoop configuration.
Thanks,
Sarath
--
You received this message because you are subscr
types and the job fails silently
after invoking context.write method. In fact, you can just remove the EsOutputFormat and see whether it makes any
difference (it shouldn't).
On 11/1/14 7:20 AM, Sarath wrote:
Hi All,
Will Elasticsearch Hadoop WRITE operation doesn't use our cust
hi,
i am planning to start using elasticsearch for tweet analytics on company
product. is it feasible "if i index and store data directly into the hdfs
via elasticsearch, instead of storing raw data and fetching that raw data
back to elasticsearch"?? and can we communicate and do some comp
Hi All,
Will Elasticsearch Hadoop WRITE operation doesn't use our custom reducer?
I tried with following code and observed that our customer reducer is not
invoked.
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(NullWritable.
Is there an easy way to rename the fields on an index?
I have a field named "searchTerm" that I use for some event tracking. But
the elasticsearch-hadoop library assumes all elasticsearch fields are
lowercase and is converting all field names to lower case. When hadoop
tries to re
Thanks,
Jinyuan (Jack) Zhou
On Tue, Oct 14, 2014 at 1:36 PM, Costin Leau wrote:
> You need the appropriate hadoop jar on your classpath otherwise
> es-hadoop repository plugin cannot connect to HDFS. In the repo,
> you'll find two versions with vanilla hadoop1 and hadoop2 - however if
> you are
You need the appropriate hadoop jar on your classpath otherwise
es-hadoop repository plugin cannot connect to HDFS. In the repo,
you'll find two versions with vanilla hadoop1 and hadoop2 - however if
you are using a certain distro, for best compatibility you should use
that distro client jars.
Plea
My ES cluster nodes and Hadoop nodes are not collocated. Light version does
not works for me without putting enough correct versions of hadoop related
jars. Right now I don't want to create my jar as Brent did and I don't
want to install hadoop or copy jars on the es nodes either . Right now I
Hi everyone,
Elasticsearch Hadoop 2.0.2 and 2.1 Beta2, featuring Apache Storm integration
and Apache Spark SQL, have been released.
You can read all about them here [1].
Feedback is welcome!
Cheers,
http://www.elasticsearch.org/blog/elasticsearch-hadoop-2-0-2-and-2-1-beta2/
--
Costin
--
You
whether there's load building up or if anything
> unusual happens.
>
> Since it's unclear what the issue might be, take baby steps [1] and start
> with minimal load (smaller bulk size + less tasks) see whether there are
> any issues and keep on going.
>
> [1] ht
thing unusual happens.
Since it's unclear what the issue might be, take baby steps [1] and start with minimal load (smaller bulk size + less
tasks) see whether there are any issues and keep on going.
[1]
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/troubleshooting.html
1m, it means things are not
> going well at all.
>
> On 10/3/14 6:09 PM, Zach Cox wrote:
>
>> Is there anything else we could try here to debug elasticsearch-hadoop
>> being unable to write to Elasticsearch? We're
>> still seeing the same number of these fails durin
e we could try here to debug elasticsearch-hadoop being
unable to write to Elasticsearch? We're
still seeing the same number of these fails during the nightly batch runs even
after switching to 2.0.2.BUILD-SNAPSHOT,
and I don't see any additional lines from org.elasticsearch.hadoop.__rest
Is there anything else we could try here to debug elasticsearch-hadoop
being unable to write to Elasticsearch? We're still seeing the same number
of these fails during the nightly batch runs even after switching to
2.0.2.BUILD-SNAPSHOT,
and I don't see any additional
Hi Costin - by "bulk size/entries number" are you referring to the
es.batch.size.bytes and es.batch.size.entries config values described here?
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/configuration.html#configuration-serialization
It looks like the only ela
ulk size/entries number?
>
>
> On 10/1/14 2:15 PM, Zach Cox wrote:
>>
>> Hi Costin - we updated our dependencies to use elasticsearch-hadoop
>> 2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
>> anything. We're still seeing the same task failures whil
hat's your bulk size/entries number?
On 10/1/14 2:15 PM, Zach Cox wrote:
Hi Costin - we updated our dependencies to use elasticsearch-hadoop
2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
anything. We're still seeing the same task failures while trying to write to
Elastic
Hi Costin - we updated our dependencies to use elasticsearch-hadoop
2.0.2.BUILD-SNAPSHOT, but that didn't seem to change anything. We're still
seeing the same task failures while trying to write to Elasticsearch. The
only difference in the logs is that now I don&
Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be running into issue #256 which was fixed some time ago
and will be part of the upcoming
2.0.2, 2.1 Beta2.
Cheers,
On 9/30/14 6:43 PM, Zach Cox wrote:
Hi Costin:
elasticsearch-hadoop 2.0.0
cascading 2.5.4
scalding 0.10.0
Thanks
Hi Costin:
elasticsearch-hadoop 2.0.0
cascading 2.5.4
scalding 0.10.0
Thanks,
Zach
On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau wrote:
>
> What version of es-hadoop/es/cascading are you using?
>
> On 9/30/14 6:16 PM, Zach Cox wrote:
> > Hi - we're havi
ke this:
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt
Seems like elasticsearch-hadoop tries talking to an ES node, it times out,
tries the next one, it times out, etc until
all nodes in the cluster are exhausted and then it gives up.
As far as I c
d/gistfile1.txt
Seems like elasticsearch-hadoop tries talking to an ES node, it times out,
tries the next one, it times out, etc until all nodes in the cluster are
exhausted and then it gives up.
As far as I can tell, the ES cluster is healthy while this is occurring.
May map tasks are succeeding -
or a mapreduce job
>> to EsOutuputFormat. Below is a part regarding
>> dynamically decide the document type.
>> (http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/
>> current/mapreduce.html). My question is is it possible to
>> parameterize the "my-collec
://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/mapreduce.html).
My question is is it possible to
parameterize the "my-collection"?
Thanks,
Jack
writing to dynamic/multi-resourcesedit
<https://github.com/elasticsearch/elasticsearch-hadoop/edit/2.0/docs/src/reference/
I saw hadoop documentation regarding setting up index for a mapreduce job
to EsOutuputFormat. Below is a part regarding dynamically decide the
document type.
(http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/mapreduce.html).
My question is is it possible to parameterize the
(Much delayed) thank you Costin.
Indeed, on Ubuntu, changing ES_CLASSPATH to include hadoop and hadoop/lib
directories in /etc/default/elasticsearch (and exporting it in
/etc/init.d/elasticsearch) and installing light plugin version did work.
--
You received this message because you are subscr
(Much delayed) thank you Costin.
Indeed, on Ubuntu, changing ES_CLASSPATH to include hadoop and hadoop/lib
directories in /etc/default/elasticsearch (and exporting it in
/etc/init.d/elasticsearch) and installing light plugin version did work.
On Thursday, 14 August 2014 20:59:39 UTC, Costin Le
Hi,
The hdfs repository relies on vanilla Hadoop 2.2 since that's the official stable version of Yarn. Since you are using a
different
Hadoop version, use the 'light' version as explained in the docs - this contains only the repository-hdfs, without the
Hadoop dependency
(since you already hav
Hi everyone,
Elasticsearch Hadoop 2.0.1 and 2.1 Beta1, featuring native Apache Spark
integration, have been released.
You can read all about them here [1].
Feedback is welcome!
Cheers,
[1] http://www.elasticsearch.org/blog/es-hadoop-2-0-1-and-2-1-beta1/
--
Costin
--
You received this
I'm trying to get es-hadoop repository plugin working on our hadoop
2.0.0-cdh4.6.0 distribution and it seems like I'm quite lost.
I installed plugin's -hadoop2 version on the machines on our hadoop cluster
(which also run our stage elasticsearch nodes).
When attempting to create a repository o
may not be found. See JobConf(Class) or JobConf#setJar(String).
14/07/28 11:58:23 WARN mr.EsOutputFormat: Speculative execution enabled for
reducer - consider disabling it to prevent data corruption
14/07/28 11:58:23 INFO util.Version: Elasticsearch Hadoop v2.0.0
[eb4487f75f]
14/07/28 11:58:23 INFO mr.E
Have you looked at the docs?
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/mapreduce.html
On Fri, Jul 25, 2014 at 11:04 PM, M_20 wrote:
> Hi Guys,
>
> Could you please give me a java sample code of mapper and reducer in
> Elasticsearch-hadoop?
> I&
Hi Guys,
Could you please give me a java sample code of mapper and reducer in
Elasticsearch-hadoop?
I'd appreciate it.
Thanks
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving
On 7/17/14 8:38 PM, James Cook wrote:
I've read through much of the documentation for es-hadoop, but I might be
coming away with some misunderstandings.
The setup docs for elasticsearch for apache hadoop (es-hadoop) uses the word
/interact/ which is a bit vague.
Elasticsearch for Apache H
I've read through much of the documentation for es-hadoop, but I might be
coming away with some misunderstandings.
The setup docs for elasticsearch for apache hadoop (es-hadoop) uses the
word *interact* which is a bit vague.
Elasticsearch for Apache Hadoop is an open-source, stand-alone,
> sel
Hi,
Issue #231 which I believe you have raised, has been fixed in 2.x - can you please try the latest 2.0.1.BUILD-SNAPSHOT
and report back?
Thanks!
On 7/15/14 9:32 AM, János Háber wrote:
Hi guys,
I writing a spark application where I want to use ES with Hadoop. I have a lot
of document in
Hi guys,
I writing a spark application where I want to use ES with Hadoop. I have a
lot of document in ES now I want to aggregate but I can't.
My document's have different fields which means some have "twitter" field,
with values, some have "facebook" etc
When I try to read the data from ES I g
u can create and
>> update documents without having to include the id in the source document,
>> so I think it would make sense to be able to do that with
>> elasticsearch-hadoop also.
>>
>> On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote:
>>
&
Hi Costin, thank you for your reply.
My issue actually came down to the ordering of my matches. I had a
'match:*' as the first dynamic template which disabled norms. Although this
template didn't explicitly define a type for any matched field it would
automatically set the 'date' field to a string
he id in the source document,
> so I think it would make sense to be able to do that with
> elasticsearch-hadoop also.
>
> On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote:
>
>> You need to specify the id of the document you want to update somehow.
>> Since in
elasticsearch, you can create and
update documents without having to include the id in the source document,
so I think it would make sense to be able to do that with
elasticsearch-hadoop also.
On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote:
>
> You need to specify the id
gets included in the source
document. In elasticsearch, you can create and update documents without
having to include the id in the source document, so I think it would make
sense to be able to do that with elasticsearch-hadoop also.
On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote
Make sure the template does match. This might not be always obvious however it's easy to test out. First, check your
template and after defining the template, send a request with a sample payload to see whether the doc gets properly
created. A common mistake is defining the template after the ind
ternatives
that you thought of?
Cheers,
On 7/7/14 10:48 PM, Brian Thomas wrote:
I am trying to update an elasticsearch index using elasticsearch-hadoop. I am
aware of the *es.mapping.id*
configuration where you can specify that field in the document to use as an id,
but in my case the source doc
I am trying to update an elasticsearch index using elasticsearch-hadoop. I
am aware of the *es.mapping.id* configuration where you can specify that
field in the document to use as an id, but in my case the source document
does not have the id (I used elasticsearch's autogenerated id
, Jul 3, 2014 at 8:58 PM, James Campbell > wrote:
>
>> I would like to update an existing document that has an array from
>> elasticsearch hadoop.
>>
>> I notice that I can do that from curl directly, for example:
>>
>> PUT arraydemo/temp/1
>> {
>&
1 - 100 of 144 matches
Mail list logo