Hi,

I'm trying the Elasticsearch support for Spark (2.1.0.Beta3).

In the following I provide the query (as query dsl):


import org.elasticsearch.spark._

object TryES {
  val sparkConf = new SparkConf().setAppName("Campaigns")
  sparkConf.set("es.nodes", "<es_cluster>:9200")
  sparkConf.set("es.nodes.discovery", "false")
  val sc = new SparkContext(sparkConf)

  def main(args: Array[String]) {
    val query = """"{
   "query": {
      ...
   }
}
"""
    val campaigns = sc.esRDD("<resource>", query)
    campaigns.count();
  }
}


However when I submit this (using spark-1.1.0-bin-hadoop2.4),
I am experiencing the following exceptions:

14/12/03 14:55:27 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose 
tasks have all completed, from pool
14/12/03 14:55:27 INFO scheduler.DAGScheduler: Failed to run count at 
TryES.scala:<...>
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to 
stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost 
task 1.0 in stage 0.0 (TID 1, localhost): 
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot open stream 
for resource "{
   "query": {
       ...
   }
}


Is the query dsl supported with esRDD, or am I missing something
more fundamental?

Huge thanks,
ian
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to