Hi, I'm trying the Elasticsearch support for Spark (2.1.0.Beta3).
In the following I provide the query (as query dsl): import org.elasticsearch.spark._ object TryES { val sparkConf = new SparkConf().setAppName("Campaigns") sparkConf.set("es.nodes", "<es_cluster>:9200") sparkConf.set("es.nodes.discovery", "false") val sc = new SparkContext(sparkConf) def main(args: Array[String]) { val query = """"{ "query": { ... } } """ val campaigns = sc.esRDD("<resource>", query) campaigns.count(); } } However when I submit this (using spark-1.1.0-bin-hadoop2.4), I am experiencing the following exceptions: 14/12/03 14:55:27 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/12/03 14:55:27 INFO scheduler.DAGScheduler: Failed to run count at TryES.scala:<...> Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost task 1.0 in stage 0.0 (TID 1, localhost): org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot open stream for resource "{ "query": { ... } } Is the query dsl supported with esRDD, or am I missing something more fundamental? Huge thanks, ian --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org