Re: Providing query dsl to Elasticsearch for Spark (2.1.0.Beta3)

2014-12-18 Thread Ian Wilkinson
Quick follow-up: this works sweetly with spark-1.1.1-bin-hadoop2.4.


 On Dec 3, 2014, at 3:31 PM, Ian Wilkinson ia...@me.com wrote:
 
 Hi,
 
 I'm trying the Elasticsearch support for Spark (2.1.0.Beta3).
 
 In the following I provide the query (as query dsl):
 
 
 import org.elasticsearch.spark._
 
 object TryES {
  val sparkConf = new SparkConf().setAppName(Campaigns)
  sparkConf.set(es.nodes, es_cluster:9200)
  sparkConf.set(es.nodes.discovery, false)
  val sc = new SparkContext(sparkConf)
 
  def main(args: Array[String]) {
val query = {
   query: {
  ...
   }
 }
 
val campaigns = sc.esRDD(resource, query)
campaigns.count();
  }
 }
 
 
 However when I submit this (using spark-1.1.0-bin-hadoop2.4),
 I am experiencing the following exceptions:
 
 14/12/03 14:55:27 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, 
 whose tasks have all completed, from pool
 14/12/03 14:55:27 INFO scheduler.DAGScheduler: Failed to run count at 
 TryES.scala:...
 Exception in thread main org.apache.spark.SparkException: Job aborted due 
 to stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: 
 Lost task 1.0 in stage 0.0 (TID 1, localhost): 
 org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot open stream 
 for resource {
   query: {
   ...
   }
 }
 
 
 Is the query dsl supported with esRDD, or am I missing something
 more fundamental?
 
 Huge thanks,
 ian
 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org
 


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Providing query dsl to Elasticsearch for Spark (2.1.0.Beta3)

2014-12-03 Thread Ian Wilkinson
Hi,

I'm trying the Elasticsearch support for Spark (2.1.0.Beta3).

In the following I provide the query (as query dsl):


import org.elasticsearch.spark._

object TryES {
  val sparkConf = new SparkConf().setAppName(Campaigns)
  sparkConf.set(es.nodes, es_cluster:9200)
  sparkConf.set(es.nodes.discovery, false)
  val sc = new SparkContext(sparkConf)

  def main(args: Array[String]) {
val query = {
   query: {
  ...
   }
}

val campaigns = sc.esRDD(resource, query)
campaigns.count();
  }
}


However when I submit this (using spark-1.1.0-bin-hadoop2.4),
I am experiencing the following exceptions:

14/12/03 14:55:27 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose 
tasks have all completed, from pool
14/12/03 14:55:27 INFO scheduler.DAGScheduler: Failed to run count at 
TryES.scala:...
Exception in thread main org.apache.spark.SparkException: Job aborted due to 
stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost 
task 1.0 in stage 0.0 (TID 1, localhost): 
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot open stream 
for resource {
   query: {
   ...
   }
}


Is the query dsl supported with esRDD, or am I missing something
more fundamental?

Huge thanks,
ian
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org