Hi community ,
As shown in other answers online , Spark does not support the nesting of DataFrames , but what are the options? I have the following scenario : dataFrame1 = List of Cities dataFrame2 = Created after searching in ElasticSearch for each city in dataFrame1 I've tried : val cities = sc.parallelize(Seq("New York")).toDF() cities.foreach(r => { val companyName = r.getString(0) println(companyName) val dfs = sqlContext.esDF("cities/docs", "?q=" + companyName) //returns a DataFrame consisting of all the cities matching the entry in cities }) Which triggers the expected null pointer exception java.lang.NullPointerException at org.elasticsearch.spark.sql.EsSparkSQL$.esDF(EsSparkSQL.scala:53) at org.elasticsearch.spark.sql.EsSparkSQL$.esDF(EsSparkSQL.scala:51) at org.elasticsearch.spark.sql.package$SQLContextFunctions.esDF(package.scala:3 7) at Main$$anonfun$main$1.apply(Main.scala:43) at Main$$anonfun$main$1.apply(Main.scala:39) at scala.collection.Iterator$class.foreach(Iterator.scala:742) at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scal a:921) at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scal a:921) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:206 7) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:206 7) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11 49) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6 24) at java.lang.Thread.run(Thread.java:748) 2018-12-28 02:01:00 ERROR TaskSetManager:70 - Task 7 in stage 0.0 failed 1 times; aborting job Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 0.0 failed 1 times, most recent failure: Lost task 7.0 in stage 0.0 (TID 7, localhost, executor driver): java.lang.NullPointerException What options do I have? Thank you.