Spark do not allow executor code using `sparkSession`.
But I think you can move all json files to one directory, and them run:

```
spark.read.json("/path/to/jsonFileDir")
```
But if you want to get filename at the same time, you can use
```
spark.sparkContext.wholeTextFiles("/path/to/jsonFileDir")...
```

On Thu, Sep 21, 2017 at 9:18 PM, Riccardo Ferrari <ferra...@gmail.com>
wrote:

> Depends on your use-case however broadcasting
> <https://spark.apache.org/docs/2.2.0/rdd-programming-guide.html#broadcast-variables>
> could be a better option.
>
> On Thu, Sep 21, 2017 at 2:03 PM, Chackravarthy Esakkimuthu <
> chaku.mi...@gmail.com> wrote:
>
>> Hi,
>>
>> I want to know how to pass sparkSession from driver to executor.
>>
>> I have a spark program (batch job) which does following,
>>
>> #################
>>
>> val spark = SparkSession.builder().appName("SampleJob").config("spark.
>> master", "local") .getOrCreate()
>>
>> val df = this is dataframe which has list of file names (hdfs)
>>
>> df.foreach { fileName =>
>>
>>       *spark.read.json(fileName)*
>>
>>       ...... some logic here....
>> }
>>
>> #################
>>
>>
>> *spark.read.json(fileName) --- this fails as it runs in executor. When I
>> put it outside foreach, i.e. in driver, it works.*
>>
>> As I am trying to use spark (sparkSession) in executor which is not
>> visible outside driver. But I want to read hdfs files inside foreach, how
>> do I do it.
>>
>> Can someone help how to do this.
>>
>> Thanks,
>> Chackra
>>
>
>

Reply via email to