ike this line to load it directly:
val jsonSchema = sqlContext.read.json(“path/to/schema”).schema
Thanks,
Ewan
From: Gavin Yue [mailto:yue.yuany...@gmail.com]
Sent: 06 January 2016 07:14
To: user <user@spark.apache.org>
Subject: How to accelerate reading json file?
I am trying to read jso
Hi all
I want to ask how exactly it differs while reading >1 tb file on standalone
cluster vs yarn or mesos cluster ?
On Wednesday 6 January 2016, Gavin Yue wrote:
> I am trying to read json files following the example:
>
> val path =
HI ,
You can try this
sqlContext.read.format("json").option("samplingRatio","0.1").load("path")
If it still takes time , feel free to experiment with the samplingRatio.
Thanks,
Vishnu
On Wed, Jan 6, 2016 at 12:43 PM, Gavin Yue wrote:
> I am trying to read json files
I am trying to read json files following the example:
val path = "examples/src/main/resources/jsonfile"val people =
sqlContext.read.json(path)
I have 1 Tb size files in the path. It took 1.2 hours to finish the
reading to infer the schema.
But I already know the schema. Could I make this