Re: Spark distributed SQL: JSON Data set on all worker node

Ted Yu Sun, 03 May 2015 07:42:06 -0700

Looking at SQLContext.scala (in master branch), jsonFile() returns
DataFrame directly:
  def jsonFile(path: String, samplingRatio: Double): DataFrame =


FYI

On Sun, May 3, 2015 at 2:14 AM, ayan guha <guha.a...@gmail.com> wrote:

> Yes it is possible. You need to use jsonfile method on SQL context and
> then create a dataframe from the rdd. Then register it as a table. Should
> be 3 lines of code, thanks to spark.
>
> You may see few YouTube video esp for unifying pipelines.
> On 3 May 2015 19:02, "Jai" <jai4l...@gmail.com> wrote:
>
>> Hi,
>>
>> I am noob to spark and related technology.
>>
>> i have JSON stored at same location on all worker clients spark cluster).
>> I am looking to load JSON data set on these clients and do SQL query, like
>> distributed SQL.
>>
>> is it possible to achieve?
>>
>> right now, master submits task to one node only.
>>
>> Thanks and regards
>> Mrityunjay
>>
>

Re: Spark distributed SQL: JSON Data set on all worker node

Reply via email to