RE: Dataframe nested schema inference from Json without type conflicts

2015-10-23 Thread Ewan Leith
al message-- From: Reynold Xin Date: Thu, 1 Oct 2015 22:12 To: Ewan Leith; Cc: dev@spark.apache.org<mailto:dev@spark.apache.org>; Subject:Re: Dataframe nested schema inference from Json without type conflicts You can pass the schema into json directly, can't you? On Thu

Re: Dataframe nested schema inference from Json without type conflicts

2015-10-05 Thread Yin Huai
; > > > -- Original message-- > > *From: *Yin Huai > > *Date: *Thu, 1 Oct 2015 23:54 > > *To: *Ewan Leith; > > *Cc: *r...@databricks.com;dev@spark.apache.org; > > *Subject:*Re: Dataframe nested schema inference from Json without type > conflicts >

Re: Dataframe nested schema inference from Json without type conflicts

2015-10-05 Thread Ewan Leith
Thanks Yin, I'll put together a JIRA and a PR tomorrow. Ewan -- Original message-- From: Yin Huai Date: Mon, 5 Oct 2015 17:39 To: Ewan Leith; Cc: dev@spark.apache.org; Subject:Re: Dataframe nested schema inference from Json without type conflicts Hello Ewan, Adding a JSON

RE: Dataframe nested schema inference from Json without type conflicts

2015-10-05 Thread Ewan Leith
he.org>; Subject:Re: Dataframe nested schema inference from Json without type conflicts Hi Ewan, For your use case, you only need the schema inference to pick up the structure of your data (basically you want spark sql to infer the type of complex values like arrays and structs but keep

Dataframe nested schema inference from Json without type conflicts

2015-10-01 Thread Ewan Leith
Hi all, We really like the ability to infer a schema from JSON contained in an RDD, but when we're using Spark Streaming on small batches of data, we sometimes find that Spark infers a more specific type than it should use, for example if the json in that small batch only contains integer

Re: Dataframe nested schema inference from Json without type conflicts

2015-10-01 Thread Reynold Xin
You can pass the schema into json directly, can't you? On Thu, Oct 1, 2015 at 10:33 AM, Ewan Leith wrote: > Hi all, > > > > We really like the ability to infer a schema from JSON contained in an > RDD, but when we’re using Spark Streaming on small batches of data, we

Re: Dataframe nested schema inference from Json without type conflicts

2015-10-01 Thread Yin Huai
*Date: *Thu, 1 Oct 2015 22:12 > > *To: *Ewan Leith; > > *Cc: *dev@spark.apache.org; > > *Subject:*Re: Dataframe nested schema inference from Json without type > conflicts > > > You can pass the schema into json directly, can't you? > > On Thu, Oct 1, 2015 at 10:33 AM

Re: Dataframe nested schema inference from Json without type conflicts

2015-10-01 Thread Ewan Leith
probably have to adopt if we can't come up with a way to keep the inference working. Thanks, Ewan -- Original message-- From: Reynold Xin Date: Thu, 1 Oct 2015 22:12 To: Ewan Leith; Cc: dev@spark.apache.org; Subject:Re: Dataframe nested schema inference from Json without type

Re: Dataframe nested schema inference from Json without type conflicts

2015-10-01 Thread Ewan Leith
Exactly, that's a much better way to put it. Thanks, Ewan -- Original message-- From: Yin Huai Date: Thu, 1 Oct 2015 23:54 To: Ewan Leith; Cc: r...@databricks.com;dev@spark.apache.org; Subject:Re: Dataframe nested schema inference from Json without type conflicts Hi Ewan