Thans for the tip, I've realize about that end I've ended using explode as you said.
This is my attempt var res=(df.explode("rows","r") { l: WrappedArray[ArrayBuffer[String]] => l.toList}).select("r") .map { m => m.getList[Row](0) } var u = res.map { m => Row.fromSeq(m.toSeq) } var df1 = df.sqlContext.createDataFrame(u, getScheme(df) ) It woks ok, but throws an invalid cast to Integer if the scheme have some IntegerType, looks like a spark-csv bug, but I can solved anyway Thanks for the help. On Thu, Jan 28, 2016 at 7:43 PM, Mohammed Guller <moham...@glassbeam.com> wrote: > You don’t need Hive for that. The DataFrame class has a method named > explode, which provides the same functionality. > > > > Here is an example from the Spark API documentation: > > df.explode("words", "word"){words: String => words.split(" ")} > > > > The first argument to the explode method is the name of the input column > and the second argument is the name of the output column. > > > > Mohammed > > Author: Big Data Analytics with Spark > <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/> > > > > *From:* Andrés Ivaldi [mailto:iaiva...@gmail.com] > *Sent:* Wednesday, January 27, 2016 7:17 PM > *To:* Cheng, Hao > *Cc:* Sahil Sareen; Al Pivonka; user > > *Subject:* Re: JSON to SQL > > > > I'm using DataFrames reading the JSON exactly as you say, and I can get > the scheme from there. Reading the documentation, I realized that is > possible to create Dynamically a Structure, so applying some > transformations to the dataFrame plus the new structure I'll be able to > save the JSON on my DBRM. > > > > For the flatten approach, you mentioned LateralView, do I need Hive DB for > that? or just the Spark Hive Context? I saw some examples and that is > exactly what I'm needing. Can you explain it a little bit more? > > > > Thanks > > > > On Wed, Jan 27, 2016 at 10:29 PM, Cheng, Hao <hao.ch...@intel.com> wrote: > > Have you ever try the DataFrame API like: > sqlContext.read.json("/path/to/file.json"); the Spark SQL will auto infer > the type/schema for you. > > > > And lateral view will help on the flatten issues, > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView, > as well as the “a.b[0].c” format of expression. > > > > > > *From:* Andrés Ivaldi [mailto:iaiva...@gmail.com] > *Sent:* Thursday, January 28, 2016 3:39 AM > *To:* Sahil Sareen > *Cc:* Al Pivonka; user > *Subject:* Re: JSON to SQL > > > > I'm really brand new with Scala, but if I'm defining a case class then is > becouse I know how is the json's structure is previously? > > If I'm able to define dinamicaly a case class from the JSON structure then > even with spark I will be able to extract the data > > > > On Wed, Jan 27, 2016 at 4:01 PM, Sahil Sareen <sareen...@gmail.com> wrote: > > Isn't this just about defining a case class and using > parse(json).extract[CaseClassName] using Jackson? > > -Sahil > > > > On Wed, Jan 27, 2016 at 11:08 PM, Andrés Ivaldi <iaiva...@gmail.com> > wrote: > > We dont have Domain Objects, its a service like a pipeline, data is read > from source and they are saved it in relational Database > > I can read the structure from DataFrames, and do some transformations, I > would prefer to do it with Spark to be consistent with the process > > > > On Wed, Jan 27, 2016 at 12:25 PM, Al Pivonka <alpivo...@gmail.com> wrote: > > Are you using an Relational Database? > > If so why not use a nojs DB ? then pull from it to your relational? > > > > Or utilize a library that understands Json structure like Jackson to > obtain the data from the Json structure the persist the Domain Objects ? > > > > On Wed, Jan 27, 2016 at 9:45 AM, Andrés Ivaldi <iaiva...@gmail.com> wrote: > > Sure, > > The Job is like an etl, but without interface, so I decide the rules of > how the JSON will be saved into a SQL Table. > > > > I need to Flatten the hierarchies where is possible in case of list > flatten also, nested objects Won't be processed by now > > {"a":1,"b":[2,3],"c"="Field", "d":[4,5,6,7,8] } > {"a":11,"b":[22,33],"c"="Field1", "d":[44,55,66,77,88] } > {"a":111,"b":[222,333],"c"="Field2", "d":[44,55,666,777,888] } > > I would like something like this on my SQL table > > a b c d > > 1 2,3 Field 4,5,6,7,8 > > 11 22,33 Field1 44,55,66,77,88 > > 111 222,333 Field2 444,555,,666,777,888 > > Right now this is what i need > > I will later add more intelligence, like detection of list or nested > objects and create relations in other tables. > > > > > > > > On Wed, Jan 27, 2016 at 11:25 AM, Al Pivonka <alpivo...@gmail.com> wrote: > > More detail is needed. > > Can you provide some context to the use-case ? > > > > On Wed, Jan 27, 2016 at 8:33 AM, Andrés Ivaldi <iaiva...@gmail.com> wrote: > > Hello, I'm trying to Save a JSON filo into SQL table. > > If i try to do this directly the IlligalArgumentException is raised, I > suppose this is beacouse JSON have a hierarchical structure, is that > correct? > > If that is the problem, how can I flatten the JSON structure? The JSON > structure to be processed would be unknow, so I need to do it > programatically > > regards > > -- > > Ing. Ivaldi Andres > > > > > > -- > > Those who say it can't be done, are usually interrupted by those doing it. > > > > -- > > Ing. Ivaldi Andres > > > > > > -- > > Those who say it can't be done, are usually interrupted by those doing it. > > > > -- > > Ing. Ivaldi Andres > > > > > > > -- > > Ing. Ivaldi Andres > > > > > > -- > > Ing. Ivaldi Andres > -- Ing. Ivaldi Andres