RE: Flattening XML in a DataFrame

2016-08-21 Thread srikanth.jella
Hi Hyukjin, I have created the below issue. https://github.com/databricks/spark-xml/issues/155 Sent from Mail for Windows 10 From: Hyukjin Kwon

Re: Flattening XML in a DataFrame

2016-08-16 Thread Hyukjin Kwon
; > *Cc:* 'user @spark' <user@spark.apache.org> > *Subject:* Re: Flattening XML in a DataFrame > > > > Hi Hyukjin Kwon, > > Thank you for reply. > > There are several types of XML documents with different schema which needs > to be parsed and tag names do not kno

RE: Flattening XML in a DataFrame

2016-08-16 Thread Sreekanth Jella
Hi Experts, Please suggest. Thanks in advance. Thanks, Sreekanth From: Sreekanth Jella [mailto:srikanth.je...@gmail.com] Sent: Sunday, August 14, 2016 11:46 AM To: 'Hyukjin Kwon' <gurwls...@gmail.com> Cc: 'user @spark' <user@spark.apache.org> Subject: Re: Fl

Re: Flattening XML in a DataFrame

2016-08-14 Thread Sreekanth Jella
Hi Hyukjin Kwon, Thank you for reply. There are several types of XML documents with different schema which needs to be parsed and tag names do not know in hand. All we know is the XSD for the given XML. Is it possible to get the same results even when we do not know the xml tags like

Re: Flattening XML in a DataFrame

2016-08-12 Thread Hyukjin Kwon
Hi Sreekanth, Assuming you are using Spark 1.x, I believe this code below: sqlContext.read.format("com.databricks.spark.xml").option("rowTag", "emp").load("/tmp/sample.xml") .selectExpr("manager.id", "manager.name", "explode(manager.subordinates.clerk) as clerk") .selectExpr("id", "name",

Flattening XML in a DataFrame

2016-08-12 Thread Sreekanth Jella
Hi Folks, I am trying flatten variety of XMLs using DataFrames. I'm using spark-xml package which is automatically inferring my schema and creating a DataFrame. I do not want to hard code any column names in DataFrame as I have lot of varieties of XML documents and each might be lot more