Re: sort descending with multiple columns
Yes, thank you. Thanks, Sreekanth, +1 (571) 376-0714 On Nov 18, 2016 6:33 AM, "Stuart White" <stuart.whi...@gmail.com> wrote: > Is this what you're looking for? > > val df = Seq( > (1, "A"), > (1, "B"), > (1, "C"), > (2, "D"), > (3, "E") > ).toDF("foo", "bar") > > val colList = Seq("foo", "bar") > df.sort(colList.map(col(_).desc): _*).show > > +---+---+ > |foo|bar| > +---+---+ > | 3| E| > | 2| D| > | 1| C| > | 1| B| > | 1| A| > +---+---+ > > On Fri, Nov 18, 2016 at 1:15 AM, Sreekanth Jella <jsreekan...@gmail.com> > wrote: > > Hi, > > > > I'm trying to sort multiple columns and column names are dynamic. > > > > df.sort(colList.head, colList.tail: _*) > > > > > > But I'm not sure how to sort in descending order for all columns, I tried > > this but it's for only first column.. > > > > df.sort(df.col(colList.head).desc) > > How can I pass all column names (or some) with descending order. > > > > > > Thanks, > > Sreekanth >
sort descending with multiple columns
Hi, I'm trying to sort multiple columns and column names are dynamic. df.sort(colList.head, colList.tail: _*) But I'm not sure how to sort in descending order for all columns, I tried this but it's for only first column.. df.sort(df.col(colList.head).desc) How can I pass all column names (or some) with descending order. Thanks, Sreekanth
Spark Metrics monitoring using Graphite
Hi All, I am trying to retrieve the spark metrics using Graphite Exporter. It seems by default it is exposing the Application ID, but as per the our requirements we need Application Name. Sample GraphiteExporter data: block_manager{application="local-1477496809940",executor_id="driver",instance=" 127.0.0.1:9108 ",job="spark_graphite_exp",qty="remainingMem_MB",type="memory"} In above entry, "application" is defaults to ApplicationId. How do I configure to retrieve the application Name instead of ID. Thanks, Sreekanth.
RE: Flattening XML in a DataFrame
Hi Experts, Please suggest. Thanks in advance. Thanks, Sreekanth From: Sreekanth Jella [mailto:srikanth.je...@gmail.com] Sent: Sunday, August 14, 2016 11:46 AM To: 'Hyukjin Kwon' <gurwls...@gmail.com> Cc: 'user @spark' <user@spark.apache.org> Subject: Re: Flattening XML in a DataFrame Hi Hyukjin Kwon, Thank you for reply. There are several types of XML documents with different schema which needs to be parsed and tag names do not know in hand. All we know is the XSD for the given XML. Is it possible to get the same results even when we do not know the xml tags like manager.id, manager.name or is it possible to read the tag names from XSD and use? Thanks, Sreekanth On Aug 12, 2016 9:58 PM, "Hyukjin Kwon" <gurwls...@gmail.com <mailto:gurwls...@gmail.com> > wrote: Hi Sreekanth, Assuming you are using Spark 1.x, I believe this code below: sqlContext.read.format("com.databricks.spark.xml").option("rowTag", "emp").load("/tmp/sample.xml") .selectExpr("manager.id <http://manager.id> ", "manager.name <http://manager.name> ", "explode(manager.subordinates.clerk) as clerk") .selectExpr("id", "name", "clerk.cid", "clerk.cname") .show() would print the results below as you want: +---++---+-+ | id|name|cid|cname| +---++---+-+ | 1| foo| 1| foo| | 1| foo| 1| foo| +---++---+-+ I hope this is helpful. Thanks! 2016-08-13 9:33 GMT+09:00 Sreekanth Jella <srikanth.je...@gmail.com <mailto:srikanth.je...@gmail.com> >: Hi Folks, I am trying flatten variety of XMLs using DataFrames. I’m using spark-xml package which is automatically inferring my schema and creating a DataFrame. I do not want to hard code any column names in DataFrame as I have lot of varieties of XML documents and each might be lot more depth of child nodes. I simply want to flatten any type of XML and then write output data to a hive table. Can you please give some expert advice for the same. Example XML and expected output is given below. Sample XML: 1 foo 1 foo 1 foo Expected output: id, name, clerk.cid, clerk.cname 1, foo, 2, cname2 1, foo, 3, cname3 Thanks, Sreekanth Jella
Re: Flattening XML in a DataFrame
Hi Hyukjin Kwon, Thank you for reply. There are several types of XML documents with different schema which needs to be parsed and tag names do not know in hand. All we know is the XSD for the given XML. Is it possible to get the same results even when we do not know the xml tags like manager.id, manager.name or is it possible to read the tag names from XSD and use? Thanks, Sreekanth On Aug 12, 2016 9:58 PM, "Hyukjin Kwon" <gurwls...@gmail.com <mailto:gurwls...@gmail.com> > wrote: Hi Sreekanth, Assuming you are using Spark 1.x, I believe this code below: sqlContext.read.format("com.databricks.spark.xml").option("rowTag", "emp").load("/tmp/sample.xml") .selectExpr("manager.id <http://manager.id> ", "manager.name <http://manager.name> ", "explode(manager.subordinates.clerk) as clerk") .selectExpr("id", "name", "clerk.cid", "clerk.cname") .show() would print the results below as you want: +---++---+-+ | id|name|cid|cname| +---++---+-+ | 1| foo| 1| foo| | 1| foo| 1| foo| +---++---+-+ I hope this is helpful. Thanks! 2016-08-13 9:33 GMT+09:00 Sreekanth Jella <srikanth.je...@gmail.com <mailto:srikanth.je...@gmail.com> >: Hi Folks, I am trying flatten variety of XMLs using DataFrames. I’m using spark-xml package which is automatically inferring my schema and creating a DataFrame. I do not want to hard code any column names in DataFrame as I have lot of varieties of XML documents and each might be lot more depth of child nodes. I simply want to flatten any type of XML and then write output data to a hive table. Can you please give some expert advice for the same. Example XML and expected output is given below. Sample XML: 1 foo 1 foo 1 foo Expected output: id, name, clerk.cid, clerk.cname 1, foo, 2, cname2 1, foo, 3, cname3 Thanks, Sreekanth Jella
Flattening XML in a DataFrame
Hi Folks, I am trying flatten variety of XMLs using DataFrames. I'm using spark-xml package which is automatically inferring my schema and creating a DataFrame. I do not want to hard code any column names in DataFrame as I have lot of varieties of XML documents and each might be lot more depth of child nodes. I simply want to flatten any type of XML and then write output data to a hive table. Can you please give some expert advice for the same. Example XML and expected output is given below. Sample XML: 1 foo 1 foo 1 foo Expected output: id, name, clerk.cid, clerk.cname 1, foo, 2, cname2 1, foo, 3, cname3 Thanks, Sreekanth Jella