Re: Spark 2.0.0 - Apply schema on few columns of dataset

2016-08-10 Thread Aseem Bansal
ail.com] > *Sent:* 08 August 2016 07:37 > *To:* Ewan Leith <ewan.le...@realitymine.com> > *Cc:* user <user@spark.apache.org> > *Subject:* Re: Spark 2.0.0 - Apply schema on few columns of dataset > > > > Hi Ewan > > > > The .as function take a single

Re: Spark 2.0.0 - Apply schema on few columns of dataset

2016-08-08 Thread Aseem Bansal
Hi Ewan The .as function take a single encoder or a single string or a single Symbol. I have like more than 10 columns so I cannot use the tuple functions. Passing using bracket does not work. On Mon, Aug 8, 2016 at 11:26 AM, Ewan Leith wrote: > Looking at the

Re: Spark 2.0.0 - Apply schema on few columns of dataset

2016-08-07 Thread Ewan Leith
Looking at the encoders api documentation at http://spark.apache.org/docs/latest/api/java/ == Java == Encoders are specified by calling static methods on Encoders. List data = Arrays.asList("abc", "abc", "xyz");

Re: Spark 2.0.0 - Apply schema on few columns of dataset

2016-08-07 Thread Aseem Bansal
Hi All Has anyone done this with Java API? On Fri, Aug 5, 2016 at 5:36 PM, Aseem Bansal wrote: > I need to use few columns out of a csv. But as there is no option to read > few columns out of csv so > 1. I am reading the whole CSV using SparkSession.csv() > 2.

Re: Spark 2.0.0 - Apply schema on few columns of dataset

2016-08-05 Thread Jacek Laskowski
Hi Aseem, Ah, so I can't help you in this area. I've never worked with Spark using Java (and honestly don't want to if I don't have to). Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark Follow me at

Re: Spark 2.0.0 - Apply schema on few columns of dataset

2016-08-05 Thread Aseem Bansal
Yes. This is what I am after. But I have to use the Java API. And using the Java API I was not able to get the .as() function working On Fri, Aug 5, 2016 at 7:09 PM, Jacek Laskowski wrote: > Hi, > > I don't understand where the issue is... > > ➜ spark git:(master) ✗ cat

Re: Spark 2.0.0 - Apply schema on few columns of dataset

2016-08-05 Thread Jacek Laskowski
Hi, I don't understand where the issue is... ➜ spark git:(master) ✗ cat csv-logs/people-1.csv name,city,country,age,alive Jacek,Warszawa,Polska,42,true val df = spark.read.option("header", true).csv("csv-logs/people-1.csv") val nameCityPairs = df.select('name, 'city).as[(String, String)]

Spark 2.0.0 - Apply schema on few columns of dataset

2016-08-05 Thread Aseem Bansal
I need to use few columns out of a csv. But as there is no option to read few columns out of csv so 1. I am reading the whole CSV using SparkSession.csv() 2. selecting few of the columns using DataFrame.select() 3. applying schema using the .as() function of Dataset. I tried to extent