JavaRDD<Person> prdd = sc.textFile("c:\\fls\\people.txt").map( new Function<String, Person>() { public Person call(String line) throws Exception { String[] parts = line.split(","); Person person = new Person(); person.setName(parts[0]); person.setAge(Integer.parseInt(parts[1].trim())); person.setSal(Integer.parseInt(parts[2].trim())); return person; } });
RDD<Person>personRDD = prdd.toRDD(prdd); Dataset<Person> dss= sqlContext.createDataset(personRDD , Encoders.bean(Person.class)); GroupedDataset<Row, Person> dq=dss.groupBy(new Column("name")); I have to calculate sum of age and salary group by name on the dataset. Please help how to query dataset ? I tried using GroupedDataset but don't know how to proceed with it. I acn not find much help for using dataset api. Please help -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Using-Aggregate-and-group-by-on-spark-Dataset-api-tp26824.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org