Hi Team, I am new to Spark, my requirement is I have a huge list, which is converted to spark dataset and I need to operate on this dataset and store computed values in another object/dataset and store in memory for further processing.
Approach I tried is : list is retrieved from third party in a loop. I converted this list to dataset and using function I am trying to iterate and store results in another dataset. Problem I am facing : I am not able to see any data in newly computed dataset. Kindly help me to sort out this issue, please let me know if any better approach. Sample Code: ------------------------ Class Person implements Serializable{ private static final long serialVersionUID = 1L; private String name; Private PersonId id; //getters and setters } Class personId { private int deptId; //getters and setters } Class PersonDetails implements Serializable{ private static final long serialVersionUID = 1L; private int deptId; private BigDecimal sal; private String name; //getters and setters } In another Class - I have below template code List<PersonDetails> personDtlsList = new ArrayList<>(); final Encoder<Person> encoder = Encoders.bean(Person.class); final Encoder< PersonDetails > personDtlsEncoder = Encoders.bean( PersonDetails .class); // here I try to hit thrid party Interface and get person information in list List<Person> personList = getPersonInformation( passing few parameters); Dataset<Person> personDS = sqlContext.createDataset(personList,encoder); Dataset<PersonDetails> personDtlsDS = sqlContext.createDataset( personDtlsList,personDtlsEncoder); JavaRDD<PersonDetails> personDtlsRDD = personDS.toDF().toJavaRDD().map(new Function<Row, PersonDetails>() { private static final long serialVersionUID = 2L; @Override public PersonDetails call(Row row) throws Exception{ PersonDetails personDetails = new PersonDetails(); //setter for personDetails - name, sal and others personDetails.setName(row.getString(0)); personDetails.setSal(new BigDecimal(10000)); personDtlsDS.union(sqlContext.createDataset(new ArrayList<PersonDetails>(){{add(personDetails);}}, personDtlsEncoder)); return personDetails; } }); personDtlsDS.count(); Regards, Sunitha.