Hi Sunitha, Make the class which is having the common function your calling as serializable.
Thank you, Naresh On Wed, Dec 20, 2017 at 9:58 PM Sunitha Chennareddy < chennareddysuni...@gmail.com> wrote: > Hi, > > Thank You All.. > > Here is my requirement, I have a dataframe which contains list of rows > retrieved from oracle table. > I need to iterate dataframe and fetch each record and call a common > function by passing few parameters. > > Issue I am facing is : I am not able to call common function > > JavaRDD<Person> personRDD = person_dataframe.toJavaRDD().map(new > Function<Row, Person>() { > @Override > public Person call(Row row) throws Exception{ > Person person = new Person(); > person.setId(row.getDecimal(0).longValue()); > person.setName(row.getString(1)); > > personLst.add(person); > return person; > } > }); > > personRDD.foreach(new VoidFunction<Person>() { > private static final long serialVersionUID = 1111111111111123456L; > > @Override > public void call(Person person) throws Exception > { > System.out.println(person.getId()); > Here I tried to call common function ************ > } > }); > > I am able to print data in foreach loop, however if I tried to call common > function it gives me below error > Error Message : org.apache.spark.SparkException: Task not serializable > > I kindly request you to share some idea(sample code / link to refer) on > how to call a common function/Interace method by passing values in each > record of the dataframe. > > Regards, > Sunitha > > > On Tue, Dec 19, 2017 at 1:20 PM, Weichen Xu <weichen...@databricks.com> > wrote: > >> Hi Sunitha, >> >> In the mapper function, you cannot update outer variables such as >> `personLst.add(person)`, >> this won't work so that's the reason you got an empty list. >> >> You can use `rdd.collect()` to get a local list of `Person` objects >> first, then you can safely iterate on the local list and do any update you >> want. >> >> Thanks. >> >> On Tue, Dec 19, 2017 at 2:16 PM, Sunitha Chennareddy < >> chennareddysuni...@gmail.com> wrote: >> >>> Hi Deepak, >>> >>> I am able to map row to person class, issue is I want to to call another >>> method. >>> I tried converting to list and its not working with out using collect. >>> >>> Regards >>> Sunitha >>> On Tuesday, December 19, 2017, Deepak Sharma <deepakmc...@gmail.com> >>> wrote: >>> >>>> I am not sure about java but in scala it would be something like >>>> df.rdd.map{ x => MyClass(x.getString(0),.....)} >>>> >>>> HTH >>>> >>>> --Deepak >>>> >>>> On Dec 19, 2017 09:25, "Sunitha Chennareddy" <chennareddysunitha@.com >>>> <chennareddysuni...@gmail.com>> wrote: >>>> >>>> Hi All, >>>> >>>> I am new to Spark, I want to convert DataFrame to List<JavaClass> with >>>> out using collect(). >>>> >>>> Main requirement is I need to iterate through the rows of dataframe and >>>> call another function by passing column value of each row (person.getId()) >>>> >>>> Here is the snippet I have tried, Kindly help me to resolve the issue, >>>> personLst is returning 0: >>>> >>>> List<Person> personLst= new ArrayList<Person>(); >>>> JavaRDD<Person> personRDD = person_dataframe.toJavaRDD().map(new >>>> Function<Row, Person>() { >>>> public Person call(Row row) throws Exception{ >>>> Person person = new Person(); >>>> person.setId(row.getDecimal(0).longValue()); >>>> person.setName(row.getString(1)); >>>> >>>> personLst.add(person); >>>> // here I tried to call another function but control never passed >>>> return person; >>>> } >>>> }); >>>> logger.info("personLst size =="+personLst.size()); >>>> logger.info("personRDD count ==="+personRDD.count()); >>>> >>>> //output is >>>> personLst size == 0 >>>> personRDD count === 3 >>>> >>>> >>>> >> >