Hi Sunitha,

Make the class which is having the common function your calling as
serializable.


Thank you,
Naresh

On Wed, Dec 20, 2017 at 9:58 PM Sunitha Chennareddy <
chennareddysuni...@gmail.com> wrote:

> Hi,
>
> Thank You All..
>
> Here is my requirement, I have a dataframe which contains list of rows
> retrieved from oracle table.
> I need to iterate dataframe and fetch each record and call a common
> function by passing few parameters.
>
> Issue I am facing is : I am not able to call common function
>
> JavaRDD<Person> personRDD = person_dataframe.toJavaRDD().map(new
> Function<Row, Person>() {
>   @Override
>   public Person call(Row row)  throws Exception{
>   Person person = new Person();
>   person.setId(row.getDecimal(0).longValue());
>   person.setName(row.getString(1));
>
> personLst.add(person);
> return person;
>   }
> });
>
> personRDD.foreach(new VoidFunction<Person>() {
> private static final long serialVersionUID = 1111111111111123456L;
>
> @Override
> public void call(Person person) throws Exception
> {
>   System.out.println(person.getId());
> Here I tried to call common function ************
> }
>    });
>
> I am able to print data in foreach loop, however if I tried to call common
> function it gives me below error
> Error Message :  org.apache.spark.SparkException: Task not serializable
>
> I kindly request you to share some idea(sample code / link to refer) on
> how to call a common function/Interace method by passing values in each
> record of the dataframe.
>
> Regards,
> Sunitha
>
>
> On Tue, Dec 19, 2017 at 1:20 PM, Weichen Xu <weichen...@databricks.com>
> wrote:
>
>> Hi Sunitha,
>>
>> In the mapper function, you cannot update outer variables such as 
>> `personLst.add(person)`,
>> this won't work so that's the reason you got an empty list.
>>
>> You can use `rdd.collect()` to get a local list of `Person` objects
>> first, then you can safely iterate on the local list and do any update you
>> want.
>>
>> Thanks.
>>
>> On Tue, Dec 19, 2017 at 2:16 PM, Sunitha Chennareddy <
>> chennareddysuni...@gmail.com> wrote:
>>
>>> Hi Deepak,
>>>
>>> I am able to map row to person class, issue is I want to to call another
>>> method.
>>> I tried converting to list and its not working with out using collect.
>>>
>>> Regards
>>> Sunitha
>>> On Tuesday, December 19, 2017, Deepak Sharma <deepakmc...@gmail.com>
>>> wrote:
>>>
>>>> I am not sure about java but in scala it would be something like
>>>> df.rdd.map{ x => MyClass(x.getString(0),.....)}
>>>>
>>>> HTH
>>>>
>>>> --Deepak
>>>>
>>>> On Dec 19, 2017 09:25, "Sunitha Chennareddy" <chennareddysunitha@.com
>>>> <chennareddysuni...@gmail.com>> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I am new to Spark, I want to convert DataFrame to List<JavaClass> with
>>>> out using collect().
>>>>
>>>> Main requirement is I need to iterate through the rows of dataframe and
>>>> call another function by passing column value of each row (person.getId())
>>>>
>>>> Here is the snippet I have tried, Kindly help me to resolve the issue,
>>>> personLst is returning 0:
>>>>
>>>> List<Person> personLst= new ArrayList<Person>();
>>>> JavaRDD<Person> personRDD = person_dataframe.toJavaRDD().map(new
>>>> Function<Row, Person>() {
>>>>   public Person call(Row row)  throws Exception{
>>>>   Person person = new Person();
>>>>   person.setId(row.getDecimal(0).longValue());
>>>>   person.setName(row.getString(1));
>>>>
>>>> personLst.add(person);
>>>> // here I tried to call another function but control never passed
>>>>     return person;
>>>>   }
>>>> });
>>>> logger.info("personLst size =="+personLst.size());
>>>> logger.info("personRDD count ==="+personRDD.count());
>>>>
>>>> //output is
>>>> personLst size == 0
>>>> personRDD count === 3
>>>>
>>>>
>>>>
>>
>

Reply via email to