rdd count is throwing null pointer exception

2015-08-17 Thread Priya Ch
Hi All,

 Thank you very much for the detailed explanation.

I have scenario like this-
I have rdd of ticket records and another rdd of booking records. for each
ticket record, i need to check whether any link exists in booking table.

val ticketCachedRdd = ticketRdd.cache

ticketRdd.foreach{
ticket =>
val bookingRecords =  queryOnBookingTable (date, flightNumber,
flightCarrier)  // this function queries the booking table and retrieves
the booking rows
println(ticketCachedRdd.count) // this is throwing Null pointer exception

}

Is there somthing wrong in the count, i am trying to use the count of
cached rdd when looping through the actual rdd. whats wrong in this ?

Thanks,
Padma Ch


Re: rdd count is throwing null pointer exception

2015-08-17 Thread Preetam
The error could be because of the missing brackets after the word cache - 
.ticketRdd.cache()

> On Aug 17, 2015, at 7:26 AM, Priya Ch  wrote:
> 
> Hi All,
> 
>  Thank you very much for the detailed explanation.
> 
> I have scenario like this- 
> I have rdd of ticket records and another rdd of booking records. for each 
> ticket record, i need to check whether any link exists in booking table.
> 
> val ticketCachedRdd = ticketRdd.cache
> 
> ticketRdd.foreach{
> ticket =>
> val bookingRecords =  queryOnBookingTable (date, flightNumber, flightCarrier) 
>  // this function queries the booking table and retrieves the booking rows
> println(ticketCachedRdd.count) // this is throwing Null pointer exception
> 
> }
> 
> Is there somthing wrong in the count, i am trying to use the count of cached 
> rdd when looping through the actual rdd. whats wrong in this ?
> 
> Thanks,
> Padma Ch

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: rdd count is throwing null pointer exception

2015-08-17 Thread Priya Ch
Looks like because of Spark-5063
RDD transformations and actions can only be invoked by the driver, not
inside of other transformations; for example, rdd1.map(x =>
rdd2.values.count() * x) is invalid because the values transformation and
count action cannot be performed inside of the rdd1.map transformation. For
more information, see SPARK-5063.

On Mon, Aug 17, 2015 at 8:13 PM, Preetam  wrote:

> The error could be because of the missing brackets after the word cache -
> .ticketRdd.cache()
>
> > On Aug 17, 2015, at 7:26 AM, Priya Ch 
> wrote:
> >
> > Hi All,
> >
> >  Thank you very much for the detailed explanation.
> >
> > I have scenario like this-
> > I have rdd of ticket records and another rdd of booking records. for
> each ticket record, i need to check whether any link exists in booking
> table.
> >
> > val ticketCachedRdd = ticketRdd.cache
> >
> > ticketRdd.foreach{
> > ticket =>
> > val bookingRecords =  queryOnBookingTable (date, flightNumber,
> flightCarrier)  // this function queries the booking table and retrieves
> the booking rows
> > println(ticketCachedRdd.count) // this is throwing Null pointer exception
> >
> > }
> >
> > Is there somthing wrong in the count, i am trying to use the count of
> cached rdd when looping through the actual rdd. whats wrong in this ?
> >
> > Thanks,
> > Padma Ch
>


Re: rdd count is throwing null pointer exception

2015-08-24 Thread Akhil Das
Move your count operation outside the foreach and use a broadcast to access
it inside the foreach.
On Aug 17, 2015 10:34 AM, "Priya Ch"  wrote:

> Looks like because of Spark-5063
> RDD transformations and actions can only be invoked by the driver, not
> inside of other transformations; for example, rdd1.map(x =>
> rdd2.values.count() * x) is invalid because the values transformation and
> count action cannot be performed inside of the rdd1.map transformation. For
> more information, see SPARK-5063.
>
> On Mon, Aug 17, 2015 at 8:13 PM, Preetam  wrote:
>
>> The error could be because of the missing brackets after the word cache -
>> .ticketRdd.cache()
>>
>> > On Aug 17, 2015, at 7:26 AM, Priya Ch 
>> wrote:
>> >
>> > Hi All,
>> >
>> >  Thank you very much for the detailed explanation.
>> >
>> > I have scenario like this-
>> > I have rdd of ticket records and another rdd of booking records. for
>> each ticket record, i need to check whether any link exists in booking
>> table.
>> >
>> > val ticketCachedRdd = ticketRdd.cache
>> >
>> > ticketRdd.foreach{
>> > ticket =>
>> > val bookingRecords =  queryOnBookingTable (date, flightNumber,
>> flightCarrier)  // this function queries the booking table and retrieves
>> the booking rows
>> > println(ticketCachedRdd.count) // this is throwing Null pointer
>> exception
>> >
>> > }
>> >
>> > Is there somthing wrong in the count, i am trying to use the count of
>> cached rdd when looping through the actual rdd. whats wrong in this ?
>> >
>> > Thanks,
>> > Padma Ch
>>
>
>