Re: rdd count is throwing null pointer exception

2015-08-24 Thread Akhil Das
Move your count operation outside the foreach and use a broadcast to access
it inside the foreach.
On Aug 17, 2015 10:34 AM, Priya Ch learnings.chitt...@gmail.com wrote:

 Looks like because of Spark-5063
 RDD transformations and actions can only be invoked by the driver, not
 inside of other transformations; for example, rdd1.map(x =
 rdd2.values.count() * x) is invalid because the values transformation and
 count action cannot be performed inside of the rdd1.map transformation. For
 more information, see SPARK-5063.

 On Mon, Aug 17, 2015 at 8:13 PM, Preetam preetam...@gmail.com wrote:

 The error could be because of the missing brackets after the word cache -
 .ticketRdd.cache()

  On Aug 17, 2015, at 7:26 AM, Priya Ch learnings.chitt...@gmail.com
 wrote:
 
  Hi All,
 
   Thank you very much for the detailed explanation.
 
  I have scenario like this-
  I have rdd of ticket records and another rdd of booking records. for
 each ticket record, i need to check whether any link exists in booking
 table.
 
  val ticketCachedRdd = ticketRdd.cache
 
  ticketRdd.foreach{
  ticket =
  val bookingRecords =  queryOnBookingTable (date, flightNumber,
 flightCarrier)  // this function queries the booking table and retrieves
 the booking rows
  println(ticketCachedRdd.count) // this is throwing Null pointer
 exception
 
  }
 
  Is there somthing wrong in the count, i am trying to use the count of
 cached rdd when looping through the actual rdd. whats wrong in this ?
 
  Thanks,
  Padma Ch





rdd count is throwing null pointer exception

2015-08-17 Thread Priya Ch
Hi All,

 Thank you very much for the detailed explanation.

I have scenario like this-
I have rdd of ticket records and another rdd of booking records. for each
ticket record, i need to check whether any link exists in booking table.

val ticketCachedRdd = ticketRdd.cache

ticketRdd.foreach{
ticket =
val bookingRecords =  queryOnBookingTable (date, flightNumber,
flightCarrier)  // this function queries the booking table and retrieves
the booking rows
println(ticketCachedRdd.count) // this is throwing Null pointer exception

}

Is there somthing wrong in the count, i am trying to use the count of
cached rdd when looping through the actual rdd. whats wrong in this ?

Thanks,
Padma Ch


Re: rdd count is throwing null pointer exception

2015-08-17 Thread Preetam
The error could be because of the missing brackets after the word cache - 
.ticketRdd.cache()

 On Aug 17, 2015, at 7:26 AM, Priya Ch learnings.chitt...@gmail.com wrote:
 
 Hi All,
 
  Thank you very much for the detailed explanation.
 
 I have scenario like this- 
 I have rdd of ticket records and another rdd of booking records. for each 
 ticket record, i need to check whether any link exists in booking table.
 
 val ticketCachedRdd = ticketRdd.cache
 
 ticketRdd.foreach{
 ticket =
 val bookingRecords =  queryOnBookingTable (date, flightNumber, flightCarrier) 
  // this function queries the booking table and retrieves the booking rows
 println(ticketCachedRdd.count) // this is throwing Null pointer exception
 
 }
 
 Is there somthing wrong in the count, i am trying to use the count of cached 
 rdd when looping through the actual rdd. whats wrong in this ?
 
 Thanks,
 Padma Ch

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: rdd count is throwing null pointer exception

2015-08-17 Thread Priya Ch
Looks like because of Spark-5063
RDD transformations and actions can only be invoked by the driver, not
inside of other transformations; for example, rdd1.map(x =
rdd2.values.count() * x) is invalid because the values transformation and
count action cannot be performed inside of the rdd1.map transformation. For
more information, see SPARK-5063.

On Mon, Aug 17, 2015 at 8:13 PM, Preetam preetam...@gmail.com wrote:

 The error could be because of the missing brackets after the word cache -
 .ticketRdd.cache()

  On Aug 17, 2015, at 7:26 AM, Priya Ch learnings.chitt...@gmail.com
 wrote:
 
  Hi All,
 
   Thank you very much for the detailed explanation.
 
  I have scenario like this-
  I have rdd of ticket records and another rdd of booking records. for
 each ticket record, i need to check whether any link exists in booking
 table.
 
  val ticketCachedRdd = ticketRdd.cache
 
  ticketRdd.foreach{
  ticket =
  val bookingRecords =  queryOnBookingTable (date, flightNumber,
 flightCarrier)  // this function queries the booking table and retrieves
 the booking rows
  println(ticketCachedRdd.count) // this is throwing Null pointer exception
 
  }
 
  Is there somthing wrong in the count, i am trying to use the count of
 cached rdd when looping through the actual rdd. whats wrong in this ?
 
  Thanks,
  Padma Ch