Hi All,

I have a table named *customer *(customer_id, event, country, .... ) in
postgreSQL database. This table is having more than 100 million rows.

I want to know number of events from each country. To achieve that i am
doing groupBY using spark as following.

*val dataframe1 = sqlContext.load("jdbc", Map("url" ->
"jdbc:postgresql://localhost/customerlogs?user=postgres&password=postgres",
"dbtable" -> "customer"))*


*dataframe1.groupBy("country").count().show()*

above code seems to be getting complete customer table before doing
groupBy. Because of that reason it is throwing the following error

*16/01/11 12:49:04 WARN HeartbeatReceiver: Removing executor 0 with no
recent heartbeats: 170758 ms exceeds timeout 120000 ms*
*16/01/11 12:49:04 ERROR TaskSchedulerImpl: Lost executor 0 on 10.2.12.59
<http://10.2.12.59>: Executor heartbeat timed out after 170758 ms*
*16/01/11 12:49:04 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
10.2.12.59): ExecutorLostFailure (executor 0 exited caused by one of the
running tasks) Reason: Executor heartbeat timed out after 170758 ms*

I am using spark 1.6.0

Is there anyway i can solve this ?

Thanks,
Rajeshwar Gaini.

Reply via email to