Hi All, I have a table named *customer *(customer_id, event, country, .... ) in postgreSQL database. This table is having more than 100 million rows.
I want to know number of events from each country. To achieve that i am doing groupBY using spark as following. *val dataframe1 = sqlContext.load("jdbc", Map("url" -> "jdbc:postgresql://localhost/customerlogs?user=postgres&password=postgres", "dbtable" -> "customer"))* *dataframe1.groupBy("country").count().show()* above code seems to be getting complete customer table before doing groupBy. Because of that reason it is throwing the following error *16/01/11 12:49:04 WARN HeartbeatReceiver: Removing executor 0 with no recent heartbeats: 170758 ms exceeds timeout 120000 ms* *16/01/11 12:49:04 ERROR TaskSchedulerImpl: Lost executor 0 on 10.2.12.59 <http://10.2.12.59>: Executor heartbeat timed out after 170758 ms* *16/01/11 12:49:04 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 10.2.12.59): ExecutorLostFailure (executor 0 exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 170758 ms* I am using spark 1.6.0 Is there anyway i can solve this ? Thanks, Rajeshwar Gaini.