There is no problem with the sql read. When i do the following it is working fine.
*val dataframe1 = sqlContext.load("jdbc", Map("url" -> "jdbc:postgresql://localhost/customerlogs?user=postgres&password=postgres", "dbtable" -> "customer"))* *dataframe1.filter("country = 'BA'").show()* On Mon, Jan 11, 2016 at 1:41 PM, Xingchi Wang <regrec...@gmail.com> wrote: > Error happend at the "Lost task 0.0 in stage 0.0", I think it is not the > "groupBy" problem, it's the sql read the "customer" table issue, > please check the jdbc link and the data is loaded successfully?? > > Thanks > Xingchi > > 2016-01-11 15:43 GMT+08:00 Gaini Rajeshwar <raja.rajeshwar2...@gmail.com>: > >> Hi All, >> >> I have a table named *customer *(customer_id, event, country, .... ) in >> postgreSQL database. This table is having more than 100 million rows. >> >> I want to know number of events from each country. To achieve that i am >> doing groupBY using spark as following. >> >> *val dataframe1 = sqlContext.load("jdbc", Map("url" -> >> "jdbc:postgresql://localhost/customerlogs?user=postgres&password=postgres", >> "dbtable" -> "customer"))* >> >> >> *dataframe1.groupBy("country").count().show()* >> >> above code seems to be getting complete customer table before doing >> groupBy. Because of that reason it is throwing the following error >> >> *16/01/11 12:49:04 WARN HeartbeatReceiver: Removing executor 0 with no >> recent heartbeats: 170758 ms exceeds timeout 120000 ms* >> *16/01/11 12:49:04 ERROR TaskSchedulerImpl: Lost executor 0 on 10.2.12.59 >> <http://10.2.12.59>: Executor heartbeat timed out after 170758 ms* >> *16/01/11 12:49:04 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID >> 0, 10.2.12.59): ExecutorLostFailure (executor 0 exited caused by one of the >> running tasks) Reason: Executor heartbeat timed out after 170758 ms* >> >> I am using spark 1.6.0 >> >> Is there anyway i can solve this ? >> >> Thanks, >> Rajeshwar Gaini. >> > >