Hi everyone. I'm experiencing an issue when I try to fetch data from SQL Server. This is my context Ubuntu 14.04 LTS Apache Spark 1.4.0 SQL Server 2008 Scala 2.10.5 Sbt 0.13.11
I'm trying to fetch data from a table in SQL Server 2008 that has 85.000.000 records. I just only need around 200.000 records. This is my code val df = sqlContext.read.jdbc("anUrl", "aTableName", Array(s"timestamp >= '2016-06-21T00:00:00'", s"timestamp < '2016-06-22T00:00:00'"), new Properties) if I do this df.take(5).foreach(println) it works without any trouble. if I do this println(df.count()) // this should return 200.000 the application hangs I've entered to http://localhost:4040/ to check what spark is doing. When I enter to the job details, it shows that is running the count method and this is the detail org.apache.spark.sql.DataFrame.count(DataFrame.scala:1269) SkipOverPlaysInWeek$.main(SkipOverPlaysInWeek.scala:88) Thanks, Gastón