Oh, I see. That's the total time of executing a query in Spark. Then the difference is reasonable, considering Spark has much more work to do, e.g., launching tasks in executors.
Best Regards, Shixiong Zhu 2015-07-26 16:16 GMT+08:00 Louis Hust <louis.h...@gmail.com>: > Look at the given url: > > Code can be found at: > > > https://github.com/louishust/sparkDemo/blob/master/src/main/java/DirectQueryTest.java > > 2015-07-26 16:14 GMT+08:00 Shixiong Zhu <zsxw...@gmail.com>: > >> Could you clarify how you measure the Spark time cost? Is it the total >> time of running the query? If so, it's possible because the overhead of >> Spark dominates for small queries. >> >> Best Regards, >> Shixiong Zhu >> >> 2015-07-26 15:56 GMT+08:00 Jerrick Hoang <jerrickho...@gmail.com>: >> >>> how big is the dataset? how complicated is the query? >>> >>> On Sun, Jul 26, 2015 at 12:47 AM Louis Hust <louis.h...@gmail.com> >>> wrote: >>> >>>> Hi, all, >>>> >>>> I am using spark DataFrame to fetch small table from MySQL, >>>> and i found it cost so much than directly access MySQL Using JDBC. >>>> >>>> Time cost for Spark is about 2033ms, and direct access at about 16ms. >>>> >>>> Code can be found at: >>>> >>>> >>>> https://github.com/louishust/sparkDemo/blob/master/src/main/java/DirectQueryTest.java >>>> >>>> So If my configuration for spark is wrong? How to optimise Spark to >>>> achieve the similar performance like direct access? >>>> >>>> Any idea will be appreciated! >>>> >>>> >> >