Re: Spark is much slower than direct access MySQL

Shixiong Zhu Sun, 26 Jul 2015 01:24:54 -0700

Oh, I see. That's the total time of executing a query in Spark. Then the
difference is reasonable, considering Spark has much more work to do, e.g.,
launching tasks in executors.


Best Regards,
Shixiong Zhu

2015-07-26 16:16 GMT+08:00 Louis Hust <louis.h...@gmail.com>:

> Look at the given url:
>
> Code can be found at:
>
>
> https://github.com/louishust/sparkDemo/blob/master/src/main/java/DirectQueryTest.java
>
> 2015-07-26 16:14 GMT+08:00 Shixiong Zhu <zsxw...@gmail.com>:
>
>> Could you clarify how you measure the Spark time cost? Is it the total
>> time of running the query? If so, it's possible because the overhead of
>> Spark dominates for small queries.
>>
>> Best Regards,
>> Shixiong Zhu
>>
>> 2015-07-26 15:56 GMT+08:00 Jerrick Hoang <jerrickho...@gmail.com>:
>>
>>> how big is the dataset? how complicated is the query?
>>>
>>> On Sun, Jul 26, 2015 at 12:47 AM Louis Hust <louis.h...@gmail.com>
>>> wrote:
>>>
>>>> Hi, all,
>>>>
>>>> I am using spark DataFrame to fetch small table from MySQL,
>>>> and i found it cost so much than directly access MySQL Using JDBC.
>>>>
>>>> Time cost for Spark is about 2033ms, and direct access at about 16ms.
>>>>
>>>> Code can be found at:
>>>>
>>>>
>>>> https://github.com/louishust/sparkDemo/blob/master/src/main/java/DirectQueryTest.java
>>>>
>>>> So If my configuration for spark is wrong? How to optimise Spark to
>>>> achieve the similar performance like direct access?
>>>>
>>>> Any idea will be appreciated!
>>>>
>>>>
>>
>

Re: Spark is much slower than direct access MySQL

Reply via email to