BTW, 1 min V.S. 2 Hours, seems quite weird, can you provide more information on 
the ETL work?

From: Cheng, Hao [mailto:hao.ch...@intel.com]
Sent: Thursday, November 5, 2015 12:56 PM
To: gen tang; dev@spark.apache.org
Subject: RE: dataframe slow down with tungsten turn on

1.5 has critical performance / bug issues, you’d better try 1.5.1 or 1.5.2rc 
version.

From: gen tang [mailto:gen.tan...@gmail.com]
Sent: Thursday, November 5, 2015 12:43 PM
To: dev@spark.apache.org<mailto:dev@spark.apache.org>
Subject: Fwd: dataframe slow down with tungsten turn on

Hi,

In fact, I tested the same code with spark 1.5 with tungsten turning off. The 
result is quite the same as tungsten turning on.
It seems that it is not the problem of tungsten, it is simply that spark 1.5 is 
slower than spark 1.4.

Is there any idea about why it happens?
Thanks a lot in advance

Cheers
Gen


---------- Forwarded message ----------
From: gen tang <gen.tan...@gmail.com<mailto:gen.tan...@gmail.com>>
Date: Wed, Nov 4, 2015 at 3:54 PM
Subject: dataframe slow down with tungsten turn on
To: "u...@spark.apache.org<mailto:u...@spark.apache.org>" 
<u...@spark.apache.org<mailto:u...@spark.apache.org>>
Hi sparkers,

I am using dataframe to do some large ETL jobs.
More precisely, I create dataframe from HIVE table and do some operations. And 
then I save it as json.

When I used spark-1.4.1, the whole process is quite fast, about 1 mins. 
However, when I use the same code with spark-1.5.1(with tungsten turn on), it 
takes a about 2 hours to finish the same job.

I checked the detail of tasks, almost all the time is consumed by computation.
[https://owa.gf.com.cn/owa/service.svc/s/GetFileAttachment?id=AAMkAGEzNGJiN2Q4LTI2ODYtNGIyYS1hYWIyLTMzMTYxOGQzYTViNABGAAAAAACPuqp5iM6mRqg7wmvE6c8KBwBKGW%2B6dpgjRb4BfC%2BACXJIAAAAAAEPAABKGW%2B6dpgjRb4BfC%2BACXJIAAAAQcF3AAABEgAQAIeCeL7UEe9GhqECpYfXhDI%3D&X-OWA-CANARY=7U3OIyan90CkQzeCMSlDnFM6WrDs5NIIksHvCIBBNwcmtRNW4tO1_1WPFeb51C1IsASUo1jqj_A.]
Any idea about why this happens?

Thanks a lot in advance for your help.

Cheers
Gen


Reply via email to