Re: dataframe slow down with tungsten turn on
Yes, the same code, the same result. In fact, the code has been running for a more one month. Before 1.5.0, the performance is quite the same, So I doubt that it is causd by tungsten. Gen On Wed, Nov 4, 2015 at 4:05 PM, Rick Moritzwrote: > Something to check (just in case): > Are you getting identical results each time? > > On Wed, Nov 4, 2015 at 8:54 AM, gen tang wrote: > >> Hi sparkers, >> >> I am using dataframe to do some large ETL jobs. >> More precisely, I create dataframe from HIVE table and do some >> operations. And then I save it as json. >> >> When I used spark-1.4.1, the whole process is quite fast, about 1 mins. >> However, when I use the same code with spark-1.5.1(with tungsten turn on), >> it takes a about 2 hours to finish the same job. >> >> I checked the detail of tasks, almost all the time is consumed by >> computation. >> >> Any idea about why this happens? >> >> Thanks a lot in advance for your help. >> >> Cheers >> Gen >> >> >
Re: dataframe slow down with tungsten turn on
Something to check (just in case): Are you getting identical results each time? On Wed, Nov 4, 2015 at 8:54 AM, gen tangwrote: > Hi sparkers, > > I am using dataframe to do some large ETL jobs. > More precisely, I create dataframe from HIVE table and do some operations. > And then I save it as json. > > When I used spark-1.4.1, the whole process is quite fast, about 1 mins. > However, when I use the same code with spark-1.5.1(with tungsten turn on), > it takes a about 2 hours to finish the same job. > > I checked the detail of tasks, almost all the time is consumed by > computation. > > Any idea about why this happens? > > Thanks a lot in advance for your help. > > Cheers > Gen > >
dataframe slow down with tungsten turn on
Hi sparkers, I am using dataframe to do some large ETL jobs. More precisely, I create dataframe from HIVE table and do some operations. And then I save it as json. When I used spark-1.4.1, the whole process is quite fast, about 1 mins. However, when I use the same code with spark-1.5.1(with tungsten turn on), it takes a about 2 hours to finish the same job. I checked the detail of tasks, almost all the time is consumed by computation. Any idea about why this happens? Thanks a lot in advance for your help. Cheers Gen