Hi sparkers, I am using dataframe to do some large ETL jobs. More precisely, I create dataframe from HIVE table and do some operations. And then I save it as json.
When I used spark-1.4.1, the whole process is quite fast, about 1 mins. However, when I use the same code with spark-1.5.1(with tungsten turn on), it takes a about 2 hours to finish the same job. I checked the detail of tasks, almost all the time is consumed by computation. Any idea about why this happens? Thanks a lot in advance for your help. Cheers Gen