Re: Performance with Insert overwrite into Hive Table.

2016-05-04 Thread Bijay Kumar Pathak
Thanks Ted. This looks like the issue since I am running it in EMR and the Hive version is 1.0.0. Thanks, Bijay On Wed, May 4, 2016 at 10:29 AM, Ted Yu wrote: > Looks like you were hitting HIVE-11940 > > On Wed, May 4, 2016 at 10:02 AM, Bijay Kumar Pathak

Re: Performance with Insert overwrite into Hive Table.

2016-05-04 Thread Ted Yu
Looks like you were hitting HIVE-11940 On Wed, May 4, 2016 at 10:02 AM, Bijay Kumar Pathak wrote: > Hello, > > I am writing Dataframe of around 60+ GB into partitioned Hive Table using > hiveContext in parquet format. The Spark insert overwrite jobs completes in > a reasonable

Performance with Insert overwrite into Hive Table.

2016-05-04 Thread Bijay Kumar Pathak
Hello, I am writing Dataframe of around 60+ GB into partitioned Hive Table using hiveContext in parquet format. The Spark insert overwrite jobs completes in a reasonable amount of time around 20 minutes. But the job is taking a huge amount of time more than 2 hours to copy data from .hivestaging