Make sure you don't have two master instances running on the same machine. It could happen like you were running the job and in the middle you tried to stop the cluster which didn't completely stopped it and you did a start-all again which will eventually end up having 2 master instances running, and the former one will still be having your data computed/cached somewhere in the memory.
Thanks Best Regards On Mon, Mar 9, 2015 at 11:45 AM, Dai, Kevin <yun...@ebay.com> wrote: > Hi, guys > > > > I encounter a strange problem as follows: > > > > I joined two tables(which are both parquet files) and then did the > groupby. The groupby took 19 hours to finish. > > > > However, when I kill this job twice in the groupby stage. The third try > will su > > > > But after I killed this job and run it again. It succeeded and finished in > 15mins. > > > > What’s wrong with it? > > > > Best Regards, > > Kevin. > > >