Re: Why is shuffle write size so large when joining Dataset with nested structure?

2016-11-27 Thread Zhuo Tao
Hi Takeshi, Thank you for your comment. I changed it to RDD and it's a lot better. Zhuo On Fri, Nov 25, 2016 at 7:04 PM, Takeshi Yamamuro wrote: > Hi, > > I think this is just the overhead to represent nested elements as internal > rows on-runtime > (e.g., it consumes

Re: spark-submit hangs forever after all tasks finish(spark 2.0.0 stable version on yarn)

2016-07-31 Thread Zhuo Tao
Yarn client On Sunday, July 31, 2016, Pradeep wrote: > Hi, > > Are you running on yarn-client or cluster mode? > > Pradeep > > > On Jul 30, 2016, at 7:34 PM, taozhuo > > wrote: > > > > below is the error messages that seem run infinitely: