They should be identical. Can you paste the detailed explain output. On Thursday, March 10, 2016, FangFang Chen <lulynn_2015_sp...@163.com> wrote:
> hi, > Based on my testing, the memory cost is very different for > 1. sql("select * from ...").groupby.agg > 2. sql("select ... From ... Groupby ..."). > > For table.partition sized more than 500g, 2# run good, while outofmemory > happened in 1#. I am using the same spark configurations. > Could somebody tell why this happened? > > 发自 网易邮箱大师 <http://u.163.com/signature> > > >