es of “vine” (which is StringType) both from
> data and res2, and res2 is missing a lot of values:
>
>
>
> val t1 = res2.select("vine").distinct.collect
>
> scala> t1.size
>
> res10: Int = 617
>
>
>
> val t_real = data.select("vine").di
nevermind my last email. res2 is filtered so my test does not make sense. The
issue is not reproduced there. I have the problem somwhere else.
From: Ellafi, Saif A.
Sent: Thursday, October 22, 2015 12:57 PM
To: 'Xiao Li'
Cc: user
Subject: RE: Spark groupby and agg inconsistent and mi
lect
scala> t_real.size
res9: Int = 639
From: Xiao Li [mailto:gatorsm...@gmail.com]
Sent: Thursday, October 22, 2015 12:45 PM
To: Ellafi, Saif A.
Cc: user
Subject: Re: Spark groupby and agg inconsistent and missing data
Hi, Saif,
Could you post your code here? It might help others reproduce the
Hi, Saif,
Could you post your code here? It might help others reproduce the errors
and give you a correct answer.
Thanks,
Xiao Li
2015-10-22 8:27 GMT-07:00 :
> Hello everyone,
>
> I am doing some analytics experiments under a 4 server stand-alone cluster
> in a spark shell, mostly involving a
Hello everyone,
I am doing some analytics experiments under a 4 server stand-alone cluster in a
spark shell, mostly involving a huge database with groupBy and aggregations.
I am picking 6 groupBy columns and returning various aggregated results in a
dataframe. GroupBy fields are of two types, m