Hi Shan,
If you happen to have a lot of repeated data (in the most general
grouping), you might get some speedup by little pre-aggregation. The
following code should produce the same results as the example in your
first post:
>From (
SELECT a, b , c, count(*) AS cnt
FROM X
group by a,b,c
)
On 6/4/12, shan s wrote:
> Thanks for the explanation Jan.
> If I understand correctly, the input will be read one single time and will
> be preprocessed in some form, and this intermediate data is used for
> subsequent group-by..
> Not sure if my scenario will help this single step, since group-
Thanks for the explanation Jan.
If I understand correctly, the input will be read one single time and will
be preprocessed in some form, and this intermediate data is used for
subsequent group-by..
Not sure if my scenario will help this single step, since group-by varies
across vast entities.
If
> On Fri, Jun 1, 2012 at 5:25 PM, shan s wrote:
>
>> I am using Multi-GroupBy-Insert. I was expecting a single map-reduce job
>> which would club the group-bys together.
>> However it is scheduling n jobs where n = number of group bys..
>> Could you please explain this behaviour.
>>
>>
>
No, it wi
Anyone?
Thanks..
On Fri, Jun 1, 2012 at 5:25 PM, shan s wrote:
> I am using Multi-GroupBy-Insert. I was expecting a single map-reduce job
> which would club the group-bys together.
> However it is scheduling n jobs where n = number of group bys..
> Could you please explain this behaviour.
>
> Fr