> Note that there are a few aggregates that cannot be distributed (e.g. median).
You can distribute/parallelize based on the group-by key, but yes in general
these so-called Holistic Aggregates (i.e. aggregates that must look at the
entire partition) can’t be distributed within a key without a
aggregate kernels already maintain such intermediate status
> internally, I wonder if it is possible to have some APIs in aggregate
> kernels to retrieve these intermediate status to enable such use scenarios.
> Thanks.
>
>
>
> Jiangtao
>
>
>
> *From: *Sasha Krassovsk
7, 2023 at 2:21 PM
To: user@arrow.apache.org
Subject: Re: [C++][Acero] can Acero support distributed computation?
Yes, what you’ve said is correct for Mean. But my point earlier is that there
should only be a few of such special cases. A simple case would be e.g. Max,
where Aggregate outputs Max
s how to implement Pre-Aggregation and Post-Aggregation using Acero.
>
> Best,
> Jiangtao
>
>
> From: Sasha Krassovsky
> Date: Friday, July 7, 2023 at 1:25 PM
> To: user@arrow.apache.org
> Subject: Re: [C++][Acero] can Acero support distributed computation?
>
>
and Post-Aggregation using Acero.
Best,
Jiangtao
From: Sasha Krassovsky
Date: Friday, July 7, 2023 at 1:25 PM
To: user@arrow.apache.org
Subject: Re: [C++][Acero] can Acero support distributed computation?
Can you clarify what you mean by “data flow”? Each machine will be executing
the same
about data flow of “compute” and “merge” on different nodes?
>
>
> Best,
> Jiangtao
>
> From: Sasha Krassovsky
> Date: Friday, July 7, 2023 at 11:07 AM
> To: user@arrow.apache.org
> Subject: Re: [C++][Acero] can Acero support distributed computation?
>
> Distribut
be appreciated.
Thanks,
Jiangtao
From: Sasha Krassovsky
Date: Friday, July 7, 2023 at 10:12 AM
To: user@arrow.apache.org
Subject: Re: [C++][Acero] can Acero support distributed computation?
Hi Jiangtao,
Acero doesn’t support any distributed computation on its own. However, to get
some simple
thod on aggregation kernel? Any other tips would be appreciated.
>
> Thanks,
> Jiangtao
>
> From: Sasha Krassovsky
> Date: Friday, July 7, 2023 at 10:12 AM
> To: user@arrow.apache.org
> Subject: Re: [C++][Acero] can Acero support distributed computation?
>
> Hi
Krassovsky
Date: Friday, July 7, 2023 at 10:12 AM
To: user@arrow.apache.org
Subject: Re: [C++][Acero] can Acero support distributed computation?
Hi Jiangtao,
Acero doesn’t support any distributed computation on its own. However, to get
some simple distributed computation going it would
Hi Jiangtao,
Acero doesn’t support any distributed computation on its own. However, to get
some simple distributed computation going it would be sufficient to add a
Shuffle node. For example for Aggregation, the Shuffle would assign a range of
hashes to each node, and then each node would
Hi there,
I'm learning Acero streaming execution engine recently. And I’m wondering if
Acero support distributed computing.
I have read code about aggregation node and kernel; Aggregation kernel seems to
hide the details of aggregation middle state. If use multiple nodes with Acero
execution
11 matches
Mail list logo