Hi Fabian,
My GroupReduce function sum one column of input rows of each group.
My key fields is array of multiple type, in this case is string and long.
The result that i'm posting is just represents sampling of output dataset.
Thank you in advance !
Anissa
Le jeu. 22 août 2019 à 11:24, Fabian
istic by design to enable efficient partial
> results without network and disk IO.
> reduceGroup is deterministic given a deterministic key extractor and
> deterministic GroupReduceFunction.
>
> Hope this helps,
> Fabian
>
> Am Di., 20. Aug. 2019 um 14:21 Uhr schrieb anissa
Hi,
I used the combineGroup function to reduce groups of a very large dataset.
By modifying the parallelism to 1 I have a different results with a
parallelism to 8, Knowing that the good results are those obtained with the
parallelism with 1.
I also used table api to group dataset and select sum
n) is one
> option that I’ve used for cases like this.
>
> See MapPartitionFunction
> <https://ci.apache.org/projects/flink/flink-docs-release-1.7/api/java/org/apache/flink/api/common/functions/MapPartitionFunction.html>
> in
> the JavaDocs.
>
> — Ken
>
>
> O
-- Forwarded message -
From: Piotr Nowojski
Date: jeu. 21 mars 2019 à 14:09
Subject: Re: StochasticOutlierSelection
To: anissa moussaoui , user <
user@flink.apache.org>
(Adding back user mailing list)
Hi Anissa,
Thank you for coming back with the results. I hope this mi
Hello,
I created a process for an anomaly detection with a flatMap. I need to know
the end of each job at the level of the flatMap to be able to flush a
buffer in the output collector.
I saw that it is possible to get status of job by using ExecutionEnvironment,
but i don't know how i can impleme
advance !
Best,
Anissa MOUSSAOUI
--
<http://www.dcbrain.com/> <https://twitter.com/dcbrain_feed?lang=fr>
<https://www.linkedin.com/company/dcbrain>
<https://www.youtube.com/channel/UCSJrWPBLQ58fHPN8lP_SEGw>
Pensez à la
planète, imprimer ce papier que si nécessaire