Hi Raghu,Is this the right assumption that if results are not aggregated we may
have inconsistency in what the final result may look like?What would be the
best aggregation approach to guarantee consistency? Even if there is perf.
cost.Thanks
From: Raghu Angadi <[email protected]>
To: [email protected]; amir bahmanyari <[email protected]>
Sent: Friday, July 29, 2016 2:32 PM
Subject: Re: Any data gets cached?
I suggest you try aggregating using Beam primitives (GroupByKey, count etc),
see if it produced in consistent results.
On Fri, Jul 29, 2016 at 12:09 PM, amir bahmanyari <[email protected]> wrote:
Sorry colleagues,I am having a moving target in my Beam app with FlinkRunner in
a Flink Cluster of 2 nodes.Every run produces a different result while we know
what the result MUST be: its an expected fixed number.I checked and see Kafka
is NOT sending any extra records.My first suspect was Redis thread-UNsafe
hashmap objects.I replaced them with Java ConcurrentHashMaps. Wow! it worked
for the very first time every PERFECT.I then re-run exact same thing expecting
the exact same previous result.But its different. Run it again. Another
different wrong result. A different result each time.No code change nothing
different.
I am wondering if some previous data/record gets cached by
Beam/FlinkRunner/KafkaIO invocation etc. somewhere.Sorry for the long email. Am
losing my mind catching this moving target :))Appreciate your kind feedback on
this.Cheers+have a great weekend.Amir-