Hmm, just found that there is no JoinHint that would allow what I described
above.
Broadcasting one input and using the other one to build a hash-tables is
usually not a good thing to do, because the broadcasted side should be much
smaller than the other one...
2014-10-31 21:56 GMT+01:00 Fabian H
Hi Viktor,
welcome on the dev mailing list! :-)
I agree that Flink's aggregations should be improved in various aspects:
- support more aggregation functions. Currently only MIN, MAX, SUM are
supported. Adding COUNT and AVG would be nice!
- support for multiple aggregations per field
- support fo
Just had another idea.
The group-wise crossing that you are doing is actually a self-join on the
grouping key.
The system has currently no special strategy to deal with selfjoins. That
means both inputs of the join (which are identical) are treated as two
individual inputs. If you force a broadcast
Hi everybody,
First, I want to introduce myself to the community. I am a PhD student who
wants to work with and improve Flink.
Second, I thought to work on improving aggregations as a start. My first goal
is to simplify the computaton of a field average. Basically, I want to turn
this plan:
Kostas Tzoumas created FLINK-1200:
-
Summary: Add count() aggregate function to Java and Scala APIs
Key: FLINK-1200
URL: https://issues.apache.org/jira/browse/FLINK-1200
Project: Flink
Issue T