Hi Wilson,
the OuterJoinMapFunction can be done, just as Chesnay said.
Initialise it with a JoinFunction object in the constructor and call its
join() function in the map() call.
You can also pass the JoinFunction directly to the CoGroupFunction and call
it in cogroup() instead of building the Tup
Hey Wilson,
the MapFunction should act as a wrapper for the join function. create a
class extending RichMapFunction, and pass the joinfunction via the
constructor. then you delegate open/close calls to it, with the map
function looking something like this:
map(Tuple2<...> tuple) {
return joinFunc
Hi Fabian,
It is very helpful of your response! But in order to make sure I understand
correctly, I put my pseudo-code here first:
class OuterJoinCoGroupFunction implements CoGroupFunction, Tuple2, Double>{
@Override
public void coGroup(Iterable > iVals,
Iterable > dVals, Collector out
That's a good point.
You can implement an outer join using the available runtime. This way you
do not need to touch the optimizer and runtime but only the API layer.
This basically means to add syntactic sugar to the available API. The API
will translate the outer join into a CoGroup which builds
Hi Wilson!
You can start by mocking an outer join operator using a special CoGroup
function. If one of the two sides for a group is empty, you have the case
where you need to append null values. Otherwise, you build the Cartesian
produce within the group.
For a proper through-the-stack implementa
Hi,
I am trying to pick up the outer join operator. However, as Fabian mentioned to
me, that this task would require to touch many different components of the
system, it would be a challenge job for me. Therefore I would need some help:-)
I might need to walk through some features like Compiler