I've logged an issue for alternative implementation of semi-joins as https://issues.apache.org/jira/browse/OPTIQ-379.
I believe the amount of required memory (see in the issue) should be good costing for the current enumerable implementation (no spill to disk materializations, etc). Vladimir
