Hello, I have a Job that's a series of Joins, GroupBys, and Aggs and it's bottlenecked in one of the joins. The join's cardinality is ~300 million rows on the left and ~200 million rows on the right all with unique keys. I'm seeing this in the plan for that bottlenecked Join.
Join(joinType=[InnerJoin], where=[(user_id = id0)], select=[id, group_id, user_id, uuid, owner, id0, deleted_at], leftInputSpec=[HasUniqueKey], rightInputSpec=[JoinKeyContainsUniqueKey]) The join condition is basically (left.user_id === right.id). So `id0` must be right.id here. My first question is, what is the difference between leftInputSpec=[HasUniqueKey] and rightInputSpec=[JoinKeyContainsUniqueKey] ? Is the left side not using the join key for hashing the join but instead using its pk id, which would be underperformant? Is there anything else about this that stands out? Thanks! -- Rex Fenley | Software Engineer - Mobile and Backend Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> | FOLLOW US <https://twitter.com/remindhq> | LIKE US <https://www.facebook.com/remindhq>