Questions about SparkSQL join on not equality conditions

gen tang Sun, 09 Aug 2015 06:09:08 -0700

Hi,

I might have a stupid question about sparksql's implementation of join on
not equality conditions, for instance condition1 or condition2.


In fact, Hive doesn't support such join, as it is very difficult to express
such conditions as a map/reduce job. However, sparksql supports such
operation. So I would like to know how spark implement it.

As I observe such join runs very slow, I guess that spark implement it by
doing filter on the top of cartesian product. Is it true?

Thanks in advance for your help.

Cheers
Gen

Questions about SparkSQL join on not equality conditions

Reply via email to