On Sat, 21 Mar 2015 19:01:16 +0100 "Mario M. Westphal" <mw at mwlabs.de> wrote:
> For now I have increased the threshold for IN clauses (instead of > TEMP tables) and use WHERE IN (SELECT ? from TEMP) instead of a JOIN. Because the two are conceptually different, it's not surprising they run differently. IN is an existence test: does the value appear in the set? JOIN is a product: produce every matching combination. For some queries, they reduce to the same thing. For example, FROM A JOIN B ON A.b = B.b is the same as IN if B.b 1. is unique, and 2. no other columns from B are used. The query optimizer has to be sophisticated enough to recognize those conditions, which is unlikely in the case of a temporary table. There is also the question of linear vs binary searches. When you supply a list of constants to IN, most if not all DBMSs search the list sequentially. When IN (or EXISTS) is supplied from an indexed column, the search is often binary. For a small number of elements, there's no distinction. For 1000 elements, it's 2 orders of magnitude: 1000 hops versus 10. --jkl