Re: [sqlite] Feature request: merge joins (preferably through a hint)
On 7 Nov 2017, at 7:59am, Davor Josipovicwrote: > What sqlite does now is for each "a" it searches through the index for "x". If an ideal index already exists, accessing the correct records will be fast. If one does not exist, how would you expect a merge join to be any faster ? There are specific cases where a merge join is faster than using JOIN … ORDER BY. For that to happen, both source tables must already have indexes ideally suited to the merge join, and the rows which you’re going to want returned must be a very large proportion of both source tables, probably the whole tables. Also, SQLite has to be aware of those facts, it cannot simply assume them. Except for the above cases, existing formats will be just as fast, and can be far faster, especially in cases where the rows wanted do not represent most of the rows of the existing tables. Merge joins also represent a problem where you have to compare the two available rows. There’s no good way to know what the programmer means by this, especially in cases involving many columns and collation methods. Assumptions have to be made and whatever the development team picks is sure to annoy some users. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Feature request: merge joins (preferably through a hint)
> You are thinking that perhaps queries such as the following might > be faster using a merge: > > SELECT * FROM tab1 JOIN tab2 ON tab1.a=tab2.x; > > I disagree. I don't see any reason to disagree. Merge join will definitely be faster if the data is already sorted. See the reference: https://en.wikipedia.org/wiki/Sort-merge_join. It is a linear time operation. What sqlite does now is for each "a" it searches through the index for "x". This search operation is logarithmic time. If there are many records in tab1, then this stacks and becomes quasilinear time. I experience this constantly with sqlite data wrangling and tab1 and 2 in the millions. sqlite's nested loops are very fast, but the joins _could_ be made much faster with merge joins in such situations. I just wish I had this hint... ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Feature request: merge joins (preferably through a hint)
On 5 Nov 2017, at 11:04am, Richard Hippwrote: > SQLite does do a merge in some cases, though not for what you would > traditionally call a join. For example, SQLite will do a merge to > combine the two halves of this query: > >SELECT a,b,c FROM tab1 UNION SELECT x,y,z FROM tab2 ORDER BY 1,2,3; In case it’s not clear from the above, SQL processes the UNION before the ORDER BY clause. If "merge join" didn’t already have a definition, we could use the term for that. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Feature request: merge joins (preferably through a hint)
On 11/5/17, Davor Josipovicwrote: > Merge joins could be an incredible optimization in some cases for large > queries and would make sqlite much faster in such cases. SQLite does do a merge in some cases, though not for what you would traditionally call a join. For example, SQLite will do a merge to combine the two halves of this query: SELECT a,b,c FROM tab1 UNION SELECT x,y,z FROM tab2 ORDER BY 1,2,3; You are thinking that perhaps queries such as the following might be faster using a merge: SELECT * FROM tab1 JOIN tab2 ON tab1.a=tab2.x; I disagree. In order to do this as a merge, we'd need indexes on both tab1.a and tab2.x. (In order for the merge to be practical, and not require an arbitrary amount of auxiliary storage, both indexes would need to be UNIQUE.) But if you already either one of those two indexes, then the nested-loop join will already be blazing fast. It is difficult to see how switching to a merge join would make it go any faster. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Feature request: merge joins (preferably through a hint)
Are there any plans to implement merge joins in sqlite? As far as I am aware, only nested loops are currently supported. Merge joins could be an incredible optimization in some cases for large queries and would make sqlite much faster in such cases. Personally, I would like to have this option rather as a sql HINT, than as an optimizer option, since the optimizer now, small and efficient as it is, does a great job. The merge join HINT could be used to greatly optimize specific queries - i.e. queries where poking the index many many times on an already ordered set is inefficient. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users