Re: [sqlite] Please help me optimize this LEFT JOIN query.
On Sep 5, 2013, at 9:45 PM, Yuzem wrote: > It is incredibly fast but it gives me the wrong result: Yeah… you forgot the 'where' clause in the subquery… you need to restrict it on the genre ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
Thanks Petite Abeille, I translated your code to this: SELECT genres.genres, ( SELECT substr(group_concat(name,' '),1,60) FROM ( SELECTname FROM movies JOIN genres ONmovies.movies = genres.movies ORDER BY movies.name LIMIT 6 ) ) AS sample FROM genres GROUP BY genres ORDER BY genres.genres; time: 0.028s It is incredibly fast but it gives me the wrong result: Action|American Graffiti American Graffiti American Graffiti Mulhol Adventure|American Graffiti American Graffiti American Graffiti Mulhol Animation|American Graffiti American Graffiti American Graffiti Mulhol Biography|American Graffiti American Graffiti American Graffiti Mulhol Etc... Am I doing something wrong? -- View this message in context: http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71035.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
On Sep 5, 2013, at 11:27 PM, Yuzem wrote: > Any clue on why LEFT JOIN is so slow when used with genres but not with > larger tables? Sure. But your conclusion is a most likely a red herring. The crux of the matter is that inner and outer join have a wildly different semantic. And therefore execution plan. That's all. Seems all very reasonable from an optimizer point of view. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
Petite Abeille-2 wrote > Sure. But your conclusion is a most likely a red herring. The crux of the > matter is that inner and outer join have a wildly different semantic. And > therefore execution plan. That's all. Seems all very reasonable from an > optimizer point of view. But I have no conclusion, I acknowledge that inner and outer join are different but I don't know why LEFT JOIN works very fast on larger tables and very slow on smaller tables at least in those cases. I would like to have a conclusion on that matter. -- View this message in context: http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71042.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
On Sep 5, 2013, at 10:28 PM, Yuzem wrote: > Ok, wonderful, now it is working correctly but how do I select multiple > columns from table movies? > Should I add another sub query? Nope. You have now changed the problem definition, so scalars will not be a good fit. Blindly copy & paste them will not get you anywhere. The key to success in your case is to access the movie_genre table only once, as selectively as possible. So, if, for a given genre you want 4 movies, you could try something along these lines: selectgenre.code, movie.title from ( selectmovie_genre.movie_id, movie_genre.genre_id from movie_genre where movie_genre.genre_id = 30 order by movie_genre.id limit 4 ) asmovie_genre join movie onmovie.id = movie_genre.movie_id join genre ongenre.id = movie_genre.genre_id Which gives you 4 rows: Western|"26 Men" (1957) Western|"A Man Called Shenandoah" (1965) Western|"ABC Weekend Specials" (1977) {The Winged Colt (#1.5)} Western|"Action in the Afternoon" (1953) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
Thanks but 4 rows is not what I am looking for. I found a solution, concatenation: SELECT genres.genres, ( SELECT group_concat(movie,' ') FROM ( SELECTmovies.movies||','||name movie FROM movies JOIN genres AS movie_genres ONmovies.movies = movie_genres.movies WHERE movie_genres.genres = genres.genres ORDER BY movies.name LIMIT 3 ) ) FROM genres GROUP BY genres ORDER BY genres.genres; Any clue on why LEFT JOIN is so slow when used with genres but not with larger tables? -- View this message in context: http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71040.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
Ok, wonderful, now it is working correctly but how do I select multiple columns from table movies? Should I add another sub query? Example: SELECT genres.genres, ( SELECT substr(group_concat(name,' '),1,60) FROM ( SELECTname FROM movies JOIN genres AS movie_genres ONmovies.movies = movie_genres.movies WHERE movie_genres.genres = genres.genres ORDER BY movies.name LIMIT 6 ) ), ( SELECT substr(group_concat(movies,' '),1,60) FROM ( SELECTname FROM movies JOIN genres AS movie_genres ONmovies.movies = movie_genres.movies WHERE movie_genres.genres = genres.genres ORDER BY movies.name LIMIT 6 ) ) FROM genres GROUP BY genres ORDER BY genres.genres; If I add another sub query it takes almost twice the time than using only one sub query. It is still faster than before. Petite Abeille-2 wrote > Yeah… the inner join vs out join will produce a significantly different > access plan. Yes but if that's the case don't you think that the difference in time is a bit much? And why there is almost no difference between JOIN and LEFT JOIN when using tasks and people which are larger tables? -- View this message in context: http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71038.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
On Sep 5, 2013, at 8:56 PM, Yuzem wrote: > SELECT genres FROM genres LEFT JOIN movies ON genres.movies = movies.movies > GROUP BY genres ORDER BY genres; > time: 2.475s > > SELECT genres FROM genres JOIN movies ON genres.movies = movies.movies GROUP > BY genres ORDER BY genres; > time: 0.035s Yeah… the inner join vs out join will produce a significantly different access plan. For example, using a different schema: (1) inner join 0|0|1|SCAN TABLE movie_genre USING COVERING INDEX sqlite_autoindex_movie_genre_1 0|1|0|SEARCH TABLE genre USING INTEGER PRIMARY KEY (rowid=?) (2) outer join 0|0|0|SCAN TABLE genre USING COVERING INDEX sqlite_autoindex_genre_1 0|1|1|SCAN TABLE movie_genre USING COVERING INDEX sqlite_autoindex_movie_genre_1 In this case, the first plan is much better in terms of throughput as movie_genre is much larger table. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
I did some testing and found some strange results. SELECT genres FROM genres LEFT JOIN movies ON genres.movies = movies.movies GROUP BY genres ORDER BY genres; time: 2.475s SELECT genres FROM genres JOIN movies ON genres.movies = movies.movies GROUP BY genres ORDER BY genres; time: 0.035s SELECT tasks FROM tasks LEFT JOIN people ON tasks.people = people.people GROUP BY tasks ORDER BY tasks" time: 0.164s SELECT tasks FROM tasks JOIN people ON tasks.people = people.people GROUP BY tasks ORDER BY tasks" time: 0.163s The strange thing is that tasks and people are much larger tables than genres and movies: SELECT count() from genres; 3998 SELECT count() from movies; 1529 SELECT count() from tasks; 24964 SELECT count() from people; 19626 -- View this message in context: http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71031.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
I did an ANALYZE but I'm getting the same results. I do have indexes: CREATE TABLE movies ( movies UNIQUE, name, icon_modified ); CREATE TABLE genres ( genres, movies, UNIQUE(genres,movies) ); people has an index on people (people UNIQUE) and tasks is a view: CREATE VIEW tasks AS SELECT 'creators' tasks,movies,people,'' details FROM creators UNION ALL SELECT 'directors' tasks,movies,people,'' details FROM directors UNION ALL SELECT 'writers' tasks,movies,people,'' details FROM writers UNION ALL SELECT 'actors'tasks,movies,people,characters details FROM actors; The tables creators, directors, etc, have similar indexes as genres: CREATE TABLE directors ( directors, people, UNIQUE(directors,people) ); -- View this message in context: http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71033.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
On 5 Sep 2013, at 7:56pm, Yuzem wrote: > I did some testing and found some strange results. Please do an ANALYZE and try the same things again. Also, do you have any indexes on those tables (apart from the primary keys, of course) ? Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
On Sep 4, 2013, at 4:21 PM, Yuzem wrote: > I want to construct genres icons and each icon must display 4 movies. Assuming this is IMDB… what about a scalar subquery? For example, assuming a slightly different schema from yours: selectgenre.code as genre, ( select group_concat( title ) from ( selectmovie.title as title from movie join movie_genre onmovie_genre.movie_id = movie.id where movie_genre.genre_id = genre.id order by movie.title limit 4 ) ) as sample from genre where genre.code = 'Western' order by genre.code > Western|"26 Men" (1957),"A Man Called Shenandoah" (1965),"ABC Weekend > Specials" (1977) {The Winged Colt (#1.5)},"Action in the Afternoon" (1953) CPU Time: user 0.083246 sys 0.000443 This is for: select count( * ) from genre; 30 select count( * ) from movie; 2545331 select count( * ) from movie_genre; 1545196 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
Thanks Igor but now it is taking even more time: 3.139s Any other idea? -- View this message in context: http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71003.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Please help me optimize this LEFT JOIN query.
On 9/4/2013 10:21 AM, Yuzem wrote: SELECT genres name,count(genres.movies) count,substr(group_concat(grouped.movies,' '),1,40) src,substr(group_concat(grouped.icon_modified,' '),1,80) icon_modified FROM genres LEFT JOIN (SELECT movies,icon_modified FROM movies WHERE icon_modified != '') grouped on grouped.movies = genres.movies GROUP BY genres ORDER BY name" Try FROM genres LEFT JOIN movies grouped ON (genres.movies = movies.movies AND icon_modified != '') The use of nested SELECT likely prevents SQLite from using the index on movies(movies). -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Please help me optimize this LEFT JOIN query.
I have 2 tables: CREATE TABLE movies ( movies UNIQUE, name, icon_modified ); CREATE TABLE genres ( genres, movies, ); I want to construct genres icons and each icon must display 4 movies. I need to get: genres from genres and movies,icon_modified from movies. This is the query I have: SELECT genres name,count(genres.movies) count,substr(group_concat(grouped.movies,' '),1,40) src,substr(group_concat(grouped.icon_modified,' '),1,80) icon_modified FROM genres LEFT JOIN (SELECT movies,icon_modified FROM movies WHERE icon_modified != '') grouped on grouped.movies = genres.movies GROUP BY genres ORDER BY name" The query works correctly but it is very very slow, it takes 2.474s on my machine, if I run the same query but using JOIN instead of LEFT JOIN it only takes 0.039s Is there any way of optimizing these query? Thanks in advance. -- View this message in context: http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users