Thanks Stephen, good point, I was just after some general results, I do take your point about caching etc. However it is logical to apply UNION ALL if appropriate in preference to UNION which in this context is a bit lazy. I'm a bit annoyed I didn't pick up on it myself, but thankful to Igor for reminding me of the difference in the two statements.
On 16 October 2014 11:14, Stephen Chrzanowski <pontia...@gmail.com> wrote: > Careful with the timing. You may be looking at OS memory caching the > result set instead of pulling from the drive. For best bets, either re-run > both queries several times, ditch the longest and shortest times, then take > the mean or average times and do the comparison that way. > > On Wed, Oct 15, 2014 at 7:54 PM, Michael Falconer < > michael.j.falco...@gmail.com> wrote: > > > Igor, > > > > Nice one. A quick test using 10 lookup trees and the same search > criteria: > > *Showing rows 0 - 29 (30 total, Query took 0.4838 sec)* > > > > Now with UNION ALL replacing UNION: > > *Showing rows 0 - 29 (30 total, Query took 0.2050 sec)* > > > > You weren't kidding about cheaper were you? LOL > > > > > > On 16 October 2014 10:30, Michael Falconer <michael.j.falco...@gmail.com > > > > wrote: > > > > > Yes, I agree with the general sentiment. It is not exactly the design I > > > would have chosen either, but it is what exists. Design change is > > occurring > > > but for the moment we are stuck with the current schema. > > > > > > If you cannot change the schemata to be more suitable, then your > > >> demonstrated queries are very near as good as it gets. I can think of > > other > > >> ways to achieve the same, but nothing that would be more efficient to > a > > >> tangible level. Maybe someone else here can come up with something a > bit > > >> more tight. > > >> > > > You are preaching to the converted I'm afraid, but I appreciate the > > points > > > you made. The quoted paragraph probably answers most of my questions > > > really, except for this from Igor: > > > > > > Unless you do expect duplicates and need to eliminate them, use UNION > ALL > > >> - it's much cheaper (this is assuming you insist on keeping multiple > > >> tables). > > >> > > > Cheaper because it drops the operation to remove the duplicate records > > > from the result set? I'm assuming. I'll give that a try. > > > > > > On 16 October 2014 10:05, RSmith <rsm...@rsweb.co.za> wrote: > > > > > >> > > >> On 2014/10/16 00:05, Michael Falconer wrote: > > >> > > >>> Hi all, > > >>> > > >>> first off I must start with an apology. I know I'm sort of doing the > > >>> wrong//... > > >>> > > >> > > >> No need to apologise, this flies quite close to the central theme. > > >> Whether you are using SQLite or any other SQL RDBMS, this is horrible > DB > > >> design and it is so for precisely the reasons you are asking the > > question. > > >> Tables should not be dynamic: The simple rule of thumb being - if you > > >> cannot tell (or at least accurately predict) before-hand exactly how > > many > > >> tables will be in the final system, then you are doing it wrong. Why > > not > > >> have one set of tables and in stead of preceding each of them with a > > >> <prefix>, simply add a column that can host the <prefix> as a simple > > value, > > >> which will immediately make your life very much easier and get the SQL > > >> engine to do the work you are now trying to compensate for in a > > lop-sided > > >> manual kind of way. > > >> > > >> I can only imagine the amount of code you invested into your system to > > >> track and deal with these dynamic sets of tables, so I know changing > it > > >> will seem like a rather large undertaking, but really it will mostly > > >> involve removing loads of code to end up with a small set of simple > > queries > > >> that does all the work for you. > > >> > > >> The only reason I can imagine this sort of breakdown useful is if your > > >> tree tables are all really really huge, like Gigabytes, and so there > > might > > >> be some efficiency to be gained from splitting it up, but I doubt this > > is > > >> the case. > > >> > > >> Just imagine the ease of those search queries you demonstrated... one > > >> query, no UNIONs, a single set of joins and an extra where-clause > > check... > > >> Add to that increased efficiency at finding results (run one query in > > stead > > >> of 50). Easy-mode. As a proof-of-concept, just make an SQLite DB, one > > set > > >> of tables as per your list, add column(s) for prefixing as needed, > > import > > >> all the tables from some large set of them (specifying additionally > the > > >> column for the prefix) and ten run search queries on them. I am > willing > > to > > >> bet on the fact it will be much quicker in addition to being much > > simpler. > > >> > > >> If you cannot change the schemata to be more suitable, then your > > >> demonstrated queries are very near as good as it gets. I can think of > > other > > >> ways to achieve the same, but nothing that would be more efficient to > a > > >> tangible level. Maybe someone else here can come up with something a > bit > > >> more tight. > > >> > > >> > > >> thing here as this question is NOT related to sqlite. It is a general > > SQL > > >>> question but I ask it here because I have great respect for the > answers > > >>> and > > >>> discussions I have seen on this forum over many years. I rarely post > > >>> myself > > >>> as there are always several contributors who beat me to the answer > and > > >>> often their response is far better than mine would have been. I'm > not a > > >>> code leper, I don't need the actual SQL just the method really, > though > > a > > >>> short code example would be well received for illustration. > > >>> > > >>> It's about a search performed on multiple tables. However the > structure > > >>> of > > >>> this database is somewhat unorthodox. It contains genealogical data > and > > >>> this is clustered into trees. Each tree has 8 tables, there can be as > > >>> many > > >>> as 100 trees but most installations of the associated application > > >>> software > > >>> contain between 5 - 50 trees. These 8 tables contain a family trees > > data > > >>> and are named: > > >>> <prefix>addresses > > >>> <prefix>connections > > >>> <prefix>events > > >>> <prefix>family > > >>> <prefix>person > > >>> <prefix>repositories > > >>> <prefix>sources > > >>> <prefix>texts > > >>> > > >>> The <prefix> changes and is unique for each tree. > > >>> > > >>> There are extensive search options offered to users as well as simple > > >>> name > > >>> searching and it is one of the best features of the app. It works > > pretty > > >>> well, so it ain't broke in any sense, we just wonder if there is a > > better > > >>> way to perform this search in SQL. Is there a general technique which > > is > > >>> superior either in speed, efficiency or load bearing contexts? > > >>> > > >>> I am sure you can see one of the pitfalls here is the exponential > > growth > > >>> of > > >>> such a search query as both total number of trees and indeed user > > search > > >>> criteria increase. For each criteria component, and there are quite a > > >>> few, > > >>> the appropriate tables must be queried and results joined. > > >>> > > >>> Searches return records of individuals meeting the entered search > > >>> criteria > > >>> so the query focuses on the person table as it's anchor, performs > > >>> searches > > >>> on required other tables in the tree and joins the results to the > > person > > >>> data. The results from each tree search are then UNION'ed to provide > > the > > >>> result set. Here is a contrived example of the SQL query code which > > >>> should > > >>> make things clearer. > > >>> > > >>> The user enters simple search criteria - any person with a last name > > >>> containing 'mac' and a first name containing the character 'a'. This > is > > >>> the > > >>> resulting query (generated by php code), which searches a small 4 > > family > > >>> tree installation. > > >>> > > >>> (SELECT humo1_person.*, event_kind, event_event, address_place, > > >>> address_zip > > >>> FROM humo1_person > > >>> LEFT JOIN humo1_events ON pers_gedcomnumber=event_person_id > > >>> LEFT JOIN humo1_addresses ON pers_gedcomnumber=address_person_id > > >>> WHERE pers_lastname LIKE '%mac%' > > >>> AND (pers_firstname LIKE '%a%' > > >>> OR (event_kind='name' AND event_event LIKE '%a%') > > >>> ) > > >>> GROUP BY pers_gedcomnumber > > >>> ) > > >>> UNION > > >>> (SELECT humo2_person.*, event_kind, event_event, address_place, > > >>> address_zip > > >>> FROM humo2_person LEFT JOIN humo2_events ON > > >>> pers_gedcomnumber=event_person_id > > >>> LEFT JOIN humo2_addresses ON pers_gedcomnumber=address_ > > >>> person_id > > >>> WHERE pers_lastname LIKE '%mac%' > > >>> AND (pers_firstname LIKE '%a%' > > >>> OR (event_kind='name' AND event_event LIKE '%a%') > ) > > >>> GROUP BY pers_gedcomnumber > > >>> ) > > >>> UNION > > >>> (SELECT humo3_person.*, event_kind, event_event, address_place, > > >>> address_zip > > >>> FROM humo3_person LEFT JOIN humo3_events ON > > >>> pers_gedcomnumber=event_person_id > > >>> LEFT JOIN humo3_addresses ON pers_gedcomnumber=address_ > > >>> person_id > > >>> WHERE pers_lastname LIKE '%mac%' > > >>> AND (pers_firstname LIKE '%a%' > > >>> OR (event_kind='name' AND event_event LIKE '%a%') > > >>> ) > > >>> GROUP BY pers_gedcomnumber > > >>> ) > > >>> UNION > > >>> (SELECT humo4_person.*, event_kind, event_event, address_place, > > >>> address_zip > > >>> FROM humo4_person > > >>> LEFT JOIN humo4_events ON pers_gedcomnumber=event_person_id > > >>> LEFT JOIN humo4_addresses ON pers_gedcomnumber=address_ > > >>> person_id > > >>> WHERE pers_lastname LIKE '%mac%' > > >>> AND (pers_firstname LIKE '%a%' > > >>> OR (event_kind='name' AND event_event LIKE '%a%') > > >>> ) > > >>> GROUP BY pers_gedcomnumber > > >>> ) > > >>> ORDER BY pers_lastname ASC , pers_firstname ASC LIMIT 0,30 > > >>> > > >>> Any thoughts? Suggestions? Missiles? Good approach, bad approach, or > > >>> completely off the grid? I do use sqlite quite a bit, but not on this > > >>> particular project. > > >>> > > >>> > > >>> > > >> _______________________________________________ > > >> sqlite-users mailing list > > >> sqlite-users@sqlite.org > > >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > >> > > > > > > > > > > > > -- > > > Regards, > > > Michael.j.Falconer. > > > > > > > > > > > -- > > Regards, > > Michael.j.Falconer. > > _______________________________________________ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Regards, Michael.j.Falconer. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users