Re: [sqlite] Help speed up query

Keith Medcalf Tue, 16 Apr 2013 08:02:01 -0700

Interesting ...

The query is simply descending a tree and I should think it would be fast.


Object_Types
 Objects
  DTC_Statuses
  Object_TextTables
  DisplayFormats
  Object_Transactions
   Equations
   Equation_Types

Which would need the following indexes for traversal.  For column retrieval, 
you can cover the data items required by appending them to the end of each 
index if required to completely remove the necessity of accessing the base 
table b-tree ...

Object_Types(Object_Type, Object_Type_ID)
Objects(Object_Type_ID, Object_ID, DisplayFormat_ID)
DTC_Statuses(Object_ID)
Object_TextTables(Object_ID)
DisplayFormats(DisplayFormat_ID)
Object_Transactions(Object_ID, Equation_ID, Equation_Type_ID)
Equations(Equation_ID)
Equation_Types(Equation_Type_ID)

It should be fast and the optimizer has little choice on how to perform the 
query because you only have a single non-join constraint and it would be placed 
at the top of the tree (in the outermost loop).

---
()  ascii ribbon campaign against html e-mail
/\  www.asciiribbon.org

> -----Original Message-----
> From: [email protected] [mailto:sqlite-users-
> [email protected]] On Behalf Of [email protected]
> Sent: Tuesday, 16 April, 2013 07:33
> To: [email protected]
> Subject: Re: [sqlite] Help speed up query
> 
> 
>  removing the parenthesises (what is the plural?) made no difference to
> the query plan which is:
> 0|0|6|SEARCH TABLE Object_Types USING COVERING INDEX
> sqlite_autoindex_Object_Types_1 (Object_Type=?) (~1 rows)
> 0|1|1|SCAN TABLE DTC_Statuses (~1000000 rows)
> 0|2|0|SEARCH TABLE Objects USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
> 0|3|7|SEARCH TABLE DisplayFormats USING INTEGER PRIMARY KEY (rowid=?) (~1
> rows)
> 0|4|2|SEARCH TABLE Object_TextTables USING INDEX Object_TextTables_1_index
> (Object_ID=?) (~10 rows)
> 0|5|3|SEARCH TABLE Object_Transactions USING INDEX
> Object_Transactions_2_index (Object_ID=?) (~10 rows)
> 0|6|4|SEARCH TABLE Equations USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
> 0|7|5|SEARCH TABLE Equation_Types USING INTEGER PRIMARY KEY (rowid=?) (~1
> rows)
> 
> which changed after analyze to:
> 0|0|6|SEARCH TABLE Object_Types USING COVERING INDEX
> sqlite_autoindex_Object_Types_1 (Object_Type=?) (~1 rows)
> 0|1|7|SCAN TABLE DisplayFormats (~7 rows)
> 0|2|5|SCAN TABLE Equation_Types (~10 rows)
> 0|3|3|SEARCH TABLE Object_Transactions USING AUTOMATIC COVERING INDEX
> (Equation_Type_ID=?) (~5 rows)
> 0|4|4|SEARCH TABLE Equations USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
> 0|5|0|SEARCH TABLE Objects USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
> 0|6|1|SEARCH TABLE DTC_Statuses USING INDEX DTC_Statuses_1_index
> (Object_ID=?) (~1 rows)
> 0|7|2|SEARCH TABLE Object_TextTables USING INDEX Object_TextTables_1_index
> (Object_ID=?) (~4 rows)
> which seems much better.  But the query now takes longer.
> 
> It was about 380 ms.  Now it is taking about 610 ms.
> These times include the prepare and stepping through the results, but not
> retrieving any field contents.
> Both the prepare and stepping takes longer after analyze is executed.
> 
> The time for the same query via DAO/MS Access takes about 150 ms.
> 
> Here is the sqlite_stat1 table after analyze:
> Objects|Objects_5_index|5495 1
> Objects|Objects_4_index|5495 393
> Objects|Objects_3_index|5495 1374
> Objects|Objects_2_index|5495 5495
> Objects|Objects_1_index|5495 41
> Objects|sqlite_autoindex_Objects_1|5495 1
> Dimensions||1
> Header_Stream_Values|Header_Stream_Values_2_index|3 1
> Header_Stream_Values|Header_Stream_Values_1_index|3 3
> Category_Types||8
> Object_TextTables|Object_TextTables_2_index|15718 1965
> Object_TextTables|Object_TextTables_1_index|15718 4
> Addresses||256
> Object_Types|sqlite_autoindex_Object_Types_1|5 1
> SwitchKeys|SwitchKeys_1_index|1 1
> Equation_Types||10
> Graph_Types||4
> Equations||43
> Header||1
> Object_Equations|Object_Equations_3_index|2 2
> Object_Equations|Object_Equations_2_index|2 1
> Object_Equations|Object_Equations_1_index|2 2
> Nodes|Nodes_2_index|1 1
> Node_Headers|Node_Headers_2_index|4 4
> Node_Headers|Node_Headers_1_index|4 1
> DTC_Statuses|DTC_Statuses_1_index|5234 1
> Header_Stream|sqlite_autoindex_Header_Stream_1|3 1
> TextTables|TextTables_1_index|21 1
> Byte_Types||4
> Transaction_Types|sqlite_autoindex_Transaction_Types_1|8 1
> Request_Parameters||5
> Limits||14
> Object_Categories|Object_Categories_2_index|1 1
> Object_Categories|Object_Categories_1_index|1 1
> DisplayFormats||7
> Message_Filters||5
> Object_Transactions|Object_Transactions_3_index|5747 40
> Object_Transactions|Object_Transactions_2_index|5747 2
> Object_Transactions|Object_Transactions_1_index|5747 160
> Units|sqlite_autoindex_Units_1|45 1
> SConfig|sqlite_autoindex_SConfig_1|2 1
> Service_Types|sqlite_autoindex_Service_Types_1|5 1
> Config||1
> Services|Services_3_index|12 6
> Services|Services_2_index|12 2
> Services|Services_1_index|12 12
> Services|sqlite_autoindex_Services_1|12 1
> Categories|Categories_3_index|2 2
> Categories|Categories_2_index|2 1
> Strings|sqlite_autoindex_Strings_1|5465 1
> Transactions|Transactions_3_index|268 54
> Transactions|Transactions_2_index|268 39
> Transactions|Transactions_1_index|268 2
> Transactions|sqlite_autoindex_Transactions_1|268 1
> 
> I'm not sure how breaking down the joins would give us good info since I'm
> not likely to do it in the same order as SQLite would.
> 
> Vance
> 
> 
> 
> 
> on Apr 16, 2013, Keith Medcalf <[email protected]> wrote:
> >
> >
> >Have you tried removing all the useless brackets that Micro$oft adds to
> the query?
> > I don't know if the parser strips them out automagically or if they are
> used as
> >join order hints.  You are better off letting the optimizer optimize
> unless you really
> >really want to force a specific order.  While some SQL databases ignore
> such tomfoolery,
> >others do not.  I do not know off-hand whether SQLite ignores tomfoolery
> or not.
> > I just habitually remove them.
> >
> >SELECT Objects.Object_ID,
> >       Object_TextTables.Lower_Limit,
> >       Object_TextTables.Upper_Limit,
> >       Object_TextTables.TextTable_ID,
> >       DTC_Statuses.DTC_Status_ID,
> >       DTC_Statuses.Env_Data,
> >       DTC_Statuses.Env_Offset,
> >       DTC_Statuses.DTCs_Follow,
> >       DTC_Statuses.Upper_Limit,
> >       DTC_Statuses.Lower_Limit,
> >       DTC_Statuses.Physical_Value,
> >       Object_Transactions.Element_Offset,
> >       Equation_Types.Equation_Type,
> >       Equations.BitMask,
> >       Equations.ByteSpecific,
> >       Equations.ObjectSpecific,
> >       Equations.InternalEncoder,
> >       Equations.Equation,
> >       Objects.Object_Size,
> >       Objects.Signed,
> >       Objects.Visible,
> >       Object_Types.Object_Type,
> >       DisplayFormats.DisplayFormat
> >  FROM Objects INNER JOIN DTC_Statuses ON
> Objects.Object_ID=DTC_Statuses.Object_ID
> >
> >               INNER JOIN Object_TextTables ON
> Objects.Object_ID=Object_TextTables.Object_ID
> >
> >               INNER JOIN Object_Transactions ON
> Objects.Object_ID=Object_Transactions.Object_ID
> >
> >               INNER JOIN Equations ON
> Object_Transactions.Equation_ID=Equations.Equation_ID
> >
> >               INNER JOIN Equation_Types ON
> Object_Transactions.Equation_Type_ID=Equation_Types.Equation_Type_ID
> >
> >               INNER JOIN Object_Types ON
> Objects.Object_Type_ID=Object_Types.Object_Type_ID
> >               INNER JOIN DisplayFormats ON
> Objects.DisplayFormat_ID=DisplayFormats.DisplayFormat_ID
> > WHERE Object_Types.Object_Type='DTC';
> >
> >I presume that the optimizer will then actually devolve this into the
> optimizable
> >form.  Some SQL parsers do, and some do not.  Personally, except for
> extremely trivially
> >joins (or where JOIN syntax must be used for OUTER joins) I always use
> the less semantically
> >loaded form of speech thus, to ensure that the parsing order does not
> influence the
> >optimizer.  Technically, if a different query plan is obtained when using
> the M$
> >bracketed form, the above form, and the below form, the optimizer is
> using the brackets
> >and JOIN syntax ordering to provide hints -- which may not result in the
> optimum
> >query plan unless you know what you are doing (and what the optimizer in
> question
> >will do when parsing the input semantics)
> >
> >SELECT Objects.Object_ID,
> >       Object_TextTables.Lower_Limit,
> >       Object_TextTables.Upper_Limit,
> >       Object_TextTables.TextTable_ID,
> >       DTC_Statuses.DTC_Status_ID,
> >       DTC_Statuses.Env_Data,
> >       DTC_Statuses.Env_Offset,
> >       DTC_Statuses.DTCs_Follow,
> >       DTC_Statuses.Upper_Limit,
> >       DTC_Statuses.Lower_Limit,
> >       DTC_Statuses.Physical_Value,
> >       Object_Transactions.Element_Offset,
> >       Equation_Types.Equation_Type,
> >       Equations.BitMask,
> >       Equations.ByteSpecific,
> >       Equations.ObjectSpecific,
> >       Equations.InternalEncoder,
> >       Equations.Equation,
> >       Objects.Object_Size,
> >       Objects.Signed,
> >       Objects.Visible,
> >       Object_Types.Object_Type,
> >       DisplayFormats.DisplayFormat
> >  FROM Objects,
> >       DTC_Statuses, Object_TextTables,
> >       Object_Transactions,
> >       Equations,
> >       Equation_Types,
> >       Object_Types,
> >       DisplayFormats,
> > WHERE Objects.Object_ID=DTC_Statuses.Object_ID
> >   AND Objects.Object_ID=Object_TextTables.Object_ID
> >   AND Objects.Object_ID=Object_Transactions.Object_ID
> >   AND Object_Transactions.Equation_ID=Equations.Equation_ID
> >   AND
> Object_Transactions.Equation_Type_ID=Equation_Types.Equation_Type_ID
> >   AND Objects.Object_Type_ID=Object_Types.Object_Type_ID
> >   AND Objects.DisplayFormat_ID=DisplayFormats.DisplayFormat_ID
> >   AND Object_Types.Object_Type='DTC';
> >
> >And then properly handle transitive equalities so that it can generate
> the best query
> >plan, assuming that every column mentioned as a conditional column is
> included in
> >an appropriate index (appropriately set as unique, if it is unique), and
> that tables
> >which require more than one column to be indexed have *all* the
> referenced columns
> >included in the index.  If a table has two mentioned columns you should
> create both
> >indexes (mutatis mutandis for more than two columns)).  Ensure unique
> columns are
> >actually constrained unique in the table definition (or by a manually
> created unique
> >index, which is the same thing).
> >
> >Then do an analyze and look at the generated output from EXPLAIN QUERY
> PLAN which
> >will tell you the optimum join order so you can eliminate unneeded
> indexes and extend
> >the indexes used to be covering indexes if you really want the maximum
> performance.
> >
> >Are you using a version of SQLite3 that supports transitive conditions
> (what version
> >are you using)?  If not, you should specify them in the query, eg:
> >
> >WHERE A=B and B=C
> >
> >Should be stated as
> >
> >WHERE A=B and B=C and A=C
> >
> >Assuming that Object_Types.Object_Type='DTC' selects one row, that should
> likely be
> >selected as the outer query loop since the optimizer should be seeking to
> minimize
> >selection and using indexes to SEARCH inner tables using a minimum
> selection product
> >at each step (that is what the row count in the EXPLAIN QUERY PLAN
> indicates -- it
> >indicates selectivity of at that step, not the number of rows in the
> table -- a SCAN
> >of a table containing a billion rows with the annotation (~1 ROW) means
> that after
> >scanning the all the rows, the constraints to that point are expected to
> match ~1
> >row out of the billion scanned [needless to say this is very slow] -- a
> SEARCH resulting
> >in (~1 row) means that the optimizer expects to be able to use an index
> to select
> >one row from the table (lightening fast) -- (~1000000 rows) means that
> there is no
> >information to determine how many rows will be selected and this step
> might or does
> >devolve to a cross join which may result in a huge number of candidates
> goi
> > ng into the next loop).
> >
> >---
> >()  ascii ribbon campaign against html e-mail
> >/\  www.asciiribbon.org
> >
> >> -----Original Message-----
> >> From: [email protected] [mailto:sqlite-users-
> >> [email protected]] On Behalf Of [email protected]
> >> Sent: Monday, 15 April, 2013 15:20
> >> To: [email protected]
> >> Subject: Re: [sqlite] Help speed up query
> >>
> >> SQLite was much slower on many queries AFTER analyze was executed.
> >>
> >> here are the referenced tables, # of rows, primary key, and additional
> >> indexes:
> >> Objects = 5495 rows, Primary key=Object_ID, other indexes on
> Address_ID,
> >> Dimension_ID, DisplayFormat_ID, Limit_ID, and Object_ID;
> >> DTC_Statuses = 5234 rows, Primary key=DTC_Status_ID, other index on
> >> Object_ID;
> >> Object_TextTables = 15718 rows, other indexes on Object_ID and
> >> TextTable_ID;
> >> Object_Transactions = 5747 rows, Primary key=Object_Transaction_ID,
> other
> >> indexes on Equation_ID, Object_ID, Transaction_ID;
> >> Equations = 43 rows, Primary key=Equation_ID;
> >> Equation_Types = 10 rows, Primary key=Equation_Type_ID;
> >> Object_Types = 5 rows, Primary key=Object_Type_ID;
> >> DisplayFormats = 7 rows,  Primary key=DisplayFormat_ID.
> >>
> >> I'm not sure where it is getting 1000000 rows!
> >>
> >> Vance
> >>
> >> on Apr 15, 2013, Simon Slavin <[email protected]> wrote:
> >> >
> >> >
> >> >On 15 Apr 2013, at 9:47pm, [email protected] wrote:
> >> >
> >> >> The integrity check came back with OK.
> >> >
> >> >Good.
> >> >
> >> >> My software was not locked up totally.  It did finally come back.
> For
> >> some reason
> >> >executing analyze slow other queries way down.
> >> >
> >> >I am surprised by anyone reporting that ANALYZE has substancially
> slowed
> >> down any
> >> >operation.  Please excuse me because I have trouble understanding
> English
> >> sometimes.
> >> > Can you verify for us:
> >> >
> >> >A) The slow-down you're talking about.  Are you saying that other
> >> operations ran slowly
> >> >while you were doing ANALYZE, or that the SELECTs ran more slowly
> after
> >> the ANALYZE
> >> >than they did before it ?
> >> >
> >> >B) Reading the results of the EXPLAIN QUERY PLAN after the ANALYZE,
> you
> >> should see
> >> >approximate row counts like
> >> >
> >> >> 0|5|3|SCAN TABLE DisplayFormats (~7 rows)
> >> >
> >> >This means that the table DisplayFormats has about 7 rows.  If you
> take
> >> this as meaning
> >> >that it has between 1 and 70 rows, are these rowcounts correct as far
> as
> >> you know,
> >> >or has it completely miscounted a table which actually has thousands
> of
> >> rows as having
> >> >just 1 row ?
> >> >
> >> >C) Are these EXPLAIN QUERY PLANs you're posting all for the same
> SELECT
> >> statement
> >> >or are you switching from one to another ?
> >> >
> >> >Also I see in your latest EQP
> >> >
> >> >> 0|1|1|SCAN TABLE DTC_Statuses (~1000000 rows)
> >> >
> >> >Obviously, scanning a million rows is something you want to avoid.
> Have
> >> you created
> >> >an appropriate index ?
> >> >
> >> >Simon
> >> >_______________________________________________
> >> >sqlite-users mailing list
> >> >[email protected]
> >> >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >> >
> >> _______________________________________________
> >> sqlite-users mailing list
> >> [email protected]
> >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> >
> >
> >_______________________________________________
> >sqlite-users mailing list
> >[email protected]
> >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Help speed up query

Reply via email to