Re: excessive time spent in "statistics" status

Shawn Green Tue, 04 Sep 2007 12:03:36 -0700

Lucio Chiappetti wrote:

On Tue, 4 Sep 2007, Lucio Chiappetti wrote:
I'll do some experimenting and report back.
In lack of better ways of doing a tie-break, I've done the followingtests (with the linemode client), checking both the results of a query andthe total time spent. I tested 16 different combinations of arbitraryWHERE conditions and correlation with an arbitrary external tables, usingmy "maximal" G (the one with 26 members).
For each I tested 3 cases (total 16*3=48) :

 a) the query on the "virtual" table correlated with the external
    (the virtual is my G left join t1 ... left join tn). This involves
the lowest number of joins (29) and is of course the fastest (andalso the easiest to get access to "hidden" columns by name, but
    is not what I intend our users use because the way to select columns
is clumsy (they should use a VIEW on the 26 members instead ...normally with no correlation on external that will be equally fast),but represents my REFERENCE, i.e. I checked the results of the othertest to be the same as this.
 b) the query on the VIEW V with "members also" (which implies a redundant
    join of V with G left join t1 ... left join tn) and correlated with
    an external table. Doubles the joins (59). I used here
SELECT STRAIGHT_JOIN.
 c) the same query of (b) but with a normal select, and preceded by
    setting optimizer_search_depth=0 (auto).
<snip>

In your b) test, did you use the SELECT STRAIGHT_JOIN as your outerSELECT statement or within the CREATE VIEW statement>? If you only triedit one way, you could try it the other, too.

[*] actually a larger minority (I hope it remains such) of our "public"
users accesses the DB once to "take away" all data without performing anyselection :-(


It saddens me to see people abuse your hard work in this way.

My inclination therefore would be to prefer optimizer_search_depth=0
to SELECT STRAIGHT_JOIN *if* it can be easily arranged in our tomcatenvironment, because of the marginal increase in speed despite theoccasional worsening. But the two look almost equivalent.

Another option would be to build your view dynamically based on thecriteria that a user selects through a web-based interface.

One other way to approach this project would be to normalize yourrelationships and have one skinny but very tall table similar to


CREATE TABLE map_table (
  object1_id int
  , object1_type int
  , object2_id int
  , object2_type int
  , confidence tinyint
)

where confidence would be a whole number from 0 to 100. You would needonly one of these to replace each G table you are generating now.

However, that would be a major difference in how you currently use yourdata and I would not suggest this for a near-term solution.

--
Shawn Green, Support Engineer
MySQL Inc., USA, www.mysql.com
Office: Blountville, TN
    __  ___     ___ ____  __
   /  |/  /_ __/ __/ __ \/ /
  / /|_/ / // /\ \/ /_/ / /__
 /_/  /_/\_, /___/\___\_\___/
        <___/
 Join the Quality Contribution Program Today!
 http://dev.mysql.com/qualitycontribution.html

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: excessive time spent in "statistics" status

Reply via email to