does this occur with a raw mysqlconnect script ? if so, how about a MySQLdb
version of the same thing ? start looking at the DBAPI.
On Jul 25, 2012, at 3:45 AM, Warwick Prince wrote:
I have taken the SQL entered into the MySQL Workbench (see below) and run
that same SQL directly as a string in engine.execute(bla). Times are still
way different; See below.
Workbench times are bottom right.
PastedGraphic-6.png
Now a raw query running on the same server in the console of PyCharm IDE.
It's over 3 minutes..
PastedGraphic-7.png
I removed the Date range where clause, and the results were almost line-ball.
225 seconds on Workbench, and 228 in engine.execute(). That would tend to
indicate that the bad query is somehow not using the date index and is
looking through every row?? Given that the final results are actually
correct, it must be recognising the date values in the select OK, so I don't
get it! (I added a USE INDEX later to the same query and no change).
What next?
On 25/07/2012, at 2:02 PM, Michael Bayer wrote:
One difference that exists here is the usage of bound parameters, which I'm
assuming you're not using when you run the query in the console. What
happens if you run a test script with straight mysqlconnector ? Try
with and without the bound parameters.
On Jul 24, 2012, at 11:50 PM, Warwick Prince wrote:
Hi All
This is addressed to anyone who may be able to shed some light on this
strange behaviour;
I'm running MySQL on Windows 2003 Server (Sorry) and have a table that has
~2M rows of sales data in it. Knowing the type of BI queries I would want
to be doing, I have added indexes where I feel they would help. I tested
the queries I wanted to perform manually by directly entering the SQL into
the MySQL console. Results were fine. (Tested by putting raw SQL into
MySQL Workbench on the same server as the database)
I then simply moved the logic into my code, and created very simple queries
on tables using the table.select() syntax, and adding a few basic group_by
and 'where' additions. All works perfectly, and I get exactly the same
results as my manual query BUT... the manual query returns the results in
3 seconds, and the programatic query returns the results in 200+ seconds!!
This is the round trip time from .execute() 'til the next line of code
being hit. i.e. I have fetched nothing yet.
This is what I have done;
I added an audit line in my code to show the final SQL that was being
emitted, then cut and paste that into MySQL Workbench and executed it. 3
seconds was the result, with exactly the same actual rows being returned.
The Python code is running on the same server as the MySQL Console.
I have created the same query manually on the Python console, using the
sqlalchemy constructs like query=table.select() etc and run it there - 180+
seconds again - it's I've basically eliminated by code.
I'm using 'mysqlconnector' which I have had no problems in the past, and
it's not like I'm hammering it with 1000's of queries - just 1 query that
will return around 12 rows (having processed several 1000 and then GROUPed
them). I can see the query sitting there in Workbench, taking minutes to
complete.
I've tried adding index hints just in case.. Made no difference, as it
should have been using that index anyway..
When creating the SQLAlchemy test query, (and indeed in my real code) I
used all defaults for engine, metadata etc with no additional settings at
all.
Finally, to add insult to injury, I can execute the identical manual query
DURING the slow queries execution, and still get it back 3 seconds..
whilst the slow one clunks along and spits out the same result minutes
later.
In the SQL below, invoice and customer are VARCHAR columns, whilst date is
DATE and TotalEx is FLOAT
I have restarted the MySQL Service (several times)
It's *like* the sqlalchemy query is not allowed to use any of the indexes,
and is trawling through all the data.. Are there permissions or something
on using an index that I don't know about? (Both queries are being
executed by the same user - root)
It is EXACTLY the same query in all ways, and in fact is an extremely
simple query - as below;
SELECT sum(`sales_data`.`TotalEx`) AS totalSales, date, invoice, customer
FROM sales_data USE INDEX(idx_date)
WHERE date =2011-12-01 AND date =2012-07-31
GROUP BY customer, invoice
Any clues? My project just fell in a hole!
Thanks an advance for any pointers.
Warwick
--
You received this message because you are subscribed to the Google Groups
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/sqlalchemy?hl=en.
--
You received this message