Avoiding table scans...

2014-07-24 Thread Chris Knipe
Hi All,

I have a couple of *huge* tables, they're still busy populating, but once
done I suspect it will hold well over 3 billion records (and that's more
than likely the start of the problem).

The mysql server is a highly optimized, powerful server with some 128GB ram,
data + binlogs on RAID10 SSDs and is performing incredibly well with some 3K
inserts/second whilst still doing random selects/updates in between.

How can I go about not having the following:

mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  '2118806';
+---+---
+
| ArtNumber | MessageID
|
+---+---
+
|   2118807 | part26of79.GfYzwhqz$ORUpNi3tjsW@camelsystem-powerpost.local
|
+---+---
+
1 row in set (22.78 sec)

mysql EXPLAIN SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  '2118806';
++-+--+---+-
--+-+-+--+--+-+
| id | select_type | table| type  |
possible_keys | key | key_len | ref  | rows | Extra   |
++-+--+---+-
--+-+-+--+--+-+
|  1 | SIMPLE  | 78168ea0a9b3b513a1f2d39b559b406e | range | PRIMARY
| PRIMARY | 8   | NULL | 31515172 | Using where |
++-+--+---+-
--+-+-+--+--+-+
1 row in set (0.19 sec)

mysql explain SELECT MAX(ArtNumber) AS ArtNumber, MessageID FROM
78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  '2118809';
++-+--+---+-
--+-+-+--+--+-+
| id | select_type | table| type  |
possible_keys | key | key_len | ref  | rows | Extra   |
++-+--+---+-
--+-+-+--+--+-+
|  1 | SIMPLE  | 78168ea0a9b3b513a1f2d39b559b406e | range | PRIMARY
| PRIMARY | 8   | NULL |3 | Using where |
++-+--+---+-
--+-+-+--+--+-+
1 row in set (0.17 sec)

mysql SHOW CREATE TABLE 78168ea0a9b3b513a1f2d39b559b406e;
+--+





---+
| Table| Create Table
|
+--+





---+
| 78168ea0a9b3b513a1f2d39b559b406e | CREATE TABLE
`78168ea0a9b3b513a1f2d39b559b406e` (
  `ArtNumber` bigint(20) unsigned NOT NULL,
  `MessageID` varchar(255) NOT NULL,
  `Date` int(10) unsigned NOT NULL,
  `Bytes` bigint(20) unsigned NOT NULL,
  `Lines` bigint(20) unsigned NOT NULL,
  `From` varchar(255) NOT NULL,
  `Subject` tinytext NOT NULL,
  PRIMARY KEY (`ArtNumber`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='' |
+--+





---+
1 row in set (0.70 sec)

I realise that this is an enormous amount of data - especially once fully
populated and we reach the over 3 billion records in the table.  Is the only
course of action here to re-look at how the data is stored?  I suppose it
can't get any better than querying directly against the PRIMARY key, can it?

Many thanks,
Chris.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:

Re: Avoiding table scans...

2014-07-24 Thread Johan De Meersman
- Original Message -
 From: Chris Knipe sav...@savage.za.org
 To: mysql@lists.mysql.com
 Sent: Thursday, 24 July, 2014 11:17:50 AM
 Subject: Avoiding table scans...
 
 mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
 78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  '2118806';

You're putting quotes around ArtNumber in your where clause, where it really is 
a bigint. Thus, you're forcing implicity conversion in the parser, instead of 
simply doing an index lookup.


-- 
Unhappiness is discouraged and will be corrected with kitten pictures.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Avoiding table scans...

2014-07-24 Thread Chris Knipe
On Thu, Jul 24, 2014 at 11:47 AM, Johan De Meersman vegiv...@tuxera.be wrote:
 - Original Message -
 From: Chris Knipe sav...@savage.za.org
 To: mysql@lists.mysql.com
 Sent: Thursday, 24 July, 2014 11:17:50 AM
 Subject: Avoiding table scans...

 mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
 78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  '2118806';

 You're putting quotes around ArtNumber in your where clause, where it really 
 is a bigint. Thus, you're forcing implicity conversion in the parser, instead 
 of simply doing an index lookup.


mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  2118806;
+---+---+
| ArtNumber | MessageID |
+---+---+
|   2118807 | part26of79.GfYzwhqz$ORUpNi3tjsW@camelsystem-powerpost.local |
+---+---+
1 row in set (19.37 sec)

mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  '2118806';
+---+---+
| ArtNumber | MessageID |
+---+---+
|   2118807 | part26of79.GfYzwhqz$ORUpNi3tjsW@camelsystem-powerpost.local |
+---+---+
1 row in set (19.43 sec)

mysql EXPLAIN SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  2118806;
++-+--+---+---+-+-+--+--+-+
| id | select_type | table| type  |
possible_keys | key | key_len | ref  | rows | Extra   |
++-+--+---+---+-+-+--+--+-+
|  1 | SIMPLE  | 78168ea0a9b3b513a1f2d39b559b406e | range |
PRIMARY   | PRIMARY | 8   | NULL | 31868953 | Using where |
++-+--+---+---+-+-+--+--+-+
1 row in set (0.18 sec)


Partitioning should help from what I am reading currently... Just not
sure about a few things just yet.



-- 

Regards,
Chris Knipe

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Avoiding table scans...

2014-07-24 Thread Corrado Pandiani
Il 24/07/14 11:57, Chris Knipe ha scritto:
 On Thu, Jul 24, 2014 at 11:47 AM, Johan De Meersman vegiv...@tuxera.be 
 wrote:
 - Original Message -
 From: Chris Knipe sav...@savage.za.org
 To: mysql@lists.mysql.com
 Sent: Thursday, 24 July, 2014 11:17:50 AM
 Subject: Avoiding table scans...

 mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
 78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  '2118806';
 You're putting quotes around ArtNumber in your where clause, where it really 
 is a bigint. Thus, you're forcing implicity conversion in the parser, 
 instead of simply doing an index lookup.

 mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
 78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  2118806;
 +---+---+
 | ArtNumber | MessageID |
 +---+---+
 |   2118807 | part26of79.GfYzwhqz$ORUpNi3tjsW@camelsystem-powerpost.local |
 +---+---+
 1 row in set (19.37 sec)

 mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
 78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  '2118806';
 +---+---+
 | ArtNumber | MessageID |
 +---+---+
 |   2118807 | part26of79.GfYzwhqz$ORUpNi3tjsW@camelsystem-powerpost.local |
 +---+---+
 1 row in set (19.43 sec)

 mysql EXPLAIN SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
 78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  2118806;
 ++-+--+---+---+-+-+--+--+-+
 | id | select_type | table| type  |
 possible_keys | key | key_len | ref  | rows | Extra   |
 ++-+--+---+---+-+-+--+--+-+
 |  1 | SIMPLE  | 78168ea0a9b3b513a1f2d39b559b406e | range |
 PRIMARY   | PRIMARY | 8   | NULL | 31868953 | Using where |
 ++-+--+---+---+-+-+--+--+-+
 1 row in set (0.18 sec)


 Partitioning should help from what I am reading currently... Just not
 sure about a few things just yet.



Try this

SELECT ArtNumber, MessageID FROM 78168ea0a9b3b513a1f2d39b559b406e WHERE
ArtNumber=(SELECT MIN(ArtNumber) FROM 78168ea0a9b3b513a1f2d39b559b406e
WHERE ArtNumber2118806)


-- 
INTER
CORRADO PANDIANI
Web Project Manager / DBA
F.C. Internazionale Milano S.p.A.
C.so Vittorio Emanuele II, 9
20122 Milano, Italy
TEL +39 027715 551
www.inter.it

IMPORTANT: The contents of this email and any attachments are
confidential. They are intended for the named recipient(s) only. If you
have received this email by mistake, please notify the sender
immediately and do not disclose the contents to anyone or make copies
thereof.



Re: Avoiding table scans...

2014-07-24 Thread Chris Knipe


 Try this

 SELECT ArtNumber, MessageID FROM 78168ea0a9b3b513a1f2d39b559b406e WHERE
 ArtNumber=(SELECT MIN(ArtNumber) FROM 78168ea0a9b3b513a1f2d39b559b406e
 WHERE ArtNumber2118806)



+---+---+
| ArtNumber | MessageID
|
+---+---+
|   2118807 | part26of79.GfYzwhqz$ORUpNi3tjsW@camelsystem-powerpost.local
|
+---+---+
1 row in set (0.81 sec)

If this wasn't a public mailing list I would have said something very
strange now containing the word love!

Can you perhaps just elaborate a bit as to *why* this is working so much
better?  The reason I ask is that I've only given one query as a example.
 There's quite a few slightly different queries like this, and I would just
like to understand the mythology behind in order to be able to change the
other queries as well...

MANY thanks for the assistance :-)

--
Chris.


RE: Avoiding table scans...

2014-07-24 Thread Jesper Wisborg Krogh
Hi Chris,

 -Original Message-
 From: Chris Knipe [mailto:sav...@savage.za.org]
 Sent: Thursday, 24 July 2014 19:18
 To: mysql@lists.mysql.com
 Subject: Avoiding table scans...
 
 mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
 78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  '2118806';
 +---+---
 +---+
 +
 | ArtNumber | MessageID
 |
 +---+---
 +---+
 +
 |   2118807 |
 | part26of79.GfYzwhqz$ORUpNi3tjsW@camelsystem-powerpost.local
 |
 +---+---
 +---+
 +
 1 row in set (22.78 sec)

In addition to being slow, the query is probably not what you want. What the
query does is finding the minimum ArtNumber greater than 2118806,
then is free to choose any MessageID among those rows matching the WHERE
clause.

This is also why the query has to examine so many rows.

If you look at the query with MAX(ArtNumber) ... ArtNumber  ..., then it is
more likely that you'll see an unexpected result. Using your table
definition and inserting random rows:

mysql  SELECT MAX(ArtNumber) AS ArtNumber, MessageID FROM
78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  28806;
+---+---+
| ArtNumber | MessageID |
+---+---+
| 28805 | sutlers   |
+---+---+
1 row in set (0.12 sec)

mysql SELECT ArtNumber, MessageID FROM 78168ea0a9b3b513a1f2d39b559b406e
WHERE ArtNumber = 28805;
+---++
| ArtNumber | MessageID  |
+---++
| 28805 | pearl-bordered |
+---++
1 row in set (0.00 sec)

So the MessageID returned for your original query is not the one
corresponding to the ArtNumber.


If you set sql_mode to include ONLY_FULL_GROUP_BY, you can also see that the
query is invalid:

mysql SET sql_mode = 'ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.00 sec)

mysql SELECT MIN(ArtNumber) AS ArtNumber, MessageID FROM
78168ea0a9b3b513a1f2d39b559b406e WHERE ArtNumber  28806;
ERROR 1140 (42000): Mixing of GROUP columns (MIN(),MAX(),COUNT(),...) with
no GROUP columns is illegal if there is no GROUP BY clause


One rewrite is the one suggested by Corrado - first find the ArtNumber, then
retrieve the corresponding row.
An alternative is to use ORDER BY ArtNumber ASC|DESC LIMIT 1, i.e.:

mysql SELECT ArtNumber, MessageID FROM 78168ea0a9b3b513a1f2d39b559b406e
WHERE ArtNumber  28806 ORDER BY ArtNumber ASC LIMIT 1;
+---+-+

| ArtNumber | MessageID   |
+---+-+
| 28807 | groundworks |
+---+-+
1 row in set (0.00 sec)

mysql SELECT ArtNumber, MessageID FROM 78168ea0a9b3b513a1f2d39b559b406e
WHERE ArtNumber  28806 ORDER BY ArtNumber DESC LIMIT 1;
+---++
| ArtNumber | MessageID  |
+---++
| 28805 | pearl-bordered |
+---++
1 row in set (0.00 sec)


Best regards,
Jesper Krogh
MySQL Support 



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



MySQL Connector/Python 2.0.0 Alpha has been released

2014-07-24 Thread Sowmya Dass


Dear MySQL users,

MySQL Connector/Python 2.0.0-alpha is the first alpha version of
2.0 release series of the pure Python database driver for MySQL.
It is intended to introduce users to the new features.

This release is not feature complete but it should be stable
enough for users to understand the new features and how we
expect them to work.

As is the case with all non-GA releases, it should not be used
in any production environment.

MySQL Connector/Python version 2.0.0-alpha is compatible with MySQL
Server versions 5.5 and greater. Python 2.6 and greater as well as
Python 3.3 and greater are supported. Python 2.4, 2.5 and 3.1, 3.2
are not supported.

MySQL Connector/Python 2.0.0-alpha is available for download from:
http://dev.mysql.com/downloads/connector/python/#downloads

The ChangeLog file included in the distribution contains a brief summary
of changes in MySQL Connector/Python 2.0.0. For a more complete list of
changes, see below or online at:
http://dev.mysql.com/doc/relnotes/connector-python/en/

=

Changes in MySQL Connector/Python 2.0.0 (2014-07-24, Alpha)

   Functionality Added or Changed

 * Incompatible Change: Previous series of Connector/Python had
   separate Python 2 and Python 3 code bases. For
   Connector/Python 2.0, the source tree has been reorganized to
   have a single code base, for easier maintenance, testing, and
   distribution.
   This reorganization results in an incompatible change in
   behavior: With the use of raw cursors, the returned values
   is of the bytearray type. This is necessary for having both
   Python 2 and 3 return the same data. Consider the following
   example:
   import mysql.connector

   cnx = mysql.connector.connect(raw=True)
   cursor = cnx.cursor()
   cursor.execute('SELECT 1')
   print(cursor.fetchall())
   In Connector/Python 1.x, the output is:

  + Using Python 2: [('1',)]

  + Using Python 3: [(b'1',)]
   In Connector/Python 2.0, for both Python versions, the output
   is: [(bytearray(b'1'),)]
   To get the same value as in Connector/Python 1.x, do this:

  + Using Python 2: str(bytearray(b'1'))

  + Using Python 3: bytes((bytearray(b'1'))

 * Important Change: Previously, to enable use of LOAD DATA LOCAL
   INFILE, clients had to explicitly set the
   ClientFlag.LOCAL_FILES flag. This flag is now enabled by
   default. To disable it, the allow_local_infile option for
   connect()can be set to False.

 * For a stored procedure that produces multiple result sets, it
   is now possible possible to execute the procedure and process
   its results by executing a CALL statement. Execute the
   statement using execute() with a multi=True argument, and use
   the returned iterator to process each result in turn. (Bug
   #73291, Bug #19207922)

 * The packaging modules and supporting files have been removed
   from the main repository and from the source packages for
   Connector/Python. They are still available in the
   Connector/Python 1.x series.

 * The mysql.connector.cursor module supports four new cursor
   classes:

  + The MySQLCursorDict cursor class returns each row as a
dictionary. The keys for each dictionary object are the
column names of the MySQL result.
cursor = cnx.cursor(dictionary=True)

  + The MySQLCursorBufferedDict cursor class is like
MySQLCursorDict, but fetches the entire result set after
executing the query and buffers the rows.
cursor = cnx.cursor(dictionary=True, buffered=True)

  + The MySQLCursorNamedTuple cursor class returns each row
as a named tuple. Each column is accessible through an
attribute of the tuple-like object.
cursor = cnx.cursor(named_tuple=True)

  + The MySQLCursorBufferedNamedTuple cursor class is like
MySQLCursorNamedTuple, but fetches the entire result set
after executing the query and buffers the rows.
cursor = cnx.cursor(named_tuple=True, buffered=True)
For more information, see cursor.MySQLCursor Subclasses
(http://dev.mysql.com/doc/connector-python/en/connector-python
-api-cursor-subclasses.html).

 * Connector/Python now supports option files using two new
   options for connect():

  + option_files: Which option files to read. The value can
be a file path name (a string) or a sequence of path name
strings. By default, Connector/Python reads no option
files, so this argument must be given explicitly to cause
option files to be read. Files are read in the order
specified.

  + option_groups: Which groups to read from option files, if
option