Correct me if I'm wrong. You're wanting to get all records that have an open_date or a close_date between two times.
If that's correct, you might be able to get an index_merge by doing a query like: WHERE ((starting time)<=open_dt<= (ending time)) OR ((starting time)<=close_dt<=(ending time)) and creating two indexes (one on 'open_dt' and the other on 'close_dt') http://dev.mysql.com/doc/refman/5.1/en/index-merge-optimization.html Regards, Derek On Sep 8, 2011, at 2:50 PM, Brandon Phelps wrote: > Andy, > > The queries take minutes to run. MySQL is 5.1.54 and it's running on Ubuntu > server 11.04. Unfortunately the machine only has 2GB of RAM but no other > major daemons are running on the machine. We are running RAID 1 (mirroring) > with 1TB drives. The tables in question here are all MyISAM. When running > with the LIMIT 10 my EXPLAIN is: > > +----+-------------+-------+--------+---------------------+---------------------+---------+--------------------+----------+-----------------------------+ > | id | select_type | table | type | possible_keys | key > | key_len | ref | rows | Extra | > +----+-------------+-------+--------+---------------------+---------------------+---------+--------------------+----------+-----------------------------+ > | 1 | SIMPLE | sc | range | ndx_open_close_rcvd | > ndx_open_close_rcvd | 8 | NULL | 32393316 | Using where; > Using filesort | > | 1 | SIMPLE | spm | eq_ref | PRIMARY | PRIMARY > | 2 | syslog.sc.src_port | 1 | | > | 1 | SIMPLE | dpm | eq_ref | PRIMARY | PRIMARY > | 2 | syslog.sc.dst_port | 1 | | > +----+-------------+-------+--------+---------------------+---------------------+---------+--------------------+----------+-----------------------------+ > > When I remove the LIMIT 10 I get: > > ----+-------------+-------+--------+---------------------+---------+---------+--------------------+----------+-----------------------------+ > | id | select_type | table | type | possible_keys | key | key_len > | ref | rows | Extra | > +----+-------------+-------+--------+---------------------+---------+---------+--------------------+----------+-----------------------------+ > | 1 | SIMPLE | sc | ALL | ndx_open_close_rcvd | NULL | NULL > | NULL | 32393330 | Using where; Using filesort | > | 1 | SIMPLE | spm | eq_ref | PRIMARY | PRIMARY | 2 > | syslog.sc.src_port | 1 | | > | 1 | SIMPLE | dpm | eq_ref | PRIMARY | PRIMARY | 2 > | syslog.sc.dst_port | 1 | | > +----+-------------+-------+--------+---------------------+---------+---------+--------------------+----------+-----------------------------+ > > Thanks for all your help thus far. > > On 09/08/2011 02:38 PM, Andrew Moore wrote: >> I don't think I saw any query timings in the emails (maybe I missed them). >> >> What version of MySQL are you currently using? >> What does the explain look like when your remove the limit 10? >> Is your server tuned for MyISAM or InnoDB? >> What kind of disk setup is in use? >> How much memory is in your machine? >> >> >> On Thu, Sep 8, 2011 at 7:27 PM, Brandon Phelps<bphe...@gls.com> wrote: >> >>> Thanks for the reply Andy. Unfortunately the users will be selecting >>> varying date ranges and new data is constantly coming in, so I am not sure >>> how I could archive/cache the necessary data that would be any more >>> efficient than simply using the database directly. >>> >>> >>> >>> On 09/08/2011 02:16 PM, Andrew Moore wrote: >>> >>>> Thinking outside the query, is there any archiving that could happen to >>>> make >>>> your large tables kinder in the range scan? >>>> >>>> Andy >>>> >>>> On Thu, Sep 8, 2011 at 7:03 PM, Brandon Phelps<bphe...@gls.com> wrote: >>>> >>>> On 09/01/2011 01:32 PM, Brandon Phelps wrote: >>>>> >>>>> On 09/01/2011 12:47 PM, Shawn Green (MySQL) wrote: >>>>>> >>>>>> On 9/1/2011 09:42, Brandon Phelps wrote: >>>>>>> >>>>>>> On 09/01/2011 04:59 AM, Jochem van Dieten wrote: >>>>>>>> >>>>>>>>> ... >>>>>>>>>> >>>>>>>>> >>>>>>>> WHERE >>>>>>>>>> (open_dt>= '2011-08-30 00:00:00' OR close_dt>= '2011-08-30 >>>>>>>>>> >>>>>>>>> 00:00:00') >>>>>>>> >>>>>>>>> AND (open_dt<= '2011-08-30 12:36:53' OR close_dt<= '2011-08-30 >>>>>>>>>> >>>>>>>>> 12:36:53') >>>>>>>> >>>>>>>> In that case your logic here simplifies to: >>>>>>>>> WHERE >>>>>>>>> open_dt>= '2011-08-30 00:00:00' >>>>>>>>> AND >>>>>>>>> close_dt<= '2011-08-30 12:36:53' >>>>>>>>> >>>>>>>> >>>>>>>> Now add an index over open_dt and close_dt and see what happens. >>>>>>>>> >>>>>>>> >>>>>>>> Jochem >>>>>>>>> >>>>>>>> >>>>>>>> Jochem, >>>>>>>> >>>>>>>> I can't really use your WHERE logic because I also need to retrieve >>>>>>>> results where the open_dt time is out of the range specified. For >>>>>>>> example, a very large file download might span multiple days so given >>>>>>>> your logic if the connection was started 2 days ago and I want to pull >>>>>>>> 1 >>>>>>>> days worth of connections, I would miss that entry. Basically I want >>>>>>>> to >>>>>>>> SELECT all of the records that were opened OR closed during the >>>>>>>> specified time period, ie. if any activity happened between my start >>>>>>>> and >>>>>>>> end dates, I need to see that record. >>>>>>>> >>>>>>>> Any other ideas? >>>>>>>> >>>>>>>> >>>>>>>> I believe Jochem was on the right track but he got his dates >>>>>>> reversed. >>>>>>> >>>>>>> Let's try a little ASCII art to show the situation. I will setup a >>>>>>> query >>>>>>> window with two markers (s) and (e). Events will be marked by |----| >>>>>>> markers >>>>>>> showing their durations. >>>>>>> >>>>>>> a) (s) (e) >>>>>>> b) |---| >>>>>>> c) |---| >>>>>>> d) |---| >>>>>>> e) |--------------------| >>>>>>> f) |---| >>>>>>> g) |---| >>>>>>> >>>>>>> To describe these situations: >>>>>>> a) is the window for which you want to query (s) is the starting time >>>>>>> and >>>>>>> (e) is the ending time for the date range you are interested in. >>>>>>> b) the event starts and stops before your window exists. It won't be >>>>>>> part >>>>>>> of your results. >>>>>>> c) the event starts before the window but ends within the window - >>>>>>> include this >>>>>>> d) the event starts and ends within the window - include this >>>>>>> e) the event starts before the window and ends after the window - >>>>>>> include >>>>>>> this >>>>>>> f) the event starts inside the window but ends beyond the window - >>>>>>> include this. >>>>>>> g) the event starts and ends beyond the window - exclude this. >>>>>>> >>>>>>> In order to get every event in the range of c-f, here is what you need >>>>>>> for a WHERE clause >>>>>>> >>>>>>> WHERE start<= (ending time) and end>= (starting time) >>>>>>> >>>>>>> Try that and let us know the results. >>>>>>> >>>>>>> >>>>>> Thanks Jochem and Shawn, however the following two queries result in the >>>>>> exact same EXPLAIN output: (I hope the tables don't wrap too early for >>>>>> you) >>>>>> >>>>>> Old method: >>>>>> SELECT >>>>>> sc.open_dt, >>>>>> sc.close_dt, >>>>>> sc.protocol, >>>>>> INET_NTOA(sc.src_address) AS src_address, >>>>>> sc.src_port, >>>>>> INET_NTOA(sc.dst_address) AS dst_address, >>>>>> sc.dst_port, >>>>>> sc.sent, >>>>>> sc.rcvd, >>>>>> spm.desc AS src_port_desc, >>>>>> dpm.desc AS dst_port_desc >>>>>> FROM firewall_connections AS sc >>>>>> LEFT JOIN port_mappings AS spm ON spm.port = sc.src_port >>>>>> LEFT JOIN port_mappings AS dpm ON dpm.port = sc.dst_port >>>>>> WHERE >>>>>> (open_dt>= '2011-08-31 09:53:31' OR close_dt>= '2011-08-31 09:53:31') >>>>>> AND (open_dt<= '2011-09-01 09:53:31' OR close_dt<= '2011-09-01 >>>>>> 09:53:31') >>>>>> ORDER BY rcvd DESC >>>>>> LIMIT 0, 10; >>>>>> >>>>>> New method with BTREE index on open_dt, close_dt (index name is >>>>>> ndx_open_close_dt): >>>>>> SELECT >>>>>> sc.open_dt, >>>>>> sc.close_dt, >>>>>> sc.protocol, >>>>>> INET_NTOA(sc.src_address) AS src_address, >>>>>> sc.src_port, >>>>>> INET_NTOA(sc.dst_address) AS dst_address, >>>>>> sc.dst_port, >>>>>> sc.sent, >>>>>> sc.rcvd, >>>>>> spm.desc AS src_port_desc, >>>>>> dpm.desc AS dst_port_desc >>>>>> FROM firewall_connections AS sc >>>>>> LEFT JOIN port_mappings AS spm ON spm.port = sc.src_port >>>>>> LEFT JOIN port_mappings AS dpm ON dpm.port = sc.dst_port >>>>>> WHERE >>>>>> open_dt<= '2011-09-01 09:53:31' AND close_dt>= '2011-08-31 09:53:31' >>>>>> ORDER BY rcvd DESC >>>>>> LIMIT 0, 10; >>>>>> >>>>>> EXPLAIN output for old method: >>>>>> +----+-------------+-------+--****------+---------------------**--** >>>>>> ----+----------+---------+----****----------------+------+----** >>>>>> --**-------+ >>>>>> | id | select_type | table | type | possible_keys | key | key_len | ref >>>>>> | >>>>>> rows | Extra | >>>>>> +----+-------------+-------+--****------+---------------------**--** >>>>>> ----+----------+---------+----****----------------+------+----** >>>>>> --**-------+ >>>>>> | 1 | SIMPLE | sc | index | open_dt,ndx_open_close_dt | ndx_rcvd | 4 | >>>>>> NULL | 10 | Using where | >>>>>> | 1 | SIMPLE | spm | eq_ref | PRIMARY | PRIMARY | 2 | syslog.sc.src_port >>>>>> | >>>>>> 1 | | >>>>>> | 1 | SIMPLE | dpm | eq_ref | PRIMARY | PRIMARY | 2 | syslog.sc.dst_port >>>>>> | >>>>>> 1 | | >>>>>> +----+-------------+-------+--****------+---------------------**--** >>>>>> ----+----------+---------+----****----------------+------+----** >>>>>> --**-------+ >>>>>> >>>>>> EXPLAIN output for new method with new index: >>>>>> +----+-------------+-------+--****------+---------------------**--** >>>>>> ----+----------+---------+----****----------------+------+----** >>>>>> --**-------+ >>>>>> | id | select_type | table | type | possible_keys | key | key_len | ref >>>>>> | >>>>>> rows | Extra | >>>>>> +----+-------------+-------+--****------+---------------------**--** >>>>>> ----+----------+---------+----****----------------+------+----** >>>>>> --**-------+ >>>>>> | 1 | SIMPLE | sc | index | open_dt,ndx_open_close_dt | ndx_rcvd | 4 | >>>>>> NULL | 10 | Using where | >>>>>> | 1 | SIMPLE | spm | eq_ref | PRIMARY | PRIMARY | 2 | syslog.sc.src_port >>>>>> | >>>>>> 1 | | >>>>>> | 1 | SIMPLE | dpm | eq_ref | PRIMARY | PRIMARY | 2 | syslog.sc.dst_port >>>>>> | >>>>>> 1 | | >>>>>> +----+-------------+-------+--****------+---------------------**--** >>>>>> ----+----------+---------+----****----------------+------+----** >>>>>> --**-------+ >>>>>> >>>>>> SHOW INDEX: >>>>>> +----------------------+------****------+-------------------+-**--** >>>>>> -----------+-------------+----****-------+-------------+------**--** >>>>>> --+--------+------+-----------****-+---------+ >>>>>> | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation >>>>>> | >>>>>> Cardinality | Sub_part | Packed | Null | Index_type | Comment | >>>>>> +----------------------+------****------+-------------------+-**--** >>>>>> -----------+-------------+----****-------+-------------+------**--** >>>>>> --+--------+------+-----------****-+---------+ >>>>>> | firewall_connections | 1 | ndx_open_close_dt | 1 | open_dt | A | >>>>>> 1342691 >>>>>> | NULL | NULL | | BTREE | | >>>>>> | firewall_connections | 1 | ndx_open_close_dt | 2 | close_dt | A | >>>>>> 6377783 | NULL | NULL | | BTREE | | >>>>>> +----------------------+------****------+-------------------+-**--** >>>>>> -----------+-------------+----****-------+-------------+------**--** >>>>>> --+--------+------+-----------****-+---------+ >>>>>> >>>>>> >>>>>> Although right now the queries do seem to be executing much faster, >>>>>> although I'm not quite sure why. And I'm not sure why the new >>>>>> ndx_open_close_dt isn't being used either. >>>>>> >>>>>> -Brandon >>>>>> >>>>>> >>>>>> I am still having a big issue with my query as seen above. The table >>>>> is up >>>>> to around 32 million records at the moment and either of the two SELECT >>>>> queries above take a very long time to run. Is there anything at all I >>>>> can >>>>> do to speed things up? It seems that changing the format of the WHERE >>>>> clause did not help at all, as the EXPLAIN output is exactly the same for >>>>> both version. I also tried adding an index on (open_dt, close_dt, rcvd) >>>>> but >>>>> that index does not get used. >>>>> >>>>> Any other ideas? >>>>> >>>>> Thanks in advance, >>>>> >>>>> >>>>> Brandon >>>>> >>>>> -- >>>>> MySQL General Mailing List >>>>> For list archives: http://lists.mysql.com/mysql >>>>> To unsubscribe: >>>>> http://lists.mysql.com/mysql?****<http://lists.mysql.com/mysql?**> >>>>> unsub=eroomy...@gmail.com<http**://lists.mysql.com/mysql?** >>>>> unsub=eroomy...@gmail.com<http://lists.mysql.com/mysql?unsub=eroomy...@gmail.com> >>>>>> >>>>> >>>>> >>>>> >>>> >>> -- >>> MySQL General Mailing List >>> For list archives: http://lists.mysql.com/mysql >>> To unsubscribe: http://lists.mysql.com/mysql?** >>> unsub=eroomy...@gmail.com<http://lists.mysql.com/mysql?unsub=eroomy...@gmail.com> >>> >>> >> > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/mysql?unsub=de...@orange-pants.com > -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org