Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-08 Thread Hans Kristian Rosbach
> >>>CREATE TABLE headers ( > >>> id SERIAL PRIMARY KEY, > >>> name TEXT NOT NULL > >>>); > >>>CREATE UNIQUE INDEX headers_name_udx ON headers(name); > >>> > >>>CREATE TABLE header_msg_map ( > >>> id SERIAL8 PRIMARY KEY, > >>> physmessage_id INT NOT NULL, > >>> header_id INT NOT NULL, > >

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-08 Thread Hans Kristian Rosbach
On Mon, 2004-11-08 at 16:22, Matthew T. O'Connor wrote: > Hans Kristian Rosbach wrote: > > >On Fri, 2004-11-05 at 21:32, Sean Chittenden wrote: > > > > > >>CREATE TABLE headers ( > >> id SERIAL PRIMARY KEY, > >> name TEXT NOT NULL > >>); > >>CREATE UNIQUE INDEX headers_name_udx ON headers(na

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-08 Thread Matthew T. O'Connor
Hans Kristian Rosbach wrote: On Fri, 2004-11-05 at 21:32, Sean Chittenden wrote: CREATE TABLE headers ( id SERIAL PRIMARY KEY, name TEXT NOT NULL ); CREATE UNIQUE INDEX headers_name_udx ON headers(name); CREATE TABLE header_msg_map ( id SERIAL8 PRIMARY KEY, physmessage_id INT NOT NU

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-08 Thread Hans Kristian Rosbach
On Fri, 2004-11-05 at 21:32, Sean Chittenden wrote: > > Couldn't this be solved by having a headers table like this? > > > > Create table headers( > > hid serial primary key, > > physmessage_id int not null, > > header varchar not null, > > value varchar not null, > > ); > > create

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-08 Thread Hans Kristian Rosbach
On Fri, 2004-11-05 at 16:57, Mikhail Ramendik wrote: > Hans Kristian Rosbach wrote: > > > Couldn't this be solved by having a headers table like this? > > > > Create table headers( > > hid serial primary key, > > physmessage_id int not null, > > header varchar not null, > > value

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Thomas Mueller
Hi Aaron, > > Aaron: more returned rows make a real difference if the imapd and > > database are on different hosts. > > True, but how much of a difference? The REGEXP I wrote took almost 10 > minutes to do what my LIKE did in 30 seconds. Unless your two hosts are > separated by 300 feet of 802.1

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Sean Chittenden
Couldn't this be solved by having a headers table like this? Create table headers( hid serial primary key, physmessage_id int not null, header varchar not null, value varchar not null, ); create index headers_physid_header on headers(physmessage_id,header); This w

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Aaron Stone
Magnus Sundberg <[EMAIL PROTECTED]> said: > Jesse Norell wrote: >> Along these lines, and admittedly maybe not a great idea, would it seem >> desirable to add a config item that sets a "mode" for dbmail that either >> does as much work itsself vs. using the database features, or vice-versa? >> M

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Aaron Stone
Thomas Mueller <[EMAIL PROTECTED]> said: > Aaron: more returned rows make a real difference if the imapd and > database are on different hosts. True, but how much of a difference? The REGEXP I wrote took almost 10 minutes to do what my LIKE did in 30 seconds. Unless your two hosts are separated b

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Mikhail Ramendik
Hans Kristian Rosbach wrote: Couldn't this be solved by having a headers table like this? Create table headers( hid serial primary key, physmessage_id int not null, header varchar not null, value varchar not null, ); This and similar solutions were discussed an

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Magnus Sundberg
Jesse Norell wrote: SELECT DISTINCT(physmessage_id) FROM dbmail_messageblks WHERE messageblk LIKE '%Aaron%'; 19066 rows in set (12.81 sec) So it takes about 35% of the time, but returned 190% of the rows. We'll have to combine this time trial with the header parser to see which adds more t

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Jesse Norell
> > SELECT DISTINCT(physmessage_id) FROM dbmail_messageblks WHERE > > messageblk LIKE '%Aaron%'; > > 19066 rows in set (12.81 sec) > > So it takes about 35% of the time, but returned 190% of the rows. We'll > have to combine this time trial with the header parser to see which adds > more time

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Hans Kristian Rosbach
> > X-OfflineIMAP-901701146-4c6f63616c4d69726a616d-494e424f58: > > 1086726519-0790956581151 > > Searches for p.e. X-Spam-Status are quite common too. > > Aaron: more returned rows make a real difference if the imapd and > database are on different hosts. > > The point I don't see in this discu

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Thomas Mueller
Hi Paul, > >This might be important if you're searching for uncommon headers. Of > >course, uncommon headers probably also mean that they're uncommonly > >searched for... > > Don't count on it. Offlineimap is a common client for syncing > imap-instances. It uses it's own header-type where both k

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Paul J Stevens
Aaron Stone wrote: This might be important if you're searching for uncommon headers. Of course, uncommon headers probably also mean that they're uncommonly searched for... Don't count on it. Offlineimap is a common client for syncing imap-instances. It uses it's own header-type where both ke

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Aaron Stone
Mikhail Ramendik <[EMAIL PROTECTED]> said: > Aaron Stone wrote: >> mysql> SELECT DISTINCT(physmessage_id) FROM dbmail_messageblks WHERE >> messageblk LIKE '%From:%Aaron%'; >> >> 11089 rows in set (31.17 sec) [cpu hovered around 50%] > > Could you pelase also benchmark > > SELECT DISTINCT(physme

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Mikhail Ramendik
I wrote: Could you please also benchmark SELECT DISTINCT(physmessage_id) FROM dbmail_messageblks WHERE messageblk LIKE '%Aaron%'; Tried it myself. No dbig dofference. so it's better to keep the header name to kill off some false positives. BTW, the idea really seems fruitful. The query run

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Mikhail Ramendik
Aaron Stone wrote: I'm thinking we should go with LIKE. This is MySQL 4.0.21 on Gentoo Linux on a Pentium III 866. mysql> SELECT DISTINCT(physmessage_id) FROM dbmail_messageblks WHERE messageblk LIKE '%From:%Aaron%'; 11089 rows in set (31.17 sec) [cpu hovered around 50%] Could you pelase als

Re: [Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Matthew T. O'Connor
Aaron Stone wrote: I'm thinking we should go with LIKE. This is MySQL 4.0.21 on Gentoo Linux on a Pentium III 866. mysql> SELECT DISTINCT(physmessage_id) FROM dbmail_messageblks WHERE messageblk LIKE '%From:%Aaron%'; 11089 rows in set (31.17 sec) [cpu hovered around 50%] mysql> SELECT DISTINC

[Dbmail-dev] Quick benchmark of LIKE vs. REGEXP in MySQL

2004-11-05 Thread Aaron Stone
I'm thinking we should go with LIKE. This is MySQL 4.0.21 on Gentoo Linux on a Pentium III 866. mysql> SELECT DISTINCT(physmessage_id) FROM dbmail_messageblks WHERE messageblk LIKE '%From:%Aaron%'; 11089 rows in set (31.17 sec) [cpu hovered around 50%] mysql> SELECT DISTINCT(physmessage_id) FROM