Re: help with big table search

2002-02-25 Thread Jaime Teng

Hi,

It appears fulltext has the potential for my particular problem.
However, after altering my table to contain FULLTEXT, it did not
help much:

#1:
SELECT * FROM eventlog WHERE detail like '%keyword%'
ORDER BY id DESC LIMIT 20;

#2:
SELECT * FROM eventlog WHERE MATCH(description) 
AGAINST ('keyword') ORDER BY id DESC LIMIT 20;

I concluded that the FULLTEXT search was fast ONLY when the 
table is relatively small. I made a small copy of my table 
of around 20K records and it was *VERY* fast.

However for my particular case when the table has 1.7million 
records occupying 200MB, FULLTEXT search normally would take 
around 30~60 seconds. Whereas a 'LIKE' search could give a result 
in 1 second at best and 2~5 minutes at worst. BUT most of my search 
falls under the less than 10 seconds. SO I am fairly satisfied 
using LIKE than having to constantly wait 30 seconds all the time.

In other words, (my experience of) FULLTEXT's performance is 
currently the same (if not better than) as my effort in making 
a crossref table as explained below. Actually FULLTEXT *is*
technically the same as what I did but only better.

And in my case, I have no alternatives other than either I:
* dont use any fancy indexes as it did not helped much, or
* since I have control of what keywords I want/need, I make
  a smaller crossref table. I could not do this with FULLTEXT
  as i do not have control of what FULLTEXT would do.

I think the second option is workable.

PS. I did not know there was FULLTEXT; I was using and old PDF
manual.

Jaime


At 06:42 PM 2/24/02 -0800, Jeff Kilbride wrote:
I'm about to try a full text index in a very similar situation, which has
the potential to grow fairly big. I'd also be interested in hearing how
MySQL's full text index works for your large dataset.

Thanks,
--jeff

- Original Message -
From: Luke Muszkiewicz [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, February 24, 2002 6:30 PM
Subject: RE: help with big table search


 Jaime:

 Have you tried creating a FULLTEXT index? If so, I'd like to hear about
 how well it works on such a large database.

 http://www.mysql.com/doc/F/u/Fulltext_Search.html

 Best of luck.

 -luke

 Luke Muszkiewicz
 Pure Development, LLC
 http://puredev.com

  Hi,
 
  I have a table and currently has about 1.6 million entries.
  It is a table of events with date/time and description of the event.
 
  mysql describe eventlog;
  +-+--+--+-+-+-
  ---+
  | Field   | Type | Null | Key | Default |
  Extra  |
  +-+--+--+-+-+-
  ---+
  | id  | int(10) unsigned |  | PRI | NULL|
  auto_increment |
  | timestamp   | int(10) unsigned |  | MUL | 0   |
 |
  | description | char(100)|  | | |
 |
  +-+--+--+-+-+-
  ---+
 
  My task is to search this table for any particular word that may occur
  anywhere in the description field:
  ie.
 
  SELECT * FROM eventlog
  WHERE description LIKE '%pattern%'
  ORDER BY id DESC LIMIT 20;
 
  Considering that I am using LIKE instead of = as a search
  option, making
  description into an index would not do any good.
 
  This search works well ONLY if the items to search are relatively near
  the top of the table *AND* most importantly, there are at least '20'
  matching items available on the table. *IF* the table contains only
  19 matches or less, then the SELECT will search through the whole
  table and I may have to wait some 2~3 minutes to get the result.
 
 
  Then I started using cross-reference table.
 
  mysql describe crossref;
  +---+--+--+-+-+---+
  | Field | Type | Null | Key | Default | Extra |
  +---+--+--+-+-+---+
  | word  | char(15) |  | MUL | |   |
  | id| int(10) unsigned |  | MUL | 0   |   |
  +---+--+--+-+-+---+
  word is an independent index,
  id is an independent index
 
  For every entry into the eventlog table, I broken down each word from
  description and inserted them into the crossref table. This way,
  whenever I want to find the word HELLO, all i need to do
  is:
  SELECT id FROM crossref WHERE word = 'HELLO' ORDER BY id DESC
  limit 20;
  and then use the results into another search:
  SELECT * from eventlog where id in (previous result);
 
  However still, for whatever reason, this search isnt working well.
  Though the search time is better than before, it still takes about
  30~60 seconds for an answer. (sometimes fairly fast 5 seconds).
  SOMETIMES, searching through this crossref were actually slower.
 
  I'd like to know how you people come up with a very good table and
  search. Eventlog is currently 200MB in size.
 
  Jaime

RE: help with big table search

2002-02-24 Thread Daniel Rosher

Jamie,

I think your approach of a cross-reference table is a good start. This is
similar to creating a stemming index. Perhaps you might like to look into,
for example, some perl Modules for stemming (like Linga::Stem) to further
reduce your data space.

Perhaps since there may be a large resultset from the first select, and then
feeding this information back to the server, and the second search also
returning a large resultset, perhaps you'd be better off trying to obtain
the results in one select statement --- what about something like:

select eventlog.*
from eventlog left join crossref on (eventlog.id=crossref.id and word =
'HELLO')
ORDER BY eventlog.id DESC LIMIT 20;

?

Regards,
Dan

 -Original Message-
 From: Jaime Teng [mailto:[EMAIL PROTECTED]]
 Sent: Monday, 25 February 2002 2:34 p.m.
 To: [EMAIL PROTECTED]
 Subject: help with big table search


 Hi,

 I have a table and currently has about 1.6 million entries.
 It is a table of events with date/time and description of the event.

 mysql describe eventlog;
 +-+--+--+-+-++
 | Field   | Type | Null | Key | Default | Extra  |
 +-+--+--+-+-++
 | id  | int(10) unsigned |  | PRI | NULL| auto_increment |
 | timestamp   | int(10) unsigned |  | MUL | 0   ||
 | description | char(100)|  | | ||
 +-+--+--+-+-++

 My task is to search this table for any particular word that may occur
 anywhere in the description field:
 ie.

 SELECT * FROM eventlog
 WHERE description LIKE '%pattern%'
 ORDER BY id DESC LIMIT 20;

 Considering that I am using LIKE instead of = as a search option, making
 description into an index would not do any good.

 This search works well ONLY if the items to search are relatively near
 the top of the table *AND* most importantly, there are at least '20'
 matching items available on the table. *IF* the table contains only
 19 matches or less, then the SELECT will search through the whole
 table and I may have to wait some 2~3 minutes to get the result.


 Then I started using cross-reference table.

 mysql describe crossref;
 +---+--+--+-+-+---+
 | Field | Type | Null | Key | Default | Extra |
 +---+--+--+-+-+---+
 | word  | char(15) |  | MUL | |   |
 | id| int(10) unsigned |  | MUL | 0   |   |
 +---+--+--+-+-+---+
 word is an independent index,
 id is an independent index

 For every entry into the eventlog table, I broken down each word from
 description and inserted them into the crossref table. This way,
 whenever I want to find the word HELLO, all i need to do
 is:
 SELECT id FROM crossref WHERE word = 'HELLO' ORDER BY id DESC limit 20;
 and then use the results into another search:
 SELECT * from eventlog where id in (previous result);

 However still, for whatever reason, this search isnt working well.
 Though the search time is better than before, it still takes about
 30~60 seconds for an answer. (sometimes fairly fast 5 seconds).
 SOMETIMES, searching through this crossref were actually slower.

 I'd like to know how you people come up with a very good table and
 search. Eventlog is currently 200MB in size.

 Jaime


 -
 Before posting, please check:
http://www.mysql.com/manual.php   (the manual)
http://lists.mysql.com/   (the list archive)

 To request this thread, e-mail [EMAIL PROTECTED]
 To unsubscribe, e-mail
 [EMAIL PROTECTED]
 Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




RE: help with big table search

2002-02-24 Thread Luke Muszkiewicz

Jaime:

Have you tried creating a FULLTEXT index? If so, I'd like to hear about
how well it works on such a large database.

http://www.mysql.com/doc/F/u/Fulltext_Search.html

Best of luck.

-luke

Luke Muszkiewicz
Pure Development, LLC
http://puredev.com

 Hi,

 I have a table and currently has about 1.6 million entries.
 It is a table of events with date/time and description of the event.

 mysql describe eventlog;
 +-+--+--+-+-+-
 ---+
 | Field   | Type | Null | Key | Default |
 Extra  |
 +-+--+--+-+-+-
 ---+
 | id  | int(10) unsigned |  | PRI | NULL|
 auto_increment |
 | timestamp   | int(10) unsigned |  | MUL | 0   |
|
 | description | char(100)|  | | |
|
 +-+--+--+-+-+-
 ---+

 My task is to search this table for any particular word that may occur
 anywhere in the description field:
 ie.

 SELECT * FROM eventlog
 WHERE description LIKE '%pattern%'
 ORDER BY id DESC LIMIT 20;

 Considering that I am using LIKE instead of = as a search
 option, making
 description into an index would not do any good.

 This search works well ONLY if the items to search are relatively near
 the top of the table *AND* most importantly, there are at least '20'
 matching items available on the table. *IF* the table contains only
 19 matches or less, then the SELECT will search through the whole
 table and I may have to wait some 2~3 minutes to get the result.


 Then I started using cross-reference table.

 mysql describe crossref;
 +---+--+--+-+-+---+
 | Field | Type | Null | Key | Default | Extra |
 +---+--+--+-+-+---+
 | word  | char(15) |  | MUL | |   |
 | id| int(10) unsigned |  | MUL | 0   |   |
 +---+--+--+-+-+---+
 word is an independent index,
 id is an independent index

 For every entry into the eventlog table, I broken down each word from
 description and inserted them into the crossref table. This way,
 whenever I want to find the word HELLO, all i need to do
 is:
 SELECT id FROM crossref WHERE word = 'HELLO' ORDER BY id DESC
 limit 20;
 and then use the results into another search:
 SELECT * from eventlog where id in (previous result);

 However still, for whatever reason, this search isnt working well.
 Though the search time is better than before, it still takes about
 30~60 seconds for an answer. (sometimes fairly fast 5 seconds).
 SOMETIMES, searching through this crossref were actually slower.

 I'd like to know how you people come up with a very good table and
 search. Eventlog is currently 200MB in size.

 Jaime


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: help with big table search

2002-02-24 Thread Jeff Kilbride

I'm about to try a full text index in a very similar situation, which has
the potential to grow fairly big. I'd also be interested in hearing how
MySQL's full text index works for your large dataset.

Thanks,
--jeff

- Original Message -
From: Luke Muszkiewicz [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, February 24, 2002 6:30 PM
Subject: RE: help with big table search


 Jaime:

 Have you tried creating a FULLTEXT index? If so, I'd like to hear about
 how well it works on such a large database.

 http://www.mysql.com/doc/F/u/Fulltext_Search.html

 Best of luck.

 -luke

 Luke Muszkiewicz
 Pure Development, LLC
 http://puredev.com

  Hi,
 
  I have a table and currently has about 1.6 million entries.
  It is a table of events with date/time and description of the event.
 
  mysql describe eventlog;
  +-+--+--+-+-+-
  ---+
  | Field   | Type | Null | Key | Default |
  Extra  |
  +-+--+--+-+-+-
  ---+
  | id  | int(10) unsigned |  | PRI | NULL|
  auto_increment |
  | timestamp   | int(10) unsigned |  | MUL | 0   |
 |
  | description | char(100)|  | | |
 |
  +-+--+--+-+-+-
  ---+
 
  My task is to search this table for any particular word that may occur
  anywhere in the description field:
  ie.
 
  SELECT * FROM eventlog
  WHERE description LIKE '%pattern%'
  ORDER BY id DESC LIMIT 20;
 
  Considering that I am using LIKE instead of = as a search
  option, making
  description into an index would not do any good.
 
  This search works well ONLY if the items to search are relatively near
  the top of the table *AND* most importantly, there are at least '20'
  matching items available on the table. *IF* the table contains only
  19 matches or less, then the SELECT will search through the whole
  table and I may have to wait some 2~3 minutes to get the result.
 
 
  Then I started using cross-reference table.
 
  mysql describe crossref;
  +---+--+--+-+-+---+
  | Field | Type | Null | Key | Default | Extra |
  +---+--+--+-+-+---+
  | word  | char(15) |  | MUL | |   |
  | id| int(10) unsigned |  | MUL | 0   |   |
  +---+--+--+-+-+---+
  word is an independent index,
  id is an independent index
 
  For every entry into the eventlog table, I broken down each word from
  description and inserted them into the crossref table. This way,
  whenever I want to find the word HELLO, all i need to do
  is:
  SELECT id FROM crossref WHERE word = 'HELLO' ORDER BY id DESC
  limit 20;
  and then use the results into another search:
  SELECT * from eventlog where id in (previous result);
 
  However still, for whatever reason, this search isnt working well.
  Though the search time is better than before, it still takes about
  30~60 seconds for an answer. (sometimes fairly fast 5 seconds).
  SOMETIMES, searching through this crossref were actually slower.
 
  I'd like to know how you people come up with a very good table and
  search. Eventlog is currently 200MB in size.
 
  Jaime


 -
 Before posting, please check:
http://www.mysql.com/manual.php   (the manual)
http://lists.mysql.com/   (the list archive)

 To request this thread, e-mail [EMAIL PROTECTED]
 To unsubscribe, e-mail
[EMAIL PROTECTED]
 Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php