Union Performance Question

2005-05-11 Thread Dan Salzer
Hi everyone,

I have a question regarding the performance of UNION queries:

I need to do a full-text search against a large number of rows. Is it
faster to have one table with 10,000,000 text rows and perform one
full-text search. Or, am I better off having 10 smaller more managable
tables and performing UNION'ed full-text searches against them.

A merge table seems like it would be the ideal solution, but the
limitation on full-text indexing prohibits it.

Thanks!

-Dan

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Union Performance Question

2005-05-11 Thread Dathan Pattishall
Use 10 smaller tables and perform a union. It's faster to look up in
smaller tables then larger ones-generally. Additionally more of the
key_buffer can be used for tables with the most hits over the tables
with less hits, making the lookup sustain speed over time.

 

 -Original Message-
 From: Dan Salzer [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, May 11, 2005 11:46 AM
 To: mysql@lists.mysql.com
 Subject: Union Performance Question
 
 Hi everyone,
 
 I have a question regarding the performance of UNION queries:
 
 I need to do a full-text search against a large number of 
 rows. Is it faster to have one table with 10,000,000 text 
 rows and perform one full-text search. Or, am I better off 
 having 10 smaller more managable tables and performing 
 UNION'ed full-text searches against them.
 
 A merge table seems like it would be the ideal solution, but 
 the limitation on full-text indexing prohibits it.
 
 Thanks!
 
 -Dan
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]
 
 

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Union Performance Question

2005-05-11 Thread Eric Bergen
Use one big table. A merge table will run the same query over all 10 
tables. The key buffer is filled from the top down so if you have a key 
buffer that looks like this:
   a
  /  \
 /\
   b  c
  /  \/  \
de fg

Almost all queries for that index will be able to use the buffer for 'a' 
and 'b'. It's not until they get things very narrowed down that you have 
to actually hit the index files for specific leaves. Creating ten small 
tables creates ten duplicate 'a' and 'b' sections which polutes the key 
buffer.

-Eric
Dathan Pattishall wrote:
Use 10 smaller tables and perform a union. It's faster to look up in
smaller tables then larger ones-generally. Additionally more of the
key_buffer can be used for tables with the most hits over the tables
with less hits, making the lookup sustain speed over time.

 

-Original Message-
From: Dan Salzer [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, May 11, 2005 11:46 AM
To: mysql@lists.mysql.com
Subject: Union Performance Question

Hi everyone,
I have a question regarding the performance of UNION queries:
I need to do a full-text search against a large number of 
rows. Is it faster to have one table with 10,000,000 text 
rows and perform one full-text search. Or, am I better off 
having 10 smaller more managable tables and performing 
UNION'ed full-text searches against them.

A merge table seems like it would be the ideal solution, but 
the limitation on full-text indexing prohibits it.

Thanks!
-Dan
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]

   

 


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


RE: Union Performance Question

2005-05-11 Thread Dathan Pattishall

 
 Use one big table. A merge table will run the same query over 
 all 10 tables. The key buffer is filled from the top down so 

He is using a fulltext index he can't use merge tables.

If he where to UNION across the tables being used assuming he uses the
tables that only have the data he would get a good boost in performance.
The performance comes from a smaller in time lock on the table and only
data that is being accessed the most would stay in the buffer.

He has to pick a good hash such as date or country code for splitting
tables on. 

To explain why let's look at the key buffer structure.

Having multiple tables do not pollute the buffer this is why:

mySQL pulls OS MYI file blocks then puts the frequently used blocks in
the buffer. The index file itself stores the data in a B-Tree+ (or
R-Tree) so a smaller index file means more of it can exist in the
keybuffer. Since an index file is associated with a table the most
frequently used tables' indexes will have more of the key buffer-that
assumed: if the application hashes the table correctly then a boost in
performance can be gained by using more memory for index blocks that
have more pertinent data then not.


If your using INNODB (no FULLTEXT INDEX option) use 1 big table.








  


 if you have a key buffer that looks like this:
 a
/  \
   /\
 b  c
/  \/  \
  de fg
 



 
 Almost all queries for that index will be able to use the 
 buffer for 'a' 
 and 'b'. It's not until they get things very narrowed down 
 that you have to actually hit the index files for specific 
 leaves. Creating ten small tables creates ten duplicate 'a' 
 and 'b' sections which polutes the key buffer.
 
 -Eric
 
 Dathan Pattishall wrote:
 
 Use 10 smaller tables and perform a union. It's faster to look up in 
 smaller tables then larger ones-generally. Additionally more of the 
 key_buffer can be used for tables with the most hits over the tables 
 with less hits, making the lookup sustain speed over time.
 
  
 
   
 
 -Original Message-
 From: Dan Salzer [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, May 11, 2005 11:46 AM
 To: mysql@lists.mysql.com
 Subject: Union Performance Question
 
 Hi everyone,
 
 I have a question regarding the performance of UNION queries:
 
 I need to do a full-text search against a large number of 
 rows. Is it 
 faster to have one table with 10,000,000 text rows and perform one 
 full-text search. Or, am I better off having 10 smaller 
 more managable 
 tables and performing UNION'ed full-text searches against them.
 
 A merge table seems like it would be the ideal solution, but the 
 limitation on full-text indexing prohibits it.
 
 Thanks!
 
 -Dan
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]
 
 
 
 
 
   
 
 
 

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Union Performance Question

2005-05-11 Thread Dan Salzer
Most helpfull! Thanks!

This brings me to my next question I have a table that looks like this:

CREATE TABLE `Article_Search` (
 `ArticleID` int(11) NOT NULL default '0',
 `Content` text NOT NULL,
 PRIMARY KEY (`ArticleID`),
 FULLTEXT KEY `Content` (`Content`)
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1

 This table has several million rows, but I only want to search a subset of 
the table. IE:
 SELECT * FROM Article_Search WHERE MATCH(Content) AGAINST('rubber 
duckies' IN BOOLEAN MODE) AND ArticleID IN (100, 23, 223, 98, 4018, 1452, 
91)
 The reason I'm specifying a set of ArticleIDs is that I know any hits are 
going to be within those articles. So the presence of the IN() clause is 
purely there for performance. However, an explain on this Statement shows 
that it is using the Full-Text index. Is mysql text-searching the entire 
table under the hood, or does it use the PK to reduce the dataset before the 
text-search.
 Thanks again!
 -Dan

On 5/11/05, Dathan Pattishall [EMAIL PROTECTED] wrote:
 
 
  Use one big table. A merge table will run the same query over
  all 10 tables. The key buffer is filled from the top down so
 
 He is using a fulltext index he can't use merge tables.
 
 If he where to UNION across the tables being used assuming he uses the
 tables that only have the data he would get a good boost in performance.
 The performance comes from a smaller in time lock on the table and only
 data that is being accessed the most would stay in the buffer.
 
 He has to pick a good hash such as date or country code for splitting
 tables on.
 
 To explain why let's look at the key buffer structure.
 
 Having multiple tables do not pollute the buffer this is why:
 
 mySQL pulls OS MYI file blocks then puts the frequently used blocks in
 the buffer. The index file itself stores the data in a B-Tree+ (or
 R-Tree) so a smaller index file means more of it can exist in the
 keybuffer. Since an index file is associated with a table the most
 frequently used tables' indexes will have more of the key buffer-that
 assumed: if the application hashes the table correctly then a boost in
 performance can be gained by using more memory for index blocks that
 have more pertinent data then not.
 
 If your using INNODB (no FULLTEXT INDEX option) use 1 big table.
 
 
  if you have a key buffer that looks like this:
  a
  / \
  / \
  b c
  / \ / \
  d e f g
 
 
 
  Almost all queries for that index will be able to use the
  buffer for 'a'
  and 'b'. It's not until they get things very narrowed down
  that you have to actually hit the index files for specific
  leaves. Creating ten small tables creates ten duplicate 'a'
  and 'b' sections which polutes the key buffer.
 
  -Eric
 
  Dathan Pattishall wrote:
 
  Use 10 smaller tables and perform a union. It's faster to look up in
  smaller tables then larger ones-generally. Additionally more of the
  key_buffer can be used for tables with the most hits over the tables
  with less hits, making the lookup sustain speed over time.
  
  
  
  
  
  -Original Message-
  From: Dan Salzer [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, May 11, 2005 11:46 AM
  To: mysql@lists.mysql.com
  Subject: Union Performance Question
  
  Hi everyone,
  
  I have a question regarding the performance of UNION queries:
  
  I need to do a full-text search against a large number of
  rows. Is it
  faster to have one table with 10,000,000 text rows and perform one
  full-text search. Or, am I better off having 10 smaller
  more managable
  tables and performing UNION'ed full-text searches against them.
  
  A merge table seems like it would be the ideal solution, but the
  limitation on full-text indexing prohibits it.
  
  Thanks!
  
  -Dan
  
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:
  http://lists.mysql.com/[EMAIL PROTECTED]