Re: RE: [others] Re: Views with functions and performance issues
The hit with a join on indexed columns is negligible. Relational databases live for joins - they eat them for breakfast! Seriously, as long as it's indexed in both tables, it'll be super-speedy. Dan On 9/22/06, Christopher Brooks <[EMAIL PROTECTED]> wrote: Hi, thanks for the comments, > If I'm understanding right - the view contains an additional > column that is an MD5 hash of some or all of the data in the > base table, right? Close. It's got all of the data in the base table except for the colum that's being hashed - we show the hashed version of that instead. Like a password table, where the base table has the plain text passwords and usernames, and the view only shows the hashed passwords and usernames (though it's not for passwords). > set up second table, with two columns. One is id number to > correspond to id number in original table, second is to hold > Md5 hash. Createn index on the MD5 column for sure, possibly > a two column index. I'm not sure whether it would be better > to create it as (id, md5_col) or (md5_col, id); I'd test it > each way I guess. > > Set up insert/update/delete triggers on the first table to > add/update/delete records in the second table with ID and the > MD5 hash. > > Then re-create your view to show you all the columns from the > first table plus the MD5 column from the second table, > joining on the id column. This sounds good, and this way I don't have to change the first table much at all (as it's a replicated table and my luck with replication only lately seems to have gotten good). What kind of performance hit will I be taking because of the join in the view (e.g. every query to the view is going to have to have to do the join, yes?). Or is this neglegable as long as the hash and pointer of the second table are both indexed... Chris -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: [others] Re: Views with functions and performance issues
Hi, thanks for the comments, > If I'm understanding right - the view contains an additional > column that is an MD5 hash of some or all of the data in the > base table, right? Close. It's got all of the data in the base table except for the colum that's being hashed - we show the hashed version of that instead. Like a password table, where the base table has the plain text passwords and usernames, and the view only shows the hashed passwords and usernames (though it's not for passwords). > set up second table, with two columns. One is id number to > correspond to id number in original table, second is to hold > Md5 hash. Createn index on the MD5 column for sure, possibly > a two column index. I'm not sure whether it would be better > to create it as (id, md5_col) or (md5_col, id); I'd test it > each way I guess. > > Set up insert/update/delete triggers on the first table to > add/update/delete records in the second table with ID and the > MD5 hash. > > Then re-create your view to show you all the columns from the > first table plus the MD5 column from the second table, > joining on the id column. This sounds good, and this way I don't have to change the first table much at all (as it's a replicated table and my luck with replication only lately seems to have gotten good). What kind of performance hit will I be taking because of the join in the view (e.g. every query to the view is going to have to have to do the join, yes?). Or is this neglegable as long as the hash and pointer of the second table are both indexed... Chris -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Views with functions and performance issues
If I'm understanding right - the view contains an additional column that is an MD5 hash of some or all of the data in the base table, right? Yes, I would expect that to be very very slow. When selecting, your database engine has tro calculate 700K MD5 hashes. Slow. When selecting a subset it has to do that plus what amounts to a table scan of those 700K MD5 hashes. Very slow. Wonder if you could do this: set up second table, with two columns. One is id number to correspond to id number in original table, second is to hold Md5 hash. Createn index on the MD5 column for sure, possibly a two column index. I'm not sure whether it would be better to create it as (id, md5_col) or (md5_col, id); I'd test it each way I guess. Set up insert/update/delete triggers on the first table to add/update/delete records in the second table with ID and the MD5 hash. Then re-create your view to show you all the columns from the first table plus the MD5 column from the second table, joining on the id column. I think this should end up being pretty fast since the hashes will only be calculated when the data changes, not all at once for every select, and an index will speed things up tremendously when selecting a subset. HTH, Dan On 9/22/06, Christopher Brooks <[EMAIL PROTECTED]> wrote: Hi, I've got a view of a base table that is 100% identical to that base table except for one column, which is a projection of the base table after its MD5 hashed. The table is largish (~700,000 rows) and is growing quickly. Queries on the base table are nice and fast, but on the hashed view are painfully slow - in particular when restricting the queries on the view based on the hashed colum. I assume this is because there is no index for the hashed colum. Is there a good way I can deal with this, without changing the base table? TIA, running MySQL 5 on windows, can change to a later build of MySQL (as long as I can replicate from 5) if there is some funky stuff in there that will do what I need... Chris -- Christopher A. Brooks Research Officer, ARIES Group University of Saskatchewan Email: [EMAIL PROTECTED] Mail: Christopher Brooks Department of Computer Science University of Saskatchewan 57 Campus Drive Saskatoon, Saskatchewan, S7N 5A9 Canada -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Views with functions and performance issues
Hi, I've got a view of a base table that is 100% identical to that base table except for one column, which is a projection of the base table after its MD5 hashed. The table is largish (~700,000 rows) and is growing quickly. Queries on the base table are nice and fast, but on the hashed view are painfully slow - in particular when restricting the queries on the view based on the hashed colum. I assume this is because there is no index for the hashed colum. Is there a good way I can deal with this, without changing the base table? TIA, running MySQL 5 on windows, can change to a later build of MySQL (as long as I can replicate from 5) if there is some funky stuff in there that will do what I need... Chris -- Christopher A. Brooks Research Officer, ARIES Group University of Saskatchewan Email: [EMAIL PROTECTED] Mail: Christopher Brooks Department of Computer Science University of Saskatchewan 57 Campus Drive Saskatoon, Saskatchewan, S7N 5A9 Canada -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]