Re: dynamically splitting a table through one full table scan
Hi Kabel Yes, I did, it won't do the job for us. I didn't explain the whole usecase: we are dealing with a 50-billion row table which we want to split into 1-million-row tables, and then dynamically break each of these into smaller pieces in order to speed up n^2 near-neighbor joins. If we partition the 1 million row table from the very beginning, we will end up with unmanageable number of tables/files. thanks, Jacek Have you looked into MySQL partitioning? If you're using version 5.1, it might really help.. just partition the big table on chunk ID. http://dev.mysql.com/doc/refman/5.1/en/partitioning.html kabel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: dynamically splitting a table through one full table scan
Jacek Becla wrote: Hi, Is there a way to dynamically split a big table into n smaller tables by doing a single scan of the table that is being split? Here is more details: * Suppose I have a million row MyISAM table X, with relatively small number of columns. It has a column "chunkId" with values between 1 and 100. * I need to split this table into 100 in-memory tables, essentially I need to do: INSERT INTO X001 SELECT * FROM X WHERE chunkId=1; INSERT INTO X002 SELECT * FROM X WHERE chunkId=2; INSERT INTO X003 SELECT * FROM X WHERE chunkId=3; and so on. Based on the tests, each of these individual INSERTS costs ~50% of a full table scan of X, even with a clustered index on chunkId. The cost is totally dominated by the "SELECT * FROM X WHERE chunkId=" Since we need to do such split many times, this performance is not acceptable. It feels it should be possible to do the split through a single scan. Any suggestions? thanks, Jacek Jacek, Have you looked into MySQL partitioning? If you're using version 5.1, it might really help.. just partition the big table on chunk ID. http://dev.mysql.com/doc/refman/5.1/en/partitioning.html kabel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: dynamically splitting a table through one full table scan
Why not write a simple script that scans every record and inserts them into the proper table one at a time? In php for example.. $query = "SELECT * \n"; $query .= "FROM `X` \n"; $result = mysql_query($query); while(($row = mysql_fetch_array($result, MYSQL_ASSOC))){ $Values = ""; foreach($row as $TmpVar => $TmpValue){ $$TmpVar = $TmpValue; $TmpValue = mysql_real_escape_string($TmpValue); $Values .= "'$TmpValue',"; } $Values = substr($Values,0,-1); // remove the last comma $Table = sprintf("x%03d",$chunkId); $query = "INSERT INTO `$Table` \n"; $query .= "VALUES($Values) \n"; $Iresult = mysql_query($query); } Knowing all of the columns could allow you to hard code the insert query values clause and avoid the foreach loop which could make it slightly faster. You would still need to use the mysql_real_escape_string function on any columns that contained something other than numbers. Jacek Becla wrote: Hi, Is there a way to dynamically split a big table into n smaller tables by doing a single scan of the table that is being split? Here is more details: * Suppose I have a million row MyISAM table X, with relatively small number of columns. It has a column "chunkId" with values between 1 and 100. * I need to split this table into 100 in-memory tables, essentially I need to do: INSERT INTO X001 SELECT * FROM X WHERE chunkId=1; INSERT INTO X002 SELECT * FROM X WHERE chunkId=2; INSERT INTO X003 SELECT * FROM X WHERE chunkId=3; and so on. Based on the tests, each of these individual INSERTS costs ~50% of a full table scan of X, even with a clustered index on chunkId. The cost is totally dominated by the "SELECT * FROM X WHERE chunkId=" Since we need to do such split many times, this performance is not acceptable. It feels it should be possible to do the split through a single scan. Any suggestions? thanks, Jacek -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
dynamically splitting a table through one full table scan
Hi, Is there a way to dynamically split a big table into n smaller tables by doing a single scan of the table that is being split? Here is more details: * Suppose I have a million row MyISAM table X, with relatively small number of columns. It has a column "chunkId" with values between 1 and 100. * I need to split this table into 100 in-memory tables, essentially I need to do: INSERT INTO X001 SELECT * FROM X WHERE chunkId=1; INSERT INTO X002 SELECT * FROM X WHERE chunkId=2; INSERT INTO X003 SELECT * FROM X WHERE chunkId=3; and so on. Based on the tests, each of these individual INSERTS costs ~50% of a full table scan of X, even with a clustered index on chunkId. The cost is totally dominated by the "SELECT * FROM X WHERE chunkId=" Since we need to do such split many times, this performance is not acceptable. It feels it should be possible to do the split through a single scan. Any suggestions? thanks, Jacek -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org