SQL question.... Trying to improve upon my PHP solution.
I have a table of members, about 13,000 rows. Each night I need to shuffle the table. I have a small int column called random_position. Currently I am creating a position list (based on the count of the members), shuffle it, then while iterating through the members assigning them a a position. $time_start = microtime_float(); $member = new DataObjects_Member(); $number_of_rows = $member-find(); $positions = array(); for ($i = 1; $i = $number_of_rows; $i++) { $positions[] = $i; } shuffle($positions); while ($member-fetch()) { // choose a position from the bottom $position = array_pop($positions); // set this member with that position $member-setRandom_position($position); $member-update(); } Of course by doing it this way I'm running 13,000 SELECT calls and 13,000 UPDATE calls which can be processor intensive. As of now I'm timing it at 225 seconds but this machine is faster than server. Is there a simpler / faster SQL query I could use? Thanks! ~ Mike - Mike Zornek Web Designer, Media Developer, Programmer and Geek Personal site: http://MikeZornek.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SQL question.... Trying to improve upon my PHP solution.
At 11:08 AM 11/29/2004, you wrote: I have a table of members, about 13,000 rows. Each night I need to shuffle the table. I have a small int column called random_position. Currently I am creating a position list (based on the count of the members), shuffle it, then while iterating through the members assigning them a a position. $time_start = microtime_float(); $member = new DataObjects_Member(); $number_of_rows = $member-find(); $positions = array(); for ($i = 1; $i = $number_of_rows; $i++) { $positions[] = $i; } shuffle($positions); while ($member-fetch()) { // choose a position from the bottom $position = array_pop($positions); // set this member with that position $member-setRandom_position($position); $member-update(); } Of course by doing it this way I'm running 13,000 SELECT calls and 13,000 UPDATE calls which can be processor intensive. As of now I'm timing it at 225 seconds but this machine is faster than server. Is there a simpler / faster SQL query I could use? Thanks! ~ Mike - Mike Zornek Web Designer, Media Developer, Programmer and Geek Personal site: http://MikeZornek.com Mike, Your solution is way too complicated (it makes my head hurt).g Try this: set @n=0; update tmp set rnd = @n := @n + 1 order by RAND() Mike -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SQL question.... Trying to improve upon my PHP solution.
On 11/29/04 12:27 PM, mos [EMAIL PROTECTED] wrote: Mike, Your solution is way too complicated (it makes my head hurt).g Try this: set @n=0; update tmp set rnd = @n := @n + 1 order by RAND() Mike I'll give this a shot. Follow-up question: I've had a lot of trouble with RAND() on my MySQL 3 box not being all that random. This time I'm using 4.0.17-log. Is it going to be any better? ~ Mike - Mike Zornek Web Designer, Media Developer, Programmer and Geek Personal site: http://MikeZornek.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SQL question.... Trying to improve upon my PHP solution.
- Original Message - From: Mike Zornek [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Monday, November 29, 2004 12:08 PM Subject: SQL question Trying to improve upon my PHP solution. I have a table of members, about 13,000 rows. Each night I need to shuffle the table. I have a small int column called random_position. Currently I am creating a position list (based on the count of the members), shuffle it, then while iterating through the members assigning them a a position. $time_start = microtime_float(); $member = new DataObjects_Member(); $number_of_rows = $member-find(); $positions = array(); for ($i = 1; $i = $number_of_rows; $i++) { $positions[] = $i; } shuffle($positions); while ($member-fetch()) { // choose a position from the bottom $position = array_pop($positions); // set this member with that position $member-setRandom_position($position); $member-update(); } Of course by doing it this way I'm running 13,000 SELECT calls and 13,000 UPDATE calls which can be processor intensive. As of now I'm timing it at 225 seconds but this machine is faster than server. Is there a simpler / faster SQL query I could use? I don't think this is a question about SQL at all; I think you already know how to write the SQL to select, insert, update or delete rows. I think that what you really want to know is if there is a more efficient way to shuffle your rows and that is more a question of choosing a good programming algorithm and plugging in the SQL you already know. I have to admit I'm curious about why you need to do this. In many years of database work, I've never seen the need to shuffle a table before. By the way, could you clarify what you mean by shuffle? Do you mean that you want to completely randomize every row and put it in some new, randomly-chosen slot within the table? Or are you only moving the bottom row to the top and pushing each of the others down one slot? I suspect that you might be able to avoid this shuffle entirely and simply choose rows at random based on their primary key without inserting, updating, or deleting anything. You probably only have to show the rows in a random sequence without actually physically moving them within the table. In that case, simply get a list of the primary key values for every row of the table and assign them to an array; then use a random number generator to select keys from the array at random. Of course, this still gives you the likelihood of choosing some records twice or multiple times and some not at all; you'd need to make the algorithm recognize when it has grabbed a given key already so that it ignores it if it is chosen a second time. If PHP has a data structure analagous to Java's vector, it would be a lot better choice: you simply put each primary key in its own slot of the vector, select each key at random via the random number generator, and after you've selected the row that corresponds to that primary key DELETE THE KEY FROM THE VECTOR. That ensures that no row is selected twice and every row gets selected at some point. Rhino -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SQL question.... Trying to improve upon my PHP solution.
On 11/29/04 1:26 PM, Rhino [EMAIL PROTECTED] wrote: I don't think this is a question about SQL at all; I think you already know how to write the SQL to select, insert, update or delete rows. I think that what you really want to know is if there is a more efficient way to shuffle your rows and that is more a question of choosing a good programming algorithm and plugging in the SQL you already know. I have to admit I'm curious about why you need to do this. In many years of database work, I've never seen the need to shuffle a table before. By the way, could you clarify what you mean by shuffle? Do you mean that you want to completely randomize every row and put it in some new, randomly-chosen slot within the table? Or are you only moving the bottom row to the top and pushing each of the others down one slot? The basic problem is this. We have a search engine that lets photo buyers search our db for photographers. Rather than sort on name the results have always by default been randomized to help level the playing field. And yes this works but it causing frustration to users as when you bookmark the results, or refresh it for various reasons the order is always new. Our idea is to nightly order the members at random and when showing results. I call this shuffling only to make the metaphor. I really mean setting randomized values in a extra column which will be sorted when the user asks for random results. Thanks for you help. ~ Mike - Mike Zornek Web Designer, Media Developer, Programmer and Geek Personal site: http://MikeZornek.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SQL question.... Trying to improve upon my PHP solution.
At 11:53 AM 11/29/2004, you wrote: On 11/29/04 12:27 PM, mos [EMAIL PROTECTED] wrote: Mike, Your solution is way too complicated (it makes my head hurt).g Try this: set @n=0; update tmp set rnd = @n := @n + 1 order by RAND() Mike I'll give this a shot. Follow-up question: I've had a lot of trouble with RAND() on my MySQL 3 box not being all that random. This time I'm using 4.0.17-log. Is it going to be any better? ~ Mike Mike, According to http://dev.mysql.com/doc/mysql/en/News-4.0.1.html they've improved Rand in version 4.0.1 Changed RAND() initialization so that RAND(N) and RAND(N+1) are more distinct. and in 4.0.10 they Fixed initialization of the random seed for newly created threads to give a better rand() distribution from the first call. Mike -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]