SQL question.... Trying to improve upon my PHP solution.

2004-11-29 Thread Mike Zornek
I have a table of members, about 13,000 rows.

Each night I need to shuffle the table. I have a small int column called
random_position. Currently I am creating a position list (based on the count
of the members), shuffle it, then while iterating through the members
assigning them a a position.

$time_start = microtime_float();

$member = new DataObjects_Member();
$number_of_rows = $member-find();

$positions = array();
for ($i = 1; $i = $number_of_rows; $i++) {
$positions[] = $i;
}

shuffle($positions);

while ($member-fetch()) {

// choose a position from the bottom
$position = array_pop($positions);

// set this member with that position
$member-setRandom_position($position);
$member-update();
}

Of course by doing it this way I'm running 13,000 SELECT calls and 13,000
UPDATE calls which can be processor intensive. As of now I'm timing it at
225 seconds but this machine is faster than server.

Is there a simpler / faster SQL query I could use?

Thanks!

~ Mike
-
Mike Zornek
Web Designer, Media Developer, Programmer and Geek
Personal site: http://MikeZornek.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: SQL question.... Trying to improve upon my PHP solution.

2004-11-29 Thread mos
At 11:08 AM 11/29/2004, you wrote:
I have a table of members, about 13,000 rows.
Each night I need to shuffle the table. I have a small int column called
random_position. Currently I am creating a position list (based on the count
of the members), shuffle it, then while iterating through the members
assigning them a a position.
$time_start = microtime_float();
$member = new DataObjects_Member();
$number_of_rows = $member-find();
$positions = array();
for ($i = 1; $i = $number_of_rows; $i++) {
$positions[] = $i;
}
shuffle($positions);
while ($member-fetch()) {
// choose a position from the bottom
$position = array_pop($positions);
// set this member with that position
$member-setRandom_position($position);
$member-update();
}
Of course by doing it this way I'm running 13,000 SELECT calls and 13,000
UPDATE calls which can be processor intensive. As of now I'm timing it at
225 seconds but this machine is faster than server.
Is there a simpler / faster SQL query I could use?
Thanks!
~ Mike
-
Mike Zornek
Web Designer, Media Developer, Programmer and Geek
Personal site: http://MikeZornek.com

Mike,
Your solution is way too complicated (it makes my head hurt).g 
Try this:

set @n=0;
update tmp set rnd = @n := @n + 1 order by RAND()
Mike
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: SQL question.... Trying to improve upon my PHP solution.

2004-11-29 Thread Mike Zornek
On 11/29/04 12:27 PM, mos [EMAIL PROTECTED] wrote:

 Mike,
Your solution is way too complicated (it makes my head hurt).g
 Try this:
 
 set @n=0;
 update tmp set rnd = @n := @n + 1 order by RAND()
 
 Mike

I'll give this a shot. Follow-up question:

I've had a lot of trouble with RAND() on my MySQL 3 box not being all that
random. This time I'm using 4.0.17-log. Is it going to be any better?

~ Mike
-
Mike Zornek
Web Designer, Media Developer, Programmer and Geek
Personal site: http://MikeZornek.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: SQL question.... Trying to improve upon my PHP solution.

2004-11-29 Thread Rhino

- Original Message - 
From: Mike Zornek [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, November 29, 2004 12:08 PM
Subject: SQL question Trying to improve upon my PHP solution.


 I have a table of members, about 13,000 rows.

 Each night I need to shuffle the table. I have a small int column called
 random_position. Currently I am creating a position list (based on the
count
 of the members), shuffle it, then while iterating through the members
 assigning them a a position.

 $time_start = microtime_float();

 $member = new DataObjects_Member();
 $number_of_rows = $member-find();

 $positions = array();
 for ($i = 1; $i = $number_of_rows; $i++) {
 $positions[] = $i;
 }

 shuffle($positions);

 while ($member-fetch()) {

 // choose a position from the bottom
 $position = array_pop($positions);

 // set this member with that position
 $member-setRandom_position($position);
 $member-update();
 }

 Of course by doing it this way I'm running 13,000 SELECT calls and 13,000
 UPDATE calls which can be processor intensive. As of now I'm timing it at
 225 seconds but this machine is faster than server.

 Is there a simpler / faster SQL query I could use?

I don't think this is a question about SQL at all; I think you already know
how to write the SQL to select, insert, update or delete rows.

I think that what you really want to know is if there is a more efficient
way to shuffle your rows and that is more a question of choosing a good
programming algorithm and plugging in the SQL you already know.

I have to admit I'm curious about why you need to do this. In many years of
database work, I've never seen the need to shuffle a table before. By the
way, could you clarify what you mean by shuffle? Do you mean that you want
to completely randomize every row and put it in some new, randomly-chosen
slot within the table? Or are you only moving the bottom row to the top
and pushing each of the others down one slot?

I suspect that you might be able to avoid this shuffle entirely and simply
choose rows at random based on their primary key without inserting,
updating, or deleting anything. You probably only have to show the rows in a
random sequence without actually physically moving them within the table. In
that case, simply get a list of the primary key values for every row of the
table and assign them to an array; then use a random number generator to
select keys from the array at random. Of course, this still gives you the
likelihood of choosing some records twice or multiple times and some not at
all; you'd need to make the algorithm recognize when it has grabbed a given
key already so that it ignores it if it is chosen a second time.

If PHP has a data structure analagous to Java's vector, it would be a lot
better choice: you simply put each primary key in its own slot of the
vector, select each key at random via the random number generator, and after
you've selected the row that corresponds to that primary key DELETE THE KEY
FROM THE VECTOR. That ensures that no row is selected twice and every row
gets selected at some point.

Rhino


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: SQL question.... Trying to improve upon my PHP solution.

2004-11-29 Thread Mike Zornek
On 11/29/04 1:26 PM, Rhino [EMAIL PROTECTED] wrote:

 I don't think this is a question about SQL at all; I think you already know
 how to write the SQL to select, insert, update or delete rows.
 
 I think that what you really want to know is if there is a more efficient
 way to shuffle your rows and that is more a question of choosing a good
 programming algorithm and plugging in the SQL you already know.
 
 I have to admit I'm curious about why you need to do this. In many years of
 database work, I've never seen the need to shuffle a table before. By the
 way, could you clarify what you mean by shuffle? Do you mean that you want
 to completely randomize every row and put it in some new, randomly-chosen
 slot within the table? Or are you only moving the bottom row to the top
 and pushing each of the others down one slot?

The basic problem is this. We have a search engine that lets photo buyers
search our db for photographers. Rather than sort on name the results have
always by default been randomized to help level the playing field. And yes
this works but it causing frustration to users as when you bookmark the
results, or refresh it for various reasons the order is always new.

Our idea is to nightly order the members at random and when showing results.
I call this shuffling only to make the metaphor. I really mean setting
randomized values in a extra column which will be sorted when the user asks
for random results.

Thanks for you help.

~ Mike
-
Mike Zornek
Web Designer, Media Developer, Programmer and Geek
Personal site: http://MikeZornek.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: SQL question.... Trying to improve upon my PHP solution.

2004-11-29 Thread mos
At 11:53 AM 11/29/2004, you wrote:
On 11/29/04 12:27 PM, mos [EMAIL PROTECTED] wrote:
 Mike,
Your solution is way too complicated (it makes my head hurt).g
 Try this:

 set @n=0;
 update tmp set rnd = @n := @n + 1 order by RAND()

 Mike
I'll give this a shot. Follow-up question:
I've had a lot of trouble with RAND() on my MySQL 3 box not being all that
random. This time I'm using 4.0.17-log. Is it going to be any better?
~ Mike
Mike,
According to http://dev.mysql.com/doc/mysql/en/News-4.0.1.html they've 
improved Rand in version 4.0.1 Changed RAND() initialization so that 
RAND(N) and RAND(N+1) are more distinct.  and in 4.0.10 they Fixed 
initialization of the random seed for newly created threads to give a 
better rand() distribution from the first call. 

Mike  

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]