Re: Probability Selects

2010-01-11 Thread Baron Schwartz
Matt,

On Thu, Jan 7, 2010 at 3:10 PM, Matt Neimeyer m...@neimeyer.org wrote:
 What's the best way to select names at random from this but still take
 into account frequency of use?

Here's the link I usually send clients:
http://jan.kneschke.de/projects/mysql/order-by-rand/

-- 
Baron Schwartz
Percona Inc: Services and Support for MySQL
http://www.percona.com/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Probability Selects

2010-01-11 Thread Johan De Meersman
I think what he's trying to accomplish is not truly random, but to use the
probability that's indicated in the second field of the table:

1, Aaron, 0.240
 3, Abe, 0.006
 13, Adrian, 0.069


So there would be a probability of 0.240 that the call returns Aaron, a
probability of 0.006 that it returns Abe, and so on.

I've no clue how to do this in SQL, though, save for the utter stupidity of
creating a table that repeats each name (or UID, saves memory/disk)
1000*probability times, and do a random select on that (or 100* or 10*
depending on how precise you need it). I'd not recommend it, though - it's
gonna be a mess and a huge performance drain.

I suspect this would be better done in code, but I've been out of coding (or
statistics) for too long to give pointers there.


On Mon, Jan 11, 2010 at 3:31 PM, Baron Schwartz ba...@xaprb.com wrote:

 Matt,

 On Thu, Jan 7, 2010 at 3:10 PM, Matt Neimeyer m...@neimeyer.org wrote:
  What's the best way to select names at random from this but still take
  into account frequency of use?

 Here's the link I usually send clients:
 http://jan.kneschke.de/projects/mysql/order-by-rand/

 --
 Baron Schwartz
 Percona Inc: Services and Support for MySQL
 http://www.percona.com/

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be




-- 
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel


Probability Selects

2010-01-07 Thread Matt Neimeyer
I've tried Googling till my brain is fried and I'm obviously missing
something because I'm not finding anything useful.

I'm trying to select names at random from a table that contains the
name and the frequency at which it is actually used in society. The
table is defined as follows:

CREATE TABLE `MaleNames` (
 `Name_ID` int(11) NOT NULL auto_increment,
 `Name` char(50) default NULL,
 `Frequency` decimal(5,3) default NULL,
 PRIMARY KEY  (`Name_ID`)
)

Some examples:

1, Aaron, 0.240
3, Abe, 0.006
13, Adrian, 0.069

What's the best way to select names at random from this but still take
into account frequency of use?

Thanks in advance!

Matt

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: Probability Selects

2010-01-07 Thread Daevid Vincent
http://www.greggdev.com/web/articles.php?id=6 

 -Original Message-
 From: Matt Neimeyer [mailto:m...@neimeyer.org] 
 Sent: Thursday, January 07, 2010 12:10 PM
 To: mysql@lists.mysql.com
 Subject: Probability Selects
 
 I've tried Googling till my brain is fried and I'm obviously missing
 something because I'm not finding anything useful.
 
 I'm trying to select names at random from a table that contains the
 name and the frequency at which it is actually used in society. The
 table is defined as follows:
 
 CREATE TABLE `MaleNames` (
  `Name_ID` int(11) NOT NULL auto_increment,
  `Name` char(50) default NULL,
  `Frequency` decimal(5,3) default NULL,
  PRIMARY KEY  (`Name_ID`)
 )
 
 Some examples:
 
 1, Aaron, 0.240
 3, Abe, 0.006
 13, Adrian, 0.069
 
 What's the best way to select names at random from this but still take
 into account frequency of use?
 
 Thanks in advance!
 
 Matt
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/mysql?unsub=dae...@daevid.com
 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Probability Selects

2010-01-07 Thread Tom Worster
On 1/7/10 3:10 PM, Matt Neimeyer m...@neimeyer.org wrote:

 I'm trying to select names at random from a table that contains the
 name and the frequency at which it is actually used in society. The
 table is defined as follows:
 
 CREATE TABLE `MaleNames` (
  `Name_ID` int(11) NOT NULL auto_increment,
  `Name` char(50) default NULL,
  `Frequency` decimal(5,3) default NULL,
  PRIMARY KEY  (`Name_ID`)
 )
 
 Some examples:
 
 1, Aaron, 0.240
 3, Abe, 0.006
 13, Adrian, 0.069
 
 What's the best way to select names at random from this but still take
 into account frequency of use?

after reading the source, i'd avoid using MySQL's RAND(). you can probably
easily get better quality pseudorandom number in your app's environment.

to get a random row, LIMIT 1, r, where offset r is a random number between
0 and (tablelength - 1), should work.

if you must use RAND(), FLOOR(tablelength*RAND()) will work for r.

if you want a random row from a constrained subset of rows based on
frequency, e.g. WHERE Frequency  0.001, you can include that condition in
the query and repeat it until you get a non-empty response.

but if the constrained subset is only a small fraction of the table's rows
then it might be more efficient to compute its size (with COUNT() in another
query or a subquery) and use that in generating the offset r.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org