Re: Probability Selects
Matt, On Thu, Jan 7, 2010 at 3:10 PM, Matt Neimeyer m...@neimeyer.org wrote: What's the best way to select names at random from this but still take into account frequency of use? Here's the link I usually send clients: http://jan.kneschke.de/projects/mysql/order-by-rand/ -- Baron Schwartz Percona Inc: Services and Support for MySQL http://www.percona.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Probability Selects
I think what he's trying to accomplish is not truly random, but to use the probability that's indicated in the second field of the table: 1, Aaron, 0.240 3, Abe, 0.006 13, Adrian, 0.069 So there would be a probability of 0.240 that the call returns Aaron, a probability of 0.006 that it returns Abe, and so on. I've no clue how to do this in SQL, though, save for the utter stupidity of creating a table that repeats each name (or UID, saves memory/disk) 1000*probability times, and do a random select on that (or 100* or 10* depending on how precise you need it). I'd not recommend it, though - it's gonna be a mess and a huge performance drain. I suspect this would be better done in code, but I've been out of coding (or statistics) for too long to give pointers there. On Mon, Jan 11, 2010 at 3:31 PM, Baron Schwartz ba...@xaprb.com wrote: Matt, On Thu, Jan 7, 2010 at 3:10 PM, Matt Neimeyer m...@neimeyer.org wrote: What's the best way to select names at random from this but still take into account frequency of use? Here's the link I usually send clients: http://jan.kneschke.de/projects/mysql/order-by-rand/ -- Baron Schwartz Percona Inc: Services and Support for MySQL http://www.percona.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Probability Selects
I've tried Googling till my brain is fried and I'm obviously missing something because I'm not finding anything useful. I'm trying to select names at random from a table that contains the name and the frequency at which it is actually used in society. The table is defined as follows: CREATE TABLE `MaleNames` ( `Name_ID` int(11) NOT NULL auto_increment, `Name` char(50) default NULL, `Frequency` decimal(5,3) default NULL, PRIMARY KEY (`Name_ID`) ) Some examples: 1, Aaron, 0.240 3, Abe, 0.006 13, Adrian, 0.069 What's the best way to select names at random from this but still take into account frequency of use? Thanks in advance! Matt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Probability Selects
http://www.greggdev.com/web/articles.php?id=6 -Original Message- From: Matt Neimeyer [mailto:m...@neimeyer.org] Sent: Thursday, January 07, 2010 12:10 PM To: mysql@lists.mysql.com Subject: Probability Selects I've tried Googling till my brain is fried and I'm obviously missing something because I'm not finding anything useful. I'm trying to select names at random from a table that contains the name and the frequency at which it is actually used in society. The table is defined as follows: CREATE TABLE `MaleNames` ( `Name_ID` int(11) NOT NULL auto_increment, `Name` char(50) default NULL, `Frequency` decimal(5,3) default NULL, PRIMARY KEY (`Name_ID`) ) Some examples: 1, Aaron, 0.240 3, Abe, 0.006 13, Adrian, 0.069 What's the best way to select names at random from this but still take into account frequency of use? Thanks in advance! Matt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=dae...@daevid.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Probability Selects
On 1/7/10 3:10 PM, Matt Neimeyer m...@neimeyer.org wrote: I'm trying to select names at random from a table that contains the name and the frequency at which it is actually used in society. The table is defined as follows: CREATE TABLE `MaleNames` ( `Name_ID` int(11) NOT NULL auto_increment, `Name` char(50) default NULL, `Frequency` decimal(5,3) default NULL, PRIMARY KEY (`Name_ID`) ) Some examples: 1, Aaron, 0.240 3, Abe, 0.006 13, Adrian, 0.069 What's the best way to select names at random from this but still take into account frequency of use? after reading the source, i'd avoid using MySQL's RAND(). you can probably easily get better quality pseudorandom number in your app's environment. to get a random row, LIMIT 1, r, where offset r is a random number between 0 and (tablelength - 1), should work. if you must use RAND(), FLOOR(tablelength*RAND()) will work for r. if you want a random row from a constrained subset of rows based on frequency, e.g. WHERE Frequency 0.001, you can include that condition in the query and repeat it until you get a non-empty response. but if the constrained subset is only a small fraction of the table's rows then it might be more efficient to compute its size (with COUNT() in another query or a subquery) and use that in generating the offset r. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org