On 5/8/08, Barbara Weinberg <[EMAIL PROTECTED]> wrote:
> The 40000 is the subset that my "where clause" cuts it down to. Not so
> relevant perhaps/
> Where is your perl script. I would love to see it. Many
> thanks!
>

I sent it to the list 3 hours ago. Check the thread/archives. But,
here goes again --

In that case, take out the 40000 records and then shuffle them in memory.

use DBI;
use List::Util qw(shuffle);

my $sth = $dbh->prepare(qq{
   SELECT phn
   FROM demographics
   WHERE sex='F' AND dob BETWEEN 19400401 AND 19450331
});

$sth->execute;
my @res;
while (my ($phn) = $sth->fetchrow_array) {
   push @res, $phn;
}

$sth->finish;
my @shuffled = shuffle @res;
my @final_res = $shuffled[0..499];

----
make sure you have indexes on sex and dob.



> B
>
> On Thu, May 8, 2008 at 10:44 AM, P Kishor <[EMAIL PROTECTED]> wrote:
>
> >
> >
> > On 5/8/08, Barbara Weinberg <[EMAIL PROTECTED]> wrote:
> > > This is my query:
> > >  select phn,random() as ran from demographics where sex='F' and dob >=
> > >  19400401 and dob <=19450331 order by ran limit 500
> > >  The original table has 7 million records and I am trying to select a
> subset
> > >  of about 40000 and sort them in random order. I am doing this to get a
> > >  random 500 records.
> >
> > I don't know what the 40000 is doing there, but I would first select
> > all the records, and then shuffle them in memory outside the db.
> >
> > As I showed in my Perl script, it will be way much faster to do so.
> >
> >
> >
> > >
> > >
> > >
> > >  On Thu, May 8, 2008 at 9:17 AM, John Stanton <[EMAIL PROTECTED]>
> wrote:
> > >
> > >  > Dennis Cote wrote:
> > >  > > Barbara Weinberg wrote:
> > >  > >> I was wondering whether anyone had tried sorting records in random
> > >  > order
> > >  > >> using sqlite3. I tried sorting by random() and randomblob() but it
> was
> > >  > very
> > >  > >> slow and chewed up lots of resources. Any suggestions?
> > >  > >
> > >  > > Can you provide any more details about what you are trying to
> > >  > accomplish?
> > >  > >
> > >  > > There have been several previous discussions about selecting random
> > >  > > records from a table. You may want to look for those in the
> archives. If
> > >  > > you really need  to sort the entire table then you are probably
> stuck
> > >  > > with the slow operation. If you only need a random subset of a
> table
> > >  > > then you may be able to use some of these ideas to speed things up.
> > >  > >
> > >  > > HTH
> > >  > > Dennis Cote
> > >  > >
> > >  >
> > >  > If you have sequential rowids and know the largest then you can use a
> PRNG
> > >  > and get a random rowid to lookup.  That will be optimally fast.
> > >  >
> >
> >
> >
> > --
> > Puneet Kishor http://punkish.eidesis.org/
> > Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
> > Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
> >
> >
> >
> > _______________________________________________
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> >
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>


-- 
Puneet Kishor http://punkish.eidesis.org/
Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to