On 5/8/08, Barbara Weinberg <[EMAIL PROTECTED]> wrote:
> The 40000 is the subset that my "where clause" cuts it down to. Not so
> relevant perhaps/
> Where is your perl script. I would love to see it. Many
> thanks!
>
I sent it to the list 3 hours ago. Check the thread/archives. But,
here goes again --
In that case, take out the 40000 records and then shuffle them in memory.
use DBI;
use List::Util qw(shuffle);
my $sth = $dbh->prepare(qq{
SELECT phn
FROM demographics
WHERE sex='F' AND dob BETWEEN 19400401 AND 19450331
});
$sth->execute;
my @res;
while (my ($phn) = $sth->fetchrow_array) {
push @res, $phn;
}
$sth->finish;
my @shuffled = shuffle @res;
my @final_res = $shuffled[0..499];
----
make sure you have indexes on sex and dob.
> B
>
> On Thu, May 8, 2008 at 10:44 AM, P Kishor <[EMAIL PROTECTED]> wrote:
>
> >
> >
> > On 5/8/08, Barbara Weinberg <[EMAIL PROTECTED]> wrote:
> > > This is my query:
> > > select phn,random() as ran from demographics where sex='F' and dob >=
> > > 19400401 and dob <=19450331 order by ran limit 500
> > > The original table has 7 million records and I am trying to select a
> subset
> > > of about 40000 and sort them in random order. I am doing this to get a
> > > random 500 records.
> >
> > I don't know what the 40000 is doing there, but I would first select
> > all the records, and then shuffle them in memory outside the db.
> >
> > As I showed in my Perl script, it will be way much faster to do so.
> >
> >
> >
> > >
> > >
> > >
> > > On Thu, May 8, 2008 at 9:17 AM, John Stanton <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Dennis Cote wrote:
> > > > > Barbara Weinberg wrote:
> > > > >> I was wondering whether anyone had tried sorting records in random
> > > > order
> > > > >> using sqlite3. I tried sorting by random() and randomblob() but it
> was
> > > > very
> > > > >> slow and chewed up lots of resources. Any suggestions?
> > > > >
> > > > > Can you provide any more details about what you are trying to
> > > > accomplish?
> > > > >
> > > > > There have been several previous discussions about selecting random
> > > > > records from a table. You may want to look for those in the
> archives. If
> > > > > you really need to sort the entire table then you are probably
> stuck
> > > > > with the slow operation. If you only need a random subset of a
> table
> > > > > then you may be able to use some of these ideas to speed things up.
> > > > >
> > > > > HTH
> > > > > Dennis Cote
> > > > >
> > > >
> > > > If you have sequential rowids and know the largest then you can use a
> PRNG
> > > > and get a random rowid to lookup. That will be optimally fast.
> > > >
> >
> >
> >
> > --
> > Puneet Kishor http://punkish.eidesis.org/
> > Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
> > Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
> >
> >
> >
> > _______________________________________________
> > sqlite-users mailing list
> > [email protected]
> >
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
--
Puneet Kishor http://punkish.eidesis.org/
Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users