Re: Inplace shuffle function returns none

2016-10-19 Thread Sayth Renshaw
Hi Chris I read this last night and thought i may have woken with a frightfully witty response. I didnt however. Thanks :-) -- https://mail.python.org/mailman/listinfo/python-list

Re: Inplace shuffle function returns none

2016-10-19 Thread Chris Angelico
On Wed, Oct 19, 2016 at 6:01 PM, Sayth Renshaw wrote: > Ok i think i do understand it. I searched the python document for in-place > functions but couldn't find a specific reference. > > Is there a particular part in docs or blog that covers it? Or is it > fundamental to all so not explicitly tr

Re: Inplace shuffle function returns none

2016-10-19 Thread Ben Finney
Sayth Renshaw writes: > Ok i think i do understand it. I searched the python document for > in-place functions but couldn't find a specific reference. They aren't a separate kind of function. A function can do anything Python code can do; indeed, most Python programs do just about everything t

Re: Inplace shuffle function returns none

2016-10-19 Thread Peter Otten
> right way. > > You keep getting None because you do it the wrong way. Unfortunately you > aren't showing us your code, so we have no idea what you are doing wrong. > My guess is that you are doing something like this: > > > a = [1, 2, 3, 4, 5, 6, 7, 8] > b = random.s

Re: Inplace shuffle function returns none

2016-10-19 Thread Sayth Renshaw
Ok i think i do understand it. I searched the python document for in-place functions but couldn't find a specific reference. Is there a particular part in docs or blog that covers it? Or is it fundamental to all so not explicitly treated in one particular page? Thanks Sayth -- https://mail.

Re: Inplace shuffle function returns none

2016-10-18 Thread Steve D'Aprano
the wrong way. Unfortunately you aren't showing us your code, so we have no idea what you are doing wrong. My guess is that you are doing something like this: a = [1, 2, 3, 4, 5, 6, 7, 8] b = random.shuffle(a)[0:3] That's wrong -- shuffle() modifies the list you pass, and returns None. You cannot t

Re: Inplace shuffle function returns none

2016-10-18 Thread John Gordon
In <9d24f23c-b578-4029-ab80-f117599e2...@googlegroups.com> Sayth Renshaw writes: > So why can't i assign the result slice to a variable b? Because shuffle() modifies the list directly, and returns None. It does NOT return the shuffled list. -- John Gordon A i

Re: Inplace shuffle function returns none

2016-10-18 Thread breamoreboy
done inplace *ALWAYS* returns None as a warning that the operation has been done inplace, so you're never going to get anything back. All you need do is change your original code as follows:- from random import shuffle a = [1,2,3,4,5] shuffle(a) b = a[:3] print(b) Kindest regards.

Re: Inplace shuffle function returns none

2016-10-18 Thread Ian Kelly
On Tue, Oct 18, 2016 at 2:25 PM, Sayth Renshaw wrote: > So why can't i assign the result slice to a variable b? > > It just keeps getting none. Because shuffle returns none. If you want to keep both the original list and the shuffled list, then do something like: b = a[:] shuf

Re: Inplace shuffle function returns none

2016-10-18 Thread Sayth Renshaw
So why can't i assign the result slice to a variable b? It just keeps getting none. Sayth -- https://mail.python.org/mailman/listinfo/python-list

Re: Inplace shuffle function returns none

2016-10-18 Thread John Gordon
In Sayth Renshaw writes: > If shuffle is an "in place" function and returns none how do i obtain > the values from it. The values are still in the original object -- variable "a" in your example. > from random import shuffle > a = [1,2,3,4,5] > b = shuff

Re: Inplace shuffle function returns none

2016-10-18 Thread Chris Angelico
On Wed, Oct 19, 2016 at 1:48 AM, Sayth Renshaw wrote: > If shuffle is an "in place" function and returns none how do i obtain the > values from it. > > from random import shuffle > > a = [1,2,3,4,5] > b = shuffle(a) > print(b[:3]) > > For example here

Inplace shuffle function returns none

2016-10-18 Thread Sayth Renshaw
If shuffle is an "in place" function and returns none how do i obtain the values from it. from random import shuffle a = [1,2,3,4,5] b = shuffle(a) print(b[:3]) For example here i just want to slice the first 3 numbers which should be shuffled. However you can't slice a noneTyp

__import__(name, fromlist=...), was Re: Shuffle

2014-09-15 Thread Peter Otten
Steven D'Aprano wrote: > A serious question -- what is the point of the fromlist argument to > __import__? It doesn't appear to actually do anything. > > > https://docs.python.org/3/library/functions.html#__import__ It may be for submodules: $ mkdir -p aaa/bbb $ tree . └── aaa └── bbb 2 d

Re: Shuffle

2014-09-15 Thread Steven D'Aprano
Dave Angel wrote: > Michael Torrie Wrote in message: >> You can do it two ways: >> Refer to it as random.shuffle() >> >> or >> >> from random import shuffle >> >> I tend to use the first method (random.shuffle). That way it prevents >

Re: Shuffle

2014-09-15 Thread Dave Angel
Michael Torrie Wrote in message: > On 09/13/2014 05:47 PM, Seymore4Head wrote: >> Here is a screenshot of me trying Dave Briccetti's quiz program from >> the shell and it (the shuffle command) works. >> https://www.youtube.com/watch?v=VR-yNEpGk3g >> http://i.

Re: Shuffle

2014-09-13 Thread Seymore4Head
On Sat, 13 Sep 2014 19:32:55 -0600, Michael Torrie wrote: >On 09/13/2014 05:47 PM, Seymore4Head wrote: >> Here is a screenshot of me trying Dave Briccetti's quiz program from >> the shell and it (the shuffle command) works. >> https://www.youtube.com/watch?v=VR-yNEp

Re: Shuffle

2014-09-13 Thread Michael Torrie
On 09/13/2014 05:47 PM, Seymore4Head wrote: > Here is a screenshot of me trying Dave Briccetti's quiz program from > the shell and it (the shuffle command) works. > https://www.youtube.com/watch?v=VR-yNEpGk3g > http://i.imgur.com/vlpVa5i.jpg > > Two questions > If you

Re: Shuffle

2014-09-13 Thread Chris Angelico
On Sun, Sep 14, 2014 at 9:47 AM, Seymore4Head wrote: > Two questions > If you import random, do you need to "from random import shuffle"? > > Why does shuffle work from the command line and not when I add it to > this program? > > import random > import shuffl

Shuffle

2014-09-13 Thread Seymore4Head
Here is a screenshot of me trying Dave Briccetti's quiz program from the shell and it (the shuffle command) works. https://www.youtube.com/watch?v=VR-yNEpGk3g http://i.imgur.com/vlpVa5i.jpg Two questions If you import random, do you need to "from random import shuffle"? Why do

bytes1.shuffle( )

2008-05-09 Thread castironpi
random.shuffle( bytes1 ) if random == bytes: repunctuate( sentence ) else: random.shuffle( [ random ] ) sincerely exit, () -- http://mail.python.org/mailman/listinfo/python-list

Re: Does shuffle() produce uniform result ?

2007-09-10 Thread Lawrence D'Oliveiro
In message <[EMAIL PROTECTED]>, Steven D'Aprano wrote: > On Sun, 09 Sep 2007 18:53:32 +1200, Lawrence D'Oliveiro wrote: > >> In message <[EMAIL PROTECTED]>, Paul Rubin wrote: >> >>> Lawrence D'Oliveiro <[EMAIL PROTECTED]> writes: >>> Except that the NSA's reputation has taken a dent since t

Re: Does shuffle() produce uniform result ?

2007-09-09 Thread Paul Rubin
Lawrence D'Oliveiro <[EMAIL PROTECTED]> writes: > According to this , the family of > algorithms collectively described as "SHA-2" is by no means a definitive > successor to SHA-1. See : However, due to adva

Re: Does shuffle() produce uniform result ?

2007-09-09 Thread Lawrence D'Oliveiro
In message <[EMAIL PROTECTED]>, Paul Rubin wrote: > Lawrence D'Oliveiro <[EMAIL PROTECTED]> writes: >> > ... and it's to NSA's credit that SHA-1 held up for as long as it did. >> But they have no convincing proposal for a successor. That means the gap >> between the classified and non-classified s

Re: Does shuffle() produce uniform result ?

2007-09-09 Thread Paul Rubin
Lawrence D'Oliveiro <[EMAIL PROTECTED]> writes: > > ... and it's to NSA's credit that SHA-1 held up for as long as it did. > But they have no convincing proposal for a successor. That means the gap > between the classified and non-classified state of the art has shrunk down > to insignificance. Th

Re: Does shuffle() produce uniform result ?

2007-09-09 Thread Paul Rubin
Bryan Olson <[EMAIL PROTECTED]> writes: > I haven't kept up. Has anyone exhibited a SHA-1 collision? I don't think anyone has shown an actual collision, but apparently there is now a known way to find them in around 2**63 operations. I don't know if it parallellizes as well as a brute force attac

Re: Does shuffle() produce uniform result ?

2007-09-09 Thread Steven D'Aprano
On Sun, 09 Sep 2007 18:53:32 +1200, Lawrence D'Oliveiro wrote: > In message <[EMAIL PROTECTED]>, Paul Rubin wrote: > >> Lawrence D'Oliveiro <[EMAIL PROTECTED]> writes: >> >>> Except that the NSA's reputation has taken a dent since they failed to >>> anticipate the attacks on MD5 and SHA-1. >> >>

Re: Does shuffle() produce uniform result ?

2007-09-08 Thread Lawrence D'Oliveiro
In message <[EMAIL PROTECTED]>, Paul Rubin wrote: > Lawrence D'Oliveiro <[EMAIL PROTECTED]> writes: > >> Except that the NSA's reputation has taken a dent since they failed to >> anticipate the attacks on MD5 and SHA-1. > > NSA had nothing to do with MD5 ... Nevertheless, it was their job to ant

Re: Does shuffle() produce uniform result ?

2007-09-08 Thread Bryan Olson
Paul Rubin wrote: > Lawrence D'Oliveiro writes: >> Except that the NSA's reputation has taken a dent since they failed to >> anticipate the attacks on MD5 and SHA-1. > > NSA had nothing to do with MD5, and it's to NSA's credit that SHA-1 > held up for as long as it did. I haven't kept up. Has any

Re: Does shuffle() produce uniform result ?

2007-09-08 Thread Paul Rubin
Lawrence D'Oliveiro <[EMAIL PROTECTED]> writes: > Except that the NSA's reputation has taken a dent since they failed to > anticipate the attacks on MD5 and SHA-1. NSA had nothing to do with MD5, and it's to NSA's credit that SHA-1 held up for as long as it did. -- http://mail.python.org/mailman/

Re: Does shuffle() produce uniform result ?

2007-09-08 Thread Lawrence D'Oliveiro
In message <[EMAIL PROTECTED]>, Steven D'Aprano wrote: > Any cryptographer worth his salt (pun intended) would be looking to close > that vulnerability BEFORE an attack is made public, and not just wait for > the attack to trickle down from the NSA to the script kiddies. Except that the NSA's rep

Re: Does shuffle() produce uniform result ?

2007-09-05 Thread Steven D'Aprano
On Tue, 04 Sep 2007 22:01:47 -0700, Paul Rubin wrote: > OK. /dev/random vs /dev/urandom is a perennial topic in sci.crypt and > there are endless long threads about it there, so I tried to give you > the short version, but will give a somewhat longer version here. Thank you. Your points are take

Re: Does shuffle() produce uniform result ?

2007-09-04 Thread Paul Rubin
Steven D'Aprano <[EMAIL PROTECTED]> writes: > > Right. The idea is that those attacks don't exist and therefore the > > output is computationally indistinguishable from random. > > It is a huge leap from what the man page says, that they don't exist in > the unclassified literature at the time t

Re: Does shuffle() produce uniform result ?

2007-09-04 Thread Raymond Hettinger
On Aug 24, 12:38 am, tooru honda <[EMAIL PROTECTED]> wrote: > 1. Does shuffle() produce uniform result ? If you're worried about this microscopic bias (on the order of 2**-53), then shuffle more than once. That will distribute the bias more evenly: def super_shuffle(sequence):

Re: Does shuffle() produce uniform result ?

2007-09-04 Thread Steven D'Aprano
On Mon, 03 Sep 2007 23:42:56 -0700, Paul Rubin wrote: > Antoon Pardon <[EMAIL PROTECTED]> writes: >> > No the idea is that once there's enough entropy in the pool to make >> > one encryption key (say 128 bits), the output of /dev/urandom is >> > computationally indistinguishable from random output

Re: Does shuffle() produce uniform result ?

2007-09-03 Thread Paul Rubin
Antoon Pardon <[EMAIL PROTECTED]> writes: > > No the idea is that once there's enough entropy in the pool to make > > one encryption key (say 128 bits), the output of /dev/urandom is > > computationally indistinguishable from random output no matter how > > much data you read from it. > > If you w

Re: Does shuffle() produce uniform result ?

2007-09-03 Thread Antoon Pardon
On 2007-09-03, Paul Rubin wrote: > Antoon Pardon <[EMAIL PROTECTED]> writes: >> If I understand correctly that you are using urandom as a random >> generator I wouldn't trust too much on this performance. Urandom >> uses the systemwide entropy-pool. If other programs need this pool >> too, your pe

Re: Does shuffle() produce uniform result ?

2007-09-03 Thread Paul Rubin
Antoon Pardon <[EMAIL PROTECTED]> writes: > If I understand correctly that you are using urandom as a random > generator I wouldn't trust too much on this performance. Urandom > uses the systemwide entropy-pool. If other programs need this pool > too, your performance can drop spectaculary. No th

Re: Does shuffle() produce uniform result ?

2007-09-03 Thread Antoon Pardon
On 2007-08-26, tooru honda <[EMAIL PROTECTED]> wrote: > By incorporating Alex's code, I got another performance boost of 20%. > It is mostly due to Alex's more efficient implementation of block random > than my own version. If I understand correctly that you are using urandom as a random genera

Re: Does shuffle() produce uniform result ?

2007-08-28 Thread tooru honda
Thanks to everyone who replied, (and special thanks to Alex Martelli,) I was able to accomplish what I originally asked for: a shuffle() which is both fast and without bias. It is the first time I try to optimize python code, and it is definitely a very rewarding experience. To bring closure

Re: Does shuffle() produce uniform result ?

2007-08-25 Thread Alex Martelli
tooru honda <[EMAIL PROTECTED]> wrote: ... > def rand2(): > while True: > randata = urandom(2*1024) > for i in xrange(0, 2*1024, 2): > yield int(hexlify(randata[i:i+2]),16)# integer > in [0,65535] another equivalent pos

Re: Does shuffle() produce uniform result ?

2007-08-25 Thread tooru honda
hN=stopN-startN left_over_N=65536%widthN upper_bound_N= 65535-left_over_N random_number=self.rand2_M() while random_number>upper_bound_N: random_number=self.rand2_M() r = random_number%widthN return startN+r

Re: Does shuffle() produce uniform result ?

2007-08-25 Thread Alex Martelli
tooru honda <[EMAIL PROTECTED]> wrote: > At the end, I think it is worthwhile to implement my own shuffle and > random methods based on os.urandom. Not only does the resulting code > gets rid of the minuscule bias, but the program also runs much faster.

Re: Does shuffle() produce uniform result ?

2007-08-25 Thread Dan Bishop
On Aug 24, 2:38 am, tooru honda <[EMAIL PROTECTED]> wrote: > Hi, > > I have read the source code of the built-in random module, random.py. > After also reading Wiki article on Knuth Shuffle algorithm, I wonder if > the shuffle method implemented in random.py produces resul

Re: Does shuffle() produce uniform result ?

2007-08-25 Thread tooru honda
At the end, I think it is worthwhile to implement my own shuffle and random methods based on os.urandom. Not only does the resulting code gets rid of the minuscule bias, but the program also runs much faster. When using random.SystemRandom.shuffle, posix.open and posix.close from calling

Re: Does shuffle() produce uniform result ?

2007-08-24 Thread tooru honda
Hi, First of all, my thanks to all of you who replied. I am writing a gamble simulation to convince my friend that his "winning strategy" doesn't work. I use shuffle method from a random.SystemRandom instance to shuffle 8 decks of cards. As the number of cards is quite small (

Re: Does shuffle() produce uniform result ?

2007-08-24 Thread Mark Dickinson
On Aug 24, 9:30 pm, Mark Dickinson <[EMAIL PROTECTED]> wrote: > x = floor((n/2**53)*7) > > will produce 0, 1, 3 and 5 with probability (2**53//7+1)/2**53, and 2, > 4 and 6 with probability (2**53//7)/2*53. Oops---I lied; I forgot to take into account the rounding implicit in the (n/2**53)*7 mult

Re: Does shuffle() produce uniform result ?

2007-08-24 Thread Paul Rubin
tooru honda <[EMAIL PROTECTED]> writes: > The reasoning is as follows: Because the method random() only produces > finitely many possible results, we get modulo bias when the number of > possible results is not divisible by the size of the shuffled list. > > 1. Does shuf

Re: Does shuffle() produce uniform result ?

2007-08-24 Thread Mark Dickinson
On Aug 24, 8:54 am, Hrvoje Niksic <[EMAIL PROTECTED]> wrote: > tooru honda <[EMAIL PROTECTED]> writes: > > I have read the source code of the built-in random module, > > random.py. After also reading Wiki article on Knuth Shuffle > > algorithm, I wonder if

Re: Does shuffle() produce uniform result ?

2007-08-24 Thread Hrvoje Niksic
tooru honda <[EMAIL PROTECTED]> writes: > I have read the source code of the built-in random module, > random.py. After also reading Wiki article on Knuth Shuffle > algorithm, I wonder if the shuffle method implemented in random.py > produces results with modulo bias. It d

Re: Does shuffle() produce uniform result ?

2007-08-24 Thread Steve Holden
tooru honda wrote: > Hi, > > I have read the source code of the built-in random module, random.py. > After also reading Wiki article on Knuth Shuffle algorithm, I wonder if > the shuffle method implemented in random.py produces results with modulo > bias. > > The

Does shuffle() produce uniform result ?

2007-08-24 Thread tooru honda
Hi, I have read the source code of the built-in random module, random.py. After also reading Wiki article on Knuth Shuffle algorithm, I wonder if the shuffle method implemented in random.py produces results with modulo bias. The reasoning is as follows: Because the method random() only

Re: Python module for the IPod shuffle ...

2007-02-01 Thread Analog Kid
-packages. what am i missing? thanks for your help. -ajay On 2/1/07, Simon Brunning <[EMAIL PROTECTED]> wrote: On 1/31/07, Analog Kid <[EMAIL PROTECTED]> wrote: > Hi all: > Im looking for a python module thatll let me do simple reads/writes from and > to an iPod shuf

Re: Python module for the IPod shuffle ...

2007-02-01 Thread Simon Brunning
On 1/31/07, Analog Kid <[EMAIL PROTECTED]> wrote: > Hi all: > Im looking for a python module thatll let me do simple reads/writes from and > to an iPod shuffle similar to iTunes ... I read about the gPod module ... > but Im not sure whether it will work in Windows ... Thi

Python module for the IPod shuffle ...

2007-01-31 Thread Analog Kid
Hi all: Im looking for a python module thatll let me do simple reads/writes from and to an iPod shuffle similar to iTunes ... I read about the gPod module ... but Im not sure whether it will work in Windows ... Any help is greatly appreciated. Thanks in advance ... -Ajay -- http

Re: shuffle the lines of a large file

2005-03-12 Thread paul koelle
Joerg Schuster wrote: Thanks to all. This thread shows again that Python's best feature is comp.lang.python. from comp.lang import python ;) Paul -- http://mail.python.org/mailman/listinfo/python-list

Re: shuffle the lines of a large file

2005-03-11 Thread Peter Otten
Simon Brunning wrote: > I couldn't resist. ;-) Me neither... > import random > > def randomLines(filename, lines=1): > selected_lines = list(None for line_no in xrange(lines)) > > for line_index, line in enumerate(open(filename)): > for selected_line_index in xrange(lines): >

Re: shuffle the lines of a large file

2005-03-11 Thread Simon Brunning
t'll only read several lines from the file, never do a > shuffle of the whole file content... Err, thing is, it *does* pick a random selection from the whole file, without holding the whole file in memory. (It does hold all the selected items in memory - I don't see any way to avoid th

Re: shuffle the lines of a large file

2005-03-10 Thread Heiko Wundram
On Tuesday 08 March 2005 15:55, Simon Brunning wrote: > Ah, but that's the clever bit; it *doesn't* store the whole list - > only the selected lines. But that means that it'll only read several lines from the file, never do a shuffle of the whole file content... When you&

Re: shuffle the lines of a large file

2005-03-10 Thread Simon Brunning
On Thu, 10 Mar 2005 14:37:25 +0100, Stefan Behnel <[EMAIL PROTECTED]> > There. Factor 10. That's what I call optimization... The simplest approach is even faster: C:\>python -m timeit -s "from itertools import repeat" "[None for i in range(1)]" 100 loops, best of 3: 2.53 msec per loop C:\>p

Re: shuffle the lines of a large file

2005-03-10 Thread Stefan Behnel
Simon Brunning wrote: On Tue, 8 Mar 2005 14:13:01 +, Simon Brunning wrote: selected_lines = list(None for line_no in xrange(lines)) Just a short note on this line. If lines is really large, its much faster to use from itertools import repeat selected_lines = list(repeat(None, len(lines)))

Re: shuffle the lines of a large file

2005-03-08 Thread Simon Brunning
On Tue, 8 Mar 2005 15:49:35 +0100, Heiko Wundram <[EMAIL PROTECTED]> wrote: > Problem being: if the file the OP is talking about really is 80GB in size, and > you consider a sentence to have 80 bytes on average (it's likely to have less > than that), that makes 10^9 sentences in the file. Now, mult

Re: shuffle the lines of a large file

2005-03-08 Thread Heiko Wundram
On Tuesday 08 March 2005 15:28, Simon Brunning wrote: > This has the advantage that every line had the same chance of being > picked regardless of its length. There is the chance that it'll pick > the same line more than once, though. Problem being: if the file the OP is talking about really is 80

Re: shuffle the lines of a large file

2005-03-08 Thread Simon Brunning
On Tue, 8 Mar 2005 14:13:01 +, Simon Brunning <[EMAIL PROTECTED]> wrote: > On 7 Mar 2005 06:38:49 -0800, gry@ll.mit.edu wrote: > > As far as I can tell, what you ultimately want is to be able to extract > > a random ("representative?") subset of sentences. > > If this is what's wanted, then p

Re: shuffle the lines of a large file

2005-03-08 Thread Simon Brunning
On 7 Mar 2005 06:38:49 -0800, gry@ll.mit.edu wrote: > As far as I can tell, what you ultimately want is to be able to extract > a random ("representative?") subset of sentences. If this is what's wanted, then perhaps some variation on this cookbook recipe might do the trick: http://aspn.activest

Re: shuffle the lines of a large file

2005-03-08 Thread Nick Craig-Wood
Raymond Hettinger <[EMAIL PROTECTED]> wrote: > >>> from random import random > >>> out = open('corpus.decorated', 'w') > >>> for line in open('corpus.uniq'): > print >> out, '%.14f %s' % (random(), line), > > >>> out.close() > > sort corpus.decorated | cut -c 18- > corpus.randomized Ve

Re: shuffle the lines of a large file

2005-03-07 Thread Raymond Hettinger
[Joerg Schuster] > I am looking for a method to "shuffle" the lines of a large file. > > I have a corpus of sorted and "uniqed" English sentences that has been > produced with (1): > > (1) sort corpus | uniq > corpus.uniq > > corpus.uniq is 80G large

Re: shuffle the lines of a large file

2005-03-07 Thread François Pinard
[Heiko Wundram] > Replying to oneself is bad, [...] Not necessarily. :-) -- François Pinard http://pinard.progiciels-bpi.ca -- http://mail.python.org/mailman/listinfo/python-list

Re: shuffle the lines of a large file

2005-03-07 Thread François Pinard
[Joerg Schuster] > I am looking for a method to "shuffle" the lines of a large file. If speed and space are not a concern, I would be tempted to presume that this can be organised without too much difficulty. However, looking for speed handling a big file, while keeping equiproba

RE: shuffle the lines of a large file

2005-03-07 Thread Batista, Facundo
Title: RE: shuffle the lines of a large file [Joerg Schuster] #- Thanks to all. This thread shows again that Python's best feature is #- comp.lang.python. QOTW! QOTW! .    Facundo Bitácora De Vuelo: http://www.taniquetil.com.ar/plog PyAr - Python Argentina: http://pyar.decode.c

Re: shuffle the lines of a large file

2005-03-07 Thread Steven Bethard
Joerg Schuster wrote: Thanks to all. This thread shows again that Python's best feature is comp.lang.python. +1 QOTW STeVe -- http://mail.python.org/mailman/listinfo/python-list

Re: shuffle the lines of a large file

2005-03-07 Thread Joerg Schuster
Thanks to all. This thread shows again that Python's best feature is comp.lang.python. Jörg -- http://mail.python.org/mailman/listinfo/python-list

Re: shuffle the lines of a large file - filelist.py (0/1)

2005-03-07 Thread TZOTZIOY
On 7 Mar 2005 05:36:32 -0800, rumours say that "Joerg Schuster" <[EMAIL PROTECTED]> might have written: >Hello, > >I am looking for a method to "shuffle" the lines of a large file. [snip] >So, it would be very useful to do one of the following things: &g

Re: shuffle the lines of a large file

2005-03-07 Thread Warren Postma
Joerg Schuster wrote: Unfortunately, none of the machines that I may use has 80G RAM. So, using a dictionary will not help. Any ideas? Why don't you index the file? I would store the byte-offsets of the beginning of each line into an index file. Then you can generate a random number from 1 to Wh

Re: shuffle the lines of a large file

2005-03-07 Thread gry
As far as I can tell, what you ultimately want is to be able to extract a random ("representative?") subset of sentences. Given the huge size of data, I would suggest not randomizing the file, but randomizing accesses to the file. E.g. (sorry for off-the-cuff pseudo python): [adjust 8196 == 2**13

Re: shuffle the lines of a large file

2005-03-07 Thread Richard Brodie
"Joerg Schuster" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > I am looking for a method to "shuffle" the lines of a large file. Of the top of my head: decorate, randomize, undecorate. Prepend a suitable large random number or hash to each line

Re: shuffle the lines of a large file

2005-03-07 Thread Heiko Wundram
Replying to oneself is bad, but although the program works, I never intended to use a shelve to store the data. Better to use anydbm. So, just replace: import shelve by import anydbm and lineindex = shelve.open("test.idx") by lineindex = anydbm.open("test.idx","c") Keep the rest as is.

RE: shuffle the lines of a large file

2005-03-07 Thread Alex Stapleton
Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Alex Stapleton Sent: 07 March 2005 14:17 To: Joerg Schuster; python-list@python.org Subject: RE: shuffle the lines of a large file Not tested this, run it (or some derivation thereof) over the output to get increasing randomness

Re: shuffle the lines of a large file

2005-03-07 Thread Eddie Corns
"Joerg Schuster" <[EMAIL PROTECTED]> writes: >Hello, >I am looking for a method to "shuffle" the lines of a large file. >I have a corpus of sorted and "uniqed" English sentences that has been >produced with (1): >(1) sort corpus | uniq >

Re: shuffle the lines of a large file

2005-03-07 Thread Heiko Wundram
hex(curpos-lastpos)[2:-1]) lastpos = curpos i += 1 return i maxidx = makeIdx() # To shuffle the file, just shuffle the index. Problem being: there is no # random number generator which even remotely has the possibility of yielding # all p

RE: shuffle the lines of a large file

2005-03-07 Thread Alex Stapleton
Not tested this, run it (or some derivation thereof) over the output to get increasing randomness. You will want to keep max_buffered_lines as high as possible really I imagine. If shuffle() is too intensize you could itterate over the buffer several times randomly removing and printing lines

Re: shuffle the lines of a large file

2005-03-07 Thread Kent Johnson
Joerg Schuster wrote: Hello, I am looking for a method to "shuffle" the lines of a large file. I have a corpus of sorted and "uniqed" English sentences that has been produced with (1): (1) sort corpus | uniq > corpus.uniq corpus.uniq is 80G large. The fact that every senten

shuffle the lines of a large file

2005-03-07 Thread Joerg Schuster
Hello, I am looking for a method to "shuffle" the lines of a large file. I have a corpus of sorted and "uniqed" English sentences that has been produced with (1): (1) sort corpus | uniq > corpus.uniq corpus.uniq is 80G large. The fact that every sentence appears only on