Re: generate De Bruijn sequence memory and string vs lists

2014-01-24 Thread Greg Ewing
Vincent Davis wrote: True, the "all you want is a mapping" is not quite true. I actually plan to plot frequency (the number of times an observed sub sequence overlaps a value in the De Bruijn sequence) The way the sub sequences overlap is important to me and I don't see a way go from base-k (or

Re: generate De Bruijn sequence memory and string vs lists

2014-01-24 Thread Vincent Davis
On Fri, Jan 24, 2014 at 2:23 AM, Peter Otten <__pete...@web.de> wrote: > Then, how do you think Python /knows/ that it has to repeat the code 10 > times on my "slow" and 100 times on your "fast" machine? It runs the bench > once, then 10, then 100, then 1000 times -- until there's a run that takes

Re: generate De Bruijn sequence memory and string vs lists

2014-01-24 Thread Vincent Davis
On Fri, Jan 24, 2014 at 2:29 AM, Gregory Ewing wrote: > If all you want is a mapping between a sequence of > length n and compact representation of it, there's > a much simpler way: just convert it to a base-k > integer, where k is the size of the alphabet. > > The resulting integer won't be any l

Re: generate De Bruijn sequence memory and string vs lists

2014-01-24 Thread Gregory Ewing
Vincent Davis wrote: I plan to use the sequence as an index to count occurrences of sequences of length n. If all you want is a mapping between a sequence of length n and compact representation of it, there's a much simpler way: just convert it to a base-k integer, where k is the size of the al

Re: generate De Bruijn sequence memory and string vs lists

2014-01-24 Thread Peter Otten
Vincent Davis wrote: > Excellent Peter! > I have a question, the times reported don't make sense to me, for example > $ python3 -m timeit -s 'from debruijn_compat import debruijn_bytes as d' > 'd(4, 8)' > 100 loops, best of 3: 10.2 msec per loop > This took ~4 secs (stop watch) which is much more

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Dave Angel
Vincent Davis Wrote in message: > I didn't really study the code, and the fact that there's a nested function could mess it up. But if it were a straightforward function with exactly one append, , then replacing the append with a yield would produce the string one character at a time. -

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On Thu, Jan 23, 2014 at 3:15 PM, Peter Otten <__pete...@web.de> wrote: > $ python -m timeit -s 'from debruijn_compat import debruijn as d' 'd(4, 8)' > 10 loops, best of 3: 53.5 msec per loop > $ python -m timeit -s 'from debruijn_compat import debruijn_bytes as d' > 'd(4, 8)' > 10 loops, best of 3

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Peter Otten
Vincent Davis wrote: > On Thu, Jan 23, 2014 at 2:36 PM, Mark Lawrence > wrote: > >> FTR string.maketrans is gone from Python 3.2+. Quoting from >> http://docs.python.org/dev/whatsnew/3.2.html#porting-to-python-3-2 "The >> previously deprecated string.maketrans() function has been removed in >> f

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On Thu, Jan 23, 2014 at 2:36 PM, Mark Lawrence wrote: > FTR string.maketrans is gone from Python 3.2+. Quoting from > http://docs.python.org/dev/whatsnew/3.2.html#porting-to-python-3-2 "The > previously deprecated string.maketrans() function has been removed in favor > of the static methods bytes

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Mark Lawrence
On 23/01/2014 20:10, Peter Otten wrote: Vincent Davis wrote: On Thu, Jan 23, 2014 at 12:02 PM, Peter Otten <__pete...@web.de> wrote: I just noted that the first Python loop can be eliminated: Oops, I forgot to paste import string def chars(a, b): return "".join(map(chr, range(a, b)))

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Peter Otten
Vincent Davis wrote: > On Thu, Jan 23, 2014 at 12:02 PM, Peter Otten <__pete...@web.de> wrote: >> >> I just noted that the first Python loop can be eliminated: Oops, I forgot to paste import string def chars(a, b): return "".join(map(chr, range(a, b))) _mapping = string.maketrans(chars(0, 1

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On Thu, Jan 23, 2014 at 12:02 PM, Peter Otten <__pete...@web.de> wrote: > > I just noted that the first Python loop can be eliminated: > > def debruijn(k, n): > a = k * n * bytearray([0]) > sequence = bytearray() > extend = sequence.extend # factor out method lookup > def db(t, p):

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On 1/23/14, 10:18 AM, Dave Angel wrote: > (something about your message seems to make it unquotable) Not sure why the message was not quotable. I sent it using gmail. On 1/23/14, 10:18 AM, Dave Angel wrote: > 64gig is 4^18, so you can forget about holding a string of size 4^50 I guess I will hav

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Peter Otten
Peter Otten wrote: > You could change de_bruijn_1() to use `bytearray`s instead of `list`s: > > # Python 2 > def debruijn(k, n): > a = k * n * bytearray([0]) > sequence = bytearray() > append = sequence.append # factor out method lookup > def db(t, p,): > if t > n: >

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Peter Otten
Vincent Davis wrote: > For reference, Wikipedia entry for De Bruijn sequence > http://en.wikipedia.org/wiki/De_Bruijn_sequence > > At the above link is a python algorithm for generating De Brujin > sequences. It works fine but outputs a list of integers [0, 0, 0, 1, 0, 1, > 1, 1] and I would pref

Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On Thu, Jan 23, 2014 at 10:18 AM, Dave Angel wrote: > If memory size is your issue, why not make the function a > generator, by replacing the append with a yield? > ​One more thought on the generator. I have an idea for how to use the generator but I still need 1, chucks of size n de_brujin(k

generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
For reference, Wikipedia entry for De Bruijn sequence http://en.wikipedia.org/wiki/De_Bruijn_sequence At the above link is a python algorithm for generating De Brujin sequences. It works fine but outputs a list of integers [0, 0, 0, 1, 0, 1, 1, 1] and I would prefer a string '00010111'. This can b