Re: find sublist inside list

2009-05-05 Thread bearophileHUGS
John O'Hagan:
> li=['a', 'c', 'a', 'c', 'c', 'g', 'a', 'c']
> for i  in range(len(li)):
>     if li[i:i + 2] == ['a', 'c']:
>         li[i:i + 2] = ['6']

Oh well, I have done a mistake, it seems.
Another solution then:

>>> 'acaccgac'.replace("ac", chr(6))
'\x06\x06cg\x06'

Bye,
bearophile
--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-05 Thread mzdude
On May 4, 4:54 pm, "Gabriel Genellina"  wrote:
> En Mon, 04 May 2009 15:12:41 -0300, mzdude  escribió:
>
> > substring isn't limited to 0..255
>  substring = "\0x%d\0x%d" % (257,257)
>  'acaccgac'.replace("ac", substring)
> > '\x00x257\x00x257\x00x257\x00x257cg\x00x257\x00x257'
>
> This isn't what you think it is. Look carefully:
>
> py> substring = "\0x%d\0x%d" % (257,257)
> py> len(substring)
> 10
> py> list(substring)
> ['\x00', 'x', '2', '5', '7', '\x00', 'x', '2', '5', '7']
>
> --
> Gabriel Genellina

OOPS. My bad. But I'm not going to give up.

l = ['a','b','c','a','c']
us = unicode("".join(l))
substr = unichr(257) + unichr(257)
us = us.replace(u'ac',substr)
print len(us)
print list(us)


output is
>>>
5
[u'a', u'b', u'c', u'\u0101', u'\u0101']
--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-05 Thread Gerard Flanagan

Matthias Gallé wrote:

Hi.

My problem is to replace all occurrences of a sublist with a new element.

Example:
Given ['a','c','a','c','c','g','a','c'] I want to replace all 
occurrences of ['a','c'] by 6 (result [6,6,'c','g',6]).




For novelty value:

from itertools import izip

def replace2(data, pattern):
assert len(pattern) == 2
pattern = tuple(pattern)
icopy = iter(data)
icopy.next()
gen = izip(data, icopy)
while True:
item = gen.next()
if item == pattern:
yield '6'
gen.next()
else:
yield item[0]

# works if list ends with ['a', 'c']
data = ['g', 'a', 'c', 'a', 'c', 'a', 'a', 'a', 'g', 'a', 'c']
want = 'g66aaag6'
assert ''.join(replace2(data, ['a', 'c'])) == want

# otherwise you lose the last element of the tail
data = ['g', 'a', 'c', 'a', 'c', 'a', 'a', 'a', 'g', 'a', 'c', 'c', 'g']
want = 'g66aaag6cg'
get = 'g66aaag6c'
assert not ''.join(replace2(data, ['a', 'c'])) == want
assert ''.join(replace2(data, ['a', 'c'])) == get

# fix by adding the pattern to the end of the data as a sentinel

def replace2(data, pattern):
assert len(pattern) == 2
def _replace2(data, pattern):
pattern = tuple(pattern)
icopy = iter(data)
icopy.next()
gen = izip(data, icopy)
while True:
item = gen.next()
if item == pattern:
yield '6'
gen.next()
else:
yield item[0]
data = data + pattern
return list(_replace2(data, pattern))[:-1]

data = ['g', 'a', 'c', 'a', 'c', 'a', 'a', 'a', 'g', 'a', 'c']
want = 'g66aaag6'
assert ''.join(replace2(data, ['a', 'c'])) == want

data = ['g', 'a', 'c', 'a', 'c', 'a', 'a', 'a', 'g', 'a', 'c', 'c', 'g']
want = 'g66aaag6cg'
assert ''.join(replace2(data, ['a', 'c'])) == want

print 'done'

--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread Terry Reedy

Matthias Gallé wrote:

Hi.

My problem is to replace all occurrences of a sublist with a new element.

Example:
Given ['a','c','a','c','c','g','a','c'] I want to replace all 
occurrences of ['a','c'] by 6 (result [6,6,'c','g',6]).


If I do this with string ('acaccgac') I have the advantage of all the 
'find' functions, but perfomance is bad and some extra care must be 
taken if one element consist of more then one character (case of 11 for 
example)


So I really would like to work with lists straightforward, but I could 
not found anything to search a sublist inside a list.

Any propositions for a simple solution?


For a mutable homogenous array, consider the array module.
Any algorithm that applies to a sequence of chars can be adjusted to 
other sequences.  For the above case, remember than you can easily 
filter None out of a sequence.  IE, replace 'a','c' with 6, None and 
then filter when done.


--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread Gabriel Genellina

En Mon, 04 May 2009 15:12:41 -0300, mzdude  escribió:


substring isn't limited to 0..255

substring = "\0x%d\0x%d" % (257,257)
'acaccgac'.replace("ac", substring)

'\x00x257\x00x257\x00x257\x00x257cg\x00x257\x00x257'


This isn't what you think it is. Look carefully:

py> substring = "\0x%d\0x%d" % (257,257)
py> len(substring)
10
py> list(substring)
['\x00', 'x', '2', '5', '7', '\x00', 'x', '2', '5', '7']

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread mzdude
On May 4, 9:01 am, Matthias Gallé  wrote:
> bearophileh...@lycos.com wrote:
> > John O'Hagan:
> >> li=['a', 'c', 'a', 'c', 'c', 'g', 'a', 'c']
> >> for i  in range(len(li)):
> >>     if li[i:i + 2] == ['a', 'c']:
> >>         li[i:i + 2] = ['6']
>
> > Oh well, I have done a mistake, it seems.
> > Another solution then:
>
>  'acaccgac'.replace("ac", chr(6))
> > '\x06\x06cg\x06'
>
> > Bye,
> > bearophile
>
> Thanks bearophile and John for your quick answers.
> Unfortunately, the int that can replace a sublist can be > 255, but
> John's answer looks simple and good enough for me. I will use it as a
> starting point.
>

substring isn't limited to 0..255
>>> substring = "\0x%d\0x%d" % (257,257)
>>> 'acaccgac'.replace("ac", substring)
'\x00x257\x00x257\x00x257\x00x257cg\x00x257\x00x257'
--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread John O'Hagan
On Mon, 4 May 2009, Francesco Guerrieri wrote:
> On Mon, May 4, 2009 at 3:01 PM, John O'Hagan  wrote:
> > On Mon, 4 May 2009, Matthias Gallé wrote:
> > > Hi.
> > >
> > > My problem is to replace all occurrences of a sublist with a new
> > > element.
> > >
> > > Example:
> > > Given ['a','c','a','c','c','g','a','c'] I want to replace all
> > > occurrences of ['a','c'] by 6 (result [6,6,'c','g',6]).
> >
> > li=['a', 'c', 'a', 'c', 'c', 'g', 'a', 'c']
> > for i  in range(len(li)):
> >if li[i:i + 2] == ['a', 'c']:
> >li[i:i + 2] = ['6']
> >
> > HTH,
> >
> > John
>
> Beware that you are mutating the list you are iterating over. That could
> lead to some strange bugs (for instance if you replaced the deleted items
> with a longer sequence, the range(len(li)) would still go up to the
> original lenght).
> It is better to modify a new list instead. Eg you could append to a new
> list.
[...]

Quite right, while it happens to work in this particular example, as you and 
MRAB point out, it's generally dangerous (and in fact this one silently and 
uselessly iterates over the last couple of indexes which no longer exist); a 
new list could be created like this:

index=0
newli=[]
while indexhttp://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread Aahz
In article ,
=?ISO-8859-1?Q?Matthias_Gall=E9?=   wrote:
>
>My problem is to replace all occurrences of a sublist with a new element.
>
>Example:
>Given ['a','c','a','c','c','g','a','c'] I want to replace all 
>occurrences of ['a','c'] by 6 (result [6,6,'c','g',6]).

What's your goal?  After you do this once, you cannot repeat the
operation with a different sublist because you are not tracking the
source of the numbers.  You might look into standard compression
algorithms for information about how to accomplish this.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan
--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread MRAB

Matthias Gallé wrote:

bearophileh...@lycos.com wrote:

John O'Hagan:

li=['a', 'c', 'a', 'c', 'c', 'g', 'a', 'c']
for i  in range(len(li)):
if li[i:i + 2] == ['a', 'c']:
li[i:i + 2] = ['6']


Oh well, I have done a mistake, it seems.
Another solution then:


'acaccgac'.replace("ac", chr(6))

'\x06\x06cg\x06'

Bye,
bearophile


Thanks bearophile and John for your quick answers.
Unfortunately, the int that can replace a sublist can be > 255, but 
John's answer looks simple and good enough for me. I will use it as a 
starting point.



John's solution changes the length of the list over which it's
iterating.

I'd suggest something more like:

li = ['a', 'c', 'a', 'c', 'c', 'g', 'a', 'c']
pos = 0
try:
while True:
pos = li.index('a', pos)
if li[pos : pos + 2] == ['a', 'c']:
li[pos : pos + 2] = [6]
pos += 1
except ValueError:
pass
--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread bearophileHUGS
Matthias Gallé:
>the int that can replace a sublist can be > 255,<

You didn't specify your integer ranges.
Probably there are many other solutions for your problem, but you have
to give more information. Like the typical array size, typical range
of the numbers, how much important is total memory used, how much
important is running speed, what kind of processing (or serialization/
output) you later have to do with such arrays, and so on.
Other solutions include using an array('H', []), and using 0-255 to
represent ASCII and numbers >255 <2^16 to represent the other numbers,
etc.
If speed is critical you can even think about creating a little
function with PyInline or D+Pyd, etc.

Bye,
bearophile
--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread Matthias Gallé

bearophileh...@lycos.com wrote:

John O'Hagan:

li=['a', 'c', 'a', 'c', 'c', 'g', 'a', 'c']
for i  in range(len(li)):
if li[i:i + 2] == ['a', 'c']:
li[i:i + 2] = ['6']


Oh well, I have done a mistake, it seems.
Another solution then:


'acaccgac'.replace("ac", chr(6))

'\x06\x06cg\x06'

Bye,
bearophile


Thanks bearophile and John for your quick answers.
Unfortunately, the int that can replace a sublist can be > 255, but 
John's answer looks simple and good enough for me. I will use it as a 
starting point.


Thank's again.

--
Matthias Gallé
Project Symbiose
Centre de Recherche INRIA Rennes - Bretagne Atlantique,
Campus de Beaulieu, 35042 Rennes cedex, France
tel: (33|0) 2 9984 7523
http://www.irisa.fr/symbiose/matthias_galle
--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread Francesco Guerrieri
On Mon, May 4, 2009 at 3:01 PM, John O'Hagan  wrote:

> On Mon, 4 May 2009, Matthias Gallé wrote:
> > Hi.
> >
> > My problem is to replace all occurrences of a sublist with a new element.
> >
> > Example:
> > Given ['a','c','a','c','c','g','a','c'] I want to replace all
> > occurrences of ['a','c'] by 6 (result [6,6,'c','g',6]).
> >
>
> li=['a', 'c', 'a', 'c', 'c', 'g', 'a', 'c']
> for i  in range(len(li)):
>if li[i:i + 2] == ['a', 'c']:
>li[i:i + 2] = ['6']
>
> HTH,
>
> John
>

Beware that you are mutating the list you are iterating over. That could
lead to some strange bugs (for instance if you replaced the deleted items
with a longer sequence, the range(len(li)) would still go up to the original
lenght).
It is better to modify a new list instead. Eg you could append to a new
list.

Francesco
--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread John O'Hagan
On Mon, 4 May 2009, Matthias Gallé wrote:
> Hi.
>
> My problem is to replace all occurrences of a sublist with a new element.
>
> Example:
> Given ['a','c','a','c','c','g','a','c'] I want to replace all
> occurrences of ['a','c'] by 6 (result [6,6,'c','g',6]).
>

li=['a', 'c', 'a', 'c', 'c', 'g', 'a', 'c'] 
for i  in range(len(li)):
if li[i:i + 2] == ['a', 'c']:
li[i:i + 2] = ['6']

HTH,

John




--
http://mail.python.org/mailman/listinfo/python-list


Re: find sublist inside list

2009-05-04 Thread bearophileHUGS
Matthias Gallé:
> My problem is to replace all occurrences of a sublist with a new element.
> Example:
> Given ['a','c','a','c','c','g','a','c'] I want to replace all
> occurrences of ['a','c'] by 6 (result [6,6,'c','g',6]).

There are several ways to solve this problem. Representing a string as
a list of "chars" (1-strings) is fine if you work with small strings,
but once they get longer you need too much memory and time.

This is a starting point:

>>> mapper = {"a":6, "c":6}
>>> data = 'acaccgac'
>>> [mapper.get(c, c) for c in data]
[6, 6, 6, 6, 6, 'g', 6, 6]

If you need higher performance, to represent few small numbers into a
well defined string like genomic data, you can use ASCII values of 6
and 11:

>>> from string import maketrans
>>> tab = maketrans("acgt", "".join([chr(6), chr(6), "gt"]))
>>> s.translate(tab)
'\x06\x06\x06\x06\x06g\x06\x06'

Later in processing it's easy to tell apart normal genome bases from
those small numbers.

Note that there is the array.array function too in the standard lib,
and elsewhere there is Numpy too.

There are several other possible solutions, but I stop here so you can
explain your purposes and constraints better.

Bye,
bearophile
--
http://mail.python.org/mailman/listinfo/python-list


find sublist inside list

2009-05-04 Thread Matthias Gallé

Hi.

My problem is to replace all occurrences of a sublist with a new element.

Example:
Given ['a','c','a','c','c','g','a','c'] I want to replace all 
occurrences of ['a','c'] by 6 (result [6,6,'c','g',6]).


If I do this with string ('acaccgac') I have the advantage of all the 
'find' functions, but perfomance is bad and some extra care must be 
taken if one element consist of more then one character (case of 11 for 
example)


So I really would like to work with lists straightforward, but I could 
not found anything to search a sublist inside a list.

Any propositions for a simple solution?

Thanks in advance,


--
Matthias Gallé
Project Symbiose
Centre de Recherche INRIA Rennes - Bretagne Atlantique,
Campus de Beaulieu, 35042 Rennes cedex, France
tel: (33|0) 2 9984 7523
http://www.irisa.fr/symbiose/matthias_galle
--
http://mail.python.org/mailman/listinfo/python-list