On 01/05/13 02:35, Sia wrote:
I have strings such as:
tA.-2AG.-2AG,-2ag
or
.+3ACG.+5CAACG.+3ACG.+3ACG
The plus and minus signs are always followed by a number (say, i). I want
python to find each single plus or minus, remove the sign, the number after it
and remove i characters after that. So the two strings above become:
tA..,
and
...
With the same caveat as Frank posted about the second one being
"...." (4 dots), I don't know how this version times out:
import re
r = re.compile(r"[-+](\d+)([^-+]*)")
def modify(m):
result = m.group(2)[int(m.group(1)):]
return result
for test, expected in (
("tA.-2AG.-2AG,-2ag", "tA..,"),
(".+3ACG.+5CAACG.+3ACG.+3ACG", "...."),
):
s = r.sub(modify, test)
print "%r -> %r (%r)" % (
test, s, expected
)
assert s == expected, "Nope"
(it passes the tests as modified to "....")
-tkc
--
http://mail.python.org/mailman/listinfo/python-list