Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
On Thu, Mar 2, 2023 at 9:56 PM Alan Bawden wrote: > > jose isaias cabrera writes: > >On Thu, Mar 2, 2023 at 2:38 PM Mats Wichmann wrote: > >This re is a bit different than the one I am used. So, I am trying to match >everything after 'pn=': > >import re >s = "pm=jose pn=2017"

Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
ject manager. pn=project name. I needed search() rather than match(). > > >>> s = "pn=jose pn=2017" > ... > >>> s0 = r0.match(s) > >>> s0 > > > > > -Original Message- > From: Python-list On > Behalf Of jose isaias cab

Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
On Thu, Mar 2, 2023 at 8:30 PM Cameron Simpson wrote: > > On 02Mar2023 20:06, jose isaias cabrera wrote: > >This re is a bit different than the one I am used. So, I am trying to > >match > >everything after 'pn=': > > > >import re > >s = "pm=jose pn=2017" > >m0 = r"pn=(.+)" > >r0 = re.compile(m0)

Re: Regular Expression bug?

2023-03-02 Thread Alan Bawden
jose isaias cabrera writes: On Thu, Mar 2, 2023 at 2:38 PM Mats Wichmann wrote: This re is a bit different than the one I am used. So, I am trying to match everything after 'pn=': import re s = "pm=jose pn=2017" m0 = r"pn=(.+)" r0 = re.compile(m0) s0 = r0.match(s) >>

Re: Regular Expression bug?

2023-03-02 Thread Cameron Simpson
On 02Mar2023 20:06, jose isaias cabrera wrote: This re is a bit different than the one I am used. So, I am trying to match everything after 'pn=': import re s = "pm=jose pn=2017" m0 = r"pn=(.+)" r0 = re.compile(m0) s0 = r0.match(s) `match()` matches at the start of the string. You want r0.se

RE: Regular Expression bug?

2023-03-02 Thread avi.e.gross
;> s0 -Original Message- From: Python-list On Behalf Of jose isaias cabrera Sent: Thursday, March 2, 2023 8:07 PM To: Mats Wichmann Cc: python-list@python.org Subject: Re: Regular Expression bug? On Thu, Mar 2, 2023 at 2:38 PM Mats Wichmann wrote: > > On 3/2/23 12:28

Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
On Thu, Mar 2, 2023 at 2:38 PM Mats Wichmann wrote: > > On 3/2/23 12:28, Chris Angelico wrote: > > On Fri, 3 Mar 2023 at 06:24, jose isaias cabrera wrote: > >> > >> Greetings. > >> > >> For the RegExp Gurus, consider the following python3 code: > >> > >> import re > >> s = "pn=align upgrade sd=2

RE: Regular Expression bug?

2023-03-02 Thread avi.e.gross
On Behalf Of jose isaias cabrera Sent: Thursday, March 2, 2023 2:23 PM To: python-list@python.org Subject: Regular Expression bug? Greetings. For the RegExp Gurus, consider the following python3 code: import re s = "pn=align upgrade sd=2023-02-" ro = re.compile(r"pn=(.+) &

Re: Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
n upgrade sd=2023-02-" > > ro = re.compile(r"pn=(.+) ") > > r0=ro.match(s) > > >>> print(r0.group(1)) > > align upgrade > > > > > > This is wrong. It should be 'align' because the group only goes up-to > > the space. Thought

Re: Regular Expression bug?

2023-03-02 Thread Mats Wichmann
On 3/2/23 12:28, Chris Angelico wrote: On Fri, 3 Mar 2023 at 06:24, jose isaias cabrera wrote: Greetings. For the RegExp Gurus, consider the following python3 code: import re s = "pn=align upgrade sd=2023-02-" ro = re.compile(r"pn=(.+) ") r0=ro.match(s) print(r0.group(1)) align upgrade T

Re: Regular Expression bug?

2023-03-02 Thread 2QdxY4RzWzUUiLuE
) > align upgrade > > > This is wrong. It should be 'align' because the group only goes up-to > the space. Thoughts? Thanks. The bug is in your regular expression; the plus modifier is greedy. If you want to match up to the first space, then you'll need somethin

Re: Regular Expression bug?

2023-03-02 Thread Chris Angelico
On Fri, 3 Mar 2023 at 06:24, jose isaias cabrera wrote: > > Greetings. > > For the RegExp Gurus, consider the following python3 code: > > import re > s = "pn=align upgrade sd=2023-02-" > ro = re.compile(r"pn=(.+) ") > r0=ro.match(s) > >>> print(r0.group(1)) > align upgrade > > > This is wrong. I

Regular Expression bug?

2023-03-02 Thread jose isaias cabrera
Greetings. For the RegExp Gurus, consider the following python3 code: import re s = "pn=align upgrade sd=2023-02-" ro = re.compile(r"pn=(.+) ") r0=ro.match(s) >>> print(r0.group(1)) align upgrade This is wrong. It should be 'align' because the group only goes up-to the space. Thoughts? Thanks.

Re: python3, regular expression and bytes text

2019-10-14 Thread Eko palypse
Am Montag, 14. Oktober 2019 13:56:09 UTC+2 schrieb Chris Angelico: > > (My apologies for saying this in reply to an unrelated post, but I > also don't see those posts, so it's not easy to reply to them.) > > ChrisA Nothing to apologize and thank you for clarification, I was already checking my s

Re: python3, regular expression and bytes text

2019-10-14 Thread Chris Angelico
On Mon, Oct 14, 2019 at 10:41 PM Eko palypse wrote: > > Am Sonntag, 13. Oktober 2019 21:20:26 UTC+2 schrieb moi: > > [Do not know why I spent hours with this...] > > > > To answer you question. > > Yes, I confirm. > > It seems that as soon as one works with bytes and when > > a char is encoded in

Re: python3, regular expression and bytes text

2019-10-14 Thread Eko palypse
Am Sonntag, 13. Oktober 2019 21:20:26 UTC+2 schrieb moi: > [Do not know why I spent hours with this...] > > To answer you question. > Yes, I confirm. > It seems that as soon as one works with bytes and when > a char is encoded in more than 1 byte, "re" goes into > troubles. > First, sorry for a

Re: python3, regular expression and bytes text

2019-10-12 Thread Eko palypse
the current buffer to me. The problem is that the buffer can have all possible encodings. cp1251, cp1252, utf8, ucs-2 ... but scintilla informs me about which encoding is currently used. I wanted to realize a regular expression tester with Python3, and mark the text that has been matched by regular

Re: python3, regular expression and bytes text

2019-10-12 Thread MRAB
re.LOCALE are slow. It may be more efficient to decode text and use Unicode regular expression. Thank you, I guess I'm convinced to always decode everything (re pattern and text) to utf8 internally and then do the re search but then I would need to figure out the correct position, hmm -

Re: python3, regular expression and bytes text

2019-10-12 Thread Chris Angelico
On Sun, Oct 13, 2019 at 7:16 AM Richard Damon wrote: > > On 10/12/19 3:46 PM, Eko palypse wrote: > > Thank you very much for your answer. > > > >> You have to be able to match bytes, not strings. > > May I ask you to elaborate on this, sorry non-native English speaker. > > The buffer I receive is

Re: python3, regular expression and bytes text

2019-10-12 Thread MRAB
charsets. So even if you set the utf-8 locale, it would not help. Regular expressions with re.LOCALE are slow. It may be more efficient to decode text and use Unicode regular expression. +1 It's best to treat re.LOCALE as being for old legacy encodings that use/used 8 bits per character. Whe

Re: python3, regular expression and bytes text

2019-10-12 Thread Richard Damon
On 10/12/19 3:46 PM, Eko palypse wrote: > Thank you very much for your answer. > >> You have to be able to match bytes, not strings. > May I ask you to elaborate on this, sorry non-native English speaker. > The buffer I receive is a byte-like buffer. > >> I don't think you'll be able to 100% reliab

Re: python3, regular expression and bytes text

2019-10-12 Thread Chris Angelico
hen you're matching text (the normal way you use a regular expression), every element in the RE matches a character (or emptiness). For instance, the regular expression "^[bc]at$" has these elements: "^" matches emptiness at the start "[bc]" matches either "

Re: python3, regular expression and bytes text

2019-10-12 Thread Eko palypse
ow. It may be more efficient to > decode text and use Unicode regular expression. Thank you, I guess I'm convinced to always decode everything (re pattern and text) to utf8 internally and then do the re search but then I would need to figure out the correct position, hmm - some ongoi

Re: python3, regular expression and bytes text

2019-10-12 Thread Eko palypse
Thank you very much for your answer. > You have to be able to match bytes, not strings. May I ask you to elaborate on this, sorry non-native English speaker. The buffer I receive is a byte-like buffer. > I don't think you'll be able to 100% reliably match bytes in this way. > You're asking it to

Re: python3, regular expression and bytes text

2019-10-12 Thread Serhiy Storchaka
would not help. Regular expressions with re.LOCALE are slow. It may be more efficient to decode text and use Unicode regular expression. -- https://mail.python.org/mailman/listinfo/python-list

Re: python3, regular expression and bytes text

2019-10-12 Thread Chris Angelico
On Sun, Oct 13, 2019 at 5:11 AM Eko palypse wrote: > > What needs to be set in order to be able to use a re search within > utf8 encoded bytes? You have to be able to match bytes, not strings. > So how can I make it work with utf8 encoded text? > Note, decoding it to a string isn't preferred as

python3, regular expression and bytes text

2019-10-12 Thread Eko palypse
What needs to be set in order to be able to use a re search within utf8 encoded bytes? My test, being on a windows PC with cp1252 setup, looks like this import re import locale cp1252 = 'Ärger im Paradies'.encode('cp1252') utf8 = 'Ärger im Paradies'.encode('utf-8') print('cp1252:', cp1252) pr

"Regular Expression Objects" scanner method

2019-10-01 Thread ast
.7/library/re.html#regular-expression-objects nor with help import re lex = re.compile("foo") help(lex.scanner) Help on built-in function scanner: scanner(string, pos=0, endpos=2147483647) method of _sre.SRE_Pattern instance Does anybody know where to find a doc ? -- https://mail.python.

Conversion between basic regular expression and extended regular expression

2018-11-17 Thread Peng Yu
Hi, I'd like to use a program to convert between basic regular expression (BRE) and extended regular expression (ERE). (see man grep for the definition of BRE and ERE). Does python has a module for this purpose? Thanks. -- Regards, Peng -- https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert
On Mon, Oct 29, 2018 at 05:16:11PM +, MRAB wrote: > > Logically it should not because > > > > >s'::15>>$ > > > > does not match > > > > ::\d*>>$ > > > > but I am not sure how to tell it that :-) > > > For something like that, I'd use parsing by recursive descent. > > It might be

Re: regular expression problem

2018-10-29 Thread MRAB
be either a length or a "from-until" > > - a length will be a positive integer (no bounds checking) > > - "from-until" is: a positive integer, a '-', and a positive integer (no sanity checking) > > - options needs to be able to contain nearly anythi

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert
On Sun, Oct 28, 2018 at 11:57:48PM +0100, Brian Oney wrote: > On Sun, 2018-10-28 at 22:04 +0100, Karsten Hilbert wrote: > > [^<:] > > Would a simple regex work? This brought about the solution. However, not this way: > >>> import re > >>> t = '$$' > >>> re.findall('[^<>:$]+', t) > ['name', 'op

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert
> Right, I am not trying to do that. I was, however, worried > that I need to make the expression not "trip over" fragments > of what might seem to constitute part of another placeholder. > > $<$::15>>$ > > Pass 1 might fill in to: > > $>$ > > and I was worried to make sure

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert
On Mon, Oct 29, 2018 at 12:10:04AM +0100, Thomas Jollans wrote: > On 28/10/2018 22:04, Karsten Hilbert wrote: > > - options needs to be able to contain nearly anything, except '::' > > Including > and $ ? Unfortunately, it might. Even if I assume that earlier passes are "inside", and thusly "fil

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert
ain '$' '<' '>' ':' > > > > - range can be either a length or a "from-until" > > > > - a length will be a positive integer (no bounds checking) > > > > - "from-until" is: a positive integer, a '

Re: regular expression problem

2018-10-28 Thread Thomas Jollans
On 28/10/2018 22:04, Karsten Hilbert wrote: > - options needs to be able to contain nearly anything, except '::' Including > and $ ? -- https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-28 Thread Thomas Jollans
On 28/10/2018 22:04, Karsten Hilbert wrote: > - options needs to be able to contain nearly anything, except '::' > > Is that sufficiently defined and helpful to design the regular expression ? so options isn't '.*', but more like '(:?[^:]+)*' (F

Re: regular expression problem

2018-10-28 Thread MRAB
<...$<...>$...>>>$ (lower=earlier parsing passes will be inside) - the internal structure is "name::options::range" $$ - name will *not* contain '$' '<' '>' ':' - range can be either a leng

Re: regular expression problem

2018-10-28 Thread Brian Oney via Python-list
On Sun, 2018-10-28 at 22:04 +0100, Karsten Hilbert wrote: > [^<:] Would a simple regex work? I mean: ~$ python Python 2.7.13 (default, Sep 26 2018, 18:42:22)  [GCC 6.3.0 20170516] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import re >>> t = '$$' >>> re.f

Re: regular expression problem

2018-10-28 Thread Karsten Hilbert
On Sun, Oct 28, 2018 at 10:04:39PM +0100, Karsten Hilbert wrote: > - options needs to be able to contain nearly anything, except '::' This seems to contradict the "nesting" requirement, but the nesting restriction "earlier parsing passes go inside" makes it possible. Karsten -- GPG 40BE 5B0E C

Re: regular expression problem

2018-10-28 Thread Karsten Hilbert
>$ - placeholders for different parsing passes must be nestable: $<<<...$<...>$...>>>$ (lower=earlier parsing passes will be inside) - the internal structure is "name::options::range" $$ - name will *not* contain 

Re: regular expression problem

2018-10-28 Thread Karsten Hilbert
Now that MRAB has shown me the follies of my ways I would like to learn how to properly write the regular expression I need. This part: > rx_works = '\$<[^<:]+?::.*?::\d*?>\$|\$<[^<:]+?::.*?::\d+-\d+>\$' > # it fails if switched around: >

Re: regular expression problem

2018-10-28 Thread MRAB
On 2018-10-28 18:51, Karsten Hilbert wrote: Dear list members, I cannot figure out why my regular expression does not work as I expect it to: #--- #!/usr/bin/python from __future__ import print_function import re as regex rx_works = '\$<[^<:]

regular expression problem

2018-10-28 Thread Karsten Hilbert
Dear list members, I cannot figure out why my regular expression does not work as I expect it to: #--- #!/usr/bin/python from __future__ import print_function import re as regex rx_works = '\$<[^<:]+?::.*?::\d*?>\$|\$<[^<:]+?::.*?::\d+-\d+>\$&

Re: Regular expression

2017-07-26 Thread Jussi Piitulainen
Kunal Jamdade writes: > There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' . > > I want to extract the last 4 characters. I tried different regex. but > i am not getting it right. > > Can anyone suggest me how should i proceed.? os.path.splitext(name) # most likely; also: os.path

Re: Regular expression

2017-07-26 Thread Andre Müller
fname = 'first-324-True-rms-kjhg-Meterc639.html' # with string manipulation stem, suffix = fname.rsplit('.', 1) print(stem[-4:]) # oo-style with str manipulation import pathlib path = pathlib.Path(fname) print(path.stem[-4:]) -- https://mail.python.org/mailman/listinfo/python-list

Re: Regular expression

2017-07-26 Thread Peter Otten
Kunal Jamdade wrote: > There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' . > > I want to extract the last 4 characters. I tried different regex. but i am > not getting it right. > > Can anyone suggest me how should i proceed.? You don't need

Re: Regular expression

2017-07-26 Thread Paul Barry
> On 26 July 2017 at 13:52, Kunal Jamdade wrote: > > There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' . > > > > I want to extract the last 4 characters. I tried different regex. but i > am > > not getting it right. > > > > C

Re: Regular expression

2017-07-26 Thread Johann Spies
? What have you tried? Why do you need regular expression? >>> s = 'first-324-True-rms-kjhg-Meterc639.html' >>> s[-4:] 'html' Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3) -- https://mail.python.org/mailman/listinfo/python-list

Regular expression

2017-07-26 Thread Kunal Jamdade
There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' . I want to extract the last 4 characters. I tried different regex. but i am not getting it right. Can anyone suggest me how should i proceed.? Regards, Kunal -- https://mail.python.org/mailman/listinfo/python-list

Re: Check for regular expression in a list

2017-05-26 Thread Peter Otten
ith('firefox') for i in process_iter()): > > Or use a regex match if the condition becomes more complex. Even then, > there is re.match to attemp a match at the start of the string, which > helps to keep the expression simple. If regular expression objects were to support eq

Re: Check for regular expression in a list

2017-05-26 Thread Cecil Westerhof
On Friday 26 May 2017 14:25 CEST, Jussi Piitulainen wrote: > Rustom Mody writes: > >> On Friday, May 26, 2017 at 5:02:55 PM UTC+5:30, Cecil Westerhof wrote: >>> To check if Firefox is running I use: >>> if not 'firefox' in [i.name() for i in list(process_iter())]: >>> >>> It probably could be mad

Re: Check for regular expression in a list

2017-05-26 Thread Rustom Mody
On Friday, May 26, 2017 at 5:55:32 PM UTC+5:30, Jussi Piitulainen wrote: > Rustom Mody writes: > > > On Friday, May 26, 2017 at 5:02:55 PM UTC+5:30, Cecil Westerhof wrote: > >> To check if Firefox is running I use: > >> if not 'firefox' in [i.name() for i in list(process_iter())]: > >> > >> I

Re: Check for regular expression in a list

2017-05-26 Thread Jussi Piitulainen
Jussi Piitulainen writes: > Or use a regex match if the condition becomes more complex. Even then, > there is re.match to attemp a match at the start of the string, which > helps to keep the expression simple. Soz: attemp' a match. -- https://mail.python.org/mailman/listinfo/python-list

Re: Check for regular expression in a list

2017-05-26 Thread Jussi Piitulainen
Rustom Mody writes: > On Friday, May 26, 2017 at 5:02:55 PM UTC+5:30, Cecil Westerhof wrote: >> To check if Firefox is running I use: >> if not 'firefox' in [i.name() for i in list(process_iter())]: >> >> It probably could be made more efficient, because it can stop when it >> finds the firs

Re: Check for regular expression in a list

2017-05-26 Thread Tim Chase
On 2017-05-26 13:29, Cecil Westerhof wrote: > To check if Firefox is running I use: > if not 'firefox' in [i.name() for i in list(process_iter())]: > > It probably could be made more efficient, because it can stop when > it finds the first instance. > > But know I switched to Debian and there

Re: Check for regular expression in a list

2017-05-26 Thread Rustom Mody
On Friday, May 26, 2017 at 5:02:55 PM UTC+5:30, Cecil Westerhof wrote: > To check if Firefox is running I use: > if not 'firefox' in [i.name() for i in list(process_iter())]: > > It probably could be made more efficient, because it can stop when it > finds the first instance. > > But know I s

Check for regular expression in a list

2017-05-26 Thread Cecil Westerhof
To check if Firefox is running I use: if not 'firefox' in [i.name() for i in list(process_iter())]: It probably could be made more efficient, because it can stop when it finds the first instance. But know I switched to Debian and there firefox is called firefox-esr. So I should use: re.se

Re: Regular expression query

2017-03-12 Thread Vlastimil Brom
elds have a double quoted > string as part of it (and that double quoted string can have commas). > Above string have only 6 fields. First is a, second is b and last is > f "5546,3434,345,34,34,5,34,543,7". > How can I split this string in its fields using regula

Re: Regular expression query

2017-03-12 Thread Tim Chase
part of it (and that double quoted string can have > commas). Above string have only 6 fields. First is a, second is > b and last is f "5546,3434,345,34,34,5,34,543,7". How can I > split this string in its fields using regular expression ? or even > if there is any ot

Re: Regular expression query

2017-03-12 Thread Jussi Piitulainen
ields have a double > quoted string as part of it (and that double quoted string can have > commas). Above string have only 6 fields. First is a, second is > b and last is f "5546,3434,345,34,34,5,34,543,7". How can I > split this string in its fields using regul

Re: Regular expression query

2017-03-12 Thread Larry Martell
t some of the fields have a double quoted > string as part of it (and that double quoted string can have commas). > Above string have only 6 fields. First is a, second is b and last is > f "5546,3434,345,34,34,5,34,543,7". > How can I split this string in its field

Regular expression query

2017-03-12 Thread rahulrasal
quoted string can have commas). Above string have only 6 fields. First is a, second is b and last is f "5546,3434,345,34,34,5,34,543,7". How can I split this string in its fields using regular expression ? or even if there is any other way to do this, please speak out. Thank

Re: A regular expression question

2016-09-28 Thread Ben Finney
Since ‘n*’ matches zero or more ‘n’s, it matches zero adjacent to every actual character. It's non-greedy because it matches as few characters as will allow the match to succeed. > I think the repl argument should replaces every char in text and > outputs "". I hope that h

A regular expression question

2016-09-28 Thread Cpcp Cp
Look this >>> import re >>> text="asdfnbd]" >>> m=re.sub("n*?","?",text) >>> print m ?a?s?d?f?n?b?d?]? I don't understand the 'non-greedy' pattern. I think the repl argument should replaces every char in text and outputs "". -- https://mail.python.org/mailman/listinfo/python-list

Re: Question about regular expression

2015-10-02 Thread Denis McMahon
On Wed, 30 Sep 2015 23:30:47 +, Denis McMahon wrote: > On Wed, 30 Sep 2015 11:34:04 -0700, massi_srb wrote: > >> firstly the description of my problem. I have a string in the following >> form: . > > The way I solved this was to: > > 1) replace all the punctuation in the string with spa

Re: Question about regular expression

2015-10-01 Thread gal kauffman
On Oct 2, 2015 12:35 AM, "Denis McMahon" wrote: > > On Thu, 01 Oct 2015 01:48:03 -0700, gal kauffman wrote: > > > items = s.replace(' (', '(').replace(', ',',').split() > > > > items_dict = dict() > > for item in items: > > if '(' not in item: > > item += '(0,0)' > > if ',' not in

Re: Question about regular expression

2015-10-01 Thread Denis McMahon
On Thu, 01 Oct 2015 15:53:38 +, Rob Gaddi wrote: > There's a quote for this. 'Some people, when confronted with a problem, > think “I know, I'll use regular expressions.” Now they have two > problems.' I actually used 2 regexes: wordpatt = re.compile('[a-zA-Z]+') numpatt = re.compile('[0-

Re: Question about regular expression

2015-10-01 Thread Denis McMahon
On Thu, 01 Oct 2015 01:48:03 -0700, gal kauffman wrote: > items = s.replace(' (', '(').replace(', ',',').split() > > items_dict = dict() > for item in items: > if '(' not in item: > item += '(0,0)' > if ',' not in item: > item = item.replace(')', ',0)') > > name, raw_

Re: Question about regular expression

2015-10-01 Thread Rob Gaddi
On Wed, 30 Sep 2015 11:34:04 -0700, massi_srb wrote: > Hi everyone, > > firstly the description of my problem. I have a string in the following > form: > > s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..." > > that is a string made up of groups in the form 'name' (letters only) > plus possib

Re: Question about regular expression

2015-10-01 Thread gal kauffman
My example will give false positive if there is a space before a comma. Or anything else by the conventions in the original string. I tried to keep it as simple as I could. If you want to catch a wider range of values you can use *simple* regular expression to catch as much spaces as you want. On

Re: Question about regular expression

2015-10-01 Thread Tim Chase
On 2015-10-01 01:48, gal kauffman wrote: > items = s.replace(' (', '(').replace(', ',',').split() s = "name1 (1)" Your suggestion doesn't catch cases where more than one space can occur before the paren. -tkc -- https://mail.python.org/mailman/listinfo/python-list

Re: Question about regular expression

2015-10-01 Thread gal kauffman
items = s.replace(' (', '(').replace(', ',',').split() items_dict = dict() for item in items: if '(' not in item: item += '(0,0)' if ',' not in item: item = item.replace(')', ',0)') name, raw_data = item.split('(') data_tuple = tuple((int(v) for v in raw_data.repla

Re: Question about regular expression

2015-09-30 Thread Emile van Sebille
On 9/30/2015 12:20 PM, Tim Chase wrote: On 2015-09-30 11:34, massi_...@msn.com wrote: I guess this problem can be tackled with regular expressions, b ... However, if you *want* to do it with regular expressions, you can. It's ugly and might be fragile, but ##

Re: Question about regular expression

2015-09-30 Thread Denis McMahon
On Wed, 30 Sep 2015 11:34:04 -0700, massi_srb wrote: > firstly the description of my problem. I have a string in the following > form: . The way I solved this was to: 1) replace all the punctuation in the string with spaces 2) split the string on space 3) process each thing in the list to

Re: Question about regular expression

2015-09-30 Thread Tim Chase
On 2015-09-30 11:34, massi_...@msn.com wrote: > firstly the description of my problem. I have a string in the > following form: > > s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..." > > that is a string made up of groups in the form 'name' (letters > only) plus possibly a tuple containing 1 or

Re: Question about regular expression

2015-09-30 Thread Joel Goldstick
On Wed, Sep 30, 2015 at 2:50 PM, Emile van Sebille wrote: > On 9/30/2015 11:34 AM, massi_...@msn.com wrote: > >> Hi everyone, >> >> firstly the description of my problem. I have a string in the following >> form: >> >> s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..." >> >> that is a string ma

Re: Question about regular expression

2015-09-30 Thread Emile van Sebille
On 9/30/2015 11:34 AM, massi_...@msn.com wrote: Hi everyone, firstly the description of my problem. I have a string in the following form: s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..." that is a string made up of groups in the form 'name' (letters only) plus possibly a tuple containing

Question about regular expression

2015-09-30 Thread massi_srb
Hi everyone, firstly the description of my problem. I have a string in the following form: s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..." that is a string made up of groups in the form 'name' (letters only) plus possibly a tuple containing 1 or 2 integer values. Blanks can be placed betwe

Re: Regular expression and substitution, unexpected duplication

2015-08-19 Thread Laurent Pointal
MRAB wrote: > On 2015-08-18 22:42, Laurent Pointal wrote: >> Hello, >> ellipfind_re = re.compile(r"((?=\.\.\.)|…)", re.IGNORECASE|re.VERBOSE) >> ellipfind_re.sub(' ... ', >> "C'est un essai... avec différents caractères… pour voir.") > (?=...) is a lookahead; a non-capture group is (?:..

Re: Regular expression and substitution, unexpected duplication

2015-08-18 Thread MRAB
On 2015-08-18 22:42, Laurent Pointal wrote: Hello, I want to make a replacement in a string, to ensure that ellipsis are surrounded by spaces (this is not a typographycal problem, but a preparation for late text chunking). I tried with regular expressions and the SRE_Pattern.sub() method, but I

Regular expression and substitution, unexpected duplication

2015-08-18 Thread Laurent Pointal
Hello, I want to make a replacement in a string, to ensure that ellipsis are surrounded by spaces (this is not a typographycal problem, but a preparation for late text chunking). I tried with regular expressions and the SRE_Pattern.sub() method, but I have an unexpected duplication of the repl

Re: Regular Expression

2015-06-04 Thread Laura Creighton
In a message of Thu, 04 Jun 2015 06:36:29 -0700, Palpandi writes: >Hi All, > >This is the case. To split "string2" from "string1_string2" I am using >re.split('_', "string1_string2", 1) And you shouldn't be. The 3rd argument, 1 says stop after one match. >It is working fine for string "string1_

Re: Regular Expression

2015-06-04 Thread Peter Otten
Palpandi wrote: > This is the case. To split "string2" from "string1_string2" I am using > re.split('_', "string1_string2", 1)[1]. > > It is working fine for string "string1_string2" and output as "string2". > But actually the problem is that if a sting is "__string1_string2" and the > output is

Re: Regular Expression

2015-06-04 Thread Tim Chase
On 2015-06-04 06:36, Palpandi wrote: > This is the case. To split "string2" from "string1_string2" I am > using re.split('_', "string1_string2", 1)[1]. > > It is working fine for string "string1_string2" and output as > "string2". But actually the problem is that if a sting is > "__string1_string2

Re: Regular Expression

2015-06-04 Thread Steven D'Aprano
On Thu, 4 Jun 2015 11:36 pm, Palpandi wrote: > Hi All, > > This is the case. To split "string2" from "string1_string2" I am using > re.split('_', "string1_string2", 1)[1]. There is absolutely no need to use the nuclear-powered bulldozer of regular expressions to crack that tiny peanut. Strings

Re: Regular Expression

2015-06-04 Thread Larry Martell
On Thu, Jun 4, 2015 at 9:36 AM, Palpandi wrote: > > Hi All, > > This is the case. To split "string2" from "string1_string2" I am using > re.split('_', "string1_string2", 1)[1]. > > It is working fine for string "string1_string2" and output as "string2". But > actually the problem is that if a sti

Regular Expression

2015-06-04 Thread Palpandi
Hi All, This is the case. To split "string2" from "string1_string2" I am using re.split('_', "string1_string2", 1)[1]. It is working fine for string "string1_string2" and output as "string2". But actually the problem is that if a sting is "__string1_string2" and the output is "_string1_string2

Re: Help with Regular Expression

2015-05-19 Thread Tim Chase
On 2015-05-19 06:42, massi_...@msn.com wrote: > I succesfully wrote a regex in python in order to substitute all > the occurences in the form $"somechars" with another string. Here > it is: > > re.sub(ur"""(?u)(\$\"[^\"\\]*(?:\\.[^\"\\]*)*\")""", newstring, > string) The expression is a little mo

Help with Regular Expression

2015-05-19 Thread massi_srb
Hi everyone, I succesfully wrote a regex in python in order to substitute all the occurences in the form $"somechars" with another string. Here it is: re.sub(ur"""(?u)(\$\"[^\"\\]*(?:\\.[^\"\\]*)*\")""", newstring, string) Now I would need to exclude from the match all the string in the form $"

Re: Regular Expression

2015-04-12 Thread Pippo
I fixed all! Thanks. This is the result: #C[Health] #P[Information] #ST[genetic information] #C[oral | (recorded in (any form | medium))] #C[Is created or received by] #A[health care provider | health plan | public health authority | employer | life insurer | school | university | or health car

Re: Regular Expression

2015-04-12 Thread Pippo
Sweet! Thanks all of you! I matched everything except these ones... trying to find the best way > whether #C[oral | (recorded in (any form | medium))], that #C[the past, present, or future physical | mental health | condition of an individual] | > #C[the past, present, or future payment fo

Re: Regular Expression

2015-04-12 Thread Pippo
> Put the print inside the "if"; you don't really care when result is None, and > anyway you can't access .group when it is None - it is not an 're.match" > object, because there was no match. Thanks Cameron, this worked. > > Once you're happy you should consider what happens when there is m

Re: Regular Expression

2015-04-12 Thread Cameron Simpson
On 12Apr2015 18:28, Pippo wrote: On Sunday, 12 April 2015 21:21:48 UTC-4, Cameron Simpson wrote: [...] Pippo, please take a moment to trim the less relevant verbiage from the quoted material; it makes replies easier to read because what is left can be taken to be more "on point". Thanks.

Re: Regular Expression

2015-04-12 Thread Pippo
On Sunday, 12 April 2015 21:21:48 UTC-4, Cameron Simpson wrote: > On 12Apr2015 17:55, Pippo wrote: > >On Sunday, 12 April 2015 20:46:19 UTC-4, Cameron Simpson wrote: > >> It looks like it should, unless you have mangled your regular expression. > [...] > >> Al

Re: Regular Expression

2015-04-12 Thread Cameron Simpson
On 12Apr2015 17:55, Pippo wrote: On Sunday, 12 April 2015 20:46:19 UTC-4, Cameron Simpson wrote: It looks like it should, unless you have mangled your regular expression. [...] Also note that you can print the regexp's .pattern attribute: print(constraint.pattern) as a check that

Re: Regular Expression

2015-04-12 Thread Pippo
means > >> what > >> you think. > >> > >> You're getting None because the regexp fails to match. > >> > >> >> Try printing each string you're trying to match using 'repr', i.e.: > >> >> print(r

Re: Regular Expression

2015-04-12 Thread MRAB
ok like they should match? > print(repr(content[j])) gives me the following: > >[None] >'#D{#C[Health] #P[Information] - \n' [...] >shouldn't it match "#C[Health]" in the first row? It looks like it should, unless you have mangled your regular expression. You ment

Re: Regular Expression

2015-04-12 Thread MRAB
On 2015-04-13 01:25, Pippo wrote: On Sunday, 12 April 2015 20:06:08 UTC-4, MRAB wrote: On 2015-04-13 00:47, Pippo wrote: > On Sunday, 12 April 2015 19:44:05 UTC-4, Pippo wrote: >> On Sunday, 12 April 2015 19:28:44 UTC-4, MRAB wrote: >> > On 2015-04-12 23:49, Pippo wrote: >> > > I have a text

Re: Regular Expression

2015-04-12 Thread Pippo
print(repr(content[j])) > >> > >> Do any look like they should match? > > print(repr(content[j])) gives me the following: > > > >[None] > >'#D{#C[Health] #P[Information] - \n' > [...] > >shouldn't it match "#C[H

  1   2   3   4   5   6   7   8   9   10   >