On 2/28/2023 12:57 PM, Jen Kris via Python-list wrote:
The code I sent is correct, and it runs here.  Maybe you received it with a 
carriage return removed, but on my copy after posting, it is correct:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
  find_string = re.escape('abc_degree + 1')
  for match in re.finditer(find_string, example):
      print(match.start(), match.end())

One question:  several people have made suggestions other than regex (not your 
terser example with regex you shown below).  Is there a reason why regex is not 
preferred to, for example, a list comp?  Performance?  Reliability?

"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."

- https://blog.codinghorror.com/regular-expressions-now-you-have-two-problems/

Of course, if you actually read the blog post in the link, there's more to it than that...


Feb 27, 2023, 18:16 by avi.e.gr...@gmail.com:

Jen,

Can you see what SOME OF US see as ASCII text? We can help you better if we get 
code that can be copied and run as-is.

  What you sent is not terse. It is wrong. It will not run on any python 
interpreter because you somehow lost a carriage return and indent.

This is what you sent:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
find_string = re.escape('abc_degree + 1') for match in re.finditer(find_string, 
example):
  print(match.start(), match.end())

This is code indentedproperly:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
find_string = re.escape('abc_degree + 1')
for match in re.finditer(find_string, example):
  print(match.start(), match.end())

Of course I am sure you wrote and ran code more like the latter version but 
somewhere in your copy/paste process, ....

And, just for fun, since there is nothing wrong with your code, this minor 
change is terser:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
for match in re.finditer(re.escape('abc_degree + 1') , example):

...     print(match.start(), match.end())
...
...
4 18
26 40

But note once you use regular expressions, and not in your case, you might match multiple things 
that are far from the same such as matching two repeated words of any kind in any case including 
"and and" and "so so" or finding words that have multiple doubled letter as in 
the  stereotypical bookkeeper. In those cases, you may want even more than offsets but also show 
the exact text that matched or even show some characters before and/or after for context.


-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail....@python.org> On 
Behalf Of Jen Kris via Python-list
Sent: Monday, February 27, 2023 8:36 PM
To: Cameron Simpson <c...@cskk.id.au>
Cc: Python List <python-list@python.org>
Subject: Re: How to escape strings for re.finditer?


I haven't tested it either but it looks like it would work.  But for this case 
I prefer the relative simplicity of:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
find_string = re.escape('abc_degree + 1') for match in re.finditer(find_string, 
example):
  print(match.start(), match.end())

4 18
26 40

I don't insist on terseness for its own sake, but it's cleaner this way.

Jen


Feb 27, 2023, 16:55 by c...@cskk.id.au:

On 28Feb2023 01:13, Jen Kris <jenk...@tutanota.com> wrote:

I went to the re module because the specified string may appear more than once 
in the string (in the code I'm writing).


Sure, but writing a `finditer` for plain `str` is pretty easy (untested):

  pos = 0
  while True:
  found = s.find(substring, pos)
  if found < 0:
  break
  start = found
  end = found + len(substring)
  ... do whatever with start and end ...
  pos = end

Many people go straight to the `re` module whenever they're looking for 
strings. It is often cryptic error prone overkill. Just something to keep in 
mind.

Cheers,
Cameron Simpson <c...@cskk.id.au>
--
https://mail.python.org/mailman/listinfo/python-list


--
https://mail.python.org/mailman/listinfo/python-list



--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to