Re: How do you print a string after it's been searched for an RE?
On Thu, Jun 23, 2011 at 1:58 PM, John Salerno johnj...@gmail.com wrote: After I've run the re.search function on a string and no match was found, how can I access that string? When I try to print it directly, it's an empty string, I assume because it has been consumed. How do I prevent this? This has nothing to do with regular expressions. It would appear that page.read() is letting you read the response body multiple times in 2.x but not in 3.x, probably due to a change in buffering. Just store the string in a variable and avoid calling page.read() multiple times. -- http://mail.python.org/mailman/listinfo/python-list
Re: How do you print a string after it's been searched for an RE?
On Jun 23, 3:47 pm, Ian Kelly ian.g.ke...@gmail.com wrote: On Thu, Jun 23, 2011 at 1:58 PM, John Salerno johnj...@gmail.com wrote: After I've run the re.search function on a string and no match was found, how can I access that string? When I try to print it directly, it's an empty string, I assume because it has been consumed. How do I prevent this? This has nothing to do with regular expressions. It would appear that page.read() is letting you read the response body multiple times in 2.x but not in 3.x, probably due to a change in buffering. Just store the string in a variable and avoid calling page.read() multiple times. Thank you. That worked, and as a result I think my code will look cleaner. -- http://mail.python.org/mailman/listinfo/python-list
Re: How do you print a string after it's been searched for an RE?
There is also print(match_obj.string) which gives you a copy of the string searched. See end of section 6.2.5. Match Objects At 02:58 PM 6/23/2011, John Salerno wrote: After I've run the re.search function on a string and no match was found, how can I access that string? When I try to print it directly, it's an empty string, I assume because it has been consumed. How do I prevent this? It seems to work fine for this 2.x code: import urllib.request import re next_nothing = '12345' pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php? nothing=' pattern = re.compile(r'[0-9]+') while True: page = urllib.request.urlopen(pc_url + next_nothing) match_obj = pattern.search(page.read().decode()) if match_obj: next_nothing = match_obj.group() print(next_nothing) else: print(page.read().decode()) break But when I try it with my own code (3.2), it won't print the text of the page: import urllib.request import re next_nothing = '12345' pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php? nothing=' pattern = re.compile(r'[0-9]+') while True: page = urllib.request.urlopen(pc_url + next_nothing) match_obj = pattern.search(page.read().decode()) if match_obj: next_nothing = match_obj.group() print(next_nothing) else: print(page.read().decode()) break P.S. I plan to clean up my code, I know it's not great right now. But my immediate goal is to just figure out why the 2.x code can print text, but my own code can't print page, which are basically the same thing, unless something significant has changed with either the urllib.request module, or the way it's decoded, or something, or is it just an RE issue? Thanks. -- http://mail.python.org/mailman/listinfo/python-list
Re: How do you print a string after it's been searched for an RE?
On Jun 23, 4:47 pm, Thomas L. Shinnick tshin...@prismnet.com wrote: There is also print(match_obj.string) which gives you a copy of the string searched. See end of section 6.2.5. Match Objects I tried that, but the only time I wanted the string printed was when there *wasn't* a match, so the match object was a NoneType. -- http://mail.python.org/mailman/listinfo/python-list