Re: Using dictionary to hold regex patterns?
André a écrit : (snip) you don't need to use pattern.items()... Here is something I use (straight cut-and-paste): def parse_single_line(self, line): '''Parses a given line to see if it match a known pattern''' for name in self.patterns: result = self.patterns[name].match(line) FWIW, this is more expansive than iterating over (key, value) tuples using dict.items(), since you have one extra call to dict.__getitem__ per entry. if result is not None: return name, result.groups() return None, line where self.patterns is something like self.patterns={ 'pattern1': re.compile(...), 'pattern2': re.compile(...) } The one potential problem with the method as I wrote it is that sometimes a more generic pattern gets matched first whereas a more specific pattern may be desired. As usual when order matters, the solution is to use a list of (name, whatever) tuples instead of a dict. You can still build a dict from this list when needed (the dict initializer accepts a list of (name, object) as argument). -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
John Machin schrieb: No, complicated is more related to unused features. In the case of using an aeroplane to transport 3 passengers 10 km along the autobahn, you aren't using the radar, wheel-retractability, wings, pressurised cabin, etc. In your original notion of using a dict in your lexer, you weren't using the mapping functionality of a dict at all. In both cases you have perplexed bystanders asking Why use a plane/dict when a car/list will do the job?. Now the matter is getting clearer in my head. Thanks and greetings, Thomas -- Ce n'est pas parce qu'ils sont nombreux à avoir tort qu'ils ont raison! (Coluche) -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
Dennis Lee Bieber schrieb: Is [ ( name, regex ), ... ] really simpler than { name: regex, ... }? Intuitively, I would consider the dictionary to be the simpler structure. Why, when you aren't /using/ the name to retrieve the expression... So as soon as I start retrieving a regex by its name, the dict will be the most suitable structure? Greetings, Thomas -- Ce n'est pas parce qu'ils sont nombreux à avoir tort qu'ils ont raison! (Coluche) -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
John Machin schrieb: Rephrasing for clarity: Don't use a data structure that is more complicated than that indicated by your requirements. Could you please define complicated in this context? In terms of characters to type and reading, the dict is surely simpler. But I suppose that under the hood, it is less work for Python to deal with a list of tuples than a dict? Judging which of two structures is simpler should not be independent of those requirements. I don't see a role for intuition in this process. Maybe I should have said upon first sight / judging from the outer appearance instead of intuition. Please see my belated response in your My first Python program -- a lexer thread. (See my answer there.) I think I should definitely read up a bit on the implementation details of those data structures in Python. (As it was suggested earlier in my lexer thread.) Greetings, Thomas -- Ce n'est pas parce qu'ils sont nombreux à avoir tort qu'ils ont raison! (Coluche) -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Nov 25, 4:38 am, Thomas Mlynarczyk [EMAIL PROTECTED] wrote: John Machin schrieb: Rephrasing for clarity: Don't use a data structure that is more complicated than that indicated by your requirements. Could you please define complicated in this context? In terms of characters to type and reading, the dict is surely simpler. But I suppose that under the hood, it is less work for Python to deal with a list of tuples than a dict? The two extra parentheses per item are a trivial cosmetic factor only when the data is hard-coded i.e. don't exist if the data is read from a file i.e nothing to do with complicated. The amount of work done by Python under the hood is relevant only to a speed/memory requirement. No, complicated is more related to unused features. In the case of using an aeroplane to transport 3 passengers 10 km along the autobahn, you aren't using the radar, wheel-retractability, wings, pressurised cabin, etc. In your original notion of using a dict in your lexer, you weren't using the mapping functionality of a dict at all. In both cases you have perplexed bystanders asking Why use a plane/dict when a car/list will do the job?. Judging which of two structures is simpler should not be independent of those requirements. I don't see a role for intuition in this process. Maybe I should have said upon first sight / judging from the outer appearance instead of intuition. I don't see a role for upon first sight or judging from the outer appearance either. -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
John Machin wrote: On Nov 25, 4:38 am, Thomas Mlynarczyk [EMAIL PROTECTED] [...] Judging which of two structures is simpler should not be independent of those requirements. I don't see a role for intuition in this process. Maybe I should have said upon first sight / judging from the outer appearance instead of intuition. I don't see a role for upon first sight or judging from the outer appearance either. They are all potentially (inadequate) substitutes for the knowledge and experience you bring to the problem. regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
Gilles Ganault [EMAIL PROTECTED] writes: Hello After downloading a web page, I need to search for several patterns, and if found, extract information and put them into a database. To avoid a bunch of if m, I figured maybe I could use a dictionary to hold the patterns, and loop through it: == pattern = {} pattern[pattern1] = .+?/td.+?(.+?)/td pattern[pattern1] = re.compile(.+?/td.+?(.+?)/td) for key,value in pattern.items(): response = whatever/td.+?Blababla/td #AttributeError: 'str' object has no attribute 'search' m = key.search(response) m = value.search(response) if m: print key + # + value == Is there a way to use a dictionary this way, or am I stuck with copy/pasting blocks of if m:? But there is no reason why you should use a dictionary; just use a list of key-value pairs: patterns = [ (pattern1, re.compile(.+?/td.+?(.+?)/td), (pattern2, re.compile(something else), ] for name, pattern in patterns: ... -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
Gilles Ganault wrote: Hello After downloading a web page, I need to search for several patterns, and if found, extract information and put them into a database. To avoid a bunch of if m, I figured maybe I could use a dictionary to hold the patterns, and loop through it: Good idea. import re pattern = {} pattern[pattern1] = .+?/td.+?(.+?)/td ... = re.compile(...) for key,value in pattern.items(): for name, regex in ... response = whatever/td.+?Blababla/td #AttributeError: 'str' object has no attribute 'search' Correct, only compiled re patterns have search, better naming would make error obvious. m = key.search(response) m = regex.search(response) if m: print key + # + value print name + '#' + regex -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Sun, 23 Nov 2008 17:55:48 +, Arnaud Delobelle [EMAIL PROTECTED] wrote: But there is no reason why you should use a dictionary; just use a list of key-value pairs: patterns = [ (pattern1, re.compile(.+?/td.+?(.+?)/td), Thanks for the tip, but... I thought that lists could only use integer indexes, while text indexes had to use dictionaries. In which case do we need dictionaries, then? -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
2008/11/23 Gilles Ganault [EMAIL PROTECTED] Hello After downloading a web page, I need to search for several patterns, and if found, extract information and put them into a database. To avoid a bunch of if m, I figured maybe I could use a dictionary to hold the patterns, and loop through it: == pattern = {} pattern[pattern1] = .+?/td.+?(.+?)/td for key,value in pattern.items(): response = whatever/td.+?Blababla/td #AttributeError: 'str' object has no attribute 'search' m = key.search(response) if m: print key + # + value == Is there a way to use a dictionary this way, or am I stuck with copy/pasting blocks of if m:? Thank you. -- http://mail.python.org/mailman/listinfo/python-list I'm not quite sure, whether I underestand correctly, what should be achieved; but it seems, that you should do the searches on dict values, instead of keys, if you want to access the re patterns. m = re.search(re_pattern_value, text_to_search_in): if m: print key + # + m.group() ... In case, there could be multiple matches, probably findall or finditer would be more suitable than search. But after all, regexes aren't very efficient for dealing with HTML, unless you know quite exactly, what structure you can expect; probably e.g. BeautifulSoup could be used. hth, Vlasta -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Sun, Nov 23, 2008 at 2:55 PM, Gilles Ganault [EMAIL PROTECTED] wrote: On Sun, 23 Nov 2008 17:55:48 +, Arnaud Delobelle [EMAIL PROTECTED] wrote: But there is no reason why you should use a dictionary; just use a list of key-value pairs: patterns = [ (pattern1, re.compile(.+?/td.+?(.+?)/td), Thanks for the tip, but... I thought that lists could only use integer indexes, while text indexes had to use dictionaries. In which case do we need dictionaries, then? -- Lists do use integer indexes. Since you never use the dict[key] syntax, you don't need key value pairs like that. Instead, the example uses two-item tuples. patterns = [(pattern1, re.compile(.+?/td.+?(.+?)/td)), (pattern2, re.compile(something else))] patterns[0] ('pattern1', _sre.SRE_Pattern object at 0x3c7a0) for pattern, regex in patterns : ...print pattern + : + str(regex) ... pattern1:_sre.SRE_Pattern object at 0x3c7a0 pattern2:_sre.SRE_Pattern object at 0x35860 http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Nov 24, 6:55 am, Gilles Ganault [EMAIL PROTECTED] wrote: On Sun, 23 Nov 2008 17:55:48 +, Arnaud Delobelle [EMAIL PROTECTED] wrote: But there is no reason why you should use a dictionary; just use a list of key-value pairs: patterns = [ (pattern1, re.compile(.+?/td.+?(.+?)/td), Thanks for the tip, but... I thought that lists could only use integer indexes, while text indexes had to use dictionaries. In which case do we need dictionaries, then? You don't have a requirement for indexing -- neither a text index nor an integer index. Your requirement is met by a sequence of (name, regex) pairs. Yes, a list is a sequence, and a list has integer indexes, but this is irrelevant. General tip: Don't us a data structure that is more complicated than what you need. -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Nov 24, 5:36 am, Terry Reedy [EMAIL PROTECTED] wrote: Gilles Ganault wrote: Hello After downloading a web page, I need to search for several patterns, and if found, extract information and put them into a database. To avoid a bunch of if m, I figured maybe I could use a dictionary to hold the patterns, and loop through it: Good idea. import re pattern = {} pattern[pattern1] = .+?/td.+?(.+?)/td ... = re.compile(...) for key,value in pattern.items(): for name, regex in ... response = whatever/td.+?Blababla/td #AttributeError: 'str' object has no attribute 'search' Correct, only compiled re patterns have search, better naming would make error obvious. m = key.search(response) m = regex.search(response) if m: print key + # + value print name + '#' + regex Perhaps you meant: print key + # + regex.pattern -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
John Machin schrieb: General tip: Don't us a data structure that is more complicated than what you need. Is [ ( name, regex ), ... ] really simpler than { name: regex, ... }? Intuitively, I would consider the dictionary to be the simpler structure. Greetings, Thomas -- Ce n'est pas parce qu'ils sont nombreux à avoir tort qu'ils ont raison! (Coluche) -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Nov 23, 1:40 pm, Gilles Ganault [EMAIL PROTECTED] wrote: Hello After downloading a web page, I need to search for several patterns, and if found, extract information and put them into a database. To avoid a bunch of if m, I figured maybe I could use a dictionary to hold the patterns, and loop through it: == pattern = {} pattern[pattern1] = .+?/td.+?(.+?)/td for key,value in pattern.items(): response = whatever/td.+?Blababla/td #AttributeError: 'str' object has no attribute 'search' m = key.search(response) if m: print key + # + value == Is there a way to use a dictionary this way, or am I stuck with copy/pasting blocks of if m:? Thank you. Yes it is possible and you don't need to use pattern.items()... Here is something I use (straight cut-and-paste): def parse_single_line(self, line): '''Parses a given line to see if it match a known pattern''' for name in self.patterns: result = self.patterns[name].match(line) if result is not None: return name, result.groups() return None, line where self.patterns is something like self.patterns={ 'pattern1': re.compile(...), 'pattern2': re.compile(...) } The one potential problem with the method as I wrote it is that sometimes a more generic pattern gets matched first whereas a more specific pattern may be desired. André -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Nov 24, 7:48 am, John Machin [EMAIL PROTECTED] wrote: On Nov 24, 5:36 am, Terry Reedy [EMAIL PROTECTED] wrote: print name + '#' + regex Perhaps you meant: print key + # + regex.pattern I definitely meant: print name + '#' + regex.pattern -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Nov 24, 7:49 am, Thomas Mlynarczyk [EMAIL PROTECTED] wrote: John Machin schrieb: General tip: Don't us a data structure that is more complicated than what you need. Is [ ( name, regex ), ... ] really simpler than { name: regex, ...}? Intuitively, I would consider the dictionary to be the simpler structure. Hi Thomas, Rephrasing for clarity: Don't use a data structure that is more complicated than that indicated by your requirements. Judging which of two structures is simpler should not be independent of those requirements. I don't see a role for intuition in this process. Please see my belated response in your My first Python program -- a lexer thread. Cheers, John -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Sun, 23 Nov 2008 17:55:48 +, Arnaud Delobelle [EMAIL PROTECTED] wrote: But there is no reason why you should use a dictionary; just use a list of key-value pairs: Thanks for the tip. I didn't know it was possible to use arrays to hold more than one value. Actually, it's a better solution, as key/value tuples in a dictionary aren't used in the order in which they're put in the dictionary, while arrays are. For those interested: response = dummy/tdblagood stuff/td for name, pattern in patterns: m = pattern.search(response) if m: print m.group(1) break else: print here Thanks guys. -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
Gilles Ganault wrote: On Sun, 23 Nov 2008 17:55:48 +, Arnaud Delobelle [EMAIL PROTECTED] wrote: But there is no reason why you should use a dictionary; just use a list of key-value pairs: Thanks for the tip. I didn't know it was possible to use arrays to hold more than one value. Actually, it's a better solution, as key/value tuples in a dictionary aren't used in the order in which they're put in the dictionary, while arrays are. [snip] A list is an ordered collection of items. Each item can be anything: a string, an integer, a dictionary, a tuple, a list... -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Sun, 23 Nov 2008 23:18:06 +, MRAB [EMAIL PROTECTED] wrote: A list is an ordered collection of items. Each item can be anything: a string, an integer, a dictionary, a tuple, a list... Yup, learned something new today. Naively, I though a list was index=value, where value=a single piece of data. Works like a charm. Thanks. -- http://mail.python.org/mailman/listinfo/python-list
Re: Using dictionary to hold regex patterns?
On Mon, 24 Nov 2008 00:46:42 +0100, Gilles Ganault wrote: On Sun, 23 Nov 2008 23:18:06 +, MRAB [EMAIL PROTECTED] wrote: A list is an ordered collection of items. Each item can be anything: a string, an integer, a dictionary, a tuple, a list... Yup, learned something new today. Naively, I though a list was index=value, where value=a single piece of data. Your thought was correct, each value is a single piece of data: *one* tuple. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list