Re: pattern block expression matching
Thank you all for thoughtful excellent updates! Aldi -- https://mail.python.org/mailman/listinfo/python-list
Re: pattern block expression matching
On Sat, 21 Jul 2018 17:37:00 +0100, MRAB wrote: > On 2018-07-21 15:20, aldi.kr...@gmail.com wrote: >> Hi, >> I have a long text, which tells me which files from a database were >> downloaded and which ones failed. The pattern is as follows (at the end of >> this post). Wrote a tiny program, but still is raw. I want to find term >> "ERROR" and go 5 lines above and get the name with suffix XPT, in this first >> case DRXIFF_F.XPT, but it changes in other cases to some other name with >> suffix XPT. Thanks, Aldi >> >> # reading errors from a file txt >> import re >> with open('nohup.out', 'r') as fh: >>lines = fh.readlines() >>for line in lines: >>m1 = re.search("XPT", line) >>m2 = re.search('ERROR', line) >>if m1: >> print(line) >>if m2: >> print(line) >> > Firstly, you don't need regex for something has simple has checking for > the presence of a string. > > Secondly, I think it's 4 lines above, not 5. > > 'enumerate' comes in useful here: > > with open('nohup.out', 'r') as fh: > lines = fh.readlines() > for i, line in enumerate(lines): > if 'ERROR' in line: > print(line) > print(lines[i - 4]) Where's awk when you need it? import fileinput for line in fileinput.fileinput('nohump.out'): if 'XPT' in line: line_containing_filename = line if 'ERROR' in line: print(line_containing_filename) I think Aldi's original approach is pretty good. -- https://mail.python.org/mailman/listinfo/python-list
Re: pattern block expression matching
MRAB wrote: > On 2018-07-21 15:20, aldi.kr...@gmail.com wrote: >> Hi, >> I have a long text, which tells me which files from a database were >> downloaded and which ones failed. The pattern is as follows (at the end >> of this post). Wrote a tiny program, but still is raw. I want to find >> term "ERROR" and go 5 lines above and get the name with suffix XPT, in >> this first case DRXIFF_F.XPT, but it changes in other cases to some other >> name with suffix XPT. Thanks, Aldi >> >> # reading errors from a file txt >> import re >> with open('nohup.out', 'r') as fh: >>lines = fh.readlines() >>for line in lines: >>m1 = re.search("XPT", line) >>m2 = re.search('ERROR', line) >>if m1: >> print(line) >>if m2: >> print(line) >> > Firstly, you don't need regex for something has simple has checking for > the presence of a string. > > Secondly, I think it's 4 lines above, not 5. > > 'enumerate' comes in useful here: > > with open('nohup.out', 'r') as fh: > lines = fh.readlines() > for i, line in enumerate(lines): > if 'ERROR' in line: > print(line) > print(lines[i - 4]) Here's an alternative that works when the file is huge, and reading it into memory is impractical: import itertools def get_url(line): return line.rsplit(None, 1)[-1] def pairs(lines, step=4): a, b = itertools.tee(f) return zip(a, itertools.islice(b, step, None)) with open("nohup.out") as f: for s, t in pairs(f, 4): if "ERROR" in t: assert "XPT" in s print(get_url(s)) And here's yet another way that assumes that (1) the groups are separated by empty lines (2) the first line always contains the file name (3) "ERROR" may occur in any of the lines that follow def groups(lines): return ( group for key, group in itertools.groupby(lines, key=str.isspace) if not key ) with open("nohup.out") as f: for group in groups(f): first = next(group) if any("ERROR" in line for line in group): assert "XPT" in first print(get_url(first)) >> --2018-07-14 21:26:45-- >> https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXIFF_F.XPT Resolving >> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov >> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, >> awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not >> Found. >> >> --2018-07-14 21:26:46-- >> https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXTOT_F.XPT Resolving >> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov >> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, >> awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not >> Found. >> >> --2018-07-14 21:26:46-- >> https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXFMT_F.XPT Resolving >> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov >> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, >> awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not >> Found. >> >> --2018-07-14 21:26:46-- >> https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DSQ1_F.XPT Resolving >> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov >> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, >> awaiting response... 404 Not Found 2018-07-14 21:26:47 ERROR 404: Not >> Found. >> >> --2018-07-14 21:26:47-- >> https://wwwn.cdc.gov/Nchs/Nhanes/1999-2000/DSII.XPT Resolving >> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov >> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, >> awaiting response... 200 OK Length: 56060880 (53M) >> [application/octet-stream] Saving to: ‘DSII.XPT’ >> > -- https://mail.python.org/mailman/listinfo/python-list
Re: pattern block expression matching
On 2018-07-21 15:20, aldi.kr...@gmail.com wrote: Hi, I have a long text, which tells me which files from a database were downloaded and which ones failed. The pattern is as follows (at the end of this post). Wrote a tiny program, but still is raw. I want to find term "ERROR" and go 5 lines above and get the name with suffix XPT, in this first case DRXIFF_F.XPT, but it changes in other cases to some other name with suffix XPT. Thanks, Aldi # reading errors from a file txt import re with open('nohup.out', 'r') as fh: lines = fh.readlines() for line in lines: m1 = re.search("XPT", line) m2 = re.search('ERROR', line) if m1: print(line) if m2: print(line) Firstly, you don't need regex for something has simple has checking for the presence of a string. Secondly, I think it's 4 lines above, not 5. 'enumerate' comes in useful here: with open('nohup.out', 'r') as fh: lines = fh.readlines() for i, line in enumerate(lines): if 'ERROR' in line: print(line) print(lines[i - 4]) --2018-07-14 21:26:45-- https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXIFF_F.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not Found. --2018-07-14 21:26:46-- https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXTOT_F.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not Found. --2018-07-14 21:26:46-- https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXFMT_F.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not Found. --2018-07-14 21:26:46-- https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DSQ1_F.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-07-14 21:26:47 ERROR 404: Not Found. --2018-07-14 21:26:47-- https://wwwn.cdc.gov/Nchs/Nhanes/1999-2000/DSII.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 56060880 (53M) [application/octet-stream] Saving to: ‘DSII.XPT’ -- https://mail.python.org/mailman/listinfo/python-list
Re: pattern
On 16Jun2018 11:59, Sharan Basappa wrote: This is so kind of you. Thanks for spending time to explain the code. It did help a lot. I did go back and brush up lists & dictionaries. At this point, I think, I need to go back and brush up Python from the start. So, I will do that first. Sure, sounds good. But write code! It is not enough to read code and read about code. You need to write code and modify code. Otherwise the skills don't internalise well. If you're running the code you asked about, one way to learn a lot about something that looks obscrure is simply to put in print() calls at various places, eg: print("iterate over traing_data =", repr(training_data)) for pattern in training_data: # tokenize each word in the sentence print("pattern =", repr(pattern)) w = nltk.word_tokenize(pattern['sentence']) print("w =", repr(w)) # add to our words list words.extend(w) print("words =", repr(words)) # add to documents in our corpus documents.append((w, pattern['class'])) print("documents =", repr(documents)) Note the use of repr(): it will print out the structure of lists and so forth, very useful. Just reviewing that loop, the logic does look a little weird to me. I think the "documents.append" should be inside the loop because otherwise it only accrues the _last_ "w" and "pattern". Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: pattern
Dear Cameron, This is so kind of you. Thanks for spending time to explain the code. It did help a lot. I did go back and brush up lists & dictionaries. At this point, I think, I need to go back and brush up Python from the start. So, I will do that first. On Friday, 15 June 2018 09:12:22 UTC+5:30, Cameron Simpson wrote: > On 14Jun2018 20:01, Sharan Basappa wrote: > >> >Can anyone explain to me the purpose of "pattern" in the line below: > >> > > >> >documents.append((w, pattern['class'])) > >> > > >> >documents is declared as a list as follows: > >> >documents.append((w, pattern['class'])) > >> > >> Not without a lot more context. Where did you find this code? > > > >I am sorry that partial info was not sufficient. > >I am actually trying to implement my first text classification code and I am > >referring to the below URL for that: > > > >https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6 > > Ah, ok. It helps to include some cut/paste of the relevant code, though the > URL > is a big help. > > The wider context of the code you recite looks like this: > > words = [] > classes = [] > documents = [] > ignore_words = ['?'] > # loop through each sentence in our training data > for pattern in training_data: > # tokenize each word in the sentence > w = nltk.word_tokenize(pattern['sentence']) > # add to our words list > words.extend(w) > # add to documents in our corpus > documents.append((w, pattern['class'])) > > and the training_data is defined like this: > > training_data = [] > training_data.append({"class":"greeting", "sentence":"how are you?"}) > training_data.append({"class":"greeting", "sentence":"how is your day?"}) > ... lots more ... > > So training data is a list of dicts, each dict holding a "class" and > "sentence" > key. The "for pattern in training_data" loop iterates over each item of the > training_data. It calls nltk.word_tokenize on the 'sentence" part of the > training item, presumably getting a list of "word" strings. The documents > list > gets this tuple: > > (w, pattern['class']) > > added to it. > > In this way the documents list ends up with tuples of (words, > classification), > with the words coming from the sentence via nltk and the classification > coming > straight from the train item's "class" value. > > So at the end of the loop the documents array will look like: > > documents = [ > ( ['how', 'are', 'you'], 'greeting' ), > ( ['how', 'is', 'your', 'day', 'greeting' ), > ] > > and so forth. > > Cheers, > Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: pattern
On 14Jun2018 20:01, Sharan Basappa wrote: >Can anyone explain to me the purpose of "pattern" in the line below: > >documents.append((w, pattern['class'])) > >documents is declared as a list as follows: >documents.append((w, pattern['class'])) Not without a lot more context. Where did you find this code? I am sorry that partial info was not sufficient. I am actually trying to implement my first text classification code and I am referring to the below URL for that: https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6 Ah, ok. It helps to include some cut/paste of the relevant code, though the URL is a big help. The wider context of the code you recite looks like this: words = [] classes = [] documents = [] ignore_words = ['?'] # loop through each sentence in our training data for pattern in training_data: # tokenize each word in the sentence w = nltk.word_tokenize(pattern['sentence']) # add to our words list words.extend(w) # add to documents in our corpus documents.append((w, pattern['class'])) and the training_data is defined like this: training_data = [] training_data.append({"class":"greeting", "sentence":"how are you?"}) training_data.append({"class":"greeting", "sentence":"how is your day?"}) ... lots more ... So training data is a list of dicts, each dict holding a "class" and "sentence" key. The "for pattern in training_data" loop iterates over each item of the training_data. It calls nltk.word_tokenize on the 'sentence" part of the training item, presumably getting a list of "word" strings. The documents list gets this tuple: (w, pattern['class']) added to it. In this way the documents list ends up with tuples of (words, classification), with the words coming from the sentence via nltk and the classification coming straight from the train item's "class" value. So at the end of the loop the documents array will look like: documents = [ ( ['how', 'are', 'you'], 'greeting' ), ( ['how', 'is', 'your', 'day', 'greeting' ), ] and so forth. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: pattern
> >Can anyone explain to me the purpose of "pattern" in the line below: > > > >documents.append((w, pattern['class'])) > > > >documents is declared as a list as follows: > >documents.append((w, pattern['class'])) > > Not without a lot more context. Where did you find this code? > > Cheers, I am sorry that partial info was not sufficient. I am actually trying to implement my first text classification code and I am referring to the below URL for that: https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6 I hope this helps. -- https://mail.python.org/mailman/listinfo/python-list
Re: pattern
On 13Jun2018 19:51, Sharan Basappa wrote: Can anyone explain to me the purpose of "pattern" in the line below: documents.append((w, pattern['class'])) documents is declared as a list as follows: documents.append((w, pattern['class'])) Not without a lot more context. Where did you find this code? Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On 15/06/2013 22:03, Joshua Landau wrote: On 15 June 2013 11:18, Mark Lawrence wrote: I tend to reach for string methods rather than an RE so will something like this suit you? c:\Users\Mark\MyPython>type a.py for s in ("In the ocean", "On the ocean", "By the ocean", "In this group", "In this group", "By the new group"): print(' '.join(s.split()[1:-1])) c:\Users\Mark\MyPython>a the the the this this the new Careful - " ".join(s.split()) != s Eg: " ".join("s\ns".split()) 's s' It's pedantry, but true. I'm sorry but I haven't the faintest idea what you're talking about. I believe the code I posted works for the OP's needs. If it doesn't please say so. -- "Steve is going for the pink ball - and for those of you who are watching in black and white, the pink is next to the green." Snooker commentator 'Whispering' Ted Lowe. Mark Lawrence -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On 15 June 2013 11:18, Mark Lawrence wrote: > I tend to reach for string methods rather than an RE so will something like > this suit you? > > c:\Users\Mark\MyPython>type a.py > for s in ("In the ocean", > "On the ocean", > "By the ocean", > "In this group", > "In this group", > "By the new group"): > print(' '.join(s.split()[1:-1])) > > > c:\Users\Mark\MyPython>a > the > the > the > this > this > the new Careful - " ".join(s.split()) != s Eg: >>> " ".join("s\ns".split()) 's s' It's pedantry, but true. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
Oops... On Saturday, June 15, 2013 12:47:18 PM UTC-6, ru...@yahoo.com wrote: > Links to the Python reference documentation are useful for people > just beginning with some aspect of Python; they are for people who > already know Python and want to look up details. That was supposed to be: Links to the Python reference documentation are NOT useful for people just beginning with some aspect of Python and as long as I'm revising, I mean that as a general statement, nothing wrong with a reference doc link accompanying a simpler explanation or pointer thereto. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On 6/15/2013 12:28 PM, subhabangal...@gmail.com wrote: Suppose I want a regular expression that matches both "Sent from my iPhone" and "Sent from my iPod". How do I write such an expression--is the problem, "Sent from my iPod" "Sent from my iPhone" which can be written as, re.compile("Sent from my (iPhone|iPod)") now if I want to slightly to extend it as, "Taken from my iPod" "Taken from my iPhone" I am looking how can I use or in the beginning pattern? and the third phase if the intermediate phrase, "from my" if also differs or changes. In a nutshell I want to extract a particular group of phrases, where, the beginning and end pattern may alter like, (i) either from beginning Pattern B1 to end Pattern E1, (ii) or from beginning Pattern B1 to end Pattern E2, (iii) or from beginning Pattern B2 to end Pattern E2, The only hints I will add to those given is that you need a) pattern for a word, and b) a way to 'anchor' the pattern to the beginning and ending of the string so it will only match the first and last words. This is a pretty good re practice problem, so go and practice and experiment. Expect to fail 20 times and you should beat your expectation ;-). The interactive interpreter, or Idle with its F5 Run editor window, makes experimenting easy and (for me) fun. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Sunday, June 16, 2013 12:17:18 AM UTC+5:30, ru...@yahoo.com wrote: > On Saturday, June 15, 2013 11:54:28 AM UTC-6, subhaba...@gmail.com wrote: > > > > > Thank you for the answer. But I want to learn bit of interesting > > > regular expression forms where may I? > > > No Mark, thank you for your links but they were not sufficient. > > > > Links to the Python reference documentation are useful for people > > just beginning with some aspect of Python; they are for people who > > already know Python and want to look up details. So it's no > > surprise that you did not find them useful. > > > > > I am looking for more intriguing exercises, esp use of or in > > > the pattern search. > > > > Have you tried searching on Google for "regular expression tutorial"? > > It gives a lot of results. I've never tried any of them so I can't > > recommend any one specifically but maybe you can find something > > useful there? > > > > There is also a Python Howto on regular expressions at > > http://docs.python.org/3/howto/regex.html > > > > Also, maybe the book "Regular Expressions Cookbook" would > > be useful? It seems to have a lot of specific expressions > > for accomplishing various tasks and seems to be online for > > free at > > http://it-ebooks.info/read/920/ Dear Group, Thank you for the links. Yes, HOW-TO is good. The cook book should be good. Internet changes its contents so fast few days back there was a very good Regular Expression Tutorial by Alan Gauld or there were some mail discussions, I don't know where they are gone. There is one Gauld's tutorial but I think I read some think different. Regards, Subhabrata. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Saturday, June 15, 2013 11:54:28 AM UTC-6, subhaba...@gmail.com wrote: > Thank you for the answer. But I want to learn bit of interesting > regular expression forms where may I? > No Mark, thank you for your links but they were not sufficient. Links to the Python reference documentation are useful for people just beginning with some aspect of Python; they are for people who already know Python and want to look up details. So it's no surprise that you did not find them useful. > I am looking for more intriguing exercises, esp use of or in > the pattern search. Have you tried searching on Google for "regular expression tutorial"? It gives a lot of results. I've never tried any of them so I can't recommend any one specifically but maybe you can find something useful there? There is also a Python Howto on regular expressions at http://docs.python.org/3/howto/regex.html Also, maybe the book "Regular Expressions Cookbook" would be useful? It seems to have a lot of specific expressions for accomplishing various tasks and seems to be online for free at http://it-ebooks.info/read/920/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Saturday, June 15, 2013 3:12:55 PM UTC+5:30, subhaba...@gmail.com wrote: > Dear Group, > > > > I am trying to search the following pattern in Python. > > > > I have following strings: > > > > (i)"In the ocean" > > (ii)"On the ocean" > > (iii) "By the ocean" > > (iv) "In this group" > > (v) "In this group" > > (vi) "By the new group" > >. > > > > I want to extract from the first word to the last word, > > where first word and last word are varying. > > > > I am looking to extract out: > > (i) the > > (ii) the > > (iii) the > > (iv) this > > (v) this > > (vi) the new > > . > > > > The problem may be handled by converting the string to list and then > > index of list. > > > > But I am thinking if I can use regular expression in Python. > > > > If any one of the esteemed members can help. > > > > Thanking you in Advance, > > > > Regards, > > Subhabrata Dear Group, Thank you for the answer. But I want to learn bit of interesting regular expression forms where may I? No Mark, thank you for your links but they were not sufficient. I am looking for more intriguing exercises, esp use of or in the pattern search. Regards, Subhabrata. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On 06/15/2013 03:42 AM, subhabangal...@gmail.com wrote:> Dear Group, > > I am trying to search the following pattern in Python. > > I have following strings: > > (i)"In the ocean" > (ii)"On the ocean" > (iii) "By the ocean" > (iv) "In this group" > (v) "In this group" > (vi) "By the new group" >. > > I want to extract from the first word to the last word, > where first word and last word are varying. > > I am looking to extract out: > (i) the > (ii) the > (iii) the > (iv) this > (v) this > (vi) the new > . > > The problem may be handled by converting the string to list and then > index of list. > > But I am thinking if I can use regular expression in Python. Since nobody here seems to want to answer your question (or seems even able to read it), I'll try. Is something like this what you want? import re texts = [ '(i)"In the ocean"', '(ii)"On the ocean"', '(iii) "By the ocean"', '(iv) "In this group"', '(v) "In this group"', '(vi) "By the new group"'] pattern = re.compile (r'^\((.*)\)\s*"\S+\s*(.*)\s\S+"$') for txt in texts: matchobj = re.search (pattern, txt) number, midtext = matchobj.group (1, 2) print ("(%s) %s" % (number, midtext)) -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On 15/06/2013 17:28, subhabangal...@gmail.com wrote: You've been pointed at several links, so what have you tried, and what, if anything, went wrong? Or do you simply not understand, in which case please say so and we'll help. I'm not trying to be awkward, it's simply known that you learn more if you try something yourself, rather than be spoon fed it. -- "Steve is going for the pink ball - and for those of you who are watching in black and white, the pink is next to the green." Snooker commentator 'Whispering' Ted Lowe. Mark Lawrence -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Saturday, June 15, 2013 8:34:59 PM UTC+5:30, Mark Lawrence wrote: > On 15/06/2013 15:31, subhabangal...@gmail.com wrote: > > > > > > Dear Group, > > > > > > I know this solution but I want to have Regular Expression option. Just > > learning. > > > > > > Regards, > > > Subhabrata. > > > > > > > Start here http://docs.python.org/2/library/re.html > > > > Would you also please read and action this, > > http://wiki.python.org/moin/GoogleGroupsPython , thanks. > > > > -- > > "Steve is going for the pink ball - and for those of you who are > > watching in black and white, the pink is next to the green." Snooker > > commentator 'Whispering' Ted Lowe. > > > > Mark Lawrence Dear Group, Suppose I want a regular expression that matches both "Sent from my iPhone" and "Sent from my iPod". How do I write such an expression--is the problem, "Sent from my iPod" "Sent from my iPhone" which can be written as, re.compile("Sent from my (iPhone|iPod)") now if I want to slightly to extend it as, "Taken from my iPod" "Taken from my iPhone" I am looking how can I use or in the beginning pattern? and the third phase if the intermediate phrase, "from my" if also differs or changes. In a nutshell I want to extract a particular group of phrases, where, the beginning and end pattern may alter like, (i) either from beginning Pattern B1 to end Pattern E1, (ii) or from beginning Pattern B1 to end Pattern E2, (iii) or from beginning Pattern B2 to end Pattern E2, . Regards, Subhabrata. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On 15/06/2013 15:31, subhabangal...@gmail.com wrote: Dear Group, I know this solution but I want to have Regular Expression option. Just learning. Regards, Subhabrata. Start here http://docs.python.org/2/library/re.html Would you also please read and action this, http://wiki.python.org/moin/GoogleGroupsPython , thanks. -- "Steve is going for the pink ball - and for those of you who are watching in black and white, the pink is next to the green." Snooker commentator 'Whispering' Ted Lowe. Mark Lawrence -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
subhabangal...@gmail.com wrote: >I know this solution but I want to have Regular Expression option. >Just learning. http://mattgemmell.com/2008/12/08/what-have-you-tried/ Just spell out what you want: A word at the beginning, followed by any text, followed by a word at the end. Now look up the basic regex metacharacters and try to come up with a solution (Hint: you will need groups) http://docs.python.org/3/howto/regex.html#regex-howto http://docs.python.org/3/library/re.html#regular-expression-syntax Bye, Andreas -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Saturday, June 15, 2013 7:58:44 PM UTC+5:30, Mark Lawrence wrote: > On 15/06/2013 14:45, Denis McMahon wrote: > > > On Sat, 15 Jun 2013 13:41:21 +, Denis McMahon wrote: > > > > > >> first_and_last = [sentence.split()[i] for i in (0, -1)] middle = > > >> sentence.split()[1:-2] > > > > > > Bugger! That last is actually: > > > > > > sentence.split()[1:-1] > > > > > > It just looks like a two. > > > > > > > I've a very strong sense of deja vu having round the same loop what, two > > hours ago? Wondering out aloud the number of times a programmer has > > thought "That's easy, I don't need to test it". How are the mighty fallen. > > > > -- > > "Steve is going for the pink ball - and for those of you who are > > watching in black and white, the pink is next to the green." Snooker > > commentator 'Whispering' Ted Lowe. > > > > Mark Lawrence Dear Group, I know this solution but I want to have Regular Expression option. Just learning. Regards, Subhabrata. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On 15/06/2013 14:45, Denis McMahon wrote: On Sat, 15 Jun 2013 13:41:21 +, Denis McMahon wrote: first_and_last = [sentence.split()[i] for i in (0, -1)] middle = sentence.split()[1:-2] Bugger! That last is actually: sentence.split()[1:-1] It just looks like a two. I've a very strong sense of deja vu having round the same loop what, two hours ago? Wondering out aloud the number of times a programmer has thought "That's easy, I don't need to test it". How are the mighty fallen. -- "Steve is going for the pink ball - and for those of you who are watching in black and white, the pink is next to the green." Snooker commentator 'Whispering' Ted Lowe. Mark Lawrence -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Sat, 15 Jun 2013 13:41:21 +, Denis McMahon wrote: > first_and_last = [sentence.split()[i] for i in (0, -1)] middle = > sentence.split()[1:-2] Bugger! That last is actually: sentence.split()[1:-1] It just looks like a two. -- Denis McMahon, denismfmcma...@gmail.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Sat, 15 Jun 2013 11:55:34 +0100, Mark Lawrence wrote: > >>> sentence = "By the new group" > >>> words = sentence.split() > >>> words[words[0],words[-1]] > Traceback (most recent call last): >File "", line 1, in > TypeError: list indices must be integers, not tuple > > So why would the OP want a TypeError? Or has caffeine deprivation > affected your typing skills? :) Yeah - that last: words[words[0],words[-1]] should probably have been: first_and_last = [words[0], words[-1]] or even: first_and_last = (words[0], words[-1]) Or even: first_and_last = [sentence.split()[i] for i in (0, -1)] middle = sentence.split()[1:-2] -- Denis McMahon, denismfmcma...@gmail.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Jun 15, 3:55 pm, Mark Lawrence wrote: > On 15/06/2013 11:24, Denis McMahon wrote: > > > > > > > > > > > On Sat, 15 Jun 2013 10:05:01 +, Steven D'Aprano wrote: > > >> On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote: > > >>> Dear Group, > > >>> I am trying to search the following pattern in Python. > > >>> I have following strings: > > >>> (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In > >>> this group" (v) "In this group" (vi) "By the new group" > >>> . > > >>> I want to extract from the first word to the last word, where first > >>> word and last word are varying. > > >>> I am looking to extract out: > >>> (i) the (ii) the (iii) the (iv) this (v) this (vi) the new > >>> . > > >>> The problem may be handled by converting the string to list and then > >>> index of list. > > >> No need for a regular expression. > > >> py> sentence = "By the new group" > >> py> words = sentence.split() > >> py> words[1:-1] > >> ['the', 'new'] > > >> Does that help? > > > I thought OP wanted: > > > words[words[0],words[-1]] > > > But that might be just my caffeine deprived misinterpretation of his > > terminology. > > >>> sentence = "By the new group" > >>> words = sentence.split() > >>> words[words[0],words[-1]] > Traceback (most recent call last): > File "", line 1, in > TypeError: list indices must be integers, not tuple > > So why would the OP want a TypeError? Or has caffeine deprivation > affected your typing skills? :) :-) I guess Denis meant (words[0], words[-1]) To the OP: You have the identity: words == [words[0]] + words[1:-1] + [words[-1]] So take your pick of what parts of the expression you want (and discard what you dont want). [The way you've used 'extract' is a bit ambiguous] -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On 15/06/2013 11:24, Denis McMahon wrote: On Sat, 15 Jun 2013 10:05:01 +, Steven D'Aprano wrote: On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote: Dear Group, I am trying to search the following pattern in Python. I have following strings: (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In this group" (v) "In this group" (vi) "By the new group" . I want to extract from the first word to the last word, where first word and last word are varying. I am looking to extract out: (i) the (ii) the (iii) the (iv) this (v) this (vi) the new . The problem may be handled by converting the string to list and then index of list. No need for a regular expression. py> sentence = "By the new group" py> words = sentence.split() py> words[1:-1] ['the', 'new'] Does that help? I thought OP wanted: words[words[0],words[-1]] But that might be just my caffeine deprived misinterpretation of his terminology. >>> sentence = "By the new group" >>> words = sentence.split() >>> words[words[0],words[-1]] Traceback (most recent call last): File "", line 1, in TypeError: list indices must be integers, not tuple So why would the OP want a TypeError? Or has caffeine deprivation affected your typing skills? :) -- "Steve is going for the pink ball - and for those of you who are watching in black and white, the pink is next to the green." Snooker commentator 'Whispering' Ted Lowe. Mark Lawrence -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Sat, 15 Jun 2013 10:05:01 +, Steven D'Aprano wrote: > On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote: > >> Dear Group, >> >> I am trying to search the following pattern in Python. >> >> I have following strings: >> >> (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In >> this group" (v) "In this group" (vi) "By the new group" >>. >> >> I want to extract from the first word to the last word, where first >> word and last word are varying. >> >> I am looking to extract out: >> (i) the (ii) the (iii) the (iv) this (v) this (vi) the new >> . >> >> The problem may be handled by converting the string to list and then >> index of list. > > No need for a regular expression. > > py> sentence = "By the new group" > py> words = sentence.split() > py> words[1:-1] > ['the', 'new'] > > Does that help? I thought OP wanted: words[words[0],words[-1]] But that might be just my caffeine deprived misinterpretation of his terminology. -- Denis McMahon, denismfmcma...@gmail.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On 15/06/2013 10:42, subhabangal...@gmail.com wrote: Dear Group, I am trying to search the following pattern in Python. I have following strings: (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In this group" (v) "In this group" (vi) "By the new group" . I want to extract from the first word to the last word, where first word and last word are varying. I am looking to extract out: (i) the (ii) the (iii) the (iv) this (v) this (vi) the new . The problem may be handled by converting the string to list and then index of list. But I am thinking if I can use regular expression in Python. If any one of the esteemed members can help. Thanking you in Advance, Regards, Subhabrata I tend to reach for string methods rather than an RE so will something like this suit you? c:\Users\Mark\MyPython>type a.py for s in ("In the ocean", "On the ocean", "By the ocean", "In this group", "In this group", "By the new group"): print(' '.join(s.split()[1:-1])) c:\Users\Mark\MyPython>a the the the this this the new -- "Steve is going for the pink ball - and for those of you who are watching in black and white, the pink is next to the green." Snooker commentator 'Whispering' Ted Lowe. Mark Lawrence -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Search Regular Expression
On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote: > Dear Group, > > I am trying to search the following pattern in Python. > > I have following strings: > > (i)"In the ocean" > (ii)"On the ocean" > (iii) "By the ocean" > (iv) "In this group" > (v) "In this group" > (vi) "By the new group" >. > > I want to extract from the first word to the last word, where first word > and last word are varying. > > I am looking to extract out: > (i) the > (ii) the > (iii) the > (iv) this > (v) this > (vi) the new > . > > The problem may be handled by converting the string to list and then > index of list. No need for a regular expression. py> sentence = "By the new group" py> words = sentence.split() py> words[1:-1] ['the', 'new'] Does that help? -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern-match & Replace - help required
On 2012-12-19 14:11, Alexander Blinne wrote: Am 19.12.2012 14:41, schrieb AT: Thanks a million Can you recommend a good online book/tutorial on regular expr. in python? http://docs.python.org/3/howto/regex.html Another good resource is: http://www.regular-expressions.info/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern-match & Replace - help required
Am 19.12.2012 14:41, schrieb AT: > Thanks a million > Can you recommend a good online book/tutorial on regular expr. in python? http://docs.python.org/3/howto/regex.html -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern-match & Replace - help required
On Wednesday, 19 December 2012 18:16:18 UTC+5, Peter Otten wrote: > AT wrote: > > > > > I am new to python and web2py framework. Need urgent help to match a > > > pattern in an string and replace the matched text. > > > > > > I've this string (basically an sql statement): > > > stmnt = 'SELECT taxpayer.id, > > > taxpayer.enc_name, > > > taxpayer.age, > > > taxpayer.occupation > > > FROM taxpayer WHERE (taxpayer.id IS NOT NULL);' > > > > > > The requirement is to replace it with this one: > > > r_stmnt = 'SELECT taxpayer.id, > > >decrypt(taxpayer.enc_name), > > >taxpayer.age, > > >taxpayer.occupation > > >FROM taxpayer WHERE (taxpayer.id IS NOT NULL);' > > > > > > Can somebody please help? > > > > > The pattern is '%s.enc_%s', and after matching this pattern want to change > > > it to 'decrypt(%s.enc_%s)' > > > > after = re.compile(r"(\w+[.]enc_\w+)").sub(r"decrypt(\1)", before) Thanks a million Can you recommend a good online book/tutorial on regular expr. in python? Regards -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern-match & Replace - help required
AT wrote: > I am new to python and web2py framework. Need urgent help to match a > pattern in an string and replace the matched text. > > I've this string (basically an sql statement): > stmnt = 'SELECT taxpayer.id, > taxpayer.enc_name, > taxpayer.age, > taxpayer.occupation > FROM taxpayer WHERE (taxpayer.id IS NOT NULL);' > > The requirement is to replace it with this one: > r_stmnt = 'SELECT taxpayer.id, >decrypt(taxpayer.enc_name), >taxpayer.age, >taxpayer.occupation >FROM taxpayer WHERE (taxpayer.id IS NOT NULL);' > > Can somebody please help? > The pattern is '%s.enc_%s', and after matching this pattern want to change > it to 'decrypt(%s.enc_%s)' after = re.compile(r"(\w+[.]enc_\w+)").sub(r"decrypt(\1)", before) -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern-match & Replace - help required
On Wednesday, 19 December 2012 16:27:19 UTC+5, Thomas Bach wrote: > On Wed, Dec 19, 2012 at 02:42:26AM -0800, AT wrote: > > > Hi, > > > > > > I am new to python and web2py framework. Need urgent help to match a > > > pattern in an string and replace the matched text. > > > > > > > Well, what about str.replace then? > > > > >>> 'egg, ham, tomato'.replace('ham', 'spam, ham, spam') > > 'egg, spam, ham, spam, tomato' > > > > > > If the pattern you want to match is more complicated, have a look at > > the re module! > > > > Regards, > > Thomas. The pattern is '%s.enc_%s', and after matching this pattern want to change it to 'decrypt(%s.enc_%s)' Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern-match & Replace - help required
On Wed, Dec 19, 2012 at 02:42:26AM -0800, AT wrote: > Hi, > > I am new to python and web2py framework. Need urgent help to match a > pattern in an string and replace the matched text. > Well, what about str.replace then? >>> 'egg, ham, tomato'.replace('ham', 'spam, ham, spam') 'egg, spam, ham, spam, tomato' If the pattern you want to match is more complicated, have a look at the re module! Regards, Thomas. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern-match & Replace - help required
On Wed, 19 Dec 2012 03:01:32 -0800, AT wrote: > I just wanted to change taxpayer.enc_name in stmnt to > decrypt(taxpayer.enc_name) > > hope it clarifies? Maybe. Does this help? lunch = "Bread, ham, cheese and tomato." # replace ham with spam offset = lunch.find('ham') if offset != -1: lunch = lunch[:offset] + 'spam' + lunch[offset + len('ham'):] print(lunch) -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern-match & Replace - help required
On Wednesday, 19 December 2012 15:51:22 UTC+5, Steven D'Aprano wrote: > On Wed, 19 Dec 2012 02:42:26 -0800, AT wrote: > > > > > Hi, > > > > > > I am new to python and web2py framework. Need urgent help to match a > > > pattern in an string and replace the matched text. > > > > > > I've this string (basically an sql statement): > > > > > > stmnt = 'SELECT taxpayer.id, > > > taxpayer.enc_name, > > > taxpayer.age, > > > taxpayer.occupation > > > FROM taxpayer WHERE (taxpayer.id IS NOT NULL);' > > > > > > The requirement is to replace it with this one: > > > > > > r_stmnt = 'SELECT taxpayer.id, > > >decrypt(taxpayer.enc_name), > > >taxpayer.age, > > >taxpayer.occupation > > >FROM taxpayer WHERE (taxpayer.id IS NOT NULL);' > > > > > > Can somebody please help? > > > > Can you do this? > > > > stmnt = r_stmnt > > > > That should do what you are asking. > > > > If that doesn't solve your problem, you will need to explain your problem > > in more detail. > > > > > > > > -- > > Steven I just wanted to change taxpayer.enc_name in stmnt to decrypt(taxpayer.enc_name) hope it clarifies? thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern-match & Replace - help required
On Wed, 19 Dec 2012 02:42:26 -0800, AT wrote: > Hi, > > I am new to python and web2py framework. Need urgent help to match a > pattern in an string and replace the matched text. > > I've this string (basically an sql statement): > > stmnt = 'SELECT taxpayer.id, > taxpayer.enc_name, > taxpayer.age, > taxpayer.occupation > FROM taxpayer WHERE (taxpayer.id IS NOT NULL);' > > The requirement is to replace it with this one: > > r_stmnt = 'SELECT taxpayer.id, >decrypt(taxpayer.enc_name), >taxpayer.age, >taxpayer.occupation >FROM taxpayer WHERE (taxpayer.id IS NOT NULL);' > > Can somebody please help? Can you do this? stmnt = r_stmnt That should do what you are asking. If that doesn't solve your problem, you will need to explain your problem in more detail. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern matching
On Feb 24, 2:11 am, monkeys paw wrote: > if I have a string such as '01/12/2011' and i want > to reformat it as '20110112', how do i pull out the components > of the string and reformat them into a DDMM format? > > I have: > > import re > > test = re.compile('\d\d\/') > f = open('test.html') # This file contains the html dates > for line in f: > if test.search(line): > # I need to pull the date components here I second using an html parser to extact the content of the TD's, but I would also go one step further reformatting and do something such as: >>> from time import strptime, strftime >>> d = '01/12/2011' >>> strftime('%Y%m%d', strptime(d, '%m/%d/%Y')) '20110112' That way you get some validation about the data, ie, if you get '13/12/2011' you've probably got mixed data formats. hth Jon. -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern matching
On Feb 23, 9:11 pm, monkeys paw wrote: > if I have a string such as '01/12/2011' and i want > to reformat it as '20110112', how do i pull out the components > of the string and reformat them into a DDMM format? > > I have: > > import re > > test = re.compile('\d\d\/') > f = open('test.html') # This file contains the html dates > for line in f: > if test.search(line): > # I need to pull the date components here What you need are parentheses, which capture part of the text you're matching. Each set of parentheses creates a "group". To get to these groups, you need the match object which is returned by re.search. Group 0 is the entire match, group 1 is the contents of the first set of parentheses, and so forth. If the regex does not match, then re.search returns None. DATA FILE (test.html): David02/19/1967 Susan05/23/1948 Clare09/22/1952 BP08/27/1990 Roger12/19/1954 CODE: import re rx_test = re.compile(r'(\d{2})/(\d{2})/(\d{4})') f = open('test.html') for line in f: m = rx_test.search(line) if m: new_date = m.group(3) + m.group(1) + m.group(2) print "raw text: ",m.group(0) print "new date: ",new_date print OUTPUT: raw text: 02/19/1967 new date: 19670219 raw text: 05/23/1948 new date: 19480523 raw text: 09/22/1952 new date: 19520922 raw text: 08/27/1990 new date: 19900827 raw text: 12/19/1954 new date: 19541219 -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern matching
if I have a string such as '01/12/2011' and i want to reformat it as '20110112', how do i pull out the components of the string and reformat them into a DDMM format? I have: import re test = re.compile('dd/') f = open('test.html') # This file contains the html dates for line in f: if test.search(line): # I need to pull the date components here I am no python guru but you could use beautifulsoup to parse html as its much easier some untested pseudocode below. adapt to your needs. from BeautifulSoup import BeautifulSoup #read html data or whatever source html_data = open('/yourwebsite/page.html','r').read() #Create the soup object from the HTML data soup = new BeautifulSoup(html_data) someData = soup.find('td',name='someTable') #Find the proper tag see beautifulsoup docs value = someData.attrs[2][1] # the value of 3rd attrib of the tag , just an example ##end now when you have the date in some str format the next thing is your date conversion. For this re fer to dateutil parse http://labix.org/python-dateutil hope it help. posted via Grepler.com -- poster is authenticated. begin 644 end -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern matching
In article , Chris Rebert wrote: > regex = compile("(\d\d)/(\d\d)/(\d{4})") I would probably write that as either r"(\d{2})/(\d{2})/(\d{4})" or (somewhat less likely) r"(\d\d)/(\d\d)/(\d\d\d\d)" Keeping to one consistent style makes it a little easier to read. Also, don't forget the leading `r` to get raw strings. I've long since given up trying to remember the exact rules of what needs to get escaped and what doesn't. If it's a regex, I just automatically make it a raw string. Also, don't overlook the re.VERBOSE flag. With it, you can write positively outrageous expressions which are still quite readable. With it, you could write this regex as: r" (\d{2}) / (\d{2}) / (\d{4}) " which takes up only slightly more space, but makes it a whole lot easier to scan by eye. I'm still going to stand by my previous statement, however. If you're trying to parse HTML, use an HTML parser. Using a regex like this is perfectly fine for parsing the CDATA text inside the HTML element, but pattern matching the HTML markup itself is madness. -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern matching
On Wed, Feb 23, 2011 at 6:37 PM, Steven D'Aprano wrote: > On Wed, 23 Feb 2011 21:11:53 -0500, monkeys paw wrote: >> if I have a string such as '01/12/2011' and i want to reformat >> it as '20110112', how do i pull out the components of the string and >> reformat them into a DDMM format? > > data = '01/12/2011' > # Throw away tags. > data = data[4:-5] > # Separate components. > day, month, year = data.split('/') > # Recombine. > print(year + month + day) > > > No need for the sledgehammer of regexes for cracking this peanut. Agreed. But "Just 'Cause"(tm), and in order to get in some regex practice: from re import compile regex = compile("(\d\d)/(\d\d)/(\d{4})") for match in regex.finditer(data): day, month, year = match.groups() print(year + month + day) Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern matching
On Wed, 23 Feb 2011 21:11:53 -0500, monkeys paw wrote: > if I have a string such as '01/12/2011' and i want to reformat > it as '20110112', how do i pull out the components of the string and > reformat them into a DDMM format? data = '01/12/2011' # Throw away tags. data = data[4:-5] # Separate components. day, month, year = data.split('/') # Recombine. print(year + month + day) No need for the sledgehammer of regexes for cracking this peanut. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern matching
In article , monkeys paw wrote: > if I have a string such as '01/12/2011' and i want > to reformat it as '20110112', how do i pull out the components > of the string and reformat them into a DDMM format? > > I have: > > import re > > test = re.compile('\d\d\/') > f = open('test.html') # This file contains the html dates > for line in f: > if test.search(line): > # I need to pull the date components here My first thought is that any attempt to parse HTML by using regex is doomed to failure. HTML is meant to be parsed by an HTML parser. Python gives you several to pick from; the best that I know of is the third-party lxml package (http://lxml.de/). My second thought is that my first thought was correct. -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern matching with multiple lists
On 07/16/2010 02:20 PM, Chad Kellerman wrote: Greetings, I have some code that I wrote and know there is a better way to write it. I wonder if anyone could point me in the right direction on making this 'cleaner'. I have two lists: liveHostList = [ app11, app12, web11, web12, host11 ] stageHostList = [ web21, web22, host21, app21, app22 ] I need to pair the elements in the list such that: app11 pairs with app21 app12 pairs with app22 web11 pairs with web21 web12 pairs with web22 host11pairs with host21 While I like MRAB's solution even better than mine[1], you can also use: liveHostList = ["app11", "app12", "web11", "web12", "host11"] stageHostList = ["web21", "web22", "host21", "app21", "app22"] def bits(s): return (s[:-2],s[-1]) for live, stage in zip( sorted(liveHostList, key=bits), sorted(stageHostList, key=bits), ): print "Match: ", live, stage -tkc [1] His solution is O(N), making one pass through each list, with O(1) lookups into the created dict during the 2nd loop, while mine is likely overwhelmed by the cost of the sorts...usually O(N log N) for most reasonable sorts. However, this doesn't likely matter much until your list-sizes are fairly large. -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern matching with multiple lists
Chad Kellerman wrote: Greetings, I have some code that I wrote and know there is a better way to write it. I wonder if anyone could point me in the right direction on making this 'cleaner'. I have two lists: liveHostList = [ app11, app12, web11, web12, host11 ] stageHostList = [ web21, web22, host21, app21, app22 ] I need to pair the elements in the list such that: app11 pairs with app21 app12 pairs with app22 web11 pairs with web21 web12 pairs with web22 host11pairs with host21 each time I get the list I don't know the order, and the lists will grow over time. (hosts will be added in pairs. app13 to liveHostList and app23 to stageHostList, etc) Anyways this is what I have. I think it can be written better with map, but not sure. Any help would be appreciated. import re for liveHost in liveHostlist: nameList = list(liveHost) clone= nameList[-1] di = nameList[-2] generic = liveHost[:-2] for stageHost in stageHostList: if re.match( generic + '.' + clone, stageHost ): print "Got a pair: " + stageHost + liveHost Thanks again for any suggestions, Chad So you recognise a pair by them having the same 'key', which is: name[ : -2] + name[-1 : ] Therefore you can put one of the lists into a dict and look up the name by its key: liveHostDict = dict((liveHost[ : -2] + liveHost[-1 : ], liveHost) for liveHost in liveHostList) for stageHost in stageHostList: key = stageHost[ : -2] + stageHost[-1 : ] liveHost = liveHostDict[key] print "Got a pair: %s %s" % (stageHost, liveHost) -- http://mail.python.org/mailman/listinfo/python-list
Re: Defining re pattern for matching list of numbers
On Fri, 06 Nov 2009 10:16:31 -0800, Chris Rebert wrote: > Your format seems so simple I have to ask why you're using regexes in > the first place. Raymond Hettinger has described some computing techniques as "code prions" -- programming advice or techniques which are sometimes useful but often actively harmful. http://www.mail-archive.com/python-list%40python.org/msg262651.html As useful as regexes are, I think they qualify as code prions too: people insist on using them in production code, even when a simple string method or function would do the job far more efficiently and readably. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
In article <[EMAIL PROTECTED]>, Jonathan Gardner <[EMAIL PROTECTED]> wrote: > On Jun 27, 10:32 am, "David C. Ullrich" <[EMAIL PROTECTED]> wrote: > > (ii) The regexes in languages like Python and Perl include > > features that are not part of the formal CS notion of > > "regular expression". Do they include something that > > does allow parsing nested delimiters properly? > > > > In perl, there are some pretty wild extensions to the regex syntax, > features that make it much more than a regular expression engine. > > Yes, it is possible to match parentheses and other nested structures > (such as HTML), and the regex to do so isn't incredibly difficult. > Note that Python doesn't support this extension. Huh. My evidently misinformed impression was that the regexes in P and P were essentially equivalent. (I hope nobody takes that as a complaint...) > See http://www.perl.com/pub/a/2003/08/21/perlcookbook.html -- David C. Ullrich -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
On Jun 27, 10:32 am, "David C. Ullrich" <[EMAIL PROTECTED]> wrote: > (ii) The regexes in languages like Python and Perl include > features that are not part of the formal CS notion of > "regular expression". Do they include something that > does allow parsing nested delimiters properly? > In perl, there are some pretty wild extensions to the regex syntax, features that make it much more than a regular expression engine. Yes, it is possible to match parentheses and other nested structures (such as HTML), and the regex to do so isn't incredibly difficult. Note that Python doesn't support this extension. See http://www.perl.com/pub/a/2003/08/21/perlcookbook.html -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
In article <[EMAIL PROTECTED]>, Dan <[EMAIL PROTECTED]> wrote: > On Jun 27, 1:32 pm, "David C. Ullrich" <[EMAIL PROTECTED]> wrote: > > In article > > <[EMAIL PROTECTED]>, > > Jonathan Gardner <[EMAIL PROTECTED]> wrote: > > > > > On Jun 26, 3:22 pm, MRAB <[EMAIL PROTECTED]> wrote: > > > > Try something like: > > > > > > re.compile(r'.*?', re.DOTALL) > > > > > So you would pick up strings like "foo > > td>"? I doubt that is what oyster wants. > > > > I asked a question recently - nobody answered, I think > > because they assumed it was just a rhetorical question: > > > > (i) It's true, isn't it, that it's impossible for the > > formal CS notion of "regular expression" to correctly > > parse nested open/close delimiters? > > Yes. For the proof, you want to look at the pumping lemma found in > your favorite Theory of Computation textbook. Ah, thanks. Don't have a favorite text, not having any at all. But wikipedia works - what I found at http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages was pretty clear. (Yes, it's exactly that \1, \2 stuff that convinced me I really don't understand what one can do with a Python regex.) > > > > (ii) The regexes in languages like Python and Perl include > > features that are not part of the formal CS notion of > > "regular expression". Do they include something that > > does allow parsing nested delimiters properly? > > So, I think most of the extensions fall into syntactic sugar > (certainly all the character classes \b \s \w, etc). The ability to > look at input without consuming it is more than syntactic sugar, but > my intuition is that it could be pretty easily modeled by a > nondeterministic finite state machine, which is of equivalent power to > REs. The only thing I can really think of that is completely non- > regular is the \1 \2, etc syntax to match previously match strings > exactly. But since you can't to an arbitrary number of them, I don't > think its actually context free. (I'm not prepared to give a proof > either way). Needless to say that even if you could, it would be > highly impractical to match parentheses using those. > > So, yeah, to match arbitrary nested delimiters, you need a real > context free parser. > > > > > -- > > David C. Ullrich > > > -Dan -- David C. Ullrich -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
On Jun 27, 1:32 pm, "David C. Ullrich" <[EMAIL PROTECTED]> wrote: > In article > <[EMAIL PROTECTED]>, > Jonathan Gardner <[EMAIL PROTECTED]> wrote: > > > On Jun 26, 3:22 pm, MRAB <[EMAIL PROTECTED]> wrote: > > > Try something like: > > > > re.compile(r'.*?', re.DOTALL) > > > So you would pick up strings like "foo > td>"? I doubt that is what oyster wants. > > I asked a question recently - nobody answered, I think > because they assumed it was just a rhetorical question: > > (i) It's true, isn't it, that it's impossible for the > formal CS notion of "regular expression" to correctly > parse nested open/close delimiters? Yes. For the proof, you want to look at the pumping lemma found in your favorite Theory of Computation textbook. > > (ii) The regexes in languages like Python and Perl include > features that are not part of the formal CS notion of > "regular expression". Do they include something that > does allow parsing nested delimiters properly? So, I think most of the extensions fall into syntactic sugar (certainly all the character classes \b \s \w, etc). The ability to look at input without consuming it is more than syntactic sugar, but my intuition is that it could be pretty easily modeled by a nondeterministic finite state machine, which is of equivalent power to REs. The only thing I can really think of that is completely non- regular is the \1 \2, etc syntax to match previously match strings exactly. But since you can't to an arbitrary number of them, I don't think its actually context free. (I'm not prepared to give a proof either way). Needless to say that even if you could, it would be highly impractical to match parentheses using those. So, yeah, to match arbitrary nested delimiters, you need a real context free parser. > > -- > David C. Ullrich -Dan -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
In article <[EMAIL PROTECTED]>, Jonathan Gardner <[EMAIL PROTECTED]> wrote: > On Jun 26, 3:22 pm, MRAB <[EMAIL PROTECTED]> wrote: > > Try something like: > > > > re.compile(r'.*?', re.DOTALL) > > So you would pick up strings like "foo td>"? I doubt that is what oyster wants. I asked a question recently - nobody answered, I think because they assumed it was just a rhetorical question: (i) It's true, isn't it, that it's impossible for the formal CS notion of "regular expression" to correctly parse nested open/close delimiters? (ii) The regexes in languages like Python and Perl include features that are not part of the formal CS notion of "regular expression". Do they include something that does allow parsing nested delimiters properly? -- David C. Ullrich -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
On Jun 26, 3:22 pm, MRAB <[EMAIL PROTECTED]> wrote: > Try something like: > > re.compile(r'.*?', re.DOTALL) So you would pick up strings like "foo"? I doubt that is what oyster wants. -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
On Jun 26, 11:07 am, Grant Edwards <[EMAIL PROTECTED]> wrote: > On 2008-06-26, Stefan Behnel <[EMAIL PROTECTED]> wrote: > > > > Why not use an HTML parser instead? > > > > Stating it differently: in order to correctly recognize HTML > tags, you must use an HTML parser. Trying to write an HTML > parser in a single RE is probably not practical. > s/practical/possible It isn't *possible* to grok HTML with regular expressions. Individual tags--yes. But not a full element where nesting is possible. At least not properly. Maybe we need some notes on the limits of regular expressions in the re documentation for people who haven't taken the computer science courses on parsing and grammars. Then we could explain the necessity of real parsers and grammars, at least in layman's terms. -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
On Jun 26, 7:26 pm, "David C. Ullrich" <[EMAIL PROTECTED]> wrote: > In article <[EMAIL PROTECTED]>, > Cédric Lucantis <[EMAIL PROTECTED]> wrote: > > > > > Le Thursday 26 June 2008 15:53:06 oyster, vous avez écrit : > > > that is, there is no TABLE tag between a TABLE, for example > > > something with out table tag > > > what is the RE pattern? thanks > > > > the following is not right > > > [^table]*? > > > The construct [abc] does not match a whole word but only one char, so > > [^table] means "any char which is not t, a, b, l or e". > > > Anyway the inside table word won't match your pattern, as there are '<' > > and '>' in it, and these chars have to be escaped when used as simple text. > > So this should work: > > > re.compile(r'.*') > > ^ this is to avoid matching a tag name starting with > > table > > (like ) > > Doesn't work - for example it matches '' > (and in fact if the html contains any number of tables it's going > to match the string starting at the start of the first table and > ending at the end of the last one.) > Try something like: re.compile(r'.*?', re.DOTALL) -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
In article <[EMAIL PROTECTED]>, Cédric Lucantis <[EMAIL PROTECTED]> wrote: > Le Thursday 26 June 2008 15:53:06 oyster, vous avez écrit : > > that is, there is no TABLE tag between a TABLE, for example > > something with out table tag > > what is the RE pattern? thanks > > > > the following is not right > > [^table]*? > > The construct [abc] does not match a whole word but only one char, so > [^table] means "any char which is not t, a, b, l or e". > > Anyway the inside table word won't match your pattern, as there are '<' > and '>' in it, and these chars have to be escaped when used as simple text. > So this should work: > > re.compile(r'.*') > ^ this is to avoid matching a tag name starting with > table > (like ) Doesn't work - for example it matches '' (and in fact if the html contains any number of tables it's going to match the string starting at the start of the first table and ending at the end of the last one.) -- David C. Ullrich -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
On 2008-06-26, Stefan Behnel <[EMAIL PROTECTED]> wrote: > oyster wrote: >> that is, there is no TABLE tag between a TABLE, for example >> something with out table tag >> what is the RE pattern? thanks >> >> the following is not right >> [^table]*? > > Why not use an HTML parser instead? Stating it differently: in order to correctly recognize HTML tags, you must use an HTML parser. Trying to write an HTML parser in a single RE is probably not practical. -- Grant Edwards grante Yow! I want another at RE-WRITE on my CEASAR visi.comSALAD!! -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
oyster wrote: > that is, there is no TABLE tag between a TABLE, for example > something with out table tag > what is the RE pattern? thanks > > the following is not right > [^table]*? Why not use an HTML parser instead? Try lxml.html. http://codespeak.net/lxml/ Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: ask for a RE pattern to match TABLE in html
Le Thursday 26 June 2008 15:53:06 oyster, vous avez écrit : > that is, there is no TABLE tag between a TABLE, for example > something with out table tag > what is the RE pattern? thanks > > the following is not right > [^table]*? The construct [abc] does not match a whole word but only one char, so [^table] means "any char which is not t, a, b, l or e". Anyway the inside table word won't match your pattern, as there are '<' and '>' in it, and these chars have to be escaped when used as simple text. So this should work: re.compile(r'.*') ^ this is to avoid matching a tag name starting with table (like ) -- Cédric Lucantis -- http://mail.python.org/mailman/listinfo/python-list
ask for a RE pattern to match TABLE in html
that is, there is no TABLE tag between a TABLE, for example something with out table tag what is the RE pattern? thanks the following is not right [^table]*? -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Matching Over Python Lists
> Fair enough. To help you understand the method I used, I'll give you > this hint. It's true that regex on works on strings. However, is there > any way to convert arbitrarily complex data structures to string > representations? You don't need to be an experienced Python user to > answer to this ;) As Paddy noted before, your solution has a problem, Regexes can't match nested parenthesis, so I think your method will have a problem with nested lists, unless your actual inputs are much simpler than the general case. Eli -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Matching Over Python Lists
On Jun 19, 9:03 pm, John Machin <[EMAIL PROTECTED]> wrote: > On Jun 20, 10:45 am, Chris <[EMAIL PROTECTED]> wrote: > > > On Jun 17, 1:09 pm, [EMAIL PROTECTED] wrote: > > > > Kirk Strauser: > > > > > Hint: recursion. Your general algorithm will be something like: > > > > Another solution is to use a better (different) language, that has > > > built-in pattern matching, or allows to create one. > > > > Bye, > > > bearophile > > > Btw, Python's stdlib includes a regular expression library. I'm not > > sure if you're trolling or simply unaware of it, but I've found it > > quite adequate for most tasks. > > Kindly consider a third possibility: bearophile is an experienced > Python user, has not to my knowledge exhibited any troll-like > behaviour in the past, and given that you seem to be happy using the > re module not on strings but on lists of integers, may have been > wondering whether *you* were trolling or just plain confused but just > too polite to wonder out loud :-) Fair enough. To help you understand the method I used, I'll give you this hint. It's true that regex on works on strings. However, is there any way to convert arbitrarily complex data structures to string representations? You don't need to be an experienced Python user to answer to this ;) -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Matching Over Python Lists
On Jun 20, 1:45 am, Chris <[EMAIL PROTECTED]> wrote: > On Jun 17, 1:09 pm, [EMAIL PROTECTED] wrote: > > > Kirk Strauser: > > > > Hint: recursion. Your general algorithm will be something like: > > > Another solution is to use a better (different) language, that has > > built-in pattern matching, or allows to create one. > > > Bye, > > bearophile > > Btw, Python's stdlib includes a regular expression library. I'm not > sure if you're trolling or simply unaware of it, but I've found it > quite adequate for most tasks. bearophile was talking about matching lists and tuples, not matching strings. Python's regular expression module works with characters in strings, but the same approach can be applied to items in lists and tuples. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Matching Over Python Lists
On Jun 20, 1:44 am, Chris <[EMAIL PROTECTED]> wrote: > Thanks for your help. Those weren't quite what I was looking for, but > I ended up figuring it out on my own. Turns out you can actually > search nested Python lists using simple regular expressions. Strange? How do you match nested '[' ... ']' brackets? - Paddy. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Matching Over Python Lists
On Jun 20, 10:45 am, Chris <[EMAIL PROTECTED]> wrote: > On Jun 17, 1:09 pm, [EMAIL PROTECTED] wrote: > > > Kirk Strauser: > > > > Hint: recursion. Your general algorithm will be something like: > > > Another solution is to use a better (different) language, that has > > built-in pattern matching, or allows to create one. > > > Bye, > > bearophile > > Btw, Python's stdlib includes a regular expression library. I'm not > sure if you're trolling or simply unaware of it, but I've found it > quite adequate for most tasks. Kindly consider a third possibility: bearophile is an experienced Python user, has not to my knowledge exhibited any troll-like behaviour in the past, and given that you seem to be happy using the re module not on strings but on lists of integers, may have been wondering whether *you* were trolling or just plain confused but just too polite to wonder out loud :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Matching Over Python Lists
On Jun 17, 1:09 pm, [EMAIL PROTECTED] wrote: > Kirk Strauser: > > > Hint: recursion. Your general algorithm will be something like: > > Another solution is to use a better (different) language, that has > built-in pattern matching, or allows to create one. > > Bye, > bearophile Btw, Python's stdlib includes a regular expression library. I'm not sure if you're trolling or simply unaware of it, but I've found it quite adequate for most tasks. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Matching Over Python Lists
Thanks for your help. Those weren't quite what I was looking for, but I ended up figuring it out on my own. Turns out you can actually search nested Python lists using simple regular expressions. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Matching Over Python Lists
Kirk Strauser: > Hint: recursion. Your general algorithm will be something like: Another solution is to use a better (different) language, that has built-in pattern matching, or allows to create one. Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Matching Over Python Lists
At 2008-06-17T05:55:52Z, Chris <[EMAIL PROTECTED]> writes: > Is anyone aware of any prior work done with searching or matching a > pattern over nested Python lists? I have this problem where I have a > list like: > > [1, 2, [1, 2, [1, 7], 9, 9], 10] > > and I'd like to search for the pattern [1, 2, ANY] so that is returns: > > [1, 2, [1, 2, [6, 7], 9, 9], 10] > [1, 2, [6, 7], 9, 9] Hint: recursion. Your general algorithm will be something like: def compare(list, function): if function(list): print list for item in list: if item is a list: compare(item, function) def check(list): if list starts with [1, 2] and length of the list > 2: return True else: return False -- Kirk Strauser The Day Companies -- http://mail.python.org/mailman/listinfo/python-list
Re: Tips Re Pattern Matching / REGEX
Hello, > I have a large text file (1GB or so) with structure similar to the > html example below. > > I have to extract content (text between div and tr tags) from this > file and put it into a spreadsheet or a database - given my limited > python knowledge I was going to try to do this with regex pattern > matching. > > Would someone be able to provide pointers regarding how do I approach > this? Any code samples would be greatly appreciated. The ultimate tool for handling HTML is http://www.crummy.com/software/BeautifulSoup/ where you can do stuff like: soup = BeautifulSoup(html) for div in soup("div", {"class" : "special"}): ... Not sure how fast it is though. There is also the htmllib module that comes with python, it might do the work as well and maybe a bit faster. If the file is valid HTML and you need some speed, have a look at xml.sax. HTH, -- Miki <[EMAIL PROTECTED]> http://pythonwise.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Tips Re Pattern Matching / REGEX
Hello Python Community, I have a large text file (1GB or so) with structure similar to the html example below. I have to extract content (text between div and tr tags) from this file and put it into a spreadsheet or a database - given my limited python knowledge I was going to try to do this with regex pattern matching. Would someone be able to provide pointers regarding how do I approach this? Any code samples would be greatly appreciated. Thanks. Sam \\ there are hundreds of thousands of items \\Item1 123 Text1: What do I do with these lines That span several rows? ... Foot \\Item2 First Line Can go here But the second line can go here ... Foot Can span Over several pages \\Item3 First Line Can go here But the second line can go here ... This can Span several rows -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern combinations
On Sep 17, 3:11 pm, "Shawn Milochik" <[EMAIL PROTECTED]> wrote: > On 9/17/07, dorje tarap <[EMAIL PROTECTED]> wrote: > > > > > Hi all, > > > Given some patterns such as "...t...s." I need to make all possible > > combinations given a separate list for each position. The length of the > > pattern is fixed to 9, so thankfully that reduces a bit of the complexity. > > > For example I have the following: > > > pos1 = ['a',' t'] > > pos2 = ['r', 's'] > > pos3 = ['n', 'f'] > > > So if the pattern contains a '.' character at position 1 it could be 'a' or > > 't'. For the pattern '.s.' (length of 3 as example) all combinations would > > be: > > > asn > > asf > > tsn > > tsf > > > Thanks > > -- > >http://mail.python.org/mailman/listinfo/python-list > > Sounds like homework to me. Checkout http://probstat.sf.net/ it will sort you out quick. -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern combinations
On 9/17/07, dorje tarap <[EMAIL PROTECTED]> wrote: > Hi all, > > Given some patterns such as "...t...s." I need to make all possible > combinations given a separate list for each position. The length of the > pattern is fixed to 9, so thankfully that reduces a bit of the complexity. > > For example I have the following: > > pos1 = ['a',' t'] > pos2 = ['r', 's'] > pos3 = ['n', 'f'] > > So if the pattern contains a '.' character at position 1 it could be 'a' or > 't'. For the pattern '.s.' (length of 3 as example) all combinations would > be: > > asn > asf > tsn > tsf > > Thanks > -- > http://mail.python.org/mailman/listinfo/python-list > Sounds like homework to me. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern for error checking easiest-first?
On 20 ago, 18:01, [EMAIL PROTECTED] wrote: > The problem is that code like this does error checking backwards. A > call to NetworkedThing.changeMe will first do a slow error check and > then a fast one. Obviously there are various ways to get around this - > either have the subclass explicitly ask the superclass to error check > first, or vice totally versa. Is there some accepted pattern/idiom for > handling this issue? What about this: class AbstractThing(): def changeMe(self,blah): self.verify_blah(blah) self.blah = blah def verify_blah(self, blah): if blah < 1: raise MyException class NetworkedThing(AbstractThing): def verify_blah(self, blah): AbstractThing.verify_blah(blah) if blah > self.getUpperLimitOverTheNetworkSlowly: raise MyOtherException That is, it's the verify step that is overriden/enhanced, not the changeMe method that stays the same. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern match !
On Jul 11, 9:29 pm, Helmut Jarausch <[EMAIL PROTECTED]> wrote: > import re > P=re.compile(r'(\w+(?:[-.]\d+)+)-RHEL3-Linux\.RPM') > S="hpsmh-1.1.1.2-0-RHEL3-Linux.RPM" > PO= P.match(S) > if PO : >print PO.group(1) Isn't a regexp overkill here when this will do: head = filename[:filename.index('-RHEL3')] Of course if you need to make it more generic (as in Jay's solution below), re is the way to go. -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern match !
[EMAIL PROTECTED] wrote: >> A slightly more generic match in case your package names turn out to be less >> consistent than given in the test cases: >> >> #!/usr/bin/python >> >> import re >> pattern = re.compile(r'(\w+?-(\d+[\.-])+\d+?)-\D+.*RPM') >> pkgnames = ["hpsmh-1.1.1.2-0-RHEL3-Linux.RPM", >> "hpsmh-1.1.1.2-RHEL3-Linux.RPM"] >> for pkg in pkgnames: >> matchObj = pattern.search(pkg) >> if matchObj: >> print matchObj.group(1) >> >> Still assumes it will end in RPM (all caps), but if you add the flag "re.I" >> to the re.compile() call, it will match case-insensitive. >> >> Hope that helps, >> >> -Jay > > How about if i had something like 1-3 words in the application name: > websphere-pk543-1.1.4.2-1-RHEL3-i386.rpm (in this case are 2 words)? Try this instead then: #!/usr/bin/python import re pattern = re.compile(r'((\w+?-)+?(\d+[\.-])+\d+?)-\D+.*RPM', re.I) pkgnames = ["hpsmh-1.1.1.2-0-RHEL3-Linux.RPM", "hpsmh-1.1.1.2-RHEL3-Linux.RPM", "websphere-pk543-1.1.4.2-1-RHEL3-i386.rpm"] for pkg in pkgnames: matchObj = pattern.search(pkg) if matchObj: print matchObj.group(1) -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern match !
Helmut Jarausch wrote: > [EMAIL PROTECTED] wrote: >> Extract the application name with version from an RPM string like >> hpsmh-1.1.1.2-0-RHEL3-Linux.RPM, i require to extract hpsmh-1.1.1.2-0 >> from above string. Sometimes the RPM string may be hpsmh-1.1.1.2-RHEL3- >> Linux.RPM. >> > > Have a try with > > import re > P=re.compile(r'(\w+(?:[-.]\d+)+)-RHEL3-Linux\.RPM') > S="hpsmh-1.1.1.2-0-RHEL3-Linux.RPM" > PO= P.match(S) > if PO : >print PO.group(1) A slightly more generic match in case your package names turn out to be less consistent than given in the test cases: #!/usr/bin/python import re pattern = re.compile(r'(\w+?-(\d+[\.-])+\d+?)-\D+.*RPM') pkgnames = ["hpsmh-1.1.1.2-0-RHEL3-Linux.RPM", "hpsmh-1.1.1.2-RHEL3-Linux.RPM"] for pkg in pkgnames: matchObj = pattern.search(pkg) if matchObj: print matchObj.group(1) Still assumes it will end in RPM (all caps), but if you add the flag "re.I" to the re.compile() call, it will match case-insensitive. Hope that helps, -Jay -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern match !
[EMAIL PROTECTED] wrote: > Extract the application name with version from an RPM string like > hpsmh-1.1.1.2-0-RHEL3-Linux.RPM, i require to extract hpsmh-1.1.1.2-0 > from above string. Sometimes the RPM string may be hpsmh-1.1.1.2-RHEL3- > Linux.RPM. > Have a try with import re P=re.compile(r'(\w+(?:[-.]\d+)+)-RHEL3-Linux\.RPM') S="hpsmh-1.1.1.2-0-RHEL3-Linux.RPM" PO= P.match(S) if PO : print PO.group(1) -- Helmut Jarausch Lehrstuhl fuer Numerische Mathematik RWTH - Aachen University D 52056 Aachen, Germany -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern match !
On Jul 11, 1:40 pm, [EMAIL PROTECTED] wrote: > Extract the application name with version from an RPM string like > hpsmh-1.1.1.2-0-RHEL3-Linux.RPM, i require to extract hpsmh-1.1.1.2-0 > from above string. Sometimes the RPM string may be hpsmh-1.1.1.2-RHEL3- > Linux.RPM. Now that list-like splicing and indexing works on strings, why not just splice the string, using .index to locate '-RHEL'? -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern match !
On Wed, 11 Jul 2007 03:40:06 +, hari.siri74 wrote: > Extract the application name with version from an RPM string like > hpsmh-1.1.1.2-0-RHEL3-Linux.RPM, i require to extract hpsmh-1.1.1.2-0 > from above string. Sometimes the RPM string may be hpsmh-1.1.1.2-RHEL3- > Linux.RPM. Thank you for sharing. The answer to your problem is here: http://tinyurl.com/anel -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Classification Frameworks?
Hello Evan, > What frameworks are there available for doing pattern classification? > ... Two Bayesian classifiers are SpamBayes (http://spambayes.sf.net) and Reverend Thomas (http://www.divmod.org/projects/reverend). IMO the latter will be easier to play with. > Also, as a sidenote, are there any texts that anyone can recommend to > me for learning more about this area? A good book about NLP is http://nlp.stanford.edu/fsnlp/ which have a chapter about text classification. http://www.cs.cmu.edu/~tom/mlbook.html has some good coverage on the subject as well. HTH. -- Miki Tebeka <[EMAIL PROTECTED]> http://pythonwise.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Classification Frameworks?
On 6/12/07, Steven Bethard <[EMAIL PROTECTED]> wrote: > In fact, a wide variety of classifiers are used in text classification, > including Bayesian approaches, support vector machines, conditional > random fields, etc. > > > Are there any other frameworks I should be aware of? > > I have used (but not recently) Orange: > > http://www.ailab.si/orange > > I haven't used, but have been meaning to try, PyML: > > http://pyml.sourceforge.net/ > > A more recent addition (whose documentation needs work) is: > > http://montepython.sourceforge.net/ > > And here's a Summer of Code project to build an ML library: > > http://projects.scipy.org/scipy/scipy/wiki/MachineLearning > > These are all general-purpose machine learning frameworks. So they can > be applied to pretty much any classification problem (including the text > classification problems you're looking at). You just need to pick out a > set of relevant features to describe your data, and feed those features > along with your chosen labels to a machine learning algorithm. > > STeVe Thanks Steven (and Diez), the projects you pointed me to look like great places to start. -- Evan Klitzke <[EMAIL PROTECTED]> -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Classification Frameworks?
Evan Klitzke wrote: > What frameworks are there available for doing pattern classification? > I'm generally interested in the problem of mapping some sort of input > to one or more categories. For example, I want to be able to solve > problems like taking text and applying one or more tags to it like > "romance", "horror", "poetry", etc. This isn't really my research > specialty, but my understanding is that Bayesian classifiers are > generally used for problems like this. In fact, a wide variety of classifiers are used in text classification, including Bayesian approaches, support vector machines, conditional random fields, etc. > Are there any other frameworks I should be aware of? I have used (but not recently) Orange: http://www.ailab.si/orange I haven't used, but have been meaning to try, PyML: http://pyml.sourceforge.net/ A more recent addition (whose documentation needs work) is: http://montepython.sourceforge.net/ And here's a Summer of Code project to build an ML library: http://projects.scipy.org/scipy/scipy/wiki/MachineLearning These are all general-purpose machine learning frameworks. So they can be applied to pretty much any classification problem (including the text classification problems you're looking at). You just need to pick out a set of relevant features to describe your data, and feed those features along with your chosen labels to a machine learning algorithm. STeVe -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern Classification Frameworks?
Evan Klitzke wrote: > Hi all, > > What frameworks are there available for doing pattern classification? > I'm generally interested in the problem of mapping some sort of input > to one or more categories. For example, I want to be able to solve > problems like taking text and applying one or more tags to it like > "romance", "horror", "poetry", etc. This isn't really my research > specialty, but my understanding is that Bayesian classifiers are > generally used for problems like this. I've had CRM114 recommended to > me, but as far as I can tell there aren't any python bindings for > this. I've utilized the CRM114 classifier from python. It wasn't too hard to come up with a simple wrapping that only needs the crm114 binary somewhere. The rest was dealt with in python. So if CRM114 fits you needs functionalitywise, you should go for it. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern search
Hi Paul, Paul McGuire schrieb am 03/27/2007 07:19 PM: > On Mar 27, 3:13 pm, Fabian Braennstroem <[EMAIL PROTECTED]> wrote: >> Hi to all, >> >> Wojciech Mu?a schrieb am 03/27/2007 03:34 PM: >> >>> Fabian Braennstroem wrote: Now, I would like to improve it by searching for different 'real' patterns just like using 'ls' in bash. E.g. the entry 'car*.pdf' should select all pdf files with a beginning 'car'. Does anyone have an idea, how to do it? >>> Use module glob. >> Thanks for your help! glob works pretty good, except that I just >> deleted all my lastet pdf files :-( >> >> Greetings! >> Fabian > > Then I shudder to think what might have happened if you had used > re's! :) A different feature it had was to copy the whole home-partition (about 19G) into one of its own directories ... the strange thing: it just needed seconds to do that and I did not have the permission to all files and directories! It was pretty strange! Hopefully it was no security bug in python... Greetings! Fabian -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern search
Hi, Gabriel Genellina schrieb am 03/27/2007 10:09 PM: > En Tue, 27 Mar 2007 18:42:15 -0300, Diez B. Roggisch <[EMAIL PROTECTED]> > escribió: > >> Paul McGuire schrieb: >>> On Mar 27, 10:18 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: Fabian Braennstroem wrote: > while iter: > value = model.get_value(iter, 1) > if value.endswith("."+ pattern): [...] > > Now, I would like to improve it by searching for different 'real' > patterns just like using 'ls' in bash. E.g. the entry > 'car*.pdf' should select all pdf files with a beginning 'car'. > Does anyone have an idea, how to do it? > Use regular expressions. They are part of the module "re". And if you use them, ditch your code above, and make it just search for a pattern all the time. Because the above is just the case of *.ext > >>> The glob module is a more direct tool based on the OP's example. The >>> example he gives works directly with glob. To use re, you'd have to >>> convert to something like "car.*\.pdf", yes? > >> I'm aware of the glob-module. But it only works on files. I was under >> the impression that he already has a list of files he wants to filter >> instead of getting it fresh from the filesystem. > > In that case the best way would be to use the fnmatch module - it already > knows how to translate from car*.pdf into the right regexp. (The glob > module is like a combo os.listdir+fnmatch.filter) I have a already a list, but I 'glob' looked so easy ... maybe it is faster to use fnmatch. When I have time I try it out... Thanks! Fabian -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern search
En Tue, 27 Mar 2007 18:42:15 -0300, Diez B. Roggisch <[EMAIL PROTECTED]> escribió: > Paul McGuire schrieb: >> On Mar 27, 10:18 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: >>> Fabian Braennstroem wrote: while iter: value = model.get_value(iter, 1) if value.endswith("."+ pattern): [...] Now, I would like to improve it by searching for different 'real' patterns just like using 'ls' in bash. E.g. the entry 'car*.pdf' should select all pdf files with a beginning 'car'. Does anyone have an idea, how to do it? >>> Use regular expressions. They are part of the module "re". And if you >>> use them, ditch your code above, and make it just search for a pattern >>> all the time. Because the above is just the case of >>> *.ext >> The glob module is a more direct tool based on the OP's example. The >> example he gives works directly with glob. To use re, you'd have to >> convert to something like "car.*\.pdf", yes? > I'm aware of the glob-module. But it only works on files. I was under > the impression that he already has a list of files he wants to filter > instead of getting it fresh from the filesystem. In that case the best way would be to use the fnmatch module - it already knows how to translate from car*.pdf into the right regexp. (The glob module is like a combo os.listdir+fnmatch.filter) -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern search
Paul McGuire schrieb: > On Mar 27, 10:18 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: >> Fabian Braennstroem wrote: >>> Hi, >>> I wrote a small gtk file manager, which works pretty well. Until >>> now, I am able to select different file (treeview entries) just by >>> extension (done with 'endswith'). See the little part below: >>> self.pathlist1=[ ] >>> self.patternlist=[ ] >>> while iter: >>> #print iter >>> value = model.get_value(iter, 1) >>> #if value is what I'm looking for: >>> if value.endswith("."+ pattern): >>> selection.select_iter(iter) >>> selection.select_path(n) >>> self.pathlist1.append(n) >>> self.patternlist.append(value) >>> iter = model.iter_next(iter) >>> #print value >>> n=n+1 >>> Now, I would like to improve it by searching for different 'real' >>> patterns just like using 'ls' in bash. E.g. the entry >>> 'car*.pdf' should select all pdf files with a beginning 'car'. >>> Does anyone have an idea, how to do it? >> Use regular expressions. They are part of the module "re". And if you use >> them, ditch your code above, and make it just search for a pattern all the >> time. Because the above is just the case of >> >> *.ext >> >> Diez- Hide quoted text - >> >> - Show quoted text - > > The glob module is a more direct tool based on the OP's example. The > example he gives works directly with glob. To use re, you'd have to > convert to something like "car.*\.pdf", yes? > > (Of course, re offers much more power than simple globbing. Not clear > how much more the OP was looking for.) I'm aware of the glob-module. But it only works on files. I was under the impression that he already has a list of files he wants to filter instead of getting it fresh from the filesystem. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern search
On Mar 27, 3:13 pm, Fabian Braennstroem <[EMAIL PROTECTED]> wrote: > Hi to all, > > Wojciech Mu?a schrieb am 03/27/2007 03:34 PM: > > > Fabian Braennstroem wrote: > >> Now, I would like to improve it by searching for different 'real' > >> patterns just like using 'ls' in bash. E.g. the entry > >> 'car*.pdf' should select all pdf files with a beginning 'car'. > >> Does anyone have an idea, how to do it? > > > Use module glob. > > Thanks for your help! glob works pretty good, except that I just > deleted all my lastet pdf files :-( > > Greetings! > Fabian Then I shudder to think what might have happened if you had used re's! :) -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern search
On Mar 27, 10:18 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > Fabian Braennstroem wrote: > > Hi, > > > I wrote a small gtk file manager, which works pretty well. Until > > now, I am able to select different file (treeview entries) just by > > extension (done with 'endswith'). See the little part below: > > > self.pathlist1=[ ] > > self.patternlist=[ ] > > while iter: > > #print iter > > value = model.get_value(iter, 1) > > #if value is what I'm looking for: > > if value.endswith("."+ pattern): > > selection.select_iter(iter) > > selection.select_path(n) > > self.pathlist1.append(n) > > self.patternlist.append(value) > > iter = model.iter_next(iter) > > #print value > > n=n+1 > > > Now, I would like to improve it by searching for different 'real' > > patterns just like using 'ls' in bash. E.g. the entry > > 'car*.pdf' should select all pdf files with a beginning 'car'. > > Does anyone have an idea, how to do it? > > Use regular expressions. They are part of the module "re". And if you use > them, ditch your code above, and make it just search for a pattern all the > time. Because the above is just the case of > > *.ext > > Diez- Hide quoted text - > > - Show quoted text - The glob module is a more direct tool based on the OP's example. The example he gives works directly with glob. To use re, you'd have to convert to something like "car.*\.pdf", yes? (Of course, re offers much more power than simple globbing. Not clear how much more the OP was looking for.) -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern search
Hi to all, Wojciech Mu?a schrieb am 03/27/2007 03:34 PM: > Fabian Braennstroem wrote: >> Now, I would like to improve it by searching for different 'real' >> patterns just like using 'ls' in bash. E.g. the entry >> 'car*.pdf' should select all pdf files with a beginning 'car'. >> Does anyone have an idea, how to do it? > > Use module glob. Thanks for your help! glob works pretty good, except that I just deleted all my lastet pdf files :-( Greetings! Fabian -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern search
Fabian Braennstroem wrote: > Now, I would like to improve it by searching for different 'real' > patterns just like using 'ls' in bash. E.g. the entry > 'car*.pdf' should select all pdf files with a beginning 'car'. > Does anyone have an idea, how to do it? Use module glob. -- http://mail.python.org/mailman/listinfo/python-list
Re: pattern search
Fabian Braennstroem wrote: > Hi, > > I wrote a small gtk file manager, which works pretty well. Until > now, I am able to select different file (treeview entries) just by > extension (done with 'endswith'). See the little part below: > > self.pathlist1=[ ] > self.patternlist=[ ] > while iter: > #print iter > value = model.get_value(iter, 1) > #if value is what I'm looking for: > if value.endswith("."+ pattern): > selection.select_iter(iter) > selection.select_path(n) > self.pathlist1.append(n) > self.patternlist.append(value) > iter = model.iter_next(iter) > #print value > n=n+1 > > Now, I would like to improve it by searching for different 'real' > patterns just like using 'ls' in bash. E.g. the entry > 'car*.pdf' should select all pdf files with a beginning 'car'. > Does anyone have an idea, how to do it? Use regular expressions. They are part of the module "re". And if you use them, ditch your code above, and make it just search for a pattern all the time. Because the above is just the case of *.ext Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern for foo tool <-> API <-> shell|GUI
On Sunday 25 March 2007 16:44, Steven Bethard wrote: > Anastasios Hatzis wrote: > > I'm working on a tool which is totally command-line based and consisting > > of multiple scripts. The user can execute a Python script in the shell, > > this script does some basic verification before delegating a call into my > > tool's package and depending on some arguments and options provided in > > the command-line, e.g. > > $python generate.py myproject --force --verbose > > the tool processes whatever necessary. There are multiple command > > handlers available in this package which are responsible for different > > tasks and depending of the script that has been executed one or more of > > these command handlers are fired to do their work ;) > > Side note: you might find argparse (http://argparse.python-hosting.com/) > > makes this a bit easier if you have positional arguments or sub-commands:: Steve, thank you, for the note. I didn't know argparse before. I have multiple scripts since optparse puts all arguments and options into one single help text, and the arguments and options are too specific for most commands (and thus the help would be absolutely overloaded and useless for new users). It seems that argparse has multiple help pages separated for each sub-command, as far as I understand the page. > > > And I don't think that this is very trivial (at least not for my > > programming skill level). In the given example "generate.py" (above) the > > following scenario is pretty likely: > > > > (1) User works with UML tool and clicks in some dialog a "generate" > > button (2) UML tool triggers this event an calls a magic generate() > > method of my tool (via the API I provide for this purpose), like my > > generate.py script would do same way > > (3) Somewhen with-in this generate process my tool may need to get some > > information from the user in order to continue (it is in the nature of > > the features that I can't avoid this need of interaction in any case). > > So you're imagining an API something like:: > > def generate(name, > force=False, > verbose=False, > handler=command_line_handler): > ... > choice = handler.prompt_user(question_text, user_choices) > ... > > where the command-line handler might look something like:: > > class CommandLineHandler(object): > ... > def prompt_user(self, question_text, user_choices): > while True: > choice = raw_input(question_text) > if choice in user_choices: > return choice > print 'invalid choice, choose from %s' % choices > > and the GUI client would implement the equivalent thing with dialogs? Exactly. - Now, as I see your example, I wonder if this would work with a GUI which is event-driven... I have to look into my wx GUI prototype. > That seems basically reasonable to me, though you should be clear in the > documentation of generate() -- and any other methods that accept handler > objects -- exactly what methods the handler must provide. > > You also may find that "prompt_user" is a bit too generic -- e.g. a file > chooser dialog looks a lot different from a color chooser dialog -- so > you may need to split this up into "prompt_user_file", > "prompt_user_color", etc. so that handler's don't have to introspect the > question text to know what to do... > > STeVe Hey, right, good idea. I didn't think about the different task-specific dialogs in most GUIs. But I see that usability will gain benefit from differentiated "prompt" methods. Anastasios -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern for foo tool <-> API <-> shell|GUI
Anastasios Hatzis wrote: > I'm working on a tool which is totally command-line based and consisting of > multiple scripts. The user can execute a Python script in the shell, this > script does some basic verification before delegating a call into my tool's > package and depending on some arguments and options provided in the > command-line, e.g. > $python generate.py myproject --force --verbose > the tool processes whatever necessary. There are multiple command handlers > available in this package which are responsible for different tasks and > depending of the script that has been executed one or more of these command > handlers are fired to do their work ;) Side note: you might find argparse (http://argparse.python-hosting.com/) makes this a bit easier if you have positional arguments or sub-commands:: >>> parser = argparse.ArgumentParser() >>> parser.add_argument('name') >>> parser.add_argument('--force', action='store_true') >>> parser.add_argument('--verbose', action='store_true') >>> parser.parse_args(['my_project', '--force', '--verbose']) Namespace(force=True, name='my_project', verbose=True) >>> parser = argparse.ArgumentParser() >>> subparsers = parser.add_subparsers() >>> cmd1_parser = subparsers.add_parser('cmd1') >>> cmd1_parser.add_argument('--foo') >>> cmd2_parser = subparsers.add_parser('cmd2') >>> cmd2_parser.add_argument('bar') >>> parser.parse_args(['cmd1', '--foo', 'X']) Namespace(foo='X') >>> parser.parse_args(['cmd2', 'Y']) Namespace(bar='Y') > And I don't think that this is very trivial (at least not for my programming > skill level). In the given example "generate.py" (above) the following > scenario is pretty likely: > > (1) User works with UML tool and clicks in some dialog a "generate" button > (2) UML tool triggers this event an calls a magic generate() method of my > tool > (via the API I provide for this purpose), like my generate.py script would do > same way > (3) Somewhen with-in this generate process my tool may need to get some > information from the user in order to continue (it is in the nature of the > features that I can't avoid this need of interaction in any case). So you're imagining an API something like:: def generate(name, force=False, verbose=False, handler=command_line_handler): ... choice = handler.prompt_user(question_text, user_choices) ... where the command-line handler might look something like:: class CommandLineHandler(object): ... def prompt_user(self, question_text, user_choices): while True: choice = raw_input(question_text) if choice in user_choices: return choice print 'invalid choice, choose from %s' % choices and the GUI client would implement the equivalent thing with dialogs? That seems basically reasonable to me, though you should be clear in the documentation of generate() -- and any other methods that accept handler objects -- exactly what methods the handler must provide. You also may find that "prompt_user" is a bit too generic -- e.g. a file chooser dialog looks a lot different from a color chooser dialog -- so you may need to split this up into "prompt_user_file", "prompt_user_color", etc. so that handler's don't have to introspect the question text to know what to do... STeVe -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern for foo tool <-> API <-> shell|GUI
On Saturday 24 March 2007 18:55, [EMAIL PROTECTED] wrote: > On Mar 24, 10:31 am, Anastasios Hatzis <[EMAIL PROTECTED]> wrote: > > I'm looking for a pattern where different client implementations can use > > the same commands of some fictive tool ("foo") by accessing some kind of > > API. Actually I have the need for such pattern for my own tool > > (http://openswarm.sourceforge.net). I already started restructuring my > > code to separate the actual command implementations from the command-line > > scripts (which is optparser-based now) and have some ideas how to > > proceed. But probably there is already a good pattern used for > > Python-based tools. > > > > In the case that some of you are interested into this topic and my recent > > thoughts, you may want to have a look at the description below. Any > > comments are very much appreciated. Hopefully this list is a good place > > for discussing a pattern, otherwise I would be okay to move this to > > another place. Thank you. > > > > Here we go: > > The tool package itself provides several commands, although not important > > for the pattern itself, here some examples: modifying user-specific > > preferences, creating and changing project settings files, > > project-related > > code-generation, or combinations of such commands ... later also commands > > for transformation between several XML formats etc. The classes which > > implement these commands are currently in multiple modules, each having a > > class named CmdHandler. > > > > I have some Python scripts (each having a ScriptHandler classes), for use > > via command-line. Each ScriptHandler class is responsible to add all > > related command-line options and process those provided by the user > > (based on optparse library from Python standard lib). The script then > > calls the corresponding command and provide the verified options as > > parameters. > > > > Due to the nature of the tool under specific conditions the following > > results may come during command execution: > > * successful execution, no interaction > > * critical error, execution cancelled > > * user interaction needed (e.g. prompt user to approve replace existing > > directory (yes/no), prompt user to provide an alternative option) > > > > Command-line interactions work simply with raw_input(). > > > > So far this works. Nevertheless, there are some other aspects that could > > be improved, but this is another topic: The tool uses custom exceptions > > (e.g. for critical errors) and logging features (based on logging from > > Python standard lib). Currently no automated tests, but I have to add. > > > > For the next step I plan to support not only my own command-line scripts, > > but also a GUI to access the commands, as well as 3rd-party products > > (themselves command-line scripts or GUIs, such as foo plugins for any > > 3rd-party-tools). As far as I see, these clients need to implement a > > handler that: > > (1) Collecting all required parameters and optional parameters from a > > user (2) Provide these parameters for a particular call to command API > > (3) Provides some kind of hooks that are called back from the API on > > specific events, e.g. Question with user-choice; Information with > > user-input (4) Provide a logging handler object from the tool logging > > class or a sub-class of that in the case that a client-specific logging > > object should be triggered on each debug, message, warning etc. > > > > (1) is very client-specific, e.g. in a GUI via dialogs. > > > > (2) Each command provides a signature for all required/optional > > parameters. They are all verified from the command itself, although a > > client could do some verification at the first place. > > > > (3) Example use-case: a command needs to know if the user wants the > > command to proceed with a particular action, e.g. "Do you want to delete > > bar.txt?" with "Yes", "No" and "Cancel" choice. So the client's handler > > object (which is provided as first parameter to each command) implements > > client-specific features to show the user this question (e.g. pop-up > > dialog with question and three buttons), receive the user input (clicking > > one of the buttons) and pass this choice back to the foo API. > > Alternatively some kind of text information could be required, as in > > raw_input(), so actually this probably would be two different interaction > > features to be implemented. > > > > (4) The foo API also provides a logging class. The client needs to > > initialize such an object and provide it as member of the handler object > > provided to the API. I wonder if some clients may have own logging > > features and want to include all log messages from foo tool to the own > > logs. In this case a client could its own sub-class of the foo logging > > class and extending it with callbacks to its (client-)native logging > > object. > > > > What do you think about this? > > > > Best regards, > > Anastasios > > I think if you want to use a GUI, wxpy
Re: Pattern for foo tool <-> API <-> shell|GUI
On Mar 24, 10:31 am, Anastasios Hatzis <[EMAIL PROTECTED]> wrote: > I'm looking for a pattern where different client implementations can use the > same commands of some fictive tool ("foo") by accessing some kind of API. > Actually I have the need for such pattern for my own tool > (http://openswarm.sourceforge.net). I already started restructuring my code > to separate the actual command implementations from the command-line scripts > (which is optparser-based now) and have some ideas how to proceed. But > probably there is already a good pattern used for Python-based tools. > > In the case that some of you are interested into this topic and my recent > thoughts, you may want to have a look at the description below. Any comments > are very much appreciated. Hopefully this list is a good place for discussing > a pattern, otherwise I would be okay to move this to another place. Thank > you. > > Here we go: > The tool package itself provides several commands, although not important for > the pattern itself, here some examples: modifying user-specific preferences, > creating and changing project settings files, project-related > code-generation, or combinations of such commands ... later also commands for > transformation between several XML formats etc. The classes which implement > these commands are currently in multiple modules, each having a class named > CmdHandler. > > I have some Python scripts (each having a ScriptHandler classes), for use via > command-line. Each ScriptHandler class is responsible to add all related > command-line options and process those provided by the user (based on > optparse library from Python standard lib). The script then calls the > corresponding command and provide the verified options as parameters. > > Due to the nature of the tool under specific conditions the following results > may come during command execution: > * successful execution, no interaction > * critical error, execution cancelled > * user interaction needed (e.g. prompt user to approve replace existing > directory (yes/no), prompt user to provide an alternative option) > > Command-line interactions work simply with raw_input(). > > So far this works. Nevertheless, there are some other aspects that could be > improved, but this is another topic: The tool uses custom exceptions (e.g. > for critical errors) and logging features (based on logging from Python > standard lib). Currently no automated tests, but I have to add. > > For the next step I plan to support not only my own command-line scripts, but > also a GUI to access the commands, as well as 3rd-party products (themselves > command-line scripts or GUIs, such as foo plugins for any 3rd-party-tools). As > far as I see, these clients need to implement a handler that: > (1) Collecting all required parameters and optional parameters from a user > (2) Provide these parameters for a particular call to command API > (3) Provides some kind of hooks that are called back from the API on specific > events, e.g. Question with user-choice; Information with user-input > (4) Provide a logging handler object from the tool logging > class or a sub-class of that in the case that a client-specific logging object > should be triggered on each debug, message, warning etc. > > (1) is very client-specific, e.g. in a GUI via dialogs. > > (2) Each command provides a signature for all required/optional parameters. > They are all verified from the command itself, although a client could do > some verification at the first place. > > (3) Example use-case: a command needs to know if the user wants the command to > proceed with a particular action, e.g. "Do you want to delete bar.txt?" > with "Yes", "No" and "Cancel" choice. So the client's handler object (which > is provided as first parameter to each command) implements client-specific > features to show the user this question (e.g. pop-up dialog with question and > three buttons), receive the user input (clicking one of the buttons) and pass > this choice back to the foo API. Alternatively some kind of text information > could be required, as in raw_input(), so actually this probably would be two > different interaction features to be implemented. > > (4) The foo API also provides a logging class. The client needs to initialize > such an object and provide it as member of the handler object provided to the > API. I wonder if some clients may have own logging features and want to > include all log messages from foo tool to the own logs. In this case a client > could its own sub-class of the foo logging class and extending it with > callbacks to its (client-)native logging object. > > What do you think about this? > > Best regards, > Anastasios I think if you want to use a GUI, wxpython or Tkinter would work well for you. wxPython has more widgets from the start, but is also more complex. Tkinter is good for quick and dirty GUIs, but gets increasingly more complicated to deal with the more complex the GUI has to be, in general.
Re: pattern matching
azrael wrote: > can someone give me good links for pattern matching in images using > python There is a python-binding available for the OpenCV library, a collection of state-of-the-art CV algorithms. And it comes with a free manual Diez -- http://mail.python.org/mailman/listinfo/python-list