Re: pattern block expression matching

2018-07-22 Thread aldi . kraja
Thank you all for thoughtful excellent updates!

Aldi
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pattern block expression matching

2018-07-21 Thread Dan Sommers
On Sat, 21 Jul 2018 17:37:00 +0100, MRAB wrote:

> On 2018-07-21 15:20, aldi.kr...@gmail.com wrote:
>> Hi,
>> I have a long text, which tells me which files from a database were 
>> downloaded and which ones failed. The pattern is as follows (at the end of 
>> this post). Wrote a tiny program, but still is raw. I want to find term 
>> "ERROR" and go 5 lines above and get the name with suffix XPT, in this first 
>> case DRXIFF_F.XPT, but it changes in other cases to some other name with 
>> suffix XPT. Thanks, Aldi
>> 
>> # reading errors from a file txt
>> import re
>> with open('nohup.out', 'r') as fh:
>>lines = fh.readlines()
>>for line in lines:
>>m1 = re.search("XPT", line)
>>m2 = re.search('ERROR', line)
>>if m1:
>>  print(line)
>>if m2:
>>  print(line)
>> 
> Firstly, you don't need regex for something has simple has checking for 
> the presence of a string.
> 
> Secondly, I think it's 4 lines above, not 5.
> 
> 'enumerate' comes in useful here:
> 
> with open('nohup.out', 'r') as fh:
>  lines = fh.readlines()
>  for i, line in enumerate(lines):
>  if 'ERROR' in line:
>  print(line)
>  print(lines[i - 4])

Where's awk when you need it?

import fileinput
for line in fileinput.fileinput('nohump.out'):
if 'XPT' in line:
line_containing_filename = line
if 'ERROR' in line:
print(line_containing_filename)

I think Aldi's original approach is pretty good.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pattern block expression matching

2018-07-21 Thread Peter Otten
MRAB wrote:

> On 2018-07-21 15:20, aldi.kr...@gmail.com wrote:
>> Hi,
>> I have a long text, which tells me which files from a database were
>> downloaded and which ones failed. The pattern is as follows (at the end
>> of this post). Wrote a tiny program, but still is raw. I want to find
>> term "ERROR" and go 5 lines above and get the name with suffix XPT, in
>> this first case DRXIFF_F.XPT, but it changes in other cases to some other
>> name with suffix XPT. Thanks, Aldi
>> 
>> # reading errors from a file txt
>> import re
>> with open('nohup.out', 'r') as fh:
>>lines = fh.readlines()
>>for line in lines:
>>m1 = re.search("XPT", line)
>>m2 = re.search('ERROR', line)
>>if m1:
>>  print(line)
>>if m2:
>>  print(line)
>> 
> Firstly, you don't need regex for something has simple has checking for
> the presence of a string.
> 
> Secondly, I think it's 4 lines above, not 5.
> 
> 'enumerate' comes in useful here:
> 
> with open('nohup.out', 'r') as fh:
>  lines = fh.readlines()
>  for i, line in enumerate(lines):
>  if 'ERROR' in line:
>  print(line)
>  print(lines[i - 4])

Here's an alternative that works when the file is huge, and reading it into 
memory is impractical:

import itertools

def get_url(line):
return line.rsplit(None, 1)[-1]

def pairs(lines, step=4):
a, b = itertools.tee(f)
return zip(a, itertools.islice(b, step, None))

with open("nohup.out") as f:
for s, t in pairs(f, 4):
if "ERROR" in t:
assert "XPT" in s
print(get_url(s))

And here's yet another way that assumes that 

(1) the groups are separated by empty lines
(2) the first line always contains the file name
(3) "ERROR" may occur in any of the lines that follow

 def groups(lines):
return (
group
for key, group in itertools.groupby(lines, key=str.isspace)
if not key
)

with open("nohup.out") as f:
for group in groups(f):
first = next(group)
if any("ERROR" in line for line in group):
assert "XPT" in first
print(get_url(first))

 
>> --2018-07-14 21:26:45-- 
>> https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXIFF_F.XPT Resolving
>> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov
>> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent,
>> awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not
>> Found.
>> 
>> --2018-07-14 21:26:46-- 
>> https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXTOT_F.XPT Resolving
>> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov
>> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent,
>> awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not
>> Found.
>> 
>> --2018-07-14 21:26:46-- 
>> https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXFMT_F.XPT Resolving
>> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov
>> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent,
>> awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not
>> Found.
>> 
>> --2018-07-14 21:26:46-- 
>> https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DSQ1_F.XPT Resolving
>> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov
>> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent,
>> awaiting response... 404 Not Found 2018-07-14 21:26:47 ERROR 404: Not
>> Found.
>> 
>> --2018-07-14 21:26:47-- 
>> https://wwwn.cdc.gov/Nchs/Nhanes/1999-2000/DSII.XPT Resolving
>> wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov
>> (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent,
>> awaiting response... 200 OK Length: 56060880 (53M)
>> [application/octet-stream] Saving to: ‘DSII.XPT’
>> 
> 


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pattern block expression matching

2018-07-21 Thread MRAB

On 2018-07-21 15:20, aldi.kr...@gmail.com wrote:

Hi,
I have a long text, which tells me which files from a database were downloaded and which 
ones failed. The pattern is as follows (at the end of this post). Wrote a tiny program, 
but still is raw. I want to find term "ERROR" and go 5 lines above and get the 
name with suffix XPT, in this first case DRXIFF_F.XPT, but it changes in other cases to 
some other name with suffix XPT. Thanks, Aldi

# reading errors from a file txt
import re
with open('nohup.out', 'r') as fh:
   lines = fh.readlines()
   for line in lines:
   m1 = re.search("XPT", line)
   m2 = re.search('ERROR', line)
   if m1:
 print(line)
   if m2:
 print(line)

Firstly, you don't need regex for something has simple has checking for 
the presence of a string.


Secondly, I think it's 4 lines above, not 5.

'enumerate' comes in useful here:

with open('nohup.out', 'r') as fh:
lines = fh.readlines()
for i, line in enumerate(lines):
if 'ERROR' in line:
print(line)
print(lines[i - 4])



--2018-07-14 21:26:45--  https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXIFF_F.XPT
Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39
Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-07-14 21:26:46 ERROR 404: Not Found.

--2018-07-14 21:26:46--  https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXTOT_F.XPT
Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39
Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-07-14 21:26:46 ERROR 404: Not Found.

--2018-07-14 21:26:46--  https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXFMT_F.XPT
Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39
Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-07-14 21:26:46 ERROR 404: Not Found.

--2018-07-14 21:26:46--  https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DSQ1_F.XPT
Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39
Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-07-14 21:26:47 ERROR 404: Not Found.

--2018-07-14 21:26:47--  https://wwwn.cdc.gov/Nchs/Nhanes/1999-2000/DSII.XPT
Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39
Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 56060880 (53M) [application/octet-stream]
Saving to: ‘DSII.XPT’



--
https://mail.python.org/mailman/listinfo/python-list


Re: pattern

2018-06-16 Thread Cameron Simpson

On 16Jun2018 11:59, Sharan Basappa  wrote:

This is so kind of you. Thanks for spending time to explain the code.
It did help a lot. I did go back and brush up lists & dictionaries.

At this point, I think, I need to go back and brush up Python from the start.
So, I will do that first.


Sure, sounds good.

But write code! It is not enough to read code and read about code. You need to 
write code and modify code. Otherwise the skills don't internalise well.


If you're running the code you asked about, one way to learn a lot about 
something that looks obscrure is simply to put in print() calls at various 
places, eg:


  print("iterate over traing_data =", repr(training_data))
  for pattern in training_data:
  # tokenize each word in the sentence
  print("pattern =", repr(pattern))
  w = nltk.word_tokenize(pattern['sentence'])
  print("w =", repr(w))
  # add to our words list
  words.extend(w)
  print("words =", repr(words))
  # add to documents in our corpus
  documents.append((w, pattern['class']))
  print("documents =", repr(documents))

Note the use of repr(): it will print out the structure of lists and so forth, 
very useful.


Just reviewing that loop, the logic does look a little weird to me. I think the 
"documents.append" should be inside the loop because otherwise it only accrues 
the _last_ "w" and "pattern".


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: pattern

2018-06-16 Thread Sharan Basappa
Dear Cameron,

This is so kind of you. Thanks for spending time to explain the code.
It did help a lot. I did go back and brush up lists & dictionaries.

At this point, I think, I need to go back and brush up Python from the start.
So, I will do that first.

On Friday, 15 June 2018 09:12:22 UTC+5:30, Cameron Simpson  wrote:
> On 14Jun2018 20:01, Sharan Basappa  wrote:
> >> >Can anyone explain to me the purpose of "pattern" in the line below:
> >> >
> >> >documents.append((w, pattern['class']))
> >> >
> >> >documents is declared as a list as follows:
> >> >documents.append((w, pattern['class']))
> >>
> >> Not without a lot more context. Where did you find this code?
> >
> >I am sorry that partial info was not sufficient.
> >I am actually trying to implement my first text classification code and I am 
> >referring to the below URL for that:
> >
> >https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6
> 
> Ah, ok. It helps to include some cut/paste of the relevant code, though the 
> URL 
> is a big help.
> 
> The wider context of the code you recite looks like this:
> 
>   words = []
>   classes = []
>   documents = []
>   ignore_words = ['?']
>   # loop through each sentence in our training data
>   for pattern in training_data:
>   # tokenize each word in the sentence
>   w = nltk.word_tokenize(pattern['sentence'])
>   # add to our words list
>   words.extend(w)
>   # add to documents in our corpus
>   documents.append((w, pattern['class']))
> 
> and the training_data is defined like this:
> 
>   training_data = []
>   training_data.append({"class":"greeting", "sentence":"how are you?"})
>   training_data.append({"class":"greeting", "sentence":"how is your day?"})
>   ... lots more ...
> 
> So training data is a list of dicts, each dict holding a "class" and 
> "sentence" 
> key. The "for pattern in training_data" loop iterates over each item of the 
> training_data. It calls nltk.word_tokenize on the 'sentence" part of the 
> training item, presumably getting a list of "word" strings. The documents 
> list 
> gets this tuple:
> 
>   (w, pattern['class'])
> 
> added to it.
> 
> In this way the documents list ends up with tuples of (words, 
> classification), 
> with the words coming from the sentence via nltk and the classification 
> coming 
> straight from the train item's "class" value.
> 
> So at the end of the loop the documents array will look like:
> 
>   documents = [
> ( ['how', 'are', 'you'], 'greeting' ),
> ( ['how', 'is', 'your', 'day', 'greeting' ),
>   ]
> 
> and so forth.
> 
> Cheers,
> Cameron Simpson 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pattern

2018-06-14 Thread Cameron Simpson

On 14Jun2018 20:01, Sharan Basappa  wrote:

>Can anyone explain to me the purpose of "pattern" in the line below:
>
>documents.append((w, pattern['class']))
>
>documents is declared as a list as follows:
>documents.append((w, pattern['class']))

Not without a lot more context. Where did you find this code?


I am sorry that partial info was not sufficient.
I am actually trying to implement my first text classification code and I am 
referring to the below URL for that:

https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6


Ah, ok. It helps to include some cut/paste of the relevant code, though the URL 
is a big help.


The wider context of the code you recite looks like this:

 words = []
 classes = []
 documents = []
 ignore_words = ['?']
 # loop through each sentence in our training data
 for pattern in training_data:
 # tokenize each word in the sentence
 w = nltk.word_tokenize(pattern['sentence'])
 # add to our words list
 words.extend(w)
 # add to documents in our corpus
 documents.append((w, pattern['class']))

and the training_data is defined like this:

 training_data = []
 training_data.append({"class":"greeting", "sentence":"how are you?"})
 training_data.append({"class":"greeting", "sentence":"how is your day?"})
 ... lots more ...

So training data is a list of dicts, each dict holding a "class" and "sentence" 
key. The "for pattern in training_data" loop iterates over each item of the 
training_data. It calls nltk.word_tokenize on the 'sentence" part of the 
training item, presumably getting a list of "word" strings. The documents list 
gets this tuple:


 (w, pattern['class'])

added to it.

In this way the documents list ends up with tuples of (words, classification), 
with the words coming from the sentence via nltk and the classification coming 
straight from the train item's "class" value.


So at the end of the loop the documents array will look like:

 documents = [
   ( ['how', 'are', 'you'], 'greeting' ),
   ( ['how', 'is', 'your', 'day', 'greeting' ),
 ]

and so forth.

Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: pattern

2018-06-14 Thread Sharan Basappa
> >Can anyone explain to me the purpose of "pattern" in the line below:
> >
> >documents.append((w, pattern['class']))
> >
> >documents is declared as a list as follows:
> >documents.append((w, pattern['class']))
> 
> Not without a lot more context. Where did you find this code?
> 
> Cheers,

I am sorry that partial info was not sufficient.
I am actually trying to implement my first text classification code and I am 
referring to the below URL for that:

https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6

I hope this helps.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pattern

2018-06-14 Thread Cameron Simpson

On 13Jun2018 19:51, Sharan Basappa  wrote:

Can anyone explain to me the purpose of "pattern" in the line below:

documents.append((w, pattern['class']))

documents is declared as a list as follows:
documents.append((w, pattern['class']))


Not without a lot more context. Where did you find this code?

Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Mark Lawrence

On 15/06/2013 22:03, Joshua Landau wrote:

On 15 June 2013 11:18, Mark Lawrence  wrote:

I tend to reach for string methods rather than an RE so will something like
this suit you?

c:\Users\Mark\MyPython>type a.py
for s in ("In the ocean",
   "On the ocean",
   "By the ocean",
   "In this group",
   "In this group",
   "By the new group"):
 print(' '.join(s.split()[1:-1]))


c:\Users\Mark\MyPython>a
the
the
the
this
this
the new


Careful - " ".join(s.split()) != s

Eg:

" ".join("s\ns".split())

's s'

It's pedantry, but true.



I'm sorry but I haven't the faintest idea what you're talking about.  I 
believe the code I posted works for the OP's needs.  If it doesn't 
please say so.


--
"Steve is going for the pink ball - and for those of you who are 
watching in black and white, the pink is next to the green." Snooker 
commentator 'Whispering' Ted Lowe.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Joshua Landau
On 15 June 2013 11:18, Mark Lawrence  wrote:
> I tend to reach for string methods rather than an RE so will something like
> this suit you?
>
> c:\Users\Mark\MyPython>type a.py
> for s in ("In the ocean",
>   "On the ocean",
>   "By the ocean",
>   "In this group",
>   "In this group",
>   "By the new group"):
> print(' '.join(s.split()[1:-1]))
>
>
> c:\Users\Mark\MyPython>a
> the
> the
> the
> this
> this
> the new

Careful - " ".join(s.split()) != s

Eg:
>>> " ".join("s\ns".split())
's s'

It's pedantry, but true.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread rurpy
Oops...

On Saturday, June 15, 2013 12:47:18 PM UTC-6, ru...@yahoo.com wrote:
> Links to the Python reference documentation are useful for people
> just beginning with some aspect of Python; they are for people who
> already know Python and want to look up details.  

That was supposed to be:
 Links to the Python reference documentation are NOT useful for people
 just beginning with some aspect of Python

and as long as I'm revising, I mean that as a general statement, 
nothing wrong with a reference doc link accompanying a simpler 
explanation or pointer thereto.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Terry Reedy

On 6/15/2013 12:28 PM, subhabangal...@gmail.com wrote:


Suppose I want a regular expression that matches both "Sent from my iPhone" and 
"Sent from my iPod". How do I write such an expression--is the problem,
"Sent from my iPod"
"Sent from my iPhone"

which can be written as,
re.compile("Sent from my (iPhone|iPod)")

now if I want to slightly to extend it as,

"Taken from my iPod"
"Taken from my iPhone"

I am looking how can I use or in the beginning pattern?

and the third phase if the intermediate phrase,

"from my" if also differs or changes.

In a nutshell I want to extract a particular group of phrases,
where, the beginning and end pattern may alter like,

(i) either from beginning Pattern B1 to end Pattern E1,
(ii) or from beginning Pattern B1 to end Pattern E2,
(iii) or from beginning Pattern B2 to end Pattern E2,


The only hints I will add to those given is that you need a) pattern for 
a word, and b) a way to 'anchor' the pattern to the beginning and ending 
of the string so it will only match the first and last words.


This is a pretty good re practice problem, so go and practice and 
experiment.  Expect to fail 20 times and you should beat your 
expectation ;-). The interactive interpreter, or Idle with its F5 Run 
editor window, makes experimenting easy and (for me) fun.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread subhabangalore
On Sunday, June 16, 2013 12:17:18 AM UTC+5:30, ru...@yahoo.com wrote:
> On Saturday, June 15, 2013 11:54:28 AM UTC-6, subhaba...@gmail.com wrote:
> 
> 
> 
> > Thank you for the answer. But I want to learn bit of interesting
> 
> > regular expression forms where may I? 
> 
> > No Mark, thank you for your links but they were not sufficient.
> 
> 
> 
> Links to the Python reference documentation are useful for people
> 
> just beginning with some aspect of Python; they are for people who
> 
> already know Python and want to look up details.  So it's no
> 
> surprise that you did not find them useful.
> 
> 
> 
> > I am looking for more intriguing exercises, esp use of or in
> 
> > the pattern search. 
> 
> 
> 
> Have you tried searching on Google for "regular expression tutorial"?
> 
> It gives a lot of results.  I've never tried any of them so I can't 
> 
> recommend any one specifically but maybe you can find something 
> 
> useful there?
> 
> 
> 
> There is also a Python Howto on regular expressions at
> 
>   http://docs.python.org/3/howto/regex.html
> 
> 
> 
> Also, maybe the book "Regular Expressions Cookbook" would
> 
> be useful?  It seems to have a lot of specific expressions
> 
> for accomplishing various tasks and seems to be online for
> 
> free at
> 
>   http://it-ebooks.info/read/920/

Dear Group,

Thank you for the links. Yes, HOW-TO is good. The cook book should be good. 
Internet changes its contents so fast few days back there was a very good 
Regular Expression Tutorial by Alan Gauld or there were some mail discussions, 
I don't know where they are gone. There is one Gauld's tutorial but I think I 
read some think different.

Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread rurpy
On Saturday, June 15, 2013 11:54:28 AM UTC-6, subhaba...@gmail.com wrote:

> Thank you for the answer. But I want to learn bit of interesting
> regular expression forms where may I? 
> No Mark, thank you for your links but they were not sufficient.

Links to the Python reference documentation are useful for people
just beginning with some aspect of Python; they are for people who
already know Python and want to look up details.  So it's no
surprise that you did not find them useful.

> I am looking for more intriguing exercises, esp use of or in
> the pattern search. 

Have you tried searching on Google for "regular expression tutorial"?
It gives a lot of results.  I've never tried any of them so I can't 
recommend any one specifically but maybe you can find something 
useful there?

There is also a Python Howto on regular expressions at
  http://docs.python.org/3/howto/regex.html

Also, maybe the book "Regular Expressions Cookbook" would
be useful?  It seems to have a lot of specific expressions
for accomplishing various tasks and seems to be online for
free at
  http://it-ebooks.info/read/920/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread subhabangalore
On Saturday, June 15, 2013 3:12:55 PM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> 
> 
> I am trying to search the following pattern in Python.
> 
> 
> 
> I have following strings:
> 
> 
> 
>  (i)"In the ocean"
> 
>  (ii)"On the ocean"
> 
>  (iii) "By the ocean"
> 
>  (iv) "In this group"
> 
>  (v) "In this group"
> 
>  (vi) "By the new group"
> 
>.
> 
> 
> 
> I want to extract from the first word to the last word, 
> 
> where first word and last word are varying.
> 
> 
> 
> I am looking to extract out:
> 
>   (i) the
> 
>   (ii) the 
> 
>   (iii) the
> 
>   (iv) this
> 
>   (v) this
> 
>   (vi) the new
> 
>   .
> 
> 
> 
> The problem may be handled by converting the string to list and then 
> 
> index of list. 
> 
> 
> 
> But I am thinking if I can use regular expression in Python.
> 
> 
> 
> If any one of the esteemed members can help.
> 
> 
> 
> Thanking you in Advance,
> 
> 
> 
> Regards,
> 
> Subhabrata

Dear Group,

Thank you for the answer. But I want to learn bit of interesting regular 
expression forms where may I? No Mark, thank you for your links but they were 
not sufficient. I am looking for more intriguing exercises, esp use of or in 
the pattern search. 

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread rurpy
On 06/15/2013 03:42 AM, subhabangal...@gmail.com wrote:> Dear Group,
> 
> I am trying to search the following pattern in Python.
> 
> I have following strings:
> 
>  (i)"In the ocean"
>  (ii)"On the ocean"
>  (iii) "By the ocean"
>  (iv) "In this group"
>  (v) "In this group"
>  (vi) "By the new group"
>.
> 
> I want to extract from the first word to the last word, 
> where first word and last word are varying.
> 
> I am looking to extract out:
>   (i) the
>   (ii) the 
>   (iii) the
>   (iv) this
>   (v) this
>   (vi) the new
>   .
> 
> The problem may be handled by converting the string to list and then 
> index of list. 
> 
> But I am thinking if I can use regular expression in Python.

Since nobody here seems to want to answer your question
(or seems even able to read it), I'll try.  Is something 
like this what you want?

import re

texts = [
'(i)"In the ocean"',
'(ii)"On the ocean"',
'(iii) "By the ocean"',
'(iv) "In this group"',
'(v) "In this group"',
'(vi) "By the new group"']

pattern = re.compile (r'^\((.*)\)\s*"\S+\s*(.*)\s\S+"$')
for txt in texts:
matchobj = re.search (pattern, txt)
number, midtext = matchobj.group (1, 2)
print ("(%s) %s" % (number, midtext))


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Mark Lawrence

On 15/06/2013 17:28, subhabangal...@gmail.com wrote:

You've been pointed at several links, so what have you tried, and what, 
if anything, went wrong?  Or do you simply not understand, in which case 
please say so and we'll help.  I'm not trying to be awkward, it's simply 
known that you learn more if you try something yourself, rather than be 
spoon fed it.


--
"Steve is going for the pink ball - and for those of you who are 
watching in black and white, the pink is next to the green." Snooker 
commentator 'Whispering' Ted Lowe.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread subhabangalore
On Saturday, June 15, 2013 8:34:59 PM UTC+5:30, Mark Lawrence wrote:
> On 15/06/2013 15:31, subhabangal...@gmail.com wrote:
> 
> >
> 
> > Dear Group,
> 
> >
> 
> > I know this solution but I want to have Regular Expression option. Just 
> > learning.
> 
> >
> 
> > Regards,
> 
> > Subhabrata.
> 
> >
> 
> 
> 
> Start here http://docs.python.org/2/library/re.html
> 
> 
> 
> Would you also please read and action this, 
> 
> http://wiki.python.org/moin/GoogleGroupsPython , thanks.
> 
> 
> 
> -- 
> 
> "Steve is going for the pink ball - and for those of you who are 
> 
> watching in black and white, the pink is next to the green." Snooker 
> 
> commentator 'Whispering' Ted Lowe.
> 
> 
> 
> Mark Lawrence

Dear Group,

Suppose I want a regular expression that matches both "Sent from my iPhone" and 
"Sent from my iPod". How do I write such an expression--is the problem, 
"Sent from my iPod"
"Sent from my iPhone"

which can be written as,
re.compile("Sent from my (iPhone|iPod)")

now if I want to slightly to extend it as,

"Taken from my iPod"
"Taken from my iPhone"

I am looking how can I use or in the beginning pattern?

and the third phase if the intermediate phrase,

"from my" if also differs or changes.

In a nutshell I want to extract a particular group of phrases,
where, the beginning and end pattern may alter like,

(i) either from beginning Pattern B1 to end Pattern E1,
(ii) or from beginning Pattern B1 to end Pattern E2,
(iii) or from beginning Pattern B2 to end Pattern E2,
.

Regards,
Subhabrata.






-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Mark Lawrence

On 15/06/2013 15:31, subhabangal...@gmail.com wrote:


Dear Group,

I know this solution but I want to have Regular Expression option. Just 
learning.

Regards,
Subhabrata.



Start here http://docs.python.org/2/library/re.html

Would you also please read and action this, 
http://wiki.python.org/moin/GoogleGroupsPython , thanks.


--
"Steve is going for the pink ball - and for those of you who are 
watching in black and white, the pink is next to the green." Snooker 
commentator 'Whispering' Ted Lowe.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Andreas Perstinger
subhabangal...@gmail.com wrote:
>I know this solution but I want to have Regular Expression option.
>Just learning.

http://mattgemmell.com/2008/12/08/what-have-you-tried/

Just spell out what you want:
A word at the beginning, followed by any text, followed by a word at
the end.
Now look up the basic regex metacharacters and try to come up with a
solution (Hint: you will need groups)

http://docs.python.org/3/howto/regex.html#regex-howto
http://docs.python.org/3/library/re.html#regular-expression-syntax

Bye, Andreas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread subhabangalore
On Saturday, June 15, 2013 7:58:44 PM UTC+5:30, Mark Lawrence wrote:
> On 15/06/2013 14:45, Denis McMahon wrote:
> 
> > On Sat, 15 Jun 2013 13:41:21 +, Denis McMahon wrote:
> 
> >
> 
> >> first_and_last = [sentence.split()[i] for i in (0, -1)] middle =
> 
> >> sentence.split()[1:-2]
> 
> >
> 
> > Bugger! That last is actually:
> 
> >
> 
> > sentence.split()[1:-1]
> 
> >
> 
> > It just looks like a two.
> 
> >
> 
> 
> 
> I've a very strong sense of deja vu having round the same loop what, two 
> 
> hours ago?  Wondering out aloud the number of times a programmer has 
> 
> thought "That's easy, I don't need to test it".  How are the mighty fallen.
> 
> 
> 
> -- 
> 
> "Steve is going for the pink ball - and for those of you who are 
> 
> watching in black and white, the pink is next to the green." Snooker 
> 
> commentator 'Whispering' Ted Lowe.
> 
> 
> 
> Mark Lawrence

Dear Group,

I know this solution but I want to have Regular Expression option. Just 
learning.

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Mark Lawrence

On 15/06/2013 14:45, Denis McMahon wrote:

On Sat, 15 Jun 2013 13:41:21 +, Denis McMahon wrote:


first_and_last = [sentence.split()[i] for i in (0, -1)] middle =
sentence.split()[1:-2]


Bugger! That last is actually:

sentence.split()[1:-1]

It just looks like a two.



I've a very strong sense of deja vu having round the same loop what, two 
hours ago?  Wondering out aloud the number of times a programmer has 
thought "That's easy, I don't need to test it".  How are the mighty fallen.


--
"Steve is going for the pink ball - and for those of you who are 
watching in black and white, the pink is next to the green." Snooker 
commentator 'Whispering' Ted Lowe.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Denis McMahon
On Sat, 15 Jun 2013 13:41:21 +, Denis McMahon wrote:

> first_and_last = [sentence.split()[i] for i in (0, -1)] middle =
> sentence.split()[1:-2]

Bugger! That last is actually:

sentence.split()[1:-1]

It just looks like a two.

-- 
Denis McMahon, denismfmcma...@gmail.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Denis McMahon
On Sat, 15 Jun 2013 11:55:34 +0100, Mark Lawrence wrote:

>  >>> sentence = "By the new group"
>  >>> words = sentence.split() 
>  >>> words[words[0],words[-1]]
> Traceback (most recent call last):
>File "", line 1, in 
> TypeError: list indices must be integers, not tuple
> 
> So why would the OP want a TypeError?  Or has caffeine deprivation
> affected your typing skills? :)

Yeah - that last:

words[words[0],words[-1]]

should probably have been:

first_and_last = [words[0], words[-1]]

or even:

first_and_last = (words[0], words[-1])

Or even:

first_and_last = [sentence.split()[i] for i in (0, -1)]
middle = sentence.split()[1:-2]

-- 
Denis McMahon, denismfmcma...@gmail.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread rusi
On Jun 15, 3:55 pm, Mark Lawrence  wrote:
> On 15/06/2013 11:24, Denis McMahon wrote:
>
>
>
>
>
>
>
>
>
> > On Sat, 15 Jun 2013 10:05:01 +, Steven D'Aprano wrote:
>
> >> On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote:
>
> >>> Dear Group,
>
> >>> I am trying to search the following pattern in Python.
>
> >>> I have following strings:
>
> >>>   (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In
> >>>   this group" (v) "In this group" (vi) "By the new group"
> >>>         .
>
> >>> I want to extract from the first word to the last word, where first
> >>> word and last word are varying.
>
> >>> I am looking to extract out:
> >>>    (i) the (ii) the (iii) the (iv) this (v) this (vi) the new
> >>>        .
>
> >>> The problem may be handled by converting the string to list and then
> >>> index of list.
>
> >> No need for a regular expression.
>
> >> py> sentence = "By the new group"
> >> py> words = sentence.split()
> >> py> words[1:-1]
> >> ['the', 'new']
>
> >> Does that help?
>
> > I thought OP wanted:
>
> > words[words[0],words[-1]]
>
> > But that might be just my caffeine deprived misinterpretation of his
> > terminology.
>
>  >>> sentence = "By the new group"
>  >>> words = sentence.split()
>  >>> words[words[0],words[-1]]
> Traceback (most recent call last):
>    File "", line 1, in 
> TypeError: list indices must be integers, not tuple
>
> So why would the OP want a TypeError?  Or has caffeine deprivation
> affected your typing skills? :)

:-)

I guess Denis meant (words[0], words[-1])

To the OP:
You have the identity:
words == [words[0]] + words[1:-1] + [words[-1]]

So take your pick of what parts of the expression you want (and
discard what you dont want).
[The way you've used 'extract' is a bit ambiguous]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Mark Lawrence

On 15/06/2013 11:24, Denis McMahon wrote:

On Sat, 15 Jun 2013 10:05:01 +, Steven D'Aprano wrote:


On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote:


Dear Group,

I am trying to search the following pattern in Python.

I have following strings:

  (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In
  this group" (v) "In this group" (vi) "By the new group"
.

I want to extract from the first word to the last word, where first
word and last word are varying.

I am looking to extract out:
   (i) the (ii) the (iii) the (iv) this (v) this (vi) the new
   .

The problem may be handled by converting the string to list and then
index of list.


No need for a regular expression.

py> sentence = "By the new group"
py> words = sentence.split()
py> words[1:-1]
['the', 'new']

Does that help?


I thought OP wanted:

words[words[0],words[-1]]

But that might be just my caffeine deprived misinterpretation of his
terminology.



>>> sentence = "By the new group"
>>> words = sentence.split()
>>> words[words[0],words[-1]]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: list indices must be integers, not tuple

So why would the OP want a TypeError?  Or has caffeine deprivation 
affected your typing skills? :)


--
"Steve is going for the pink ball - and for those of you who are 
watching in black and white, the pink is next to the green." Snooker 
commentator 'Whispering' Ted Lowe.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Denis McMahon
On Sat, 15 Jun 2013 10:05:01 +, Steven D'Aprano wrote:

> On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote:
> 
>> Dear Group,
>> 
>> I am trying to search the following pattern in Python.
>> 
>> I have following strings:
>> 
>>  (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In
>>  this group" (v) "In this group" (vi) "By the new group"
>>.
>> 
>> I want to extract from the first word to the last word, where first
>> word and last word are varying.
>> 
>> I am looking to extract out:
>>   (i) the (ii) the (iii) the (iv) this (v) this (vi) the new
>>   .
>> 
>> The problem may be handled by converting the string to list and then
>> index of list.
> 
> No need for a regular expression.
> 
> py> sentence = "By the new group"
> py> words = sentence.split()
> py> words[1:-1]
> ['the', 'new']
> 
> Does that help?

I thought OP wanted:

words[words[0],words[-1]]

But that might be just my caffeine deprived misinterpretation of his 
terminology.

-- 
Denis McMahon, denismfmcma...@gmail.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Mark Lawrence

On 15/06/2013 10:42, subhabangal...@gmail.com wrote:

Dear Group,

I am trying to search the following pattern in Python.

I have following strings:

  (i)"In the ocean"
  (ii)"On the ocean"
  (iii) "By the ocean"
  (iv) "In this group"
  (v) "In this group"
  (vi) "By the new group"
.

I want to extract from the first word to the last word,
where first word and last word are varying.

I am looking to extract out:
   (i) the
   (ii) the
   (iii) the
   (iv) this
   (v) this
   (vi) the new
   .

The problem may be handled by converting the string to list and then
index of list.

But I am thinking if I can use regular expression in Python.

If any one of the esteemed members can help.

Thanking you in Advance,

Regards,
Subhabrata



I tend to reach for string methods rather than an RE so will something 
like this suit you?


c:\Users\Mark\MyPython>type a.py
for s in ("In the ocean",
  "On the ocean",
  "By the ocean",
  "In this group",
  "In this group",
  "By the new group"):
print(' '.join(s.split()[1:-1]))


c:\Users\Mark\MyPython>a
the
the
the
this
this
the new

--
"Steve is going for the pink ball - and for those of you who are 
watching in black and white, the pink is next to the green." Snooker 
commentator 'Whispering' Ted Lowe.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread Steven D'Aprano
On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote:

> Dear Group,
> 
> I am trying to search the following pattern in Python.
> 
> I have following strings:
> 
>  (i)"In the ocean"
>  (ii)"On the ocean"
>  (iii) "By the ocean"
>  (iv) "In this group"
>  (v) "In this group"
>  (vi) "By the new group"
>.
> 
> I want to extract from the first word to the last word, where first word
> and last word are varying.
> 
> I am looking to extract out:
>   (i) the
>   (ii) the
>   (iii) the
>   (iv) this
>   (v) this
>   (vi) the new
>   .
> 
> The problem may be handled by converting the string to list and then
> index of list.

No need for a regular expression.


py> sentence = "By the new group"
py> words = sentence.split()
py> words[1:-1]
['the', 'new']

Does that help?



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern-match & Replace - help required

2012-12-19 Thread MRAB

On 2012-12-19 14:11, Alexander Blinne wrote:

Am 19.12.2012 14:41, schrieb AT:

Thanks a million
Can you recommend a good online book/tutorial on regular expr. in python?


http://docs.python.org/3/howto/regex.html


Another good resource is:

http://www.regular-expressions.info/

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern-match & Replace - help required

2012-12-19 Thread Alexander Blinne
Am 19.12.2012 14:41, schrieb AT:
> Thanks a million
> Can you recommend a good online book/tutorial on regular expr. in python?

http://docs.python.org/3/howto/regex.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern-match & Replace - help required

2012-12-19 Thread AT
On Wednesday, 19 December 2012 18:16:18 UTC+5, Peter Otten  wrote:
> AT wrote:
> 
> 
> 
> > I am new to python and web2py framework. Need urgent help to match a
> 
> > pattern in an string and replace the matched text.
> 
> > 
> 
> > I've this string (basically an sql statement):
> 
> > stmnt = 'SELECT  taxpayer.id,
> 
> >  taxpayer.enc_name,
> 
> >  taxpayer.age,
> 
> >  taxpayer.occupation
> 
> >  FROM taxpayer WHERE (taxpayer.id IS NOT NULL);'
> 
> > 
> 
> > The requirement is to replace it with this one:
> 
> > r_stmnt = 'SELECT  taxpayer.id,
> 
> >decrypt(taxpayer.enc_name),
> 
> >taxpayer.age,
> 
> >taxpayer.occupation
> 
> >FROM taxpayer WHERE (taxpayer.id IS NOT NULL);'
> 
> > 
> 
> > Can somebody please help?
> 
> 
> 
> > The pattern is '%s.enc_%s', and after matching this pattern want to change
> 
> > it to 'decrypt(%s.enc_%s)'
> 
> 
> 
> after = re.compile(r"(\w+[.]enc_\w+)").sub(r"decrypt(\1)", before)

Thanks a million
Can you recommend a good online book/tutorial on regular expr. in python?

Regards
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern-match & Replace - help required

2012-12-19 Thread Peter Otten
AT wrote:

> I am new to python and web2py framework. Need urgent help to match a
> pattern in an string and replace the matched text.
> 
> I've this string (basically an sql statement):
> stmnt = 'SELECT  taxpayer.id,
>  taxpayer.enc_name,
>  taxpayer.age,
>  taxpayer.occupation
>  FROM taxpayer WHERE (taxpayer.id IS NOT NULL);'
> 
> The requirement is to replace it with this one:
> r_stmnt = 'SELECT  taxpayer.id,
>decrypt(taxpayer.enc_name),
>taxpayer.age,
>taxpayer.occupation
>FROM taxpayer WHERE (taxpayer.id IS NOT NULL);'
> 
> Can somebody please help?

> The pattern is '%s.enc_%s', and after matching this pattern want to change
> it to 'decrypt(%s.enc_%s)'

after = re.compile(r"(\w+[.]enc_\w+)").sub(r"decrypt(\1)", before)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern-match & Replace - help required

2012-12-19 Thread AT
On Wednesday, 19 December 2012 16:27:19 UTC+5, Thomas Bach  wrote:
> On Wed, Dec 19, 2012 at 02:42:26AM -0800, AT wrote:
> 
> > Hi,
> 
> > 
> 
> > I am new to python and web2py framework. Need urgent help to match a
> 
> > pattern in an string and replace the matched text.
> 
> > 
> 
> 
> 
> Well, what about str.replace then?
> 
> 
> 
> >>> 'egg, ham, tomato'.replace('ham', 'spam, ham, spam')
> 
> 'egg, spam, ham, spam, tomato'
> 
> 
> 
> 
> 
> If the pattern you want to match is more complicated, have a look at
> 
> the re module!
> 
> 
> 
> Regards,
> 
>   Thomas.


The pattern is '%s.enc_%s', and after matching this pattern want to change it 
to 'decrypt(%s.enc_%s)' 

Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern-match & Replace - help required

2012-12-19 Thread Thomas Bach
On Wed, Dec 19, 2012 at 02:42:26AM -0800, AT wrote:
> Hi,
> 
> I am new to python and web2py framework. Need urgent help to match a
> pattern in an string and replace the matched text.
> 

Well, what about str.replace then?

>>> 'egg, ham, tomato'.replace('ham', 'spam, ham, spam')
'egg, spam, ham, spam, tomato'


If the pattern you want to match is more complicated, have a look at
the re module!

Regards,
Thomas.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern-match & Replace - help required

2012-12-19 Thread Steven D'Aprano
On Wed, 19 Dec 2012 03:01:32 -0800, AT wrote:

> I just wanted to change taxpayer.enc_name in stmnt to
> decrypt(taxpayer.enc_name)
> 
> hope it clarifies?

Maybe. Does this help?

lunch = "Bread, ham, cheese and tomato."
# replace ham with spam
offset = lunch.find('ham')
if offset != -1:
lunch = lunch[:offset] + 'spam' + lunch[offset + len('ham'):]
print(lunch)




-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern-match & Replace - help required

2012-12-19 Thread AT

On Wednesday, 19 December 2012 15:51:22 UTC+5, Steven D'Aprano  wrote:
> On Wed, 19 Dec 2012 02:42:26 -0800, AT wrote:
> 
> 
> 
> > Hi,
> 
> > 
> 
> > I am new to python and web2py framework. Need urgent help to match a
> 
> > pattern in an string and replace the matched text.
> 
> > 
> 
> > I've this string (basically an sql statement): 
> 
> >
> 
> > stmnt = 'SELECT taxpayer.id,
> 
> >  taxpayer.enc_name,
> 
> >  taxpayer.age,
> 
> >  taxpayer.occupation
> 
> >  FROM taxpayer WHERE (taxpayer.id IS NOT NULL);'
> 
> > 
> 
> > The requirement is to replace it with this one: 
> 
> > 
> 
> > r_stmnt = 'SELECT taxpayer.id,
> 
> >decrypt(taxpayer.enc_name),
> 
> >taxpayer.age,
> 
> >taxpayer.occupation
> 
> >FROM taxpayer WHERE (taxpayer.id IS NOT NULL);'
> 
> > 
> 
> > Can somebody please help?
> 
> 
> 
> Can you do this?
> 
> 
> 
> stmnt = r_stmnt
> 
> 
> 
> That should do what you are asking.
> 
> 
> 
> If that doesn't solve your problem, you will need to explain your problem 
> 
> in more detail.
> 
> 
> 
> 
> 
> 
> 
> -- 
> 
> Steven


I just wanted to change taxpayer.enc_name in stmnt to decrypt(taxpayer.enc_name)

hope it clarifies?

thanks


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern-match & Replace - help required

2012-12-19 Thread Steven D'Aprano
On Wed, 19 Dec 2012 02:42:26 -0800, AT wrote:

> Hi,
> 
> I am new to python and web2py framework. Need urgent help to match a
> pattern in an string and replace the matched text.
> 
> I've this string (basically an sql statement): 
>
> stmnt = 'SELECT taxpayer.id,
>  taxpayer.enc_name,
>  taxpayer.age,
>  taxpayer.occupation
>  FROM taxpayer WHERE (taxpayer.id IS NOT NULL);'
> 
> The requirement is to replace it with this one: 
> 
> r_stmnt = 'SELECT taxpayer.id,
>decrypt(taxpayer.enc_name),
>taxpayer.age,
>taxpayer.occupation
>FROM taxpayer WHERE (taxpayer.id IS NOT NULL);'
> 
> Can somebody please help?

Can you do this?

stmnt = r_stmnt

That should do what you are asking.

If that doesn't solve your problem, you will need to explain your problem 
in more detail.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching

2011-02-24 Thread Jon Clements
On Feb 24, 2:11 am, monkeys paw  wrote:
> if I have a string such as '01/12/2011' and i want
> to reformat it as '20110112', how do i pull out the components
> of the string and reformat them into a DDMM format?
>
> I have:
>
> import re
>
> test = re.compile('\d\d\/')
> f = open('test.html')  # This file contains the html dates
> for line in f:
>      if test.search(line):
>          # I need to pull the date components here

I second using an html parser to extact the content of the TD's, but I
would also go one step further reformatting and do something such as:

>>> from time import strptime, strftime
>>> d = '01/12/2011'
>>> strftime('%Y%m%d', strptime(d, '%m/%d/%Y'))
'20110112'

That way you get some validation about the data, ie, if you get
'13/12/2011' you've probably got mixed data formats.


hth

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching

2011-02-24 Thread John S
On Feb 23, 9:11 pm, monkeys paw  wrote:
> if I have a string such as '01/12/2011' and i want
> to reformat it as '20110112', how do i pull out the components
> of the string and reformat them into a DDMM format?
>
> I have:
>
> import re
>
> test = re.compile('\d\d\/')
> f = open('test.html')  # This file contains the html dates
> for line in f:
>      if test.search(line):
>          # I need to pull the date components here
What you need are parentheses, which capture part of the text you're
matching. Each set of parentheses creates a "group". To get to these
groups, you need the match object which is returned by re.search.
Group 0 is the entire match, group 1 is the contents of the first set
of parentheses, and so forth. If the regex does not match, then
re.search returns None.


DATA FILE (test.html):

David02/19/1967
Susan05/23/1948
Clare09/22/1952
BP08/27/1990
Roger12/19/1954



CODE:
import re
rx_test = re.compile(r'(\d{2})/(\d{2})/(\d{4})')

f = open('test.html')
for line in f:
m = rx_test.search(line)
if m:
new_date = m.group(3) + m.group(1) + m.group(2)
print "raw text: ",m.group(0)
print "new date: ",new_date
print

OUTPUT:
raw text:  02/19/1967
new date:  19670219

raw text:  05/23/1948
new date:  19480523

raw text:  09/22/1952
new date:  19520922

raw text:  08/27/1990
new date:  19900827

raw text:  12/19/1954
new date:  19541219



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching

2011-02-23 Thread Dr Vangel


if I have a string such as '01/12/2011' and i want
to reformat it as '20110112', how do i pull out the components
of the string and reformat them into a DDMM format?

I have:

import re

test = re.compile('dd/')
f = open('test.html')  # This file contains the html dates
for line in f:
if test.search(line):
# I need to pull the date components here


I am no python guru but you could use beautifulsoup to parse html as its 
much easier


some untested pseudocode below. adapt to your needs.

from BeautifulSoup import BeautifulSoup

#read html data or whatever source
html_data = open('/yourwebsite/page.html','r').read() 


#Create the soup object from the HTML data
soup = new BeautifulSoup(html_data)
someData = soup.find('td',name='someTable') 
#Find the proper tag see beautifulsoup docs
value = someData.attrs[2][1] # the value of 3rd attrib of the tag , just 
an example


##end

now when you have the date in some str format the next thing is your date 
conversion. For this

re fer to dateutil parse http://labix.org/python-dateutil

hope it help.





posted via Grepler.com -- poster is authenticated.
begin 644 
end


--
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching

2011-02-23 Thread Roy Smith
In article ,
 Chris Rebert  wrote:

> regex = compile("(\d\d)/(\d\d)/(\d{4})")

I would probably write that as either

r"(\d{2})/(\d{2})/(\d{4})"

or (somewhat less likely)

r"(\d\d)/(\d\d)/(\d\d\d\d)"

Keeping to one consistent style makes it a little easier to read.  Also, 
don't forget the leading `r` to get raw strings.  I've long since given 
up trying to remember the exact rules of what needs to get escaped and 
what doesn't.  If it's a regex, I just automatically make it a raw 
string.

Also, don't overlook the re.VERBOSE flag.  With it, you can write 
positively outrageous expressions which are still quite readable.  With 
it, you could write this regex as:

r" (\d{2}) / (\d{2}) / (\d{4}) "

which takes up only slightly more space, but makes it a whole lot easier 
to scan by eye.

I'm still going to stand by my previous statement, however.  If you're 
trying to parse HTML, use an HTML parser.  Using a regex like this is 
perfectly fine for parsing the CDATA text inside the HTML  element, 
but pattern matching the HTML markup itself is madness.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching

2011-02-23 Thread Chris Rebert
On Wed, Feb 23, 2011 at 6:37 PM, Steven D'Aprano
 wrote:
> On Wed, 23 Feb 2011 21:11:53 -0500, monkeys paw wrote:
>> if I have a string such as '01/12/2011' and i want to reformat
>> it as '20110112', how do i pull out the components of the string and
>> reformat them into a DDMM format?
>
> data = '01/12/2011'
> # Throw away tags.
> data = data[4:-5]
> # Separate components.
> day, month, year = data.split('/')
> # Recombine.
> print(year + month + day)
>
>
> No need for the sledgehammer of regexes for cracking this peanut.

Agreed. But "Just 'Cause"(tm), and in order to get in some regex practice:

from re import compile
regex = compile("(\d\d)/(\d\d)/(\d{4})")
for match in regex.finditer(data):
day, month, year = match.groups()
print(year + month + day)

Cheers,
Chris
--
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching

2011-02-23 Thread Steven D'Aprano
On Wed, 23 Feb 2011 21:11:53 -0500, monkeys paw wrote:

> if I have a string such as '01/12/2011' and i want to reformat
> it as '20110112', how do i pull out the components of the string and
> reformat them into a DDMM format?

data = '01/12/2011'
# Throw away tags.
data = data[4:-5]
# Separate components.
day, month, year = data.split('/')
# Recombine.
print(year + month + day)


No need for the sledgehammer of regexes for cracking this peanut.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching

2011-02-23 Thread Roy Smith
In article ,
 monkeys paw  wrote:

> if I have a string such as '01/12/2011' and i want
> to reformat it as '20110112', how do i pull out the components
> of the string and reformat them into a DDMM format?
> 
> I have:
> 
> import re
> 
> test = re.compile('\d\d\/')
> f = open('test.html')  # This file contains the html dates
> for line in f:
>  if test.search(line):
>  # I need to pull the date components here

My first thought is that any attempt to parse HTML by using regex is 
doomed to failure.  HTML is meant to be parsed by an HTML parser.  
Python gives you several to pick from; the best that I know of is the 
third-party lxml package (http://lxml.de/).

My second thought is that my first thought was correct.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching with multiple lists

2010-07-16 Thread Tim Chase

On 07/16/2010 02:20 PM, Chad Kellerman wrote:

Greetings,
  I have some code that I wrote and know there is a better way to
write it.  I  wonder if anyone could point me in the right direction
on making this 'cleaner'.

  I have two lists:   liveHostList = [ app11, app12, web11, web12, host11 ]
 stageHostList  = [  web21,
web22, host21, app21, app22 ]

  I need to pair the elements in the list such that:
 app11  pairs with app21
 app12 pairs with app22
 web11 pairs with web21
 web12 pairs with web22
 host11pairs with host21


While I like MRAB's solution even better than mine[1], you can 
also use:


  liveHostList = ["app11", "app12", "web11", "web12", "host11"]
  stageHostList = ["web21", "web22", "host21", "app21", "app22"]

  def bits(s):
return (s[:-2],s[-1])

  for live, stage in zip(
  sorted(liveHostList, key=bits),
  sorted(stageHostList, key=bits),
  ):
print "Match: ", live, stage

-tkc


[1] His solution is O(N), making one pass through each list, with 
O(1) lookups into the created dict during the 2nd loop, while 
mine is likely overwhelmed by the cost of the sorts...usually O(N 
log N) for most reasonable sorts.  However, this doesn't likely 
matter much until your list-sizes are fairly large.





--
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching with multiple lists

2010-07-16 Thread MRAB

Chad Kellerman wrote:

Greetings,
 I have some code that I wrote and know there is a better way to
write it.  I  wonder if anyone could point me in the right direction
on making this 'cleaner'.

 I have two lists:   liveHostList = [ app11, app12, web11, web12, host11 ]
stageHostList  = [  web21,
web22, host21, app21, app22 ]

 I need to pair the elements in the list such that:
app11  pairs with app21
app12 pairs with app22
web11 pairs with web21
web12 pairs with web22
host11pairs with host21

each time I get the list I don't know the order, and the lists
will grow over time.  (hosts will be added in pairs.  app13 to
liveHostList and app23 to stageHostList, etc)


Anyways this is what I have.  I think it can be written better with
map, but not sure.  Any help would be appreciated.

import re
for liveHost in liveHostlist:

nameList = list(liveHost)
clone= nameList[-1]
di   = nameList[-2]
generic  = liveHost[:-2]

for stageHost in stageHostList:
if re.match( generic + '.' + clone, stageHost ):
print "Got a pair: " + stageHost + liveHost

Thanks again for any suggestions,
Chad


So you recognise a pair by them having the same 'key', which is:

name[ : -2] + name[-1 : ]


Therefore you can put one of the lists into a dict and look up the name
by its key:

liveHostDict = dict((liveHost[ : -2] + liveHost[-1 : ], liveHost) 
for liveHost in liveHostList)


for stageHost in stageHostList:
key = stageHost[ : -2] + stageHost[-1 : ]
liveHost = liveHostDict[key]
print "Got a pair: %s %s" % (stageHost, liveHost)

--
http://mail.python.org/mailman/listinfo/python-list


Re: Defining re pattern for matching list of numbers

2009-11-07 Thread Steven D'Aprano
On Fri, 06 Nov 2009 10:16:31 -0800, Chris Rebert wrote:

> Your format seems so simple I have to ask why you're using regexes in
> the first place.

Raymond Hettinger has described some computing techniques as "code 
prions" -- programming advice or techniques which are sometimes useful 
but often actively harmful.

http://www.mail-archive.com/python-list%40python.org/msg262651.html

As useful as regexes are, I think they qualify as code prions too: people 
insist on using them in production code, even when a simple string method 
or function would do the job far more efficiently and readably.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ask for a RE pattern to match TABLE in html

2008-07-01 Thread David C. Ullrich
In article 
<[EMAIL PROTECTED]>,
 Jonathan Gardner <[EMAIL PROTECTED]> wrote:

> On Jun 27, 10:32 am, "David C. Ullrich" <[EMAIL PROTECTED]> wrote:
> > (ii) The regexes in languages like Python and Perl include
> > features that are not part of the formal CS notion of
> > "regular expression". Do they include something that
> > does allow parsing nested delimiters properly?
> >
> 
> In perl, there are some pretty wild extensions to the regex syntax,
> features that make it much more than a regular expression engine.
> 
> Yes, it is possible to match parentheses and other nested structures
> (such as HTML), and the regex to do so isn't incredibly difficult.
> Note that Python doesn't support this extension.

Huh. My evidently misinformed impression was that the regexes
in P and P were essentially equivalent. (I hope nobody takes
that as a complaint...)

> See http://www.perl.com/pub/a/2003/08/21/perlcookbook.html

-- 
David C. Ullrich
--
http://mail.python.org/mailman/listinfo/python-list

Re: ask for a RE pattern to match TABLE in html

2008-06-30 Thread Jonathan Gardner
On Jun 27, 10:32 am, "David C. Ullrich" <[EMAIL PROTECTED]> wrote:
> (ii) The regexes in languages like Python and Perl include
> features that are not part of the formal CS notion of
> "regular expression". Do they include something that
> does allow parsing nested delimiters properly?
>

In perl, there are some pretty wild extensions to the regex syntax,
features that make it much more than a regular expression engine.

Yes, it is possible to match parentheses and other nested structures
(such as HTML), and the regex to do so isn't incredibly difficult.
Note that Python doesn't support this extension.

See http://www.perl.com/pub/a/2003/08/21/perlcookbook.html
--
http://mail.python.org/mailman/listinfo/python-list


Re: ask for a RE pattern to match TABLE in html

2008-06-30 Thread David C. Ullrich
In article 
<[EMAIL PROTECTED]>,
 Dan <[EMAIL PROTECTED]> wrote:

> On Jun 27, 1:32 pm, "David C. Ullrich" <[EMAIL PROTECTED]> wrote:
> > In article
> > <[EMAIL PROTECTED]>,
> >  Jonathan Gardner <[EMAIL PROTECTED]> wrote:
> >
> > > On Jun 26, 3:22 pm, MRAB <[EMAIL PROTECTED]> wrote:
> > > > Try something like:
> >
> > > > re.compile(r'.*?', re.DOTALL)
> >
> > > So you would pick up strings like "foo > > td>"? I doubt that is what oyster wants.
> >
> > I asked a question recently - nobody answered, I think
> > because they assumed it was just a rhetorical question:
> >
> > (i) It's true, isn't it, that it's impossible for the
> > formal CS notion of "regular expression" to correctly
> > parse nested open/close delimiters?
> 
> Yes. For the proof, you want to look at the pumping lemma found in
> your favorite Theory of Computation textbook.

Ah, thanks. Don't have a favorite text, not having any at all.
But wikipedia works - what I found at 

http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages

was pretty clear. (Yes, it's exactly that \1, \2 stuff that
convinced me I really don't understand what one can do with
a Python regex.)

> >
> > (ii) The regexes in languages like Python and Perl include
> > features that are not part of the formal CS notion of
> > "regular expression". Do they include something that
> > does allow parsing nested delimiters properly?
> 
> So, I think most of the extensions fall into syntactic sugar
> (certainly all the character classes \b \s \w, etc). The ability to
> look at input without consuming it is more than syntactic sugar, but
> my intuition is that it could be pretty easily modeled by a
> nondeterministic finite state machine, which is of equivalent power to
> REs. The only thing I can really think of that is completely non-
> regular is the \1 \2, etc syntax to match previously match strings
> exactly. But since you can't to an arbitrary number of them, I don't
> think its actually context free. (I'm not prepared to give a proof
> either way). Needless to say that even if you could, it would be
> highly impractical to match parentheses using those.
> 
> So, yeah, to match arbitrary nested delimiters, you need a real
> context free parser.
> 
> >
> > --
> > David C. Ullrich
> 
> 
> -Dan

-- 
David C. Ullrich
--
http://mail.python.org/mailman/listinfo/python-list


Re: ask for a RE pattern to match TABLE in html

2008-06-27 Thread Dan
On Jun 27, 1:32 pm, "David C. Ullrich" <[EMAIL PROTECTED]> wrote:
> In article
> <[EMAIL PROTECTED]>,
>  Jonathan Gardner <[EMAIL PROTECTED]> wrote:
>
> > On Jun 26, 3:22 pm, MRAB <[EMAIL PROTECTED]> wrote:
> > > Try something like:
>
> > > re.compile(r'.*?', re.DOTALL)
>
> > So you would pick up strings like "foo > td>"? I doubt that is what oyster wants.
>
> I asked a question recently - nobody answered, I think
> because they assumed it was just a rhetorical question:
>
> (i) It's true, isn't it, that it's impossible for the
> formal CS notion of "regular expression" to correctly
> parse nested open/close delimiters?

Yes. For the proof, you want to look at the pumping lemma found in
your favorite Theory of Computation textbook.

>
> (ii) The regexes in languages like Python and Perl include
> features that are not part of the formal CS notion of
> "regular expression". Do they include something that
> does allow parsing nested delimiters properly?

So, I think most of the extensions fall into syntactic sugar
(certainly all the character classes \b \s \w, etc). The ability to
look at input without consuming it is more than syntactic sugar, but
my intuition is that it could be pretty easily modeled by a
nondeterministic finite state machine, which is of equivalent power to
REs. The only thing I can really think of that is completely non-
regular is the \1 \2, etc syntax to match previously match strings
exactly. But since you can't to an arbitrary number of them, I don't
think its actually context free. (I'm not prepared to give a proof
either way). Needless to say that even if you could, it would be
highly impractical to match parentheses using those.

So, yeah, to match arbitrary nested delimiters, you need a real
context free parser.

>
> --
> David C. Ullrich


-Dan
--
http://mail.python.org/mailman/listinfo/python-list


Re: ask for a RE pattern to match TABLE in html

2008-06-27 Thread David C. Ullrich
In article 
<[EMAIL PROTECTED]>,
 Jonathan Gardner <[EMAIL PROTECTED]> wrote:

> On Jun 26, 3:22 pm, MRAB <[EMAIL PROTECTED]> wrote:
> > Try something like:
> >
> > re.compile(r'.*?', re.DOTALL)
> 
> So you would pick up strings like "foo td>"? I doubt that is what oyster wants.

I asked a question recently - nobody answered, I think
because they assumed it was just a rhetorical question:

(i) It's true, isn't it, that it's impossible for the
formal CS notion of "regular expression" to correctly
parse nested open/close delimiters?

(ii) The regexes in languages like Python and Perl include
features that are not part of the formal CS notion of
"regular expression". Do they include something that
does allow parsing nested delimiters properly?

-- 
David C. Ullrich
--
http://mail.python.org/mailman/listinfo/python-list

Re: ask for a RE pattern to match TABLE in html

2008-06-26 Thread Jonathan Gardner
On Jun 26, 3:22 pm, MRAB <[EMAIL PROTECTED]> wrote:
> Try something like:
>
> re.compile(r'.*?', re.DOTALL)

So you would pick up strings like "foo"? I doubt that is what oyster wants.
--
http://mail.python.org/mailman/listinfo/python-list


Re: ask for a RE pattern to match TABLE in html

2008-06-26 Thread Jonathan Gardner
On Jun 26, 11:07 am, Grant Edwards <[EMAIL PROTECTED]> wrote:
> On 2008-06-26, Stefan Behnel <[EMAIL PROTECTED]> wrote:
> >
> > Why not use an HTML parser instead?
> >
>
> Stating it differently: in order to correctly recognize HTML
> tags, you must use an HTML parser.  Trying to write an HTML
> parser in a single RE is probably not practical.
>

s/practical/possible

It isn't *possible* to grok HTML with regular expressions. Individual
tags--yes. But not a full element where nesting is possible. At least
not properly.

Maybe we need some notes on the limits of regular expressions in the
re documentation for people who haven't taken the computer science
courses on parsing and grammars. Then we could explain the necessity
of real parsers and grammars, at least in layman's terms.
--
http://mail.python.org/mailman/listinfo/python-list


Re: ask for a RE pattern to match TABLE in html

2008-06-26 Thread MRAB
On Jun 26, 7:26 pm, "David C. Ullrich" <[EMAIL PROTECTED]> wrote:
> In article <[EMAIL PROTECTED]>,
>  Cédric Lucantis <[EMAIL PROTECTED]> wrote:
>
>
>
> > Le Thursday 26 June 2008 15:53:06 oyster, vous avez écrit :
> > > that is, there is no TABLE tag between a TABLE, for example
> > > something with out table tag
> > > what is the RE pattern? thanks
>
> > > the following is not right
> > > [^table]*?
>
> > The construct [abc] does not match a whole word but only one char, so  
> > [^table] means "any char which is not t, a, b, l or e".
>
> > Anyway the inside table word won't match your pattern, as there are '<'
> > and '>' in it, and these chars have to be escaped when used as simple text.
> > So this should work:
>
> > re.compile(r'.*')
> >                     ^ this is to avoid matching a tag name starting with
> >                     table
> > (like )
>
> Doesn't work - for example it matches ''
> (and in fact if the html contains any number of tables it's going
> to match the string starting at the start of the first table and
> ending at the end of the last one.)
>
Try something like:

re.compile(r'.*?', re.DOTALL)
--
http://mail.python.org/mailman/listinfo/python-list


Re: ask for a RE pattern to match TABLE in html

2008-06-26 Thread David C. Ullrich
In article <[EMAIL PROTECTED]>,
 Cédric Lucantis <[EMAIL PROTECTED]> wrote:

> Le Thursday 26 June 2008 15:53:06 oyster, vous avez écrit :
> > that is, there is no TABLE tag between a TABLE, for example
> > something with out table tag
> > what is the RE pattern? thanks
> >
> > the following is not right
> > [^table]*?
> 
> The construct [abc] does not match a whole word but only one char, so  
> [^table] means "any char which is not t, a, b, l or e".
> 
> Anyway the inside table word won't match your pattern, as there are '<' 
> and '>' in it, and these chars have to be escaped when used as simple text.
> So this should work:
> 
> re.compile(r'.*')
> ^ this is to avoid matching a tag name starting with 
> table 
> (like )

Doesn't work - for example it matches ''
(and in fact if the html contains any number of tables it's going
to match the string starting at the start of the first table and
ending at the end of the last one.)

-- 
David C. Ullrich
--
http://mail.python.org/mailman/listinfo/python-list

Re: ask for a RE pattern to match TABLE in html

2008-06-26 Thread Grant Edwards
On 2008-06-26, Stefan Behnel <[EMAIL PROTECTED]> wrote:
> oyster wrote:
>> that is, there is no TABLE tag between a TABLE, for example
>> something with out table tag
>> what is the RE pattern? thanks
>> 
>> the following is not right
>> [^table]*?
>
> Why not use an HTML parser instead?

Stating it differently: in order to correctly recognize HTML
tags, you must use an HTML parser.  Trying to write an HTML
parser in a single RE is probably not practical.

-- 
Grant Edwards   grante Yow! I want another
  at   RE-WRITE on my CEASAR
   visi.comSALAD!!
--
http://mail.python.org/mailman/listinfo/python-list


Re: ask for a RE pattern to match TABLE in html

2008-06-26 Thread Stefan Behnel
oyster wrote:
> that is, there is no TABLE tag between a TABLE, for example
> something with out table tag
> what is the RE pattern? thanks
> 
> the following is not right
> [^table]*?

Why not use an HTML parser instead? Try lxml.html.

http://codespeak.net/lxml/

Stefan
--
http://mail.python.org/mailman/listinfo/python-list


Re: ask for a RE pattern to match TABLE in html

2008-06-26 Thread Cédric Lucantis
Le Thursday 26 June 2008 15:53:06 oyster, vous avez écrit :
> that is, there is no TABLE tag between a TABLE, for example
> something with out table tag
> what is the RE pattern? thanks
>
> the following is not right
> [^table]*?

The construct [abc] does not match a whole word but only one char, so  
[^table] means "any char which is not t, a, b, l or e".

Anyway the inside table word won't match your pattern, as there are '<' 
and '>' in it, and these chars have to be escaped when used as simple text.
So this should work:

re.compile(r'.*')
^ this is to avoid matching a tag name starting with table 
(like )

-- 
Cédric Lucantis
--
http://mail.python.org/mailman/listinfo/python-list


ask for a RE pattern to match TABLE in html

2008-06-26 Thread oyster
that is, there is no TABLE tag between a TABLE, for example
something with out table tag
what is the RE pattern? thanks

the following is not right
[^table]*?
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Matching Over Python Lists

2008-06-22 Thread eliben
> Fair enough. To help you understand the method I used, I'll give you
> this hint. It's true that regex on works on strings. However, is there
> any way to convert arbitrarily complex data structures to string
> representations? You don't need to be an experienced Python user to
> answer to this ;)

As Paddy noted before, your solution has a problem, Regexes can't
match nested parenthesis, so I think your method will have a problem
with nested lists, unless your actual inputs are much simpler than the
general case.

Eli
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Matching Over Python Lists

2008-06-22 Thread Chris
On Jun 19, 9:03 pm, John Machin <[EMAIL PROTECTED]> wrote:
> On Jun 20, 10:45 am, Chris <[EMAIL PROTECTED]> wrote:
>
> > On Jun 17, 1:09 pm, [EMAIL PROTECTED] wrote:
>
> > > Kirk Strauser:
>
> > > > Hint: recursion.  Your general algorithm will be something like:
>
> > > Another solution is to use a better (different) language, that has
> > > built-in pattern matching, or allows to create one.
>
> > > Bye,
> > > bearophile
>
> > Btw, Python's stdlib includes a regular expression library. I'm not
> > sure if you're trolling or simply unaware of it, but I've found it
> > quite adequate for most tasks.
>
> Kindly consider a third possibility: bearophile is an experienced
> Python user, has not to my knowledge exhibited any troll-like
> behaviour in the past, and given that you seem to be happy using the
> re module not on strings but on lists of integers, may have been
> wondering whether *you* were trolling or just plain confused but just
> too polite to wonder out loud :-)

Fair enough. To help you understand the method I used, I'll give you
this hint. It's true that regex on works on strings. However, is there
any way to convert arbitrarily complex data structures to string
representations? You don't need to be an experienced Python user to
answer to this ;)
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Matching Over Python Lists

2008-06-20 Thread MRAB
On Jun 20, 1:45 am, Chris <[EMAIL PROTECTED]> wrote:
> On Jun 17, 1:09 pm, [EMAIL PROTECTED] wrote:
>
> > Kirk Strauser:
>
> > > Hint: recursion.  Your general algorithm will be something like:
>
> > Another solution is to use a better (different) language, that has
> > built-in pattern matching, or allows to create one.
>
> > Bye,
> > bearophile
>
> Btw, Python's stdlib includes a regular expression library. I'm not
> sure if you're trolling or simply unaware of it, but I've found it
> quite adequate for most tasks.

bearophile was talking about matching lists and tuples, not matching
strings.

Python's regular expression module works with characters in strings,
but the same approach can be applied to items in lists and tuples.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Matching Over Python Lists

2008-06-19 Thread Paddy
On Jun 20, 1:44 am, Chris <[EMAIL PROTECTED]> wrote:
> Thanks for your help. Those weren't quite what I was looking for, but
> I ended up figuring it out on my own. Turns out you can actually
> search nested Python lists using simple regular expressions.

Strange?
How do you match nested '[' ... ']' brackets?

- Paddy.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Matching Over Python Lists

2008-06-19 Thread John Machin
On Jun 20, 10:45 am, Chris <[EMAIL PROTECTED]> wrote:
> On Jun 17, 1:09 pm, [EMAIL PROTECTED] wrote:
>
> > Kirk Strauser:
>
> > > Hint: recursion.  Your general algorithm will be something like:
>
> > Another solution is to use a better (different) language, that has
> > built-in pattern matching, or allows to create one.
>
> > Bye,
> > bearophile
>
> Btw, Python's stdlib includes a regular expression library. I'm not
> sure if you're trolling or simply unaware of it, but I've found it
> quite adequate for most tasks.

Kindly consider a third possibility: bearophile is an experienced
Python user, has not to my knowledge exhibited any troll-like
behaviour in the past, and given that you seem to be happy using the
re module not on strings but on lists of integers, may have been
wondering whether *you* were trolling or just plain confused but just
too polite to wonder out loud :-)
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Matching Over Python Lists

2008-06-19 Thread Chris
On Jun 17, 1:09 pm, [EMAIL PROTECTED] wrote:
> Kirk Strauser:
>
> > Hint: recursion.  Your general algorithm will be something like:
>
> Another solution is to use a better (different) language, that has
> built-in pattern matching, or allows to create one.
>
> Bye,
> bearophile

Btw, Python's stdlib includes a regular expression library. I'm not
sure if you're trolling or simply unaware of it, but I've found it
quite adequate for most tasks.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Matching Over Python Lists

2008-06-19 Thread Chris
Thanks for your help. Those weren't quite what I was looking for, but
I ended up figuring it out on my own. Turns out you can actually
search nested Python lists using simple regular expressions.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Matching Over Python Lists

2008-06-17 Thread bearophileHUGS
Kirk Strauser:
> Hint: recursion.  Your general algorithm will be something like:

Another solution is to use a better (different) language, that has
built-in pattern matching, or allows to create one.

Bye,
bearophile
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Matching Over Python Lists

2008-06-17 Thread Kirk Strauser
At 2008-06-17T05:55:52Z, Chris <[EMAIL PROTECTED]> writes:

> Is anyone aware of any prior work done with searching or matching a
> pattern over nested Python lists? I have this problem where I have a
> list like:
>
> [1, 2, [1, 2, [1, 7], 9, 9], 10]
>
> and I'd like to search for the pattern [1, 2, ANY] so that is returns:
>
> [1, 2, [1, 2, [6, 7], 9, 9], 10]
> [1, 2, [6, 7], 9, 9]

Hint: recursion.  Your general algorithm will be something like:

def compare(list, function):
if function(list):
print list
for item in list:
if item is a list:
compare(item, function)

def check(list):
if list starts with [1, 2] and length of the list > 2:
return True
else:
return False
-- 
Kirk Strauser
The Day Companies
--
http://mail.python.org/mailman/listinfo/python-list


Re: Tips Re Pattern Matching / REGEX

2008-03-27 Thread Miki
Hello,

> I have a large text file (1GB or so) with structure similar to the
> html example below.
>
> I have to extract content (text between div and tr tags) from this
> file and put it into a spreadsheet or a database - given my limited
> python knowledge I was going to try to do this with regex pattern
> matching.
>
> Would someone be able to provide pointers regarding how do I approach
> this? Any code samples would be greatly appreciated.
The ultimate tool for handling HTML is 
http://www.crummy.com/software/BeautifulSoup/
where you can do stuff like:
soup = BeautifulSoup(html)
for div in soup("div", {"class" : "special"}):
...

Not sure how fast it is though.

There is also the htmllib module that comes with python, it might do
the work as well and maybe a bit faster.
If the file is valid HTML and you need some speed, have a look at
xml.sax.

HTH,
--
Miki <[EMAIL PROTECTED]>
http://pythonwise.blogspot.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Tips Re Pattern Matching / REGEX

2008-03-27 Thread egonslokar
Hello Python Community,

I have a large text file (1GB or so) with structure similar to the
html example below.

I have to extract content (text between div and tr tags) from this
file and put it into a spreadsheet or a database - given my limited
python knowledge I was going to try to do this with regex pattern
matching.

Would someone be able to provide pointers regarding how do I approach
this? Any code samples would be greatly appreciated.

Thanks.

Sam





\\ there are hundreds of thousands of items

\\Item1

123

Text1: What do I do with these lines
That span several rows? 
...
Foot

\\Item2

First Line Can go here
But the second line can go here
...
Foot
Can span
Over several pages


\\Item3

First Line Can go here
But the second line can go here
...
This can
Span several rows




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern combinations

2007-09-17 Thread dohertywa
On Sep 17, 3:11 pm, "Shawn Milochik" <[EMAIL PROTECTED]> wrote:
> On 9/17/07, dorje tarap <[EMAIL PROTECTED]> wrote:
>
>
>
> > Hi all,
>
> >  Given some patterns such as "...t...s." I need to make all possible
> > combinations given a separate list for each position. The length of the
> > pattern is fixed to 9, so thankfully that reduces a bit of the complexity.
>
> >  For example I have the following:
>
> >  pos1 = ['a',' t']
> >  pos2 = ['r', 's']
> >  pos3 = ['n', 'f']
>
> >  So if the pattern contains a '.' character at position 1 it could be 'a' or
> > 't'. For the pattern '.s.' (length of 3 as example) all combinations would
> > be:
>
> >  asn
> >  asf
> >  tsn
> >  tsf
>
> >  Thanks
> > --
> >http://mail.python.org/mailman/listinfo/python-list
>
> Sounds like homework to me.

Checkout http://probstat.sf.net/ it will sort you out quick.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern combinations

2007-09-17 Thread Shawn Milochik
On 9/17/07, dorje tarap <[EMAIL PROTECTED]> wrote:
> Hi all,
>
>  Given some patterns such as "...t...s." I need to make all possible
> combinations given a separate list for each position. The length of the
> pattern is fixed to 9, so thankfully that reduces a bit of the complexity.
>
>  For example I have the following:
>
>  pos1 = ['a',' t']
>  pos2 = ['r', 's']
>  pos3 = ['n', 'f']
>
>  So if the pattern contains a '.' character at position 1 it could be 'a' or
> 't'. For the pattern '.s.' (length of 3 as example) all combinations would
> be:
>
>  asn
>  asf
>  tsn
>  tsf
>
>  Thanks
> --
> http://mail.python.org/mailman/listinfo/python-list
>


Sounds like homework to me.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern for error checking easiest-first?

2007-08-20 Thread Gabriel Genellina
On 20 ago, 18:01, [EMAIL PROTECTED] wrote:

> The problem is that code like this does error checking backwards. A
> call to NetworkedThing.changeMe will first do a slow error check and
> then a fast one. Obviously there are various ways to get around this -
> either have the subclass explicitly ask the superclass to error check
> first, or vice totally versa. Is there some accepted pattern/idiom for
> handling this issue?

What about this:

class AbstractThing():
def changeMe(self,blah):
 self.verify_blah(blah)
 self.blah = blah

def verify_blah(self, blah):
 if blah < 1:
  raise MyException

class NetworkedThing(AbstractThing):
def verify_blah(self, blah):
AbstractThing.verify_blah(blah)
if blah > self.getUpperLimitOverTheNetworkSlowly:
 raise MyOtherException

That is, it's the verify step that is overriden/enhanced, not the
changeMe method that stays the same.

--
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern match !

2007-07-15 Thread Asun Friere
On Jul 11, 9:29 pm, Helmut Jarausch <[EMAIL PROTECTED]>
wrote:
> import re
> P=re.compile(r'(\w+(?:[-.]\d+)+)-RHEL3-Linux\.RPM')
> S="hpsmh-1.1.1.2-0-RHEL3-Linux.RPM"
> PO= P.match(S)
> if  PO :
>print PO.group(1)

Isn't a regexp overkill here when this will do:

head = filename[:filename.index('-RHEL3')]

Of course if you need to make it more generic (as in Jay's solution
below), re is the way to go.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern match !

2007-07-14 Thread Jay Loden
[EMAIL PROTECTED] wrote:
>> A slightly more generic match in case your package names turn out to be less 
>> consistent than given in the test cases:
>>
>> #!/usr/bin/python
>>
>> import re
>> pattern = re.compile(r'(\w+?-(\d+[\.-])+\d+?)-\D+.*RPM')
>> pkgnames = ["hpsmh-1.1.1.2-0-RHEL3-Linux.RPM", 
>> "hpsmh-1.1.1.2-RHEL3-Linux.RPM"]
>> for pkg in pkgnames:
>>   matchObj = pattern.search(pkg)
>>   if matchObj:
>> print matchObj.group(1)
>>
>> Still assumes it will end in RPM (all caps), but if you add the flag "re.I" 
>> to the re.compile() call, it will match case-insensitive.
>>
>> Hope that helps,
>>
>> -Jay
> 
> How about if i had something like 1-3 words in the application name:
> websphere-pk543-1.1.4.2-1-RHEL3-i386.rpm (in this case are 2 words)?

Try this instead then:

#!/usr/bin/python

import re
pattern = re.compile(r'((\w+?-)+?(\d+[\.-])+\d+?)-\D+.*RPM', re.I)
pkgnames = ["hpsmh-1.1.1.2-0-RHEL3-Linux.RPM", "hpsmh-1.1.1.2-RHEL3-Linux.RPM", 
"websphere-pk543-1.1.4.2-1-RHEL3-i386.rpm"]
for pkg in pkgnames:
  matchObj = pattern.search(pkg)
  if matchObj:
print matchObj.group(1)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern match !

2007-07-11 Thread Jay Loden

Helmut Jarausch wrote:
> [EMAIL PROTECTED] wrote:
>> Extract the application name with version from an RPM string like
>> hpsmh-1.1.1.2-0-RHEL3-Linux.RPM, i require to extract hpsmh-1.1.1.2-0
>> from above string. Sometimes the RPM string may be hpsmh-1.1.1.2-RHEL3-
>> Linux.RPM.
>>
> 
> Have a try with
> 
> import re
> P=re.compile(r'(\w+(?:[-.]\d+)+)-RHEL3-Linux\.RPM')
> S="hpsmh-1.1.1.2-0-RHEL3-Linux.RPM"
> PO= P.match(S)
> if  PO :
>print PO.group(1)


A slightly more generic match in case your package names turn out to be less 
consistent than given in the test cases:

#!/usr/bin/python

import re
pattern = re.compile(r'(\w+?-(\d+[\.-])+\d+?)-\D+.*RPM')
pkgnames = ["hpsmh-1.1.1.2-0-RHEL3-Linux.RPM", "hpsmh-1.1.1.2-RHEL3-Linux.RPM"]
for pkg in pkgnames:
  matchObj = pattern.search(pkg)
  if matchObj:
print matchObj.group(1)

Still assumes it will end in RPM (all caps), but if you add the flag "re.I" to 
the re.compile() call, it will match case-insensitive. 

Hope that helps, 

-Jay
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern match !

2007-07-11 Thread Helmut Jarausch
[EMAIL PROTECTED] wrote:
> Extract the application name with version from an RPM string like
> hpsmh-1.1.1.2-0-RHEL3-Linux.RPM, i require to extract hpsmh-1.1.1.2-0
> from above string. Sometimes the RPM string may be hpsmh-1.1.1.2-RHEL3-
> Linux.RPM.
> 

Have a try with

import re
P=re.compile(r'(\w+(?:[-.]\d+)+)-RHEL3-Linux\.RPM')
S="hpsmh-1.1.1.2-0-RHEL3-Linux.RPM"
PO= P.match(S)
if  PO :
   print PO.group(1)



-- 
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern match !

2007-07-10 Thread Asun Friere
On Jul 11, 1:40 pm, [EMAIL PROTECTED] wrote:
> Extract the application name with version from an RPM string like
> hpsmh-1.1.1.2-0-RHEL3-Linux.RPM, i require to extract hpsmh-1.1.1.2-0
> from above string. Sometimes the RPM string may be hpsmh-1.1.1.2-RHEL3-
> Linux.RPM.

Now that list-like splicing and indexing works on strings, why not
just splice the string, using .index to locate '-RHEL'?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern match !

2007-07-10 Thread Steven D'Aprano
On Wed, 11 Jul 2007 03:40:06 +, hari.siri74 wrote:

> Extract the application name with version from an RPM string like
> hpsmh-1.1.1.2-0-RHEL3-Linux.RPM, i require to extract hpsmh-1.1.1.2-0
> from above string. Sometimes the RPM string may be hpsmh-1.1.1.2-RHEL3-
> Linux.RPM.

Thank you for sharing.

The answer to your problem is here: 
http://tinyurl.com/anel


-- 
Steven.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Classification Frameworks?

2007-06-12 Thread Miki
Hello Evan,

> What frameworks are there available for doing pattern classification?
> ...
Two Bayesian classifiers are SpamBayes (http://spambayes.sf.net) and
Reverend Thomas (http://www.divmod.org/projects/reverend).
IMO the latter will be easier to play with.

> Also, as a sidenote, are there any texts that anyone can recommend to
> me for learning more about this area?
A good book about NLP is http://nlp.stanford.edu/fsnlp/ which have a
chapter about
text classification. http://www.cs.cmu.edu/~tom/mlbook.html has some
good coverage on
the subject as well.

HTH.
--
Miki Tebeka <[EMAIL PROTECTED]>
http://pythonwise.blogspot.com

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Classification Frameworks?

2007-06-12 Thread Evan Klitzke
On 6/12/07, Steven Bethard <[EMAIL PROTECTED]> wrote:
> In fact, a wide variety of classifiers are used in text classification,
> including Bayesian approaches, support vector machines, conditional
> random fields, etc.
>
> > Are there any other frameworks I should be aware of?
>
> I have used (but not recently) Orange:
>
>  http://www.ailab.si/orange
>
> I haven't used, but have been meaning to try, PyML:
>
>  http://pyml.sourceforge.net/
>
> A more recent addition (whose documentation needs work) is:
>
>  http://montepython.sourceforge.net/
>
> And here's a Summer of Code project to build an ML library:
>
>  http://projects.scipy.org/scipy/scipy/wiki/MachineLearning
>
> These are all general-purpose machine learning frameworks. So they can
> be applied to pretty much any classification problem (including the text
> classification problems you're looking at). You just need to pick out a
> set of relevant features to describe your data, and feed those features
> along with your chosen labels to a machine learning algorithm.
>
> STeVe

Thanks Steven (and Diez), the projects you pointed me to look like
great places to start.

-- 
Evan Klitzke <[EMAIL PROTECTED]>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Classification Frameworks?

2007-06-12 Thread Steven Bethard
Evan Klitzke wrote:
> What frameworks are there available for doing pattern classification?
> I'm generally interested in the problem of mapping some sort of input
> to one or more categories. For example, I want to be able to solve
> problems like taking text and applying one or more tags to it like
> "romance", "horror", "poetry", etc. This isn't really my research
> specialty, but my understanding is that Bayesian classifiers are
> generally used for problems like this.

In fact, a wide variety of classifiers are used in text classification, 
including Bayesian approaches, support vector machines, conditional 
random fields, etc.

> Are there any other frameworks I should be aware of?

I have used (but not recently) Orange:

 http://www.ailab.si/orange

I haven't used, but have been meaning to try, PyML:

 http://pyml.sourceforge.net/

A more recent addition (whose documentation needs work) is:

 http://montepython.sourceforge.net/

And here's a Summer of Code project to build an ML library:

 http://projects.scipy.org/scipy/scipy/wiki/MachineLearning

These are all general-purpose machine learning frameworks. So they can 
be applied to pretty much any classification problem (including the text 
classification problems you're looking at). You just need to pick out a 
set of relevant features to describe your data, and feed those features 
along with your chosen labels to a machine learning algorithm.

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Classification Frameworks?

2007-06-12 Thread Diez B. Roggisch
Evan Klitzke wrote:

> Hi all,
> 
> What frameworks are there available for doing pattern classification?
> I'm generally interested in the problem of mapping some sort of input
> to one or more categories. For example, I want to be able to solve
> problems like taking text and applying one or more tags to it like
> "romance", "horror", "poetry", etc. This isn't really my research
> specialty, but my understanding is that Bayesian classifiers are
> generally used for problems like this. I've had CRM114 recommended to
> me, but as far as I can tell there aren't any python bindings for
> this.

I've utilized the CRM114 classifier from python. It wasn't too hard to come
up with a simple wrapping that only needs the crm114 binary somewhere. The
rest was dealt with in python.

So if CRM114 fits you needs functionalitywise, you should go for it.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern search

2007-03-29 Thread Fabian Braennstroem
Hi Paul,

Paul McGuire schrieb am 03/27/2007 07:19 PM:
> On Mar 27, 3:13 pm, Fabian Braennstroem <[EMAIL PROTECTED]> wrote:
>> Hi to all,
>>
>> Wojciech Mu?a schrieb am 03/27/2007 03:34 PM:
>>
>>> Fabian Braennstroem wrote:
 Now, I would like to improve it by searching for different 'real'
 patterns just like using 'ls' in bash. E.g. the entry
 'car*.pdf' should select all pdf files with a beginning 'car'.
 Does anyone have an idea, how to do it?
>>> Use module glob.
>> Thanks for your help! glob works pretty good, except that I just
>> deleted all my lastet pdf files :-(
>>
>> Greetings!
>> Fabian
> 
> Then I shudder to think what might have happened if you had used
> re's! :)

A different feature it had was to copy the whole home-partition
(about 19G) into one of its own directories ... the strange thing:
it just needed seconds to do that and I did not have the permission
to all files and directories! It was pretty strange! Hopefully it
was no security bug in python...

Greetings!
Fabian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern search

2007-03-28 Thread Fabian Braennstroem
Hi,

Gabriel Genellina schrieb am 03/27/2007 10:09 PM:
> En Tue, 27 Mar 2007 18:42:15 -0300, Diez B. Roggisch <[EMAIL PROTECTED]>  
> escribió:
> 
>> Paul McGuire schrieb:
>>> On Mar 27, 10:18 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
 Fabian Braennstroem wrote:
> while iter:
> value = model.get_value(iter, 1)
> if value.endswith("."+ pattern): [...]
>
> Now, I would like to improve it by searching for different 'real'
> patterns just like using 'ls' in bash. E.g. the entry
> 'car*.pdf' should select all pdf files with a beginning 'car'.
> Does anyone have an idea, how to do it?
> 
 Use regular expressions. They are part of the module "re". And if you  
 use them, ditch your code above, and make it just search for a pattern  
 all the time. Because the above is just the case of
 *.ext
> 
>>> The glob module is a more direct tool based on the OP's example.  The
>>> example he gives works directly with glob.  To use re, you'd have to
>>> convert to something like "car.*\.pdf", yes?
> 
>> I'm aware of the glob-module. But it only works on files. I was under
>> the impression that he already has a list of files he wants to filter
>> instead of getting it fresh from the filesystem.
> 
> In that case the best way would be to use the fnmatch module - it already  
> knows how to translate from car*.pdf into the right regexp. (The glob  
> module is like a combo os.listdir+fnmatch.filter)

I have a already a list, but I 'glob' looked so easy ... maybe it is
 faster to use fnmatch. When I have time I try it out...

Thanks!
Fabian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern search

2007-03-27 Thread Gabriel Genellina
En Tue, 27 Mar 2007 18:42:15 -0300, Diez B. Roggisch <[EMAIL PROTECTED]>  
escribió:

> Paul McGuire schrieb:
>> On Mar 27, 10:18 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
>>> Fabian Braennstroem wrote:
 while iter:
 value = model.get_value(iter, 1)
 if value.endswith("."+ pattern): [...]

 Now, I would like to improve it by searching for different 'real'
 patterns just like using 'ls' in bash. E.g. the entry
 'car*.pdf' should select all pdf files with a beginning 'car'.
 Does anyone have an idea, how to do it?

>>> Use regular expressions. They are part of the module "re". And if you  
>>> use them, ditch your code above, and make it just search for a pattern  
>>> all the time. Because the above is just the case of
>>> *.ext

>> The glob module is a more direct tool based on the OP's example.  The
>> example he gives works directly with glob.  To use re, you'd have to
>> convert to something like "car.*\.pdf", yes?

> I'm aware of the glob-module. But it only works on files. I was under
> the impression that he already has a list of files he wants to filter
> instead of getting it fresh from the filesystem.

In that case the best way would be to use the fnmatch module - it already  
knows how to translate from car*.pdf into the right regexp. (The glob  
module is like a combo os.listdir+fnmatch.filter)

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern search

2007-03-27 Thread Diez B. Roggisch
Paul McGuire schrieb:
> On Mar 27, 10:18 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
>> Fabian Braennstroem wrote:
>>> Hi,
>>> I wrote a small gtk file manager, which works pretty well. Until
>>> now, I am able to select different file (treeview entries) just by
>>> extension (done with 'endswith'). See the little part below:
>>> self.pathlist1=[ ]
>>> self.patternlist=[ ]
>>> while iter:
>>> #print iter
>>> value = model.get_value(iter, 1)
>>> #if value is what I'm looking for:
>>> if value.endswith("."+ pattern):
>>> selection.select_iter(iter)
>>> selection.select_path(n)
>>> self.pathlist1.append(n)
>>> self.patternlist.append(value)
>>> iter = model.iter_next(iter)
>>> #print value
>>> n=n+1
>>> Now, I would like to improve it by searching for different 'real'
>>> patterns just like using 'ls' in bash. E.g. the entry
>>> 'car*.pdf' should select all pdf files with a beginning 'car'.
>>> Does anyone have an idea, how to do it?
>> Use regular expressions. They are part of the module "re". And if you use
>> them, ditch your code above, and make it just search for a pattern all the
>> time. Because the above is just the case of
>>
>> *.ext
>>
>> Diez- Hide quoted text -
>>
>> - Show quoted text -
> 
> The glob module is a more direct tool based on the OP's example.  The
> example he gives works directly with glob.  To use re, you'd have to
> convert to something like "car.*\.pdf", yes?
> 
> (Of course, re offers much more power than simple globbing.  Not clear
> how much more the OP was looking for.)

I'm aware of the glob-module. But it only works on files. I was under 
the impression that he already has a list of files he wants to filter 
instead of getting it fresh from the filesystem.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern search

2007-03-27 Thread Paul McGuire
On Mar 27, 3:13 pm, Fabian Braennstroem <[EMAIL PROTECTED]> wrote:
> Hi to all,
>
> Wojciech Mu?a schrieb am 03/27/2007 03:34 PM:
>
> > Fabian Braennstroem wrote:
> >> Now, I would like to improve it by searching for different 'real'
> >> patterns just like using 'ls' in bash. E.g. the entry
> >> 'car*.pdf' should select all pdf files with a beginning 'car'.
> >> Does anyone have an idea, how to do it?
>
> > Use module glob.
>
> Thanks for your help! glob works pretty good, except that I just
> deleted all my lastet pdf files :-(
>
> Greetings!
> Fabian

Then I shudder to think what might have happened if you had used
re's! :)

-- Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern search

2007-03-27 Thread Paul McGuire
On Mar 27, 10:18 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
> Fabian Braennstroem wrote:
> > Hi,
>
> > I wrote a small gtk file manager, which works pretty well. Until
> > now, I am able to select different file (treeview entries) just by
> > extension (done with 'endswith'). See the little part below:
>
> > self.pathlist1=[ ]
> > self.patternlist=[ ]
> > while iter:
> > #print iter
> > value = model.get_value(iter, 1)
> > #if value is what I'm looking for:
> > if value.endswith("."+ pattern):
> > selection.select_iter(iter)
> > selection.select_path(n)
> > self.pathlist1.append(n)
> > self.patternlist.append(value)
> > iter = model.iter_next(iter)
> > #print value
> > n=n+1
>
> > Now, I would like to improve it by searching for different 'real'
> > patterns just like using 'ls' in bash. E.g. the entry
> > 'car*.pdf' should select all pdf files with a beginning 'car'.
> > Does anyone have an idea, how to do it?
>
> Use regular expressions. They are part of the module "re". And if you use
> them, ditch your code above, and make it just search for a pattern all the
> time. Because the above is just the case of
>
> *.ext
>
> Diez- Hide quoted text -
>
> - Show quoted text -

The glob module is a more direct tool based on the OP's example.  The
example he gives works directly with glob.  To use re, you'd have to
convert to something like "car.*\.pdf", yes?

(Of course, re offers much more power than simple globbing.  Not clear
how much more the OP was looking for.)

-- Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern search

2007-03-27 Thread Fabian Braennstroem
Hi to all,

Wojciech Mu?a schrieb am 03/27/2007 03:34 PM:
> Fabian Braennstroem wrote:
>> Now, I would like to improve it by searching for different 'real'
>> patterns just like using 'ls' in bash. E.g. the entry
>> 'car*.pdf' should select all pdf files with a beginning 'car'.
>> Does anyone have an idea, how to do it?
> 
> Use module glob.

Thanks for your help! glob works pretty good, except that I just
deleted all my lastet pdf files :-(

Greetings!
Fabian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern search

2007-03-27 Thread Wojciech Muła
Fabian Braennstroem wrote:
> Now, I would like to improve it by searching for different 'real'
> patterns just like using 'ls' in bash. E.g. the entry
> 'car*.pdf' should select all pdf files with a beginning 'car'.
> Does anyone have an idea, how to do it?

Use module glob.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern search

2007-03-27 Thread Diez B. Roggisch
Fabian Braennstroem wrote:

> Hi,
> 
> I wrote a small gtk file manager, which works pretty well. Until
> now, I am able to select different file (treeview entries) just by
> extension (done with 'endswith'). See the little part below:
> 
> self.pathlist1=[ ]
> self.patternlist=[ ]
> while iter:
> #print iter
> value = model.get_value(iter, 1)
> #if value is what I'm looking for:
> if value.endswith("."+ pattern):
> selection.select_iter(iter)
> selection.select_path(n)
> self.pathlist1.append(n)
> self.patternlist.append(value)
> iter = model.iter_next(iter)
> #print value
> n=n+1
> 
> Now, I would like to improve it by searching for different 'real'
> patterns just like using 'ls' in bash. E.g. the entry
> 'car*.pdf' should select all pdf files with a beginning 'car'.
> Does anyone have an idea, how to do it?

Use regular expressions. They are part of the module "re". And if you use
them, ditch your code above, and make it just search for a pattern all the
time. Because the above is just the case of

*.ext



Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern for foo tool <-> API <-> shell|GUI

2007-03-26 Thread Anastasios Hatzis
On Sunday 25 March 2007 16:44, Steven Bethard wrote:
> Anastasios Hatzis wrote:
> > I'm working on a tool which is totally command-line based and consisting
> > of multiple scripts. The user can execute a Python script in the shell,
> > this script does some basic verification before delegating a call into my
> > tool's package and depending on some arguments and options provided in
> > the command-line, e.g.
> > $python generate.py myproject --force --verbose
> > the tool processes whatever necessary. There are multiple command
> > handlers available in this package which are responsible for different
> > tasks and depending of the script that has been executed one or more of
> > these command handlers are fired to do their work ;)
>
> Side note: you might find argparse (http://argparse.python-hosting.com/)
>
> makes this a bit easier if you have positional arguments or sub-commands::

Steve, thank you, for the note. I didn't know argparse before. I have multiple 
scripts since optparse puts all arguments and options into one single help 
text, and the arguments and options are too specific for most commands (and 
thus the help would be absolutely overloaded and useless for new users). It 
seems that argparse has multiple help pages separated for each sub-command, 
as far as I understand the page.

>
> > And I don't think that this is very trivial (at least not for my
> > programming skill level). In the given example "generate.py" (above) the
> > following scenario is pretty likely:
> >
> > (1) User works with UML tool and clicks in some dialog a "generate"
> > button (2) UML tool triggers this event an calls a magic generate()
> > method of my tool (via the API I provide for this purpose), like my
> > generate.py script would do same way
> > (3) Somewhen with-in this generate process my tool may need to get some
> > information from the user in order to continue (it is in the nature of
> > the features that I can't avoid this need of interaction in any case).
>
> So you're imagining an API something like::
>
>  def generate(name,
>   force=False,
>   verbose=False,
>   handler=command_line_handler):
>  ...
>  choice = handler.prompt_user(question_text, user_choices)
>  ...
>
> where the command-line handler might look something like::
>
>  class CommandLineHandler(object):
>  ...
>  def prompt_user(self, question_text, user_choices):
>  while True:
>  choice = raw_input(question_text)
>  if choice in user_choices:
>  return choice
>  print 'invalid choice, choose from %s' % choices
>
> and the GUI client would implement the equivalent thing with dialogs?

Exactly.

- Now, as I see your example, I wonder if this would work with a GUI which is 
event-driven... I have to look into my wx GUI prototype.

> That seems basically reasonable to me, though you should be clear in the
> documentation of generate() -- and any other methods that accept handler
> objects -- exactly what methods the handler must provide.
>
> You also may find that "prompt_user" is a bit too generic -- e.g. a file
> chooser dialog looks a lot different from a color chooser dialog -- so
> you may need to split this up into "prompt_user_file",
> "prompt_user_color", etc. so that handler's don't have to introspect the
> question text to know what to do...
>
> STeVe

Hey, right, good idea. I didn't think about the different task-specific 
dialogs in most GUIs. But I see that usability will gain benefit from 
differentiated "prompt" methods.

Anastasios
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern for foo tool <-> API <-> shell|GUI

2007-03-25 Thread Steven Bethard
Anastasios Hatzis wrote:
> I'm working on a tool which is totally command-line based and consisting of 
> multiple scripts. The user can execute a Python script in the shell, this 
> script does some basic verification before delegating a call into my tool's 
> package and depending on some arguments and options provided in the 
> command-line, e.g.
> $python generate.py myproject --force --verbose
> the tool processes whatever necessary. There are multiple command handlers 
> available in this package which are responsible for different tasks and 
> depending of the script that has been executed one or more of these command 
> handlers are fired to do their work ;)

Side note: you might find argparse (http://argparse.python-hosting.com/) 
makes this a bit easier if you have positional arguments or sub-commands::

 >>> parser = argparse.ArgumentParser()
 >>> parser.add_argument('name')
 >>> parser.add_argument('--force', action='store_true')
 >>> parser.add_argument('--verbose', action='store_true')
 >>> parser.parse_args(['my_project', '--force', '--verbose'])
 Namespace(force=True, name='my_project', verbose=True)

 >>> parser = argparse.ArgumentParser()
 >>> subparsers = parser.add_subparsers()
 >>> cmd1_parser = subparsers.add_parser('cmd1')
 >>> cmd1_parser.add_argument('--foo')
 >>> cmd2_parser = subparsers.add_parser('cmd2')
 >>> cmd2_parser.add_argument('bar')
 >>> parser.parse_args(['cmd1', '--foo', 'X'])
 Namespace(foo='X')
 >>> parser.parse_args(['cmd2', 'Y'])
 Namespace(bar='Y')

> And I don't think that this is very trivial (at least not for my programming 
> skill level). In the given example "generate.py" (above) the following 
> scenario is pretty likely:
> 
> (1) User works with UML tool and clicks in some dialog a "generate" button
> (2) UML tool triggers this event an calls a magic generate() method of my 
> tool 
> (via the API I provide for this purpose), like my generate.py script would do 
> same way
> (3) Somewhen with-in this generate process my tool may need to get some 
> information from the user in order to continue (it is in the nature of the 
> features that I can't avoid this need of interaction in any case).

So you're imagining an API something like::

 def generate(name,
  force=False,
  verbose=False,
  handler=command_line_handler):
 ...
 choice = handler.prompt_user(question_text, user_choices)
 ...

where the command-line handler might look something like::

 class CommandLineHandler(object):
 ...
 def prompt_user(self, question_text, user_choices):
 while True:
 choice = raw_input(question_text)
 if choice in user_choices:
 return choice
 print 'invalid choice, choose from %s' % choices

and the GUI client would implement the equivalent thing with dialogs? 
That seems basically reasonable to me, though you should be clear in the 
documentation of generate() -- and any other methods that accept handler 
objects -- exactly what methods the handler must provide.

You also may find that "prompt_user" is a bit too generic -- e.g. a file 
chooser dialog looks a lot different from a color chooser dialog -- so 
you may need to split this up into "prompt_user_file", 
"prompt_user_color", etc. so that handler's don't have to introspect the 
question text to know what to do...

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern for foo tool <-> API <-> shell|GUI

2007-03-24 Thread Anastasios Hatzis
On Saturday 24 March 2007 18:55, [EMAIL PROTECTED] wrote:
> On Mar 24, 10:31 am, Anastasios Hatzis <[EMAIL PROTECTED]> wrote:
> > I'm looking for a pattern where different client implementations can use
> > the same commands of some fictive tool ("foo") by accessing some kind of
> > API. Actually I have the need for such pattern for my own tool
> > (http://openswarm.sourceforge.net). I already started restructuring my
> > code to separate the actual command implementations from the command-line
> > scripts (which is optparser-based now) and have some ideas how to
> > proceed. But probably there is already a good pattern used for
> > Python-based tools.
> >
> > In the case that some of you are interested into this topic and my recent
> > thoughts, you may want to have a look at the description below. Any
> > comments are very much appreciated. Hopefully this list is a good place
> > for discussing a pattern, otherwise I would be okay to move this to
> > another place. Thank you.
> >
> > Here we go:
> > The tool package itself provides several commands, although not important
> > for the pattern itself, here some examples: modifying user-specific
> > preferences, creating and changing project settings files,
> > project-related
> > code-generation, or combinations of such commands ... later also commands
> > for transformation between several XML formats etc. The classes which
> > implement these commands are currently in multiple modules, each having a
> > class named CmdHandler.
> >
> > I have some Python scripts (each having a ScriptHandler classes), for use
> > via command-line. Each ScriptHandler class is responsible to add all
> > related command-line options and process those provided by the user
> > (based on optparse library from Python standard lib). The script then
> > calls the corresponding command and provide the verified options as
> > parameters.
> >
> > Due to the nature of the tool under specific conditions the following
> > results may come during command execution:
> > * successful execution, no interaction
> > * critical error, execution cancelled
> > * user interaction needed (e.g. prompt user to approve replace existing
> > directory (yes/no), prompt user to provide an alternative option)
> >
> > Command-line interactions work simply with raw_input().
> >
> > So far this works. Nevertheless, there are some other aspects that could
> > be improved, but this is another topic: The tool uses custom exceptions
> > (e.g. for critical errors) and logging features (based on logging from
> > Python standard lib). Currently no automated tests, but I have to add.
> >
> > For the next step I plan to support not only my own command-line scripts,
> > but also a GUI to access the commands, as well as 3rd-party products
> > (themselves command-line scripts or GUIs, such as foo plugins for any
> > 3rd-party-tools). As far as I see, these clients need to implement a
> > handler that:
> > (1) Collecting all required parameters and optional parameters from a
> > user (2) Provide these parameters for a particular call to command API
> > (3) Provides some kind of hooks that are called back from the API on
> > specific events, e.g. Question with user-choice; Information with
> > user-input (4) Provide a logging handler object from the tool logging
> > class or a sub-class of that in the case that a client-specific logging
> > object should be triggered on each debug, message, warning etc.
> >
> > (1) is very client-specific, e.g. in a GUI via dialogs.
> >
> > (2) Each command provides a signature for all required/optional
> > parameters. They are all verified from the command itself, although a
> > client could do some verification at the first place.
> >
> > (3) Example use-case: a command needs to know if the user wants the
> > command to proceed with a particular action, e.g. "Do you want to delete
> > bar.txt?" with "Yes", "No" and "Cancel" choice. So the client's handler
> > object (which is provided as first parameter to each command) implements
> > client-specific features to show the user this question (e.g. pop-up
> > dialog with question and three buttons), receive the user input (clicking
> > one of the buttons) and pass this choice back to the foo API.
> > Alternatively some kind of text information could be required, as in
> > raw_input(), so actually this probably would be two different interaction
> > features to be implemented.
> >
> > (4) The foo API also provides a logging class. The client needs to
> > initialize such an object and provide it as member of the handler object
> > provided to the API. I wonder if some clients may have own logging
> > features and want to include all log messages from foo tool to the own
> > logs. In this case a client could its own sub-class of the foo logging
> > class and extending it with callbacks to its (client-)native logging
> > object.
> >
> > What do you think about this?
> >
> > Best regards,
> > Anastasios
>
> I think if you want to use a GUI, wxpy

Re: Pattern for foo tool <-> API <-> shell|GUI

2007-03-24 Thread kyosohma
On Mar 24, 10:31 am, Anastasios Hatzis <[EMAIL PROTECTED]> wrote:
> I'm looking for a pattern where different client implementations can use the
> same commands of some fictive tool ("foo") by accessing some kind of API.
> Actually I have the need for such pattern for my own tool
> (http://openswarm.sourceforge.net). I already started restructuring my code
> to separate the actual command implementations from the command-line scripts
> (which is optparser-based now) and have some ideas how to proceed. But
> probably there is already a good pattern used for Python-based tools.
>
> In the case that some of you are interested into this topic and my recent
> thoughts, you may want to have a look at the description below. Any comments
> are very much appreciated. Hopefully this list is a good place for discussing
> a pattern, otherwise I would be okay to move this to another place. Thank
> you.
>
> Here we go:
> The tool package itself provides several commands, although not important for
> the pattern itself, here some examples: modifying user-specific preferences,
> creating and changing project settings files, project-related
> code-generation, or combinations of such commands ... later also commands for
> transformation between several XML formats etc. The classes which implement
> these commands are currently in multiple modules, each having a class named
> CmdHandler.
>
> I have some Python scripts (each having a ScriptHandler classes), for use via
> command-line. Each ScriptHandler class is responsible to add all related
> command-line options and process those provided by the user (based on
> optparse library from Python standard lib). The script then calls the
> corresponding command and provide the verified options as parameters.
>
> Due to the nature of the tool under specific conditions the following results
> may come during command execution:
> * successful execution, no interaction
> * critical error, execution cancelled
> * user interaction needed (e.g. prompt user to approve replace existing
> directory (yes/no), prompt user to provide an alternative option)
>
> Command-line interactions work simply with raw_input().
>
> So far this works. Nevertheless, there are some other aspects that could be
> improved, but this is another topic: The tool uses custom exceptions (e.g.
> for critical errors) and logging features (based on logging from Python
> standard lib). Currently no automated tests, but I have to add.
>
> For the next step I plan to support not only my own command-line scripts, but
> also a GUI to access the commands, as well as 3rd-party products (themselves
> command-line scripts or GUIs, such as foo plugins for any 3rd-party-tools). As
> far as I see, these clients need to implement a handler that:
> (1) Collecting all required parameters and optional parameters from a user
> (2) Provide these parameters for a particular call to command API
> (3) Provides some kind of hooks that are called back from the API on specific
> events, e.g. Question with user-choice; Information with user-input
> (4) Provide a logging handler object from the tool logging
> class or a sub-class of that in the case that a client-specific logging object
> should be triggered on each debug, message, warning etc.
>
> (1) is very client-specific, e.g. in a GUI via dialogs.
>
> (2) Each command provides a signature for all required/optional parameters.
> They are all verified from the command itself, although a client could do
> some verification at the first place.
>
> (3) Example use-case: a command needs to know if the user wants the command to
> proceed with a particular action, e.g. "Do you want to delete bar.txt?"
> with "Yes", "No" and "Cancel" choice. So the client's handler object (which
> is provided as first parameter to each command) implements client-specific
> features to show the user this question (e.g. pop-up dialog with question and
> three buttons), receive the user input (clicking one of the buttons) and pass
> this choice back to the foo API. Alternatively some kind of text information
> could be required, as in raw_input(), so actually this probably would be two
> different interaction features to be implemented.
>
> (4) The foo API also provides a logging class. The client needs to initialize
> such an object and provide it as member of the handler object provided to the
> API. I wonder if some clients may have own logging features and want to
> include all log messages from foo tool to the own logs. In this case a client
> could its own sub-class of the foo logging class and extending it with
> callbacks to its (client-)native logging object.
>
> What do you think about this?
>
> Best regards,
> Anastasios

I think if you want to use a GUI, wxpython or Tkinter would work well
for you. wxPython has more widgets from the start, but is also more
complex. Tkinter is good for quick and dirty GUIs, but gets
increasingly more complicated to deal with the more complex the GUI
has to be, in general. 

Re: pattern matching

2007-03-01 Thread Diez B. Roggisch
azrael wrote:

> can someone give me good links for pattern matching in images using
> python

There is a python-binding available for the OpenCV library, a collection of
state-of-the-art CV algorithms.

And it comes with a free manual

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >