Re: Pattern matching from a text document

2005-03-24 Thread Larry Bates
Ben,

Others have answered your specific questions, but I thought
I'd use this opportunity to make a general statement.  Unlike
other programming languages, Python doesn't make its built-in
functions keywords.  You should never, ever, ever name a
variable 'list' (the same is true of dict, tuple, str, ...).
When you do you mask the built-in Python function with your
variables.  If this hasn't bitten you before, it will at some
point.

It really doesn't sound like you require regular expression
complexity to just read in some data.  You might want to
investigate CSV module (for reading comma delimited files)
or you might just be able to use simple .split() method (for
tab delimited files).

Hope info helps.

Regards,
Larry Bates


Ben wrote:
 I'm currently trying to develop a demonstrator in python for an
 ontology of a football team. At present all the fit players are
 exported to a text document.
 
 The program reads the document in and splits each line into a string
 (since each fit player and their attributes is entered line by line in
 the text document) using list = target.splitlines()
 
 The program then performs a loop like so:
 
 while foo  0:
 if len(list) == 0:
 break
 else:
 pat =
 ([a-z]+)(\s+)([a-z]+)(\s+)([a-z]+)(\s+)(\d{1})(\d{1})(\d{1})(\d{1})(\d{1})([a-z]+)
 ph = re.compile(pat,re.IGNORECASE)
 
 match = ph.match(list[1])
 
 forename = match.group(1)
 surname = match.group(3)
 attacking = match.group(7)
 defending = match.group(8)
 fitness = match.group(9)
 
 print forename
 print len(list)
 del list[0]
 
 The two main problems I'm having are that the first and entry in the
 list is not printing. Once I have overcome this problem I then need
 each player and there related variables to be stored seperately. This
 is not happening at present because each time the loop runs it
 overwrites the value in each variable.
 
 Any help would be greatly appreciated.
 
 Ben.
 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern matching from a text document

2005-03-24 Thread Ben

George Sakkis wrote:
 B
 Ben [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
  I'm currently trying to develop a demonstrator in python for an
  ontology of a football team. At present all the fit players are
  exported to a text document.
 
  The program reads the document in and splits each line into a
string
  (since each fit player and their attributes is entered line by line
in
  the text document) using list = target.splitlines()
 
  [snipped]
 
  The program then performs a loop like so:
 
  The two main problems I'm having are that the first and entry in
the
  list is not printing. Once I have overcome this problem I then need
  each player and there related variables to be stored seperately.
This
  is not happening at present because each time the loop runs it
  overwrites the value in each variable.
 
  Any help would be greatly appreciated.
 
  Ben.


 Ben, can you post a sample line from the document and indicate the
fields you want to extract? I'm
 sure it will be easier to help you this way.

 George


 ~
 If a slave say to his master: You are not my master, if they
convict
 him his master shall cut off his ear.

 Hammurabi's Code of Laws
 ~

Below is a few sample lines. There is the name followed by the class
(not important) followed by 5 digits each of which can range 1-9 and
each detail a different ability, such as fitness, attacking ability
etc. Finally the preferred foot is stated.

Freddie Ljungberg   Player  02808right
Dennis Bergkamp Player  90705either
Thierry Henry   Player  90906either
Ashley Cole Player  17705left


Thanks for your help

ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern matching from a text document

2005-03-24 Thread F. Petitjean
Le 24 Mar 2005 06:16:12 -0800, Ben a écrit :
 
 Below is a few sample lines. There is the name followed by the class
 (not important) followed by 5 digits each of which can range 1-9 and
 each detail a different ability, such as fitness, attacking ability
 etc. Finally the preferred foot is stated.
 
 Freddie Ljungberg Player  02808right
 Dennis Bergkamp   Player  90705either
 Thierry Henry Player  90906either
 Ashley Cole   Player  17705left
filename = 'players' # to adapt
players = {}  # mapping of name to abilities
fin = open(filename)
for line in fin:
firstname, lastname, type_, ability = line.split()
players[(lastname, firstname)] = Ability(ability)
fin.close()

where Ability can be e simple function which return processed the
information in the last word(string) of each line, or a class which
stores/manages such information
class Ability(object):
def __init__(self, ability):
digits = ability[:5]
self.details = map(int, list(digits)) # list of details
self.preferred_foot = ability[5:]
#  and so on 
 
 
 Thanks for your help
 
 ben
 
-- 
http://mail.python.org/mailman/listinfo/python-list


Pattern matching from a text document

2005-03-23 Thread Ben
I'm currently trying to develop a demonstrator in python for an
ontology of a football team. At present all the fit players are
exported to a text document.

The program reads the document in and splits each line into a string
(since each fit player and their attributes is entered line by line in
the text document) using list = target.splitlines()

The program then performs a loop like so:

while foo  0:
if len(list) == 0:
break
else:
pat =
([a-z]+)(\s+)([a-z]+)(\s+)([a-z]+)(\s+)(\d{1})(\d{1})(\d{1})(\d{1})(\d{1})([a-z]+)
ph = re.compile(pat,re.IGNORECASE)

match = ph.match(list[1])

forename = match.group(1)
surname = match.group(3)
attacking = match.group(7)
defending = match.group(8)
fitness = match.group(9)

print forename
print len(list)
del list[0]

The two main problems I'm having are that the first and entry in the
list is not printing. Once I have overcome this problem I then need
each player and there related variables to be stored seperately. This
is not happening at present because each time the loop runs it
overwrites the value in each variable.

Any help would be greatly appreciated.

Ben.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern matching from a text document

2005-03-23 Thread George Sakkis
B
Ben [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 I'm currently trying to develop a demonstrator in python for an
 ontology of a football team. At present all the fit players are
 exported to a text document.

 The program reads the document in and splits each line into a string
 (since each fit player and their attributes is entered line by line in
 the text document) using list = target.splitlines()

 [snipped]

 The program then performs a loop like so:

 The two main problems I'm having are that the first and entry in the
 list is not printing. Once I have overcome this problem I then need
 each player and there related variables to be stored seperately. This
 is not happening at present because each time the loop runs it
 overwrites the value in each variable.

 Any help would be greatly appreciated.

 Ben.


Ben, can you post a sample line from the document and indicate the fields you 
want to extract? I'm
sure it will be easier to help you this way.

George


~
If a slave say to his master: You are not my master, if they convict
him his master shall cut off his ear.

Hammurabi's Code of Laws
~


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern matching from a text document

2005-03-23 Thread infidel
First, if you're going to loop over each line, do it like this:

for line in file('playerlist.txt'):
#do stuff here

Second, this statement is referencing the *second* item in the list,
not the first:

match = ph.match(list[1])

Third, a simple splitting of the lines by some delimiter character
would be easier than regular expressions, but whatever floats your
boat.  If you insist on using regexen, then you should compile the
pattern before the loop.  No need to do it over and over again.

Fourth, if you want to create a list of players in memory, then you
need either a class or some other structure to represent each player,
and then you need to add them to some kind of list as you go.  Like
this:

pat =
([a-z]+)(\s+)([a-z]+)(\s+)([a­-z]+)(\s+)(\d{1})(\d{1})(\d{1}­)(\d{1})(\d{1})([a-z]+)

ph = re.compile(pat,re.IGNORECASE)
players = []
for line in file('playerlist.txt'):
match = ph.match(line)
player = {
'forename' : match.group(1),
'surname' : match.group(3),
'attacking' : match.group(7),
'defending' : match.group(8),
'fitness' : match.group(9) 
}
players.append(player)

--
http://mail.python.org/mailman/listinfo/python-list