how to extract columns like awk $1 $5

2005-01-07 Thread Anand S Bisen
Hi
Is there a simple way to extract words speerated by a space in python 
the way i do it in awk '{print $4 $5}' . I am sure there should be some 
but i dont know it.

Thanks
n00b
--
http://mail.python.org/mailman/listinfo/python-list


Re: how to extract columns like awk $1 $5

2005-01-07 Thread Craig Ringer
On Sat, 2005-01-08 at 01:15, Anand S Bisen wrote:
 Hi
 
 Is there a simple way to extract words speerated by a space in python 
 the way i do it in awk '{print $4 $5}' . I am sure there should be some 
 but i dont know it.

The 'str.split' method is probably what you want:

. x = The confused frog mumbled something about foxes
. x.split()
['The', 'confused', 'frog', 'mumbled', 'something', 'about', 'foxes']
. x.split( )[4:6]
['something', 'about']

so if 'x' is your string, the rough equivalent of that awk statement is:

. x_words = x.split()
. print x_words[4], x_words[5]

or perhaps

. print %s %s % tuple(x.split()[4:6])

--
Craig Ringer

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to extract columns like awk $1 $5

2005-01-07 Thread beliavsky
It takes a few more lines in Python, but you can do something like

for text in open(file.txt,r):
words = text.split()
print words[4],words[5]
(assuming that awk starts counting from zero -- I forget).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to extract columns like awk $1 $5

2005-01-07 Thread Jeremy Sanders
On Fri, 07 Jan 2005 12:15:48 -0500, Anand S Bisen wrote:

 Is there a simple way to extract words speerated by a space in python 
 the way i do it in awk '{print $4 $5}' . I am sure there should be some 
 but i dont know it.

mystr = '1 2 3 4 5 6'
parts = mystr.split()
print parts[3:5]

Jeremy

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to extract columns like awk $1 $5

2005-01-07 Thread Roy Smith
In article [EMAIL PROTECTED],
Anand S Bisen  [EMAIL PROTECTED] wrote:
Hi

Is there a simple way to extract words speerated by a space in python 
the way i do it in awk '{print $4 $5}' . I am sure there should be some 
but i dont know it.

Something along the lines of:

words = input.split()
print words[4], words[5]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to extract columns like awk $1 $5

2005-01-07 Thread Paul Rubin
[EMAIL PROTECTED] (Roy Smith) writes:
 Something along the lines of:
 
 words = input.split()
 print words[4], words[5]

That throws an exception if there are fewer than 6 fields, which might
or might not be what you want.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to extract columns like awk $1 $5

2005-01-07 Thread Dan Valentine
On Fri, 07 Jan 2005 12:15:48 -0500, Anand S Bisen wrote:

 Is there a simple way to extract words speerated by a space in python 
 the way i do it in awk '{print $4 $5}' . I am sure there should be some 
 but i dont know it.

i guess it depends on how faithfully you want to reproduce awk's behavior
and options.

as several people have mentioned, strings have the split() method for 
simple tokenization, but blindly indexing into the resulting sequence 
can give you an out-of-range exception.  out of range indexes are no
problem for awk; it would just return an empty string without complaint.

note that the index bases are slightly different: python sequences
start with index 0, while awk's fields begin with $1.  there IS a $0,
but it means the entire unsplit line.

the split() method accepts a separator argument, which can be used to
replicate awk's -F option / FS variable.

so, if you want to closely approximate awk's behavior without fear of
exceptions, you could try a small function like this:


def awk_it(instring,index,delimiter= ):
  try:
return [instring,instring.split(delimiter)[index-1]][max(0,min(1,index))]
  except:
return 


 print awk_it(a b c d e,0)
a b c d e

 print awk_it(a b c d e,1)
a

 print awk_it(a b c d e,5)
e

 print awk_it(a b c d e,6)


- dan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to extract columns like awk $1 $5

2005-01-07 Thread Roy Smith
Dan Valentine [EMAIL PROTECTED] wrote:

 On Fri, 07 Jan 2005 12:15:48 -0500, Anand S Bisen wrote:
 
  Is there a simple way to extract words speerated by a space in python 
  the way i do it in awk '{print $4 $5}' . I am sure there should be some 
  but i dont know it.
 
 i guess it depends on how faithfully you want to reproduce awk's behavior
 and options.
 
 as several people have mentioned, strings have the split() method for 
 simple tokenization, but blindly indexing into the resulting sequence 
 can give you an out-of-range exception.  out of range indexes are no
 problem for awk; it would just return an empty string without complaint.

It's pretty easy to create a list type which has awk-ish behavior:

class awkList (list):
def __getitem__ (self, key):
try:
return list.__getitem__ (self, key)
except IndexError:
return 

l = awkList (foo bar baz.split())
print l[0] = , repr (l[0])
print l[5] = , repr (l[5])

---

Roy-Smiths-Computer:play$ ./awk.py
l[0] =  'foo'
l[5] =  ''

Hmmm.  There's something going on here I don't understand.  The ref 
manual (3.3.5 Emulating container types) says for __getitem__(), Note: 
for loops expect that an IndexError will be raised for illegal indexes 
to allow proper detection of the end of the sequence.  I expected my 
little demo class to therefore break for loops, but they seem to work 
fine:

 import awk
 l = awk.awkList (foo bar baz.split())
 l
['foo', 'bar', 'baz']
 for i in l:
... print i
... 
foo
bar
baz
 l[5]
''

Given that I've caught the IndexError, I'm not sure how that's working.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to extract columns like awk $1 $5

2005-01-07 Thread Carl Banks
Roy Smith wrote:
 Hmmm.  There's something going on here I don't understand.  The ref
 manual (3.3.5 Emulating container types) says for __getitem__(),
Note:
 for loops expect that an IndexError will be raised for illegal
indexes
 to allow proper detection of the end of the sequence.  I expected my

 little demo class to therefore break for loops, but they seem to work

 fine:

  import awk
  l = awk.awkList (foo bar baz.split())
  l
 ['foo', 'bar', 'baz']
  for i in l:
 ... print i
 ...
 foo
 bar
 baz
  l[5]
 ''

 Given that I've caught the IndexError, I'm not sure how that's
working.


The title of that particular section is Emulating container types,
which is not what you're doing, so it doesn't apply here.  For built-in
types, iterators are at work.  The list iterator probably doesn't even
call getitem, but accesses the items directly from the C structure.
-- 
CARL BANKS

-- 
http://mail.python.org/mailman/listinfo/python-list