Re: parsing

2006-05-27 Thread Keith McGavin
On Sat, May 27, Hadley Rich wrote:
 I must play with awk some more.
explaination of some syntax in the awk/gawk manual is not
always obvious but the awk/gawk gurus on news://comp.lang.awk 
will clarify how it is used if google fails. An active newgroup.


---
keith


Re: parsing

2006-05-27 Thread Keith McGavin
On Sat, May 27, Volker Kuhlmann wrote:
 I don't think that awk allows a regex for field separation. 
A Unix reference manual mentions that (original) awk uses Space 
or Tab as the default field separator and a single character if
the option -Fc is used. 


--
keith.



Re: parsing

2006-05-26 Thread Keith McGavin
On Fri, May 26, 2006 at 11:29:54AM, Nick Rout wrote:
 what is the best way to parse out the first load average figure, ie in
 this case 18.73

'awk' accepts an array of field separators -F'[fsfsfs]' between square 
brackets so the required field $12 can be printed out. Not mentioned in the 
manual.

echo $THAT_LINE | awk -F'[,:]' '{print $12}'




Re: parsing

2006-05-26 Thread Hadley Rich
On Saturday 27 May 2006 16:07, Keith McGavin wrote:
 'awk' accepts an array of field separators -F'[fsfsfs]' between square
 brackets so the required field $12 can be printed out. Not mentioned in the
 manual.

 echo $THAT_LINE | awk -F'[,:]' '{print $12}'

Nice, I would say that's a winner. I must play with awk some more.

-- 
There is always something new out of Africa.
-- Gaius Plinius Secundus


Re: parsing

2006-05-26 Thread Volker Kuhlmann
 'awk' accepts an array of field separators -F'[fsfsfs]' between square 
 brackets so the required field $12 can be printed out. Not mentioned in the 
 manual.

In the interest of portability awareness, let me be anally correct. I
don't think that awk allows a regex for field separation. However, gawk
does, and it is in the manual. Of course on Linux systems, awk is
typically a symlink to gawk. Good programming would be that when you
require features of gawk which are not part of awk, to call the program
by the name of gawk, and it will produce an obvious failure instead of
rubbish results.

Btw see my recent cable traffic script for an example of shell
programming with gawk components, it fits in with the script talk we had
last year too.

Volker

-- 
Volker Kuhlmann is list0570 with the domain in header
http://volker.dnsalias.net/ Please do not CC list postings to me.


Re: parsing

2006-05-25 Thread Carl Cerecke

Using which language?

Also, does anyone else find it odd that the line has one table and 2
/tables?
Is there another table further up?

Cheers,
Carl.

On 26/05/06, Nick Rout [EMAIL PROTECTED] wrote:

given this string (all one line):

Tracker Load: (9 %)table class=main border=0 width=400trtd style='padding: 0px; background-image: url(pic/loadbarbg.gif); 
background-repeat: repeat-x'img height=15 width=36 src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15  
up 2 days, 23:44,  1 user,  load average: 18.73, 20.96, 27.06br/td/tr/table

what is the best way to parse out the first load average figure, ie in
this case 18.73
--
Nick Rout [EMAIL PROTECTED]




Re: parsing

2006-05-25 Thread Hadley Rich
On Friday 26 May 2006 11:29, Nick Rout wrote:
 given this string (all one line):

 Tracker Load: (9 %)table class=main border=0 width=400trtd
 style='padding: 0px; background-image: url(pic/loadbarbg.gif);
 background-repeat: repeat-x'img height=15 width=36
 src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15  up 2 days,
 23:44,  1 user,  load average: 18.73, 20.96, 27.06br/td/tr/table

 what is the best way to parse out the first load average figure, ie in
 this case 18.73

Yeah, what language?

echo $THAT_LINE | sed -e 's/.*load average: //' | cut -d ',' -f 1

a bit dirty but it works.

-- 
I'd love to go out with you, but the man on television told me to stay 
tuned.


Re: parsing

2006-05-25 Thread David Mann

On May 26, 2006, at 11:29 AM, Nick Rout wrote:


given this string (all one line):

Tracker Load: (9 %)table class=main border=0 width=400trtd  
style='padding: 0px; background-image: url(pic/loadbarbg.gif);  
background-repeat: repeat-x'img height=15 width=36 src=/pic/ 
loadbargreen.gif alt='9%'/td/tr/table18:25:15  up 2 days,  
23:44,  1 user,  load average: 18.73, 20.96, 27.06br/td/tr/ 
table


what is the best way to parse out the first load average figure, ie in
this case 18.73


The best way will depend on what you're using, and how likely that  
string is to change.


Just off the top of my head, you could find the index of the  
substring load average and go from there, or maybe throw everything  
into a CSV parser to get the string  load average: 18.73, which  
wouldn't need much manipulation to convert to a numeric data type.


- Dave



Re: parsing

2006-05-25 Thread Volker Kuhlmann
 given this string (all one line):
 
 Tracker Load: (9 %)table class=main border=0 width=400trtd 
 style='padding: 0px; background-image: url(pic/loadbarbg.gif); 
 background-repeat: repeat-x'img height=15 width=36 
 src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15  up 2 days, 
 23:44,  1 user,  load average: 18.73, 20.96, 27.06br/td/tr/table
 
 what is the best way to parse out the first load average figure, ie in
 this case 18.73

Easy. Given that you didn't specify the language, the default
fallback is bash, which, short of a compiled language, is probably also
the fastest:

line=yourstring
line=${line##*load average:}
line=${line##* }
load=${line%%,*}
echo $load

The first two patterns could be combined, but as is, they allow for
variable white space.

You get the idea for other languages: kill everything until load
average, and after the first comma of what's left.

Volker

-- 
Volker Kuhlmann is list0570 with the domain in header
http://volker.dnsalias.net/ Please do not CC list postings to me.


Re: parsing

2006-05-25 Thread Andrew Errington
 what is the best way to parse out the first load average figure, ie in
 this case 18.73

The best way is to get someone else to do it for you and guarantee it will 
work.

In Python (if foo is your string):

load_average= float(foo.split('load average: ')[1].split(',')[0])

But it depends what language you have at your disposal (or which one you 
are forced to use).

A


Re: parsing

2006-05-25 Thread Nick Rout

On Fri, 26 May 2006 12:04:48 +1200
Hadley Rich wrote:

 On Friday 26 May 2006 11:29, Nick Rout wrote:
  given this string (all one line):
 
  Tracker Load: (9 %)table class=main border=0 width=400trtd
  style='padding: 0px; background-image: url(pic/loadbarbg.gif);
  background-repeat: repeat-x'img height=15 width=36
  src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15  up 2 days,
  23:44,  1 user,  load average: 18.73, 20.96, 27.06br/td/tr/table
 
  what is the best way to parse out the first load average figure, ie in
  this case 18.73
 
 Yeah, what language?
 
 echo $THAT_LINE | sed -e 's/.*load average: //' | cut -d ',' -f 1
 
 a bit dirty but it works.


yes that works thanks.  I think  I am right in saying that the sed part
cuts out everything up to the words load average: and the cut part
then takes up until (but not including) the comma.



-- 
Nick Rout [EMAIL PROTECTED]



Re: parsing

2006-05-25 Thread Christopher Sawtell
On Friday 26 May 2006 14:07, Nick Rout wrote:

 yes that works thanks.  I think  I am right in saying that the sed part
 cuts out everything up to the words load average: and the cut part
 then takes up until (but not including) the comma.
Correct!

-- 
CS


Re: parsing

2006-05-25 Thread Steve Holdoway
On Fri, 26 May 2006 14:07:17 +1200
Nick Rout [EMAIL PROTECTED] wrote:

 
 On Fri, 26 May 2006 12:04:48 +1200
 Hadley Rich wrote:
 
  On Friday 26 May 2006 11:29, Nick Rout wrote:
   given this string (all one line):
  
   Tracker Load: (9 %)table class=main border=0 width=400trtd
   style='padding: 0px; background-image: url(pic/loadbarbg.gif);
   background-repeat: repeat-x'img height=15 width=36
   src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15  up 2 
   days,
   23:44,  1 user,  load average: 18.73, 20.96, 27.06br/td/tr/table
  
   what is the best way to parse out the first load average figure, ie in
   this case 18.73
  
  Yeah, what language?
  
  echo $THAT_LINE | sed -e 's/.*load average: //' | cut -d ',' -f 1
  
  a bit dirty but it works.
 
 
 yes that works thanks.  I think  I am right in saying that the sed part
 cuts out everything up to the words load average: and the cut part
 then takes up until (but not including) the comma.
 
 
 
 -- 
 Nick Rout [EMAIL PROTECTED]
 
...nearly. 

sed... substitute everything up to load average:[SPACE] with nothing.
cut... return the first field ( of what's piped to it ), using ',' as the 
delimiter.

Steve


Re: Parsing Mail log files

2005-10-28 Thread Daniel

Jamie Dobbs wrote:

I am trying to parse my mail log files to find out the number of messages
received per day and the total size of the messages received.

The format of the log files is:


From x  Fri Oct 28 14:25:12 2005

 Subject: FW: Emailing: super_cop_1_.wmv
  Folder: ~/Maildir/new/1130462726.5627_0.xxx 2442024

Getting the number of messages per day is pretty easy, and I have done in
this way

for f in ~/Maildir/log*; do echo $f : `grep Folder: $f | wc -l`; done

I would also like to be able to get the total bytes/kbytes of the messages
received for the day and although I can see how it should be done
logically I do not know how to translate this into a command for a shell
script like the above. Logic tells me that we want the last word of the
line starting with the word Folder: and keep a running total of this.
Can anyone advise how I can do this?


Awk can do it reasonably easily:

for f in log-* ; do echo -n $f : ; awk -F  '/Folder:/ {count++;sum 
+= $(NF)}; END {printf(%d %d\n,count,sum)}' $f; done


And in kbytes

for f in log-* ; do echo -n $f : ; awk -F  '/Folder:/ {count++;sum 
+= $(NF)}; END {printf(%d %d\n,count,sum/1024)}' $f; done



Daniel



Re: parsing

2002-04-04 Thread Jeremy Bertenshaw

cat textfile | sed 's/\[.*\]//g'

:-)

jeremyb.

 From: Nick Rout [EMAIL PROTECTED]
 Date: 2002/04/05 Fri PM 04:39:28 GMT+12:00
 To: CLUG [EMAIL PROTECTED]
 Subject: parsing
 
 I have  a string like postfix/smtpd[25532]: I want to cut it off at the
 [ so I end up with postfix/smtpd 
 
 how do i do it?
 -- 
 Nick Rout [EMAIL PROTECTED]
 
 





Re: parsing

2002-04-04 Thread Volker Kuhlmann

 I have  a string like postfix/smtpd[25532]: I want to cut it off at the
 [ so I end up with postfix/smtpd 

sed 's/\[.*$//'(dump everything after first [ on line)

Volker

-- 
Volker Kuhlmann, list0570 at paradise dot net dot nz
http://volker.orcon.net.nz/ Please do not CC list postings to me.



Re: parsing

2002-04-04 Thread Wayne Rooney


-Original Message-
From: Nick Rout [EMAIL PROTECTED]
To: CLUG [EMAIL PROTECTED]
Date: Friday, April 05, 2002 4:39 PM
Subject: parsing


I have  a string like postfix/smtpd[25532]: I want to cut it off at the
[ so I end up with postfix/smtpd 

how do i do it?
-- 
Nick Rout [EMAIL PROTECTED]



For a fixed length string try

echo postfix/smtpd[25532] | cut -c 1-13

No guarentees, I haven't had a change to test it.

Wayne






Re: parsing

2002-04-04 Thread Jim Cheetham

On Fri, Apr 05, 2002 at 05:03:24PM +1200, Wayne Rooney wrote:
 From: Nick Rout [EMAIL PROTECTED]
 I have  a string like postfix/smtpd[25532]: I want to cut it off at the
 [ so I end up with postfix/smtpd 
 
 For a fixed length string try
 
 echo postfix/smtpd[25532] | cut -c 1-13

Actually, cut is much more flexible than that, and if you're running it
over large data sets, it's faster than an equivalent sed.

In this instance you want to use the invocation 'cut -d\[ -f1'
The -d option (with a quoted [, in case your shell thinks you're trying
to introduce a real metacharacter) tells cut to parse the input into a
number of columns, with [ acting as the delimeter.
-f1 selects the first column. The [ character will not appear in any of
the output fields ... it's the delimeter, not data.
Other -f options include things like -f2-3 for columns 2 and three,
-f3,5 for columns 3 and 5, and the useful -f2- for al columns from 2
onwards.

But really, just getting the postfix/smtpd section isn't that useful -
it doesn't change. Perhaps you want to identify the postfix entries in
the log file, and collect their values?

$ fgrep postfix maillog | cut -d: -f2-

Given an input file of
things [234]: This is a message
postfix [266]: This is another message
somethings [456]: This is a useless message
thongs [567]: This is not a message
swings [678]: This is an interesting message
postfix/smtpd [789]: This is because of an email message

You'll get
 This is another message
 This is because of an email message

(fgrep, or 'frep -F', is faster if you're searching for a fixed string,
as opposed to a regexp)

-jim