Re: parsing
On Sat, May 27, Hadley Rich wrote: I must play with awk some more. explaination of some syntax in the awk/gawk manual is not always obvious but the awk/gawk gurus on news://comp.lang.awk will clarify how it is used if google fails. An active newgroup. --- keith
Re: parsing
On Sat, May 27, Volker Kuhlmann wrote: I don't think that awk allows a regex for field separation. A Unix reference manual mentions that (original) awk uses Space or Tab as the default field separator and a single character if the option -Fc is used. -- keith.
Re: parsing
On Fri, May 26, 2006 at 11:29:54AM, Nick Rout wrote: what is the best way to parse out the first load average figure, ie in this case 18.73 'awk' accepts an array of field separators -F'[fsfsfs]' between square brackets so the required field $12 can be printed out. Not mentioned in the manual. echo $THAT_LINE | awk -F'[,:]' '{print $12}'
Re: parsing
On Saturday 27 May 2006 16:07, Keith McGavin wrote: 'awk' accepts an array of field separators -F'[fsfsfs]' between square brackets so the required field $12 can be printed out. Not mentioned in the manual. echo $THAT_LINE | awk -F'[,:]' '{print $12}' Nice, I would say that's a winner. I must play with awk some more. -- There is always something new out of Africa. -- Gaius Plinius Secundus
Re: parsing
'awk' accepts an array of field separators -F'[fsfsfs]' between square brackets so the required field $12 can be printed out. Not mentioned in the manual. In the interest of portability awareness, let me be anally correct. I don't think that awk allows a regex for field separation. However, gawk does, and it is in the manual. Of course on Linux systems, awk is typically a symlink to gawk. Good programming would be that when you require features of gawk which are not part of awk, to call the program by the name of gawk, and it will produce an obvious failure instead of rubbish results. Btw see my recent cable traffic script for an example of shell programming with gawk components, it fits in with the script talk we had last year too. Volker -- Volker Kuhlmann is list0570 with the domain in header http://volker.dnsalias.net/ Please do not CC list postings to me.
Re: parsing
Using which language? Also, does anyone else find it odd that the line has one table and 2 /tables? Is there another table further up? Cheers, Carl. On 26/05/06, Nick Rout [EMAIL PROTECTED] wrote: given this string (all one line): Tracker Load: (9 %)table class=main border=0 width=400trtd style='padding: 0px; background-image: url(pic/loadbarbg.gif); background-repeat: repeat-x'img height=15 width=36 src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15 up 2 days, 23:44, 1 user, load average: 18.73, 20.96, 27.06br/td/tr/table what is the best way to parse out the first load average figure, ie in this case 18.73 -- Nick Rout [EMAIL PROTECTED]
Re: parsing
On Friday 26 May 2006 11:29, Nick Rout wrote: given this string (all one line): Tracker Load: (9 %)table class=main border=0 width=400trtd style='padding: 0px; background-image: url(pic/loadbarbg.gif); background-repeat: repeat-x'img height=15 width=36 src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15 up 2 days, 23:44, 1 user, load average: 18.73, 20.96, 27.06br/td/tr/table what is the best way to parse out the first load average figure, ie in this case 18.73 Yeah, what language? echo $THAT_LINE | sed -e 's/.*load average: //' | cut -d ',' -f 1 a bit dirty but it works. -- I'd love to go out with you, but the man on television told me to stay tuned.
Re: parsing
On May 26, 2006, at 11:29 AM, Nick Rout wrote: given this string (all one line): Tracker Load: (9 %)table class=main border=0 width=400trtd style='padding: 0px; background-image: url(pic/loadbarbg.gif); background-repeat: repeat-x'img height=15 width=36 src=/pic/ loadbargreen.gif alt='9%'/td/tr/table18:25:15 up 2 days, 23:44, 1 user, load average: 18.73, 20.96, 27.06br/td/tr/ table what is the best way to parse out the first load average figure, ie in this case 18.73 The best way will depend on what you're using, and how likely that string is to change. Just off the top of my head, you could find the index of the substring load average and go from there, or maybe throw everything into a CSV parser to get the string load average: 18.73, which wouldn't need much manipulation to convert to a numeric data type. - Dave
Re: parsing
given this string (all one line): Tracker Load: (9 %)table class=main border=0 width=400trtd style='padding: 0px; background-image: url(pic/loadbarbg.gif); background-repeat: repeat-x'img height=15 width=36 src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15 up 2 days, 23:44, 1 user, load average: 18.73, 20.96, 27.06br/td/tr/table what is the best way to parse out the first load average figure, ie in this case 18.73 Easy. Given that you didn't specify the language, the default fallback is bash, which, short of a compiled language, is probably also the fastest: line=yourstring line=${line##*load average:} line=${line##* } load=${line%%,*} echo $load The first two patterns could be combined, but as is, they allow for variable white space. You get the idea for other languages: kill everything until load average, and after the first comma of what's left. Volker -- Volker Kuhlmann is list0570 with the domain in header http://volker.dnsalias.net/ Please do not CC list postings to me.
Re: parsing
what is the best way to parse out the first load average figure, ie in this case 18.73 The best way is to get someone else to do it for you and guarantee it will work. In Python (if foo is your string): load_average= float(foo.split('load average: ')[1].split(',')[0]) But it depends what language you have at your disposal (or which one you are forced to use). A
Re: parsing
On Fri, 26 May 2006 12:04:48 +1200 Hadley Rich wrote: On Friday 26 May 2006 11:29, Nick Rout wrote: given this string (all one line): Tracker Load: (9 %)table class=main border=0 width=400trtd style='padding: 0px; background-image: url(pic/loadbarbg.gif); background-repeat: repeat-x'img height=15 width=36 src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15 up 2 days, 23:44, 1 user, load average: 18.73, 20.96, 27.06br/td/tr/table what is the best way to parse out the first load average figure, ie in this case 18.73 Yeah, what language? echo $THAT_LINE | sed -e 's/.*load average: //' | cut -d ',' -f 1 a bit dirty but it works. yes that works thanks. I think I am right in saying that the sed part cuts out everything up to the words load average: and the cut part then takes up until (but not including) the comma. -- Nick Rout [EMAIL PROTECTED]
Re: parsing
On Friday 26 May 2006 14:07, Nick Rout wrote: yes that works thanks. I think I am right in saying that the sed part cuts out everything up to the words load average: and the cut part then takes up until (but not including) the comma. Correct! -- CS
Re: parsing
On Fri, 26 May 2006 14:07:17 +1200 Nick Rout [EMAIL PROTECTED] wrote: On Fri, 26 May 2006 12:04:48 +1200 Hadley Rich wrote: On Friday 26 May 2006 11:29, Nick Rout wrote: given this string (all one line): Tracker Load: (9 %)table class=main border=0 width=400trtd style='padding: 0px; background-image: url(pic/loadbarbg.gif); background-repeat: repeat-x'img height=15 width=36 src=/pic/loadbargreen.gif alt='9%'/td/tr/table18:25:15 up 2 days, 23:44, 1 user, load average: 18.73, 20.96, 27.06br/td/tr/table what is the best way to parse out the first load average figure, ie in this case 18.73 Yeah, what language? echo $THAT_LINE | sed -e 's/.*load average: //' | cut -d ',' -f 1 a bit dirty but it works. yes that works thanks. I think I am right in saying that the sed part cuts out everything up to the words load average: and the cut part then takes up until (but not including) the comma. -- Nick Rout [EMAIL PROTECTED] ...nearly. sed... substitute everything up to load average:[SPACE] with nothing. cut... return the first field ( of what's piped to it ), using ',' as the delimiter. Steve
Re: Parsing Mail log files
Jamie Dobbs wrote: I am trying to parse my mail log files to find out the number of messages received per day and the total size of the messages received. The format of the log files is: From x Fri Oct 28 14:25:12 2005 Subject: FW: Emailing: super_cop_1_.wmv Folder: ~/Maildir/new/1130462726.5627_0.xxx 2442024 Getting the number of messages per day is pretty easy, and I have done in this way for f in ~/Maildir/log*; do echo $f : `grep Folder: $f | wc -l`; done I would also like to be able to get the total bytes/kbytes of the messages received for the day and although I can see how it should be done logically I do not know how to translate this into a command for a shell script like the above. Logic tells me that we want the last word of the line starting with the word Folder: and keep a running total of this. Can anyone advise how I can do this? Awk can do it reasonably easily: for f in log-* ; do echo -n $f : ; awk -F '/Folder:/ {count++;sum += $(NF)}; END {printf(%d %d\n,count,sum)}' $f; done And in kbytes for f in log-* ; do echo -n $f : ; awk -F '/Folder:/ {count++;sum += $(NF)}; END {printf(%d %d\n,count,sum/1024)}' $f; done Daniel
Re: parsing
cat textfile | sed 's/\[.*\]//g' :-) jeremyb. From: Nick Rout [EMAIL PROTECTED] Date: 2002/04/05 Fri PM 04:39:28 GMT+12:00 To: CLUG [EMAIL PROTECTED] Subject: parsing I have a string like postfix/smtpd[25532]: I want to cut it off at the [ so I end up with postfix/smtpd how do i do it? -- Nick Rout [EMAIL PROTECTED]
Re: parsing
I have a string like postfix/smtpd[25532]: I want to cut it off at the [ so I end up with postfix/smtpd sed 's/\[.*$//'(dump everything after first [ on line) Volker -- Volker Kuhlmann, list0570 at paradise dot net dot nz http://volker.orcon.net.nz/ Please do not CC list postings to me.
Re: parsing
-Original Message- From: Nick Rout [EMAIL PROTECTED] To: CLUG [EMAIL PROTECTED] Date: Friday, April 05, 2002 4:39 PM Subject: parsing I have a string like postfix/smtpd[25532]: I want to cut it off at the [ so I end up with postfix/smtpd how do i do it? -- Nick Rout [EMAIL PROTECTED] For a fixed length string try echo postfix/smtpd[25532] | cut -c 1-13 No guarentees, I haven't had a change to test it. Wayne
Re: parsing
On Fri, Apr 05, 2002 at 05:03:24PM +1200, Wayne Rooney wrote: From: Nick Rout [EMAIL PROTECTED] I have a string like postfix/smtpd[25532]: I want to cut it off at the [ so I end up with postfix/smtpd For a fixed length string try echo postfix/smtpd[25532] | cut -c 1-13 Actually, cut is much more flexible than that, and if you're running it over large data sets, it's faster than an equivalent sed. In this instance you want to use the invocation 'cut -d\[ -f1' The -d option (with a quoted [, in case your shell thinks you're trying to introduce a real metacharacter) tells cut to parse the input into a number of columns, with [ acting as the delimeter. -f1 selects the first column. The [ character will not appear in any of the output fields ... it's the delimeter, not data. Other -f options include things like -f2-3 for columns 2 and three, -f3,5 for columns 3 and 5, and the useful -f2- for al columns from 2 onwards. But really, just getting the postfix/smtpd section isn't that useful - it doesn't change. Perhaps you want to identify the postfix entries in the log file, and collect their values? $ fgrep postfix maillog | cut -d: -f2- Given an input file of things [234]: This is a message postfix [266]: This is another message somethings [456]: This is a useless message thongs [567]: This is not a message swings [678]: This is an interesting message postfix/smtpd [789]: This is because of an email message You'll get This is another message This is because of an email message (fgrep, or 'frep -F', is faster if you're searching for a fixed string, as opposed to a regexp) -jim