On Sat, Jul 11, 2009 at 03:57:26PM +0100, Penguin Lover Mick squawked: > > > Hmm, I don't think it gets anywhere: > > > ======================================= > > > cat test.xml | grep -v NaN | grep '<row>' | tr e ' ' | awk > > > {'print "Q"$2"qcq"$3"qcq"$9"Q"'} | tr Q '"' | tr c ',' | tr q '"'" > > > > test.csv > > > > > > ======================================= > > > > > > It just sits there at the > cursor. I think it needs something more to > > > it, or > > > > Looks like a syntax error with improperly nested quotations marks. > > > > The last command in the sequence, which reads > > > > tr q '"'" > > > > try replacing that with > > > > tr q '"' > > > > (remove the final double quote) > > > > W > > Thank you both! It works to a point. This is what the xml file contains: > > <database> > <!-- 2009-07-02 07:41:00 EDT / 1246534860 --> > <row><v> > 7.3395000000e+01 </v><v> 4.7990000000e+01 </v></row> > > The CSV file only shows the first value and then it does not pick up the fact > that it is exponential: > > "2009-07-02","07:41:00","7.3395000000" > > How could it be tweaked to a)account for e+01, b)include additional value > fields? >
Try: cat test.xml | grep -v NaN | grep '<row>' | awk {'print "Q"$2"qcq"$3"qcq"$9"qcq"$11"Q"'} | tr Q '"' | tr c ',' | tr q '"'" > test.csv Just to help you help yourself later: the 'tr' command "translates". So the command tr e ' ' swaps occurences of the letter e with a blank space. Removing that command keeps the e in the numbers. (Though I am not certain how CSV files deal with e notations...). awk prints the space-separated fields. $2, $3, etc are the number of the field respectively. So adding $11 allows printing the one additional field. This, of course, only works if you have the same number of records in each row. HTH, W -- A gossip is someone with a great sense of rumour. Sortir en Pantoufles: up 947 days, 22:39