Steve Simon wrote:
Ok, I have spent half an hour trying to parse CSV files
and it's getting embarrasing, I could do it in C but I should
be able to use rc + sed + awk.
The problem is that some of my CSV files fields contain whitespace
and thus have double quotes around them.
I thought rc knows about %q quotes strings so I could use it to
do my parsing, but it fails, can this be done, or is C the answer?
seems a shame to resort to sledge hammers.
-Steve
cpu% cat file.csv
a,b,"c,d,e",f,g
p,q,r,s,t
cpu%
cpu% cat extract
#!/bin/rc
sed 's/"([^"]*)"/''\1''/g; s/,/ /g' $* |
while (s=`{read})
echo $s(1) $s(3) $s(4)
cpu% extract file.csv
a 'c d
p r s
Thought about this case today. In native Plan9 the solution is quite easy.
Programs share environment and this is the answer:
sed 's/"([^"]*)"/''\1''/g; s/,/ /g' $* |
while (s=`{read}) {
echo 's=('$"s')' | rc
echo $s(1) $s(3) $s(4)
}