Zhao Peng <[EMAIL PROTECTED]> writes: > You said that "there is an extra column in the 3rd line". I disagree > with you from my perspective. As you can see, there are 3 commas in > between "jesse" and "Dartmouth college". For these 3 commas, again, if > we think the 2nd one as an merely indication that the value for age > column is missing, then the 3rd line will be be read as ["jesse", > MISSING, "Dartmouth college"], not ["jesse",empty,empty, "Dartmouth > college"] as you suggested.
>From my perspective, your file format makes it harder to be parsed. If at all possible, I would suggest that if you can, you modify this file's format. Still, if this isn't possible, this works on your input: perl -lane 's/,,/,MISSING/g; @F = split /,/; if (index($F[-1], "univ") != -1) { ($u = $F[-1]) =~ y/"//d; print $u }' Formatted more readibly, this looks like this: perl -lne 's/,,/,MISSING/g; @F = split /,/; if (index($F[-1], "univ") != -1) { ($u = $F[-1]) =~ y/"//d; print $u }' This seems to be a reasonable solution to your problem. I hope it helps. Just another Perl hacker, --kevin -- GnuPG ID: B280F24E _______________________________________________ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss