Everyone has different toolbox and regexp logic wired in their brains. I prefer to break things down somewhat step by step - and think ahead - mostly it is worth it.
The following is more complex at first, but, in my experience, as soon as you sort by the first column - I would probably need to sort or do something by/with the other columns - so turning it to clean unquoted CSV has always been good investment to me. I'd probably wired it up this universal way (sorting differently can be done with changing the sort part only, doing some column ops can be done in awk only): cat sample.dat | sed "s/''/ /g;s/'//g"| sort -n -k1,1 -t , | sed "s/ /''/g" | awk -v FS=, '{printf("'"'%s',%s,'%s','%s','%s','%s'"'\n",$1,$2,$3,$4,$5,$6)}' '8',11,'2000-07- 18','Insecta','Ephemeroptera','Leptophlebiidae''Paraleptophlebia' '11',11,'2000-07-18','Insecta','Trichoptera','Glossosomatidae''Agapetus' '12',11,'2000-07-18','Insecta','Diptera','Tipulidae''Tipula' '134',41,'2004-06-07','Insecta','Plecoptera','Nemouridae''Amphinemura' '135',3,'2004-06-07','Insecta','Ephemeroptera','Baetidae''Baetus' '137',41,'2004-06-07','Insecta','Ephemeroptera','Baetidae''Baetis' '138',3,'2004-06-07','Insecta','Coleoptera','Hydrophilidae''Berosus' '139',3,'2004-06-07','Insecta','Plecoptera','Chloroperlidae''Sweltsa' '141',41,'2004-06-07','Insecta','Plecoptera','Chloroperlidae''Suwallia' '145',3,'2004-06-07','Insecta','Diptera','Simulidae''Prosimulium' '148',3,'2004-06-07','Annelida','Oligochaeta','Lumbricidae''Ilyodrilus/Tubifex' '151',3,'2006-06-15','Insecta','Diptera','Chironomidae''Eukiefferiella' '154',41,'2004-06-07','Insecta','Coleoptera','Dytiscidae''Hydrovatus' '155',3,'2004-06-07','Insecta','Coleoptera','Dytiscidae''Hydrovatus' '216',SC,'2005-07-13','Insecta','Diptera','Ephydridae''' '592',17,'2011-07-11','Annelida','Oligochaeta','Tubificidae' '648',17,'2011-07-11','Insecta','Plecoptera','Chloroperlidae''Suwallia' '652',17,'2011-07-11','Insecta','Plecoptera','Pteronarcidae''Pteronarcella' '895',17,'2010-09-13','Insecta','Ephemeroptera','Baetidae''Baetis' '899',17,'2010-09-13','Insecta','Diptera','Psychodidae''Pericoma' '901',17,'2010-09-13','Insecta','Coleoptera','Hydrophilidae''Cymbiodyta' '907',17,'2010-09-13','Insecta','Trichoptera','Glossosomatidae''Glossosoma' '909',17,'2010-09-13','Insecta','Diptera','Chironomidae''Cladotanytarsus' '914',17,'2010-09-13','Insecta','Plecoptera','Nemouridae''Zapada' '918',17,'2010-09-13','Insecta','Trichoptera','Hydropsychidae''Hydropsyche' '919',17,'2010-09-13','Insecta','Coleoptera','Dytiscidae''Hydroporus' '920',17,'2010-09-13','Insecta','Trichoptera','Lepidostomatidae''Lepidostoma' '922',17,'2010-09-13','Insecta','Coleoptera','Elmidae''Narpus' '1120',17,'2006-06-27','Insecta','Diptera','Chironomidae''Polypedilum' '1126',17,'2006-06-27','Insecta','Ephemeroptera','Baetidae''Baetis' '1128',17,'2006-06-27','Insecta','Trichoptera','Brachycentridae''Brachycentrus' '1129',17,'2006-06-27','Insecta','Diptera','Chironomidae''Tvetenia' '2060',11,'2012-07-11','Insecta','Coleoptera','Elmidae''Narpus' '2061',11,'2012-07-11','Insecta','Diptera','Chironomidae''Natarsia' '2062',11,'2012-07-11','Insecta','Trichoptera','Hydroptilidae''Ochrotrichia' I really, really dislike quoted CSVs - what a waste. Tomas PS: I did not see any B(W) in your examples! Perhaps that is why looking for it with sed did not work. On Tue, 2020-03-31 at 08:53 +0900, J. Hart wrote: > try this : > > cat sample.dat | sed "s|^'\([0-9]*\)'|\1 '\1'|" | sort -n | sed > "s|^[0-9]* ||" | tee sample.dat.new > > > On 03/31/2020 08:30 AM, Rich Shepard wrote: > > sample.dat: > > > > '648',17,'2011-07-11','Insecta','Plecoptera','Chloroperlidae''Suwallia' > > '652',17,'2011-07-11','Insecta','Plecoptera','Pteronarcidae''Pteronarcella' > > > > _______________________________________________ > PLUG mailing list > PLUG@pdxlinux.org > http://lists.pdxlinux.org/mailman/listinfo/plug _______________________________________________ PLUG mailing list PLUG@pdxlinux.org http://lists.pdxlinux.org/mailman/listinfo/plug