Re: extract string from filename

Dan Jenkins Fri, 13 Jan 2006 18:03:48 -0800

Zhao Peng wrote:

string1_string2_string3_string4.sas7bdat


abc_st_nh_num.sas7bdat
abc_st_vt_num.sas7bdat
abc_st_ma_num.sas7bdat
abcd_region_NewEngland_num.sas7bdat
abcd_region_South_num.sas7bdat

My goal is to :
1, extract string2 from each file name
2, then sort them and keep only unique ones
3, then output them to a .txt file. (one unique string2 per line)


Solution #1:
ls -1 *sas7bdat|awk -F_ '{print $2}'|sort -fu|cat -n >output.txt

Take output of ls, 1 file per line (ls -1) - only files ending with sas7bdat
Feed into awk, splitting on _, print the 2nd field

Sort ignoring case, eliminating duplicates (sort options: f "foldscase", u "keeps only uniques")

Number the lines (cat -n)
Put output in file named output.txt

Solution #2:

ls -1 *sas7bdat|sed 's/^$[a-zA-Z0-9]*_$$[a-zA-Z0-9]*$_.*$/\2/'|sort-fu|cat -n >output.txtUse sed (stream editor) to break up filenames into atoms separated by _,and output the 2nd one (the \2). Regular expressions (regex) can be veryhandy. ^ matches beginning of string, [a-zA-Z0-9]*_ matchesletter/number string ending with _, the backslashed parentheses groupsthe patterns, so the 2nd one can be extracted.


There are many solutions to the problem, as you can see.

--
Dan Jenkins ([EMAIL PROTECTED])
Rastech Inc., Bedford, NH, USA --- 1-603-206-9951
*** Technical Support Excellence for over a quarter century

_______________________________________________
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss

Re: extract string from filename

Reply via email to