Dear Perl Users I have hundreds of input files, named as geneName_paml_formated.mlc In each file, there are some contents similar as follows:
w (dN/dS) for branches: 0.00010 1.07967 145.81217 0.00010 dN & dS for each branch branch t N S dN/dS dN dS N*dN S*dS 4..5 0.000 1103.3 327.7 0.0001 0.0000 0.0000 0.0 0.0 5..1 0.043 1103.3 327.7 1.0797 0.0144 0.0134 15.9 4.4 5..2 0.004 1103.3 327.7 145.8122 0.0018 0.0000 2.0 0.0 4..3 0.009 1103.3 327.7 0.0001 0.0000 0.0132 0.0 4.3 tree length for dN: 0.0162 tree length for dS: 0.0266 I want to extract the line start with "5..1" to be extracted and written to the output file (only one). The first column of the output file will be the geneName, which is the information contained in the input file name. My script is as follows: #!/usr/bin/perl # USAGE: perl # unix command line: perl parseOUTcodemlResult.pl *mlc use strict; use warnings; my $input_file; my $geneName; my $output_file = "summaryOFdNdS.txt"; open (OUT, ">", $output_file); print OUT "geneName\tbranch\tt\tN\tS\tdN/dS\tdN\tdS\tN*dN\tS*dS\n"; for $input_file (@ARGV) { $geneName = $input_file; $geneName =~ s/\_paml\_formated\.mlc//; #change the expression pattern here process_file($input_file, $geneName); } sub process_file { my ($input_file, $geneName) = @_; open (IN, "<", $input_file); while (my $line=<IN>) { $line =~ s/^\s+//g; if ($line=~m/^5\.\.1/) { #change the expression here when the tree changes print OUT "$geneName\t$line\n"; } #if else {print "no such line found\n";} } #while close IN; } #sub close OUT; It turns out that it didnot work, and seems that the script cannot find my expression pattern $line=~m/^5\.\.1/ If someone can point out my error, I will appreciate it very much! Best Li