Dear Shaji

I am sorry that I didnot check my result file very carefully. My script 
actually works. The "else" line makes the script output a lot of "no such line 
found", which makes me confused. I commented out the else command now.

Thanks!

Best
Li
________________________________
From: Shaji Kalidasan [shajiin...@yahoo.com]
Sent: Thursday, November 07, 2013 4:30 AM
To: Wang, Li; beginners@perl.org
Subject: Re: please help correct my script

Dear Wang,

It is actually writing the desired info to the output file "summaryOFdNdS.txt".

Here is the content of the output file. In my case, I gave the filename 
"mydata.txt" as command line argument

[content of summaryOFdNdS.txt]
geneName branch t N S dN/dS dN dS N*dN S*dS
mydata.txt 5..1      0.043  1103.3   327.7  1.0797  0.0144  0.0134  15.9   4.4
[/content of summaryOFdNdS.txt]

Is this what you really want? Please explain.

best,
Shaji
-------------------------------------------------------------------------------
Your talent is God's gift to you. What you do with it is your gift back to God.
-------------------------------------------------------------------------------


On Thursday, 7 November 2013 3:35 PM, "Wang, Li" <li.w...@ttu.edu> wrote:
Dear Perl Users

I have hundreds of input files, named as geneName_paml_formated.mlc
In each file, there are some contents similar as follows:

w (dN/dS) for branches:  0.00010 1.07967 145.81217 0.00010
dN & dS for each branch
 branch          t       N       S   dN/dS      dN      dS  N*dN  S*dS
   4..5      0.000  1103.3   327.7  0.0001  0.0000  0.0000   0.0   0.0
   5..1      0.043  1103.3   327.7  1.0797  0.0144  0.0134  15.9   4.4
   5..2      0.004  1103.3   327.7 145.8122  0.0018  0.0000   2.0   0.0
   4..3      0.009  1103.3   327.7  0.0001  0.0000  0.0132   0.0   4.3
tree length for dN:       0.0162
tree length for dS:       0.0266

I want to extract the line start with "5..1" to be extracted and written to the 
output file (only one). The first column of the output file will be the 
geneName, which is the information contained in the input file name.

My script is as follows:

#!/usr/bin/perl
# USAGE: perl
# unix command line: perl parseOUTcodemlResult.pl *mlc
use strict;
use warnings;
my $input_file;
my $geneName;

my $output_file = "summaryOFdNdS.txt";
open (OUT, ">", $output_file);
print OUT "geneName\tbranch\tt\tN\tS\tdN/dS\tdN\tdS\tN*dN\tS*dS\n";

for $input_file (@ARGV) {
    $geneName = $input_file;
    $geneName =~ s/\_paml\_formated\.mlc//; #change the expression pattern here

    process_file($input_file, $geneName);
}


sub process_file {
    my ($input_file, $geneName) = @_;
    open (IN, "<", $input_file);
while (my $line=<IN>) {
       $line =~ s/^\s+//g;
       if ($line=~m/^5\.\.1/) {  #change the expression here when the tree 
changes
         print OUT "$geneName\t$line\n"; } #if
       else {print "no such line found\n";}
    } #while

close IN;

} #sub
close OUT;

It turns out that it didnot work, and seems that the script cannot find my 
expression pattern $line=~m/^5\.\.1/

If someone can point out my error, I will appreciate it very much!

Best
Li



Reply via email to