search scalers of an array in another file

2012-06-21 Thread Wang, Li
Dear list members

I am a very beginner of perl programming. 
I am trying to write a script to search all scalers of one array (geneIDFile) 
in another file (annotationFile). If it is found and matched, output the whole 
line of the annotation file. 
My script is as follows. It turns out not woking, and I cannot spot out the 
error. Could anyone help me?

#!/usr/bin/perl -w

# This script assigns gene function from specific poplar Gene IDs using  
populus tricocarpa annotation
# USAGE:
# unix command line:
# ./assignGOpoplar.pl candidateGenes.name annotationFile.name 
# e.g. ./assignGOpoplar.pl top4018tags.xls Ptrichocarpa_156_annotation_info.txt
# the script takes the genes number from the first file and finds the 
annotation in the second file
# then outputs a third file with the geneID and annotation


use strict;
use warnings;

my $geneIDfile = shift @ARGV;
my @geneID=(); 
my @logFC=();
my @logCPM=();
my @LR=();
my @Pvalue=();
my @FDR=();

my $i=-1;
open (GENEIDFILE, "$geneIDfile") || die "GENEID File not found\n";
 while () {
   chomp;   
   $i++;
   next if ($i==0);
   ($geneID[$i], $logFC[$i], $logCPM[$i], $LR[$i], $Pvalue[$i], $FDR[$i]) = 
split(/\t/, $_);
   
 }
close(GENEIDFILE);


my $j= 1;
my $annotationFile = 
"/Users/olsonmatthew/Desktop/Perl/Ptrichocarpa_156_annotation_info.txt";
open (ANNOTFILE, "<$annotationFile") || die "ANNOTFILE File not found\n";
 while () {
   chomp; 
   
   if ($_=~/\n/){
if ($_=~/$geneID[$j]/){
print "$_\n";
}
++$j;
}
}
close(ANNOTFILE);
exit;


Best wishes
Li
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/




Re: search scalers of an array in another file

2012-06-21 Thread Jim Gibson

On Jun 21, 2012, at 4:47 PM, Wang, Li wrote:

> Dear list members
> 
> I am a very beginner of perl programming. 
> I am trying to write a script to search all scalers of one array (geneIDFile) 
> in another file (annotationFile). If it is found and matched, output the 
> whole line of the annotation file. 
> My script is as follows. It turns out not woking, and I cannot spot out the 
> error. Could anyone help me?
> 
> #!/usr/bin/perl -w
> 
> # This script assigns gene function from specific poplar Gene IDs using  
> populus tricocarpa annotation
> # USAGE:
> # unix command line:
> # ./assignGOpoplar.pl candidateGenes.name annotationFile.name 
> # e.g. ./assignGOpoplar.pl top4018tags.xls 
> Ptrichocarpa_156_annotation_info.txt
> # the script takes the genes number from the first file and finds the 
> annotation in the second file
> # then outputs a third file with the geneID and annotation
> 
> 
> use strict;
> use warnings;
> 
> my $geneIDfile = shift @ARGV;
> my @geneID=(); 
> my @logFC=();
> my @logCPM=();
> my @LR=();
> my @Pvalue=();
> my @FDR=();
> 
> my $i=-1;
> open (GENEIDFILE, "$geneIDfile") || die "GENEID File not found\n";
> while () {
>   chomp;   
>   $i++;
>   next if ($i==0);
>   ($geneID[$i], $logFC[$i], $logCPM[$i], $LR[$i], $Pvalue[$i], $FDR[$i]) 
> = split(/\t/, $_);
> 
> }
> close(GENEIDFILE);
> 
> 
> my $j= 1;
> my $annotationFile = 
> "/Users/olsonmatthew/Desktop/Perl/Ptrichocarpa_156_annotation_info.txt";
> open (ANNOTFILE, "<$annotationFile") || die "ANNOTFILE File not found\n";
> while () {
>   chomp; 
> 
>   if ($_=~/\n/){

You are only going to look at lines that have a newline in them, but chomp has 
removed the newline. Therefore, you will never find matching lines.


>   if ($_=~/$geneID[$j]/){
>   print "$_\n";
>   }
>   ++$j;

Do you realize that you are only going to try to match each geneID to only one 
line in your annotation file? Is that what you want todo

>   }
>   }
> close(ANNOTFILE);
> exit;
> 


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/




Re: search scalers of an array in another file

2012-06-21 Thread John W. Krahn

Wang, Li wrote:

Dear list members


Hello,



I am a very beginner of perl programming.


Welcome to the Perl beginners mailing list.



I am trying to write a script to search all scalers of one array
(geneIDFile) in another file (annotationFile). If it is found and
matched, output the whole line of the annotation file.
My script is as follows. It turns out not woking, and I cannot spot
out the error. Could anyone help me?

#!/usr/bin/perl -w

# This script assigns gene function from specific poplar Gene IDs using  
populus tricocarpa annotation
# USAGE:
# unix command line:
# ./assignGOpoplar.pl candidateGenes.name annotationFile.name
# e.g. ./assignGOpoplar.pl top4018tags.xls Ptrichocarpa_156_annotation_info.txt


You say that you want two file names on the command line but your code 
only uses one of those file names.




# the script takes the genes number from the first file and finds the 
annotation in the second file
# then outputs a third file with the geneID and annotation


You also don't specify an output file in your code?



use strict;
use warnings;

my $geneIDfile = shift @ARGV;
my @geneID=();
my @logFC=();
my @logCPM=();
my @LR=();
my @Pvalue=();
my @FDR=();

my $i=-1;
open (GENEIDFILE, "$geneIDfile") || die "GENEID File not found\n";


You shouldn't quote scalar variables, Perl is not the shell.

perldoc -q quoting

You should probably also include the $! variable in your error message 
so you know why open failed.


open GENEIDFILE, '<', $geneIDfile or die "Cannot open '$geneIDfile' 
because: $!";




  while () {
chomp;
$i++;
next if ($i==0);
($geneID[$i], $logFC[$i], $logCPM[$i], $LR[$i], $Pvalue[$i], $FDR[$i]) 
= split(/\t/, $_);


You never use the arrays @logFC, @logCPM, @LR, @Pvalue and @FDR so you 
don't really need them.  Your loop would probably be better as:


while (  ) {
next if $. == 1;
push @geneID, ( split /\t/ )[ 0 ];



  }
close(GENEIDFILE);


my $j= 1;
my $annotationFile = 
"/Users/olsonmatthew/Desktop/Perl/Ptrichocarpa_156_annotation_info.txt";


Aren't you supposed to get this file name from the command line (@ARGV)?



open (ANNOTFILE, "<$annotationFile") || die "ANNOTFILE File not found\n";


open ANNOTFILE, '<', $annotationFile or die "Cannot open 
'$annotationFile' because: $!";




  while () {
chomp;

if ($_=~/\n/){


The readline () reads one line from the file, where a line is 
defined as zero or more characters ending in newline, and then chomp 
removes that newline, so there is no newline for your regular expression 
to match.




if ($_=~/$geneID[$j]/){


You are only comparing one element from @geneID to the line instead of 
all elements which you stated at the beginning is what you wanted to do.




print "$_\n";
}
++$j;
}
}
close(ANNOTFILE);
exit;


If you could provide some sample data from your two files it would be 
easier to come up with a solution.




John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction.   -- Albert Einstein

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/