I have written a rather simplistic script so I can get used to LWP::Simple etc... Anyway I am using a subroutine to "get" and "print" data from a website. I have gotten it to work except for the fact that the first iteration of the subroutine "uses" no data at all, yet after that it works fine. I know it has to do with how I am passing the data into the subroutine. The output is as follows (the perl code that was used is below too):

<Begin OutPut>
Fetching                                                                        
        # See it is fetching nothing
Appending "" to "fetched_ncbi_sequences.txt".                       # Appending 
nothing
Fetching cv889431                                                               
# It works fine from this point on
Appending "cv889431" to "fetched_ncbi_sequences.txt".
Fetching cv889432
Appending "cv889432" to "fetched_ncbi_sequences.txt".
Fetching cv889433
Appending "cv889433" to "fetched_ncbi_sequences.txt".
Fetching cv889434
Appending "cv889434" to "fetched_ncbi_sequences.txt".
Fetching cv889435
Appending "cv889435" to "fetched_ncbi_sequences.txt".
Fetching cv889436
Appending "cv889436" to "fetched_ncbi_sequences.txt".
Fetching cv889437
Appending "cv889437" to "fetched_ncbi_sequences.txt".
Fetching cv889438
Appending "cv889438" to "fetched_ncbi_sequences.txt".
Fetching cv889439
Appending "cv889439" to "fetched_ncbi_sequences.txt".
Fetching cv889440
Appending "cv889440" to "fetched_ncbi_sequences.txt".
Fetching cv889441
Appending "cv889441" to "fetched_ncbi_sequences.txt".


**********Finished**********

</End OutPut>

The script is as follows:

<Begin code>
#!usr/bin/perl -w
use strict;
use LWP::Simple;

open(FASTA, ">fetched_ncbi_sequences.txt")
        or die "Cannot open FASTA file: $!";

print "\n\t**Welcome to Mike Robeson's NCBI-fetch Script!**\n
        A - Just enter in the accession numbers of the sequence data
        you wish to pull from genbank individually (e.g. cv889410) or
        by defining a range (cv889431-cv889441). Hit <enter> after
        each entry or entry range.\n
        B - When finished, hit <enter> one last time and press <ctrl-d>.\n
        C - All sequence data will be downloaded into one file in FASTA
        format (e.g. fetched_ncbi_sequences.txt).
        \n\n";

print "Enter a list of Sequence IDs to fetch:\n";
chomp (my @list = <ARGV>);

&printSequence;

foreach my $id (@list) {
        if ($id =~ s/([a-z]*)(\d+)-[a-z]*(\d+)//) {
                my @range = split(/-/,$id);
                my $init_range_letters = $1;
                my $init_range_num = $2;
                my $term_range_num = $3;
                for (my $count = $init_range_num; $count<=$term_range_num; 
$count++) {
                        my $genbank = $init_range_letters.$count;
                        &printSequence($genbank);           
                }
        } else {
                &printSequence($id);
        }
}

print "\n\n**********Finished**********\n\n";


sub printSequence {
my $accession = "@_";
print "Fetching $accession \n";
my $data = get("http://www.ncbi.nlm.nih.gov/entrez/batchseq.cgi? cmd=&txt=on&save=&cfm=&term=&list_uids=$accession&db=nucleotide&extrafea t=16&view=fasta&dispmax=20&SendTo=t&__from=&__to=&__strand=");
print FASTA $data;
print "Appending \"$accession\" to \"fetched_ncbi_sequences.txt\"\.\n";
}


<\End Code>

I have been trying to figure out why this is occurring and have remained stumped for 3 hours now and I can't figure out what is going on. Any suggestions?

-Mike


-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to