Re: [mart-dev] large downloads possible?

Elfar Torarinsson Sat, 30 Jan 2010 07:54:55 -0800

Hi Sayed,

thanks for your answer. I have couple of issues with that solution.
First of all I have often experienced that this feature fails, that is
I never receive the mail, especially while requesting large amount of
data. The other thing is that I wanted to be able to do this
automatically, in a cronjob for example, and although I assume this is
possible, it will require somewhat more scripting than I was planning
on doing for this (unless there is some smart option here I'm
overlooking).


Best,

Elfar


On Sat, Jan 30, 2010 at 3:47 PM, Syed Haider <[email protected]> wrote:
> Hi Elfar,
>
> the best is to download them using web browser's Export (email option). This
> will compile the results on server side and then send you a link in email.
>
> Best,
> Syed
>
>
> Elfar Torarinsson wrote:
>>
>> Hi,
>>
>> I was trying to automate regular downloads of human CDS (and UTRs)
>> using biomart. I have tried it using the perl script generated at
>> biomart:
>>
>> use strict;
>> use BioMart::Initializer;
>> use BioMart::Query;
>> use BioMart::QueryRunner;
>>
>> my $confFile =
>> "/home/projects/ensembl/biomart-perl/conf/apiExampleRegistry.xml";
>> my $action='cached';
>> my $initializer = BioMart::Initializer->new('registryFile'=>$confFile,
>> 'action'=>$action);
>> my $registry = $initializer->getRegistry;
>>
>> my $query =
>> BioMart::Query->new('registry'=>$registry,'virtualSchemaName'=>'default');
>>
>> $query->setDataset("hsapiens_gene_ensembl");
>> $query->addAttribute("ensembl_gene_id");
>> $query->addAttribute("ensembl_transcript_id");
>> $query->addAttribute("coding");
>> $query->addAttribute("external_gene_id");
>>
>> $query->formatter("FASTA");
>>
>> my $query_runner = BioMart::QueryRunner->new();
>> # to obtain unique rows only
>> $query_runner->uniqueRowsOnly(1);
>>
>> $query_runner->execute($query);
>> $query_runner->printHeader();
>> $query_runner->printResults();
>> $query_runner->printFooter();
>>
>> This only retrieves a few sequences and then starts returning
>> "Problems with the web server: 500 read timeout"
>>
>> I have also tried posting the XML using LWP in perl, this downloads
>> more sequences but this also stops after a while before downloading
>> all the sequences:
>>
>> use strict;
>> use LWP::UserAgent;
>> open (FH,$ARGV[0]) || die ("\nUsage: perl postXML.pl Query.xml\n\n");
>> my $xml;
>> while (<FH>){
>>    $xml .= $_;
>> }
>> close(FH);
>>
>> my $path="http://www.biomart.org/biomart/martservice?";;
>> my $request =
>> HTTP::Request->new("POST",$path,HTTP::Headers->new(),'query='.$xml."\n");
>> my $ua = LWP::UserAgent->new;
>> $ua->timeout(30000000);
>> my $response;
>>
>> $ua->request($request,
>>             sub{
>>                 my($data, $response) = @_;
>>                 if ($response->is_success) {
>>                     print "$data";
>>                 }
>>                 else {
>>                     warn ("Problems with the web server:
>> ".$response->status_line);
>>                 }
>>             },500);
>>
>> I have managed to download all the sequences using the browser before,
>> but, it required several tries and I had to get them gzipped (also so
>> I could be sure I got all of them when gunzipping them).
>>
>> So, my question is, is there anything I can do to be able to download
>> all the sequences? I.e. avoid timeouts, some easy, systematic, way to
>> split my calls into much smaller calls or something else?
>>
>> Thanks,
>>
>> Elfar
>

Re: [mart-dev] large downloads possible?

Reply via email to