On Wed, Jul 14, 2010 at 5:02 PM, Syed Haider <[email protected]> wrote: > > > On 14/07/2010 15:46, Leandro Hermida wrote: >> >> Hi again, >> >> In the new BioMart 0.8 will the SOAP and REST APIs have: >> - an option to return results in JSON or other serialized data structure >> form? > > tentative yes for results request. For all other API call (meta data calls), > a definite yes. For the former, there is very little point to e.g wrap 1000 > bytes of gene ids in 20,000 bytes of JSON. >
good point, but many times you are returning much more than that, records with many attributes >> - an option to return results sorted by some attribute(s)? > > no, thats a post processing option and tends to be very expensive as it > needs all results to be collected in the first place. we can make it > optional though. BioMart web interface would have this option for sure. > why not let the database do these things? (i.e. ... ORDER BY x1 ASC,y1 DESC, z1 ASC ) I noticed that also in the current 0.7 you do many things post-processed in Perl, e.g. unique rows are processed in Perl after returning database results, why not use just use SELECT DISTINCT ....? >> - an option to return results with LIMITs in full form i.e. start_row, >> end_row (for paging)? > > you will have limit as offset of zero. e.g you can retrieve, first 100, > first 1000, first 10000 and so on. again why not let the database do it? ( e.g. ... LIMIT 100,500 ) > > HTH, > Syed > >> >> best, >> Leandro >> >> On Wed, Jul 14, 2010 at 4:32 PM, Leandro Hermida >> <[email protected]> wrote: >>> >>> Hi Syed, >>> >>> Since none of the BioMart APIs actually return results in a data >>> structure (it only returns formatted files like TSV, etc) I was trying >>> to be helpful and show other developers on this forum how they can go >>> about populating a Perl data structure from the results returned by >>> BioMart. >>> >>> It's not obvious after reading the docs and when you get started how >>> you need to do this, one initially expects in the APIs that there >>> would be for e.g. in the Perl API some method call ->getResults() >>> which returns an @array of arrayrefs structure or in the REST API that >>> there would be an option to return for e.g. a JSON serialized data >>> structure that can be unserialized into a native data structure for >>> the language you are using. >>> >>> best, >>> Leandro >>> >>> On Wed, Jul 14, 2010 at 3:21 PM, Syed Haider<[email protected]> >>> wrote: >>>> >>>> Hi Leandro, >>>> >>>> this is the only method that returns the results. What exactly are you >>>> after >>>> ? >>>> >>>> Best >>>> Syed >>>> >>>> On 14/07/2010 13:14, Leandro Hermida wrote: >>>>> >>>>> Sorry forgot to post what I did before! For those of your who use the >>>>> Biomart APIs and want to get results back into a Perl data structures, >>>>> here is the approach I use: >>>>> >>>>> If using the Perl API: >>>>> >>>>> use BioMart::Initializer; >>>>> use BioMart::Query; >>>>> use BioMart::QueryRunner; >>>>> >>>>> my $bm_initializer = BioMart::Initializer->new( >>>>> registryFile => "/path/to/myRegistry.xml", >>>>> action => 'update', >>>>> ); >>>>> my $bm_query = BioMart::Query->new( >>>>> registry => $bm_initializer->getRegistry(), >>>>> virtualSchemaName => 'default' >>>>> ); >>>>> $bm_query->setDataset('my_dataset'); >>>>> $bm_query->addFilter('attr1', ['Q6LTE1']); >>>>> $bm_query->addAttribute('attr2'); >>>>> $bm_query->addAttribute('attr3'); >>>>> $bm_query->formatter('TSV'); >>>>> my $bm_query_runner=BioMart::QueryRunner->new(); >>>>> $bm_query_runner->uniqueRowsOnly(1); >>>>> $bm_query_runner->execute($bm_query); >>>>> open(RESULTS, '+>', \my $results) or die "$!\n"; >>>>> $bm_query_runner->printResults(\*RESULTS); >>>>> seek(RESULTS, 0, 0); >>>>> while (<RESULTS>) { >>>>> chomp; >>>>> my @row_fields = split /\t/; >>>>> # build up a data structure or processed your fields here... >>>>> } >>>>> close(RESULTS); >>>>> >>>>> >>>>> Using the REST API: >>>>> >>>>> use LWP::UserAgent (); >>>>> >>>>> my $query_xml =<<XML; >>>>> <?xml version="1.0" encoding="UTF-8"?> >>>>> <!DOCTYPE Query> >>>>> <Query virtualSchemaName="default" formatter="TSV" header="0" >>>>> uniqueRows="1" count="" datasetConfigVersion="0.7"> >>>>> <Dataset name="my_dataset" interface="default"> >>>>> <Filter name="attr1" value="Q6LTE1"/> >>>>> <Attribute name="attr2" /> >>>>> <Attribute name="attr3" /> >>>>> </Dataset> >>>>> </Query> >>>>> XML >>>>> >>>>> my $ua = LWP::UserAgent->new(); >>>>> my $response = >>>>> $ua->post('http://myserver.mydomain:9002/biomart/martservice', >>>>> [ query => $query_xml ]); >>>>> if ($response->is_success and $response->decoded_content !~ >>>>> /BioMart::Exception/i) { >>>>> open(RESULTS, '<', \$response->decoded_content) or die "$!\n"; >>>>> while (<RESULTS>) { >>>>> chomp; >>>>> my @row_fields = split /\t/; >>>>> # build up a data structure or processed your fields here... >>>>> } >>>>> close(RESULTS); >>>>> } >>>>> else { >>>>> die $response->decoded_content, "\n"; >>>>> } >>>>> >>>>> >>>>> On Thu, Jun 10, 2010 at 12:03 AM, Syed Haider<[email protected]> >>>>> wrote: >>>>>> >>>>>> Hi Leandro, >>>>>> >>>>>> The datastructures representation of results is not returned by the >>>>>> API. >>>>>> If >>>>>> you are feeling adventurous please feel free to look into the >>>>>> lib/BioMart/Formatter/ directory for the appropriate formatter that >>>>>> you >>>>>> are >>>>>> interested in. >>>>>> >>>>>> >>>>>> Best >>>>>> Syed >>>>>> >>>>>> >>>>>> >>>>>> On 09/06/2010 17:51, Leandro Hermida wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I was wondering if there is a way using the Perl API to get results >>>>>>> in a >>>>>>> Perl data structure and, if possible, row by row. For example each >>>>>>> row >>>>>>> returned as an array or arrayref. It seems inefficient to take >>>>>>> printResults() and have to break everything up again when I know >>>>>>> somewhere >>>>>>> in the Perl API it was doing the reverse... >>>>>>> >>>>>>> thanks, >>>>>>> Leandro >>>>>>> >>>>>>> >>>>>> >>>> >>> >
