Hi Juan et al Yes phoenix has a search inter-phase to ESGF data ( but you can use other climate data archives as well ).
Here are some preliminary screen shots: http://flyingpigeon.readthedocs.io/en/latest/tutorials/sdm.html Best Nils On 01/06/2016 16:06, Juan M. Escamilla Molgora wrote: > > Hi Nils, > > Thank you for sharing! > > How is phoenix about? Does it connects to the ESGF network? It's the > first time I read about this. Looks very very interesting! > > > Thanks everybody for these valuable feedback. > > Best wishes > > > Juan > > > > On 01/06/16 10:09, Nils Hempelmann wrote: >> Hi Juan et al >> >> Thanks a lot for triggering this discussion. >> I am currently working on a Web processing service >> (http://birdhouse.readthedocs.io/en/latest/) including a species >> distribution model based on the GBIF data (and climate model data). A >> good connection to GBIF database is still missing and all hints were >> quite useful!! >> >> If you want to share code: >> https://github.com/bird-house/flyingpigeon/blob/master/flyingpigeon/processes/wps_sdm.py >> >> >> >> Merci >> Nils >> >> On 31/05/2016 22:08, Juan M. Escamilla Molgora wrote: >>> >>> Hi Tim, >>> >>> Thank you! specially for the DwC-A hint. >>> >>> The cells are by default in decimal degrees, (wgs84 ) but the >>> functions for generating them are general enough to use any >>> projection supported by gdal using postgis. It could be done "on the >>> fly" or stored on the server side, >>> >>> I was thinking (day dreaming) in a standard way for coding unique >>> but universal grids (similar to geohash or open location code), but >>> didn't find something fast and ready. Maybe later :) >>> >>> I only use Open Source Software, Python, Django, GDAL, Numpy, >>> Postgis, Conda, Py2Neo, ete2 among others. >>> >>> Currently I don't have an official release and the project is quite >>> inmature, unstable as well as the installation could be non >>> trivial. I'm fixing all these issues but will take some time,sorry >>> for this. >>> >>> The github repository is: >>> >>> https://github.com/molgor/biospytial.git >>> >>> An there's a very old documentation here: >>> >>> http://test.holobio.me/modules/gbif_taxonomy_class.html >>> >>> Please feel free to follow! >>> >>> >>> Best wishes >>> >>> >>> Juan >>> >>> P.s. The functions for generating the grid are in: >>> biospytial/SQL_functions >>> >>> >>> >>> >>> >>> On 31/05/16 19:47, Tim Robertson wrote: >>>> Thanks Juan >>>> >>>> You're quite right - you need the DwC-A download format to get >>>> those IDs. >>>> >>>> Are the cells decimal degrees, and then partitioned into smaller >>>> units, or equal area cells or maybe UTM grids or something else >>>> perhaps? I am just curious. >>>> >>>> Are you developing this as OSS? I'd like to follow progress if >>>> possible? >>>> >>>> Thanks, >>>> Tim, >>>> >>>> On 31 May 2016, at 20:31, Juan M. Escamilla Molgora >>>> <j.escamillamolgora at lancaster.ac.uk> wrote: >>>> >>>>> Hi Tim, >>>>> >>>>> The grid is made by selecting a square area and divide it in nxn >>>>> subsquares which form a partition on the bigger square. >>>>> >>>>> Each grid is a table in postgis and there's a mapping between this >>>>> table to a django model (class). >>>>> >>>>> The class constructor have attributes: id, cell and neighbours >>>>> (next release). >>>>> >>>>> The cell is a polygon (square) and with geodjango inherits the >>>>> properties of the osgeo module for polygons. >>>>> >>>>> I've tried to use the CSV data (downloaded as a CSV request ) but >>>>> I couldn't find a way to obtain the global id's for each taxonomic >>>>> level (idspecies, idgenus, idfamily, etc). >>>>> >>>>> Do you know a way for obtaining these fields? >>>>> >>>>> >>>>> Thank you for your email and best wishes, >>>>> >>>>> >>>>> Juan >>>>> >>>>> >>>>> On 31/05/16 19:03, Tim Robertson wrote: >>>>>> Hi Juan >>>>>> >>>>>> That sounds like a fun project! >>>>>> >>>>>> Can you please describe your grid / cells? >>>>>> >>>>>> Most likely your best bet will be to use the download API (as CSV >>>>>> data) and ingest that. The other APIs will likely hit limits >>>>>> (e.g. You can't page through indefinitely). >>>>>> >>>>>> Thanks, >>>>>> Tim >>>>>> >>>>>> On 31 May 2016, at 18:55, Juan M. Escamilla Molgora >>>>>> <j.escamillamolgora at lancaster.ac.uk> wrote: >>>>>> >>>>>>> Dear all, >>>>>>> >>>>>>> >>>>>>> Thank you very much for your valuable feedback! >>>>>>> >>>>>>> >>>>>>> I'll explain a bit what I'm doing just to clarify, sorry if this >>>>>>> spam to some. >>>>>>> >>>>>>> >>>>>>> I want to build a model for species assemblages based on >>>>>>> co-occurrence of taxa within an arbitrary area. I'm building a >>>>>>> 2D lattice in which for each cell I'm collapsing the data into a >>>>>>> taxonomic tree (the occurrences). For doing this I need first to >>>>>>> obtain the data from the gbif api and later, based on the ids >>>>>>> (or names) of each taxonomic level (from kingdom to occurrence) >>>>>>> build a tree coupled to each cell. >>>>>>> >>>>>>> >>>>>>> The implementation is done with postgresql (postgis) for storing >>>>>>> the raw gbif data and neo4j for storing the relation >>>>>>> >>>>>>> "Being a member of the [ specie, genus, family,,,] [name/id]" >>>>>>> The idea is to include data from different sources similar to >>>>>>> the project Matthew and Jennifer had mentioned (which I'm very >>>>>>> interested and like to hear more) and traverse the network >>>>>>> looking for significant merged information. >>>>>>> >>>>>>> >>>>>>> One of the immediate problems I've found is to import big chunks >>>>>>> of the gbif data into my specification. Thanks to this thread >>>>>>> I've found the tools that are the most used by the community >>>>>>> (pygbif,rgbif, and python-dwca-reader). I was using urlib2 and >>>>>>> things like that. >>>>>>> >>>>>>> I'll be happy to share any code or ideas with the people interested. >>>>>>> >>>>>>> >>>>>>> Btw, I've checked the tinkerpop project which uses the Gremlin >>>>>>> traversal language as independent from the DBMS. >>>>>>> >>>>>>> Perhaps it's possible to use it with spark and Guoda as well? >>>>>>> >>>>>>> >>>>>>> >>>>>>> Does GOuda is working now? >>>>>>> >>>>>>> >>>>>>> Best wishes >>>>>>> >>>>>>> >>>>>>> Juan. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 31/05/16 17:02, Collins, Matthew wrote: >>>>>>>> >>>>>>>> Jorrit pointed out this thread to us at iDigBio. Downloading >>>>>>>> and importing data into a relational database will work great, >>>>>>>> especially if as Jan said you can cut the data size down to a >>>>>>>> reasonable amount. >>>>>>>> >>>>>>>> >>>>>>>> Another approach we've been working on in a collaboration >>>>>>>> called GUODA [1] is to build an Apache Spark environment with >>>>>>>> pre-formatted data frames with common data sets in them for >>>>>>>> researchers to use. This approach would offer a remote service >>>>>>>> where you could write arbitrary Spark code, probably in Jupyter >>>>>>>> notebooks, to iterate over data. Spark does a lot of cool stuff >>>>>>>> including GraphX which might be of interest. This is definitely >>>>>>>> pre-alpha at this point and if anyone is interested, I'd like >>>>>>>> to hear your thoughts. I'll also be at SPNHC talking about this. >>>>>>>> >>>>>>>> >>>>>>>> One thing we've found in working on this is that importing data >>>>>>>> into a structured data format isn't always easy. If you only >>>>>>>> want a few columns, it'll be fine. But getting the data typing, >>>>>>>> format standardization, and column name syntax of the whole >>>>>>>> width of an iDigBio record right requires some code. I looked >>>>>>>> to see if EcoData Retriever [2] had a GBIF data source and they >>>>>>>> have an eBird one that perhaps you might find useful as a >>>>>>>> starting point if you wanted to try to use someone else's code >>>>>>>> to download and import data. >>>>>>>> >>>>>>>> >>>>>>>> For other data structures like BHL, we're kind of making stuff >>>>>>>> up since we're packaging a relational structure and not >>>>>>>> something nearly as flat as GBIF and DWC stuff. >>>>>>>> >>>>>>>> >>>>>>>> [1] http://guoda.bio/? >>>>>>>> >>>>>>>> [2] http://www.ecodataretriever.org/ >>>>>>>> >>>>>>>> >>>>>>>> Matthew Collins >>>>>>>> Technical Operations Manager >>>>>>>> Advanced Computing and Information Systems Lab, ECE >>>>>>>> University of Florida >>>>>>>> 352-392-5414 <callto:352-392-5414> >>>>>>>> ------------------------------------------------------------------------ >>>>>>>> *From:* jorrit poelen <jhpoelen at xs4all.nl> >>>>>>>> *Sent:* Monday, May 30, 2016 11:16 AM >>>>>>>> *To:* Collins, Matthew; Thompson, Alexander M; Hammock, Jennifer >>>>>>>> *Subject:* Fwd: [API-users] Is there any NEO4J or graph-based >>>>>>>> driver for this API ? >>>>>>>> Hey y?all: >>>>>>>> >>>>>>>> Interesting request below on the GBIF mailing list - sounds >>>>>>>> like a perfect fit for the GUODA use cases. >>>>>>>> >>>>>>>> Would it be too early to jump onto this thread and share our >>>>>>>> efforts/vision? >>>>>>>> >>>>>>>> thx, >>>>>>>> -jorrit >>>>>>>> >>>>>>>>> Begin forwarded message: >>>>>>>>> >>>>>>>>> *From: *Jan Legind <jlegind at gbif.org> >>>>>>>>> *Subject: **Re: [API-users] Is there any NEO4J or graph-based >>>>>>>>> driver for this API ?* >>>>>>>>> *Date: *May 30, 2016 at 5:48:51 AM PDT >>>>>>>>> *To: *Mauro Cavalcanti <maurobio at gmail.com>, "Juan M. >>>>>>>>> Escamilla Molgora" <j.escamillamolgora at lancaster.ac.uk> >>>>>>>>> *Cc: *"api-users at lists.gbif.org >>>>>>>>> <mailto:api-users at lists.gbif.org>" <api-users at lists.gbif.org> >>>>>>>>> >>>>>>>>> Dear Juan, >>>>>>>>> Unfortunately we have no tool for creating these kind of SQL >>>>>>>>> like queries to the portal. I am sure you are aware that the >>>>>>>>> filters in the occurrence search pages can be applied in >>>>>>>>> combination in numerous ways. The API can go even further in >>>>>>>>> this regard[1], but it not well suited for retrieving >>>>>>>>> occurrence records since there is a 200.000 records ceiling >>>>>>>>> making it unfit for species exceeding this number. >>>>>>>>> There is going be updates to the pygbif package[2] in the near >>>>>>>>> future that will enable you to launch user downloads >>>>>>>>> programmatically where a whole list of different species can >>>>>>>>> be used as a query parameter as well as adding polygons.[3] >>>>>>>>> In the meantime, Mauro?s suggestion is excellent. If you can >>>>>>>>> narrow your search down until it returns a manageable download >>>>>>>>> (say less than 100 million records), importing this into a >>>>>>>>> database should be doable. From there, you can refine using >>>>>>>>> SQL queries. >>>>>>>>> Best, >>>>>>>>> Jan K. Legind, GBIF Data manager >>>>>>>>> [1]http://www.gbif.org/developer/occurrence#search >>>>>>>>> [2]https://github.com/sckott/pygbif >>>>>>>>> [3]https://github.com/jlegind/GBIF-downloads >>>>>>>>> *From:*API-users [mailto:api-users-bounces at lists.gbif.org]*On >>>>>>>>> Behalf Of*Mauro Cavalcanti >>>>>>>>> *Sent:*30. maj 2016 14:06 >>>>>>>>> *To:*Juan M. Escamilla Molgora >>>>>>>>> *Cc:*api-users at lists.gbif.org >>>>>>>>> *Subject:*Re: [API-users] Is there any NEO4J or graph-based >>>>>>>>> driver for this API ? >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> One solution I have successfully adopted for this is to >>>>>>>>> download the records (either "manually" via browser or, yet >>>>>>>>> better, using a Python script using the fine pygbif library), >>>>>>>>> storing them into a MySQL or SQLite database and then perform >>>>>>>>> the relational queries. I can provide examples if you are >>>>>>>>> interested. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> 2016-05-30 8:59 GMT-03:00 Juan M. Escamilla Molgora >>>>>>>>> <j.escamillamolgora at lancaster.ac.uk>: >>>>>>>>> Hola, >>>>>>>>> >>>>>>>>> Is there any API for making relational queries like taxonomy, >>>>>>>>> location or timestamp? >>>>>>>>> >>>>>>>>> Thank you and best wishes >>>>>>>>> >>>>>>>>> Juan >>>>>>>>> _______________________________________________ >>>>>>>>> API-users mailing list >>>>>>>>> API-users at lists.gbif.org <mailto:API-users at lists.gbif.org> >>>>>>>>> http://lists.gbif.org/mailman/listinfo/api-users >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Dr. Mauro J. Cavalcanti >>>>>>>>> E-mail:maurobio at gmail.com >>>>>>>>> Web:http://sites.google.com/site/maurobio >>>>>>>>> _______________________________________________ >>>>>>>>> API-users mailing list >>>>>>>>> API-users at lists.gbif.org <mailto:API-users at lists.gbif.org> >>>>>>>>> http://lists.gbif.org/mailman/listinfo/api-users >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> API-users mailing list >>>>>>>> API-users at lists.gbif.org >>>>>>>> http://lists.gbif.org/mailman/listinfo/api-users >>>>>>> >>>>>>> _______________________________________________ >>>>>>> API-users mailing list >>>>>>> API-users at lists.gbif.org <mailto:API-users at lists.gbif.org> >>>>>>> http://lists.gbif.org/mailman/listinfo/api-users >>>>> >>> >>> >>> >>> _______________________________________________ >>> API-users mailing list >>> API-users at lists.gbif.org >>> http://lists.gbif.org/mailman/listinfo/api-users >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gbif.org/pipermail/api-users/attachments/20160601/e04faa11/attachment-0001.html>