Re: [BioMart Users] [biohackathon:270] Text-mining ontological data

Arek Kasprzyk Fri, 26 Aug 2011 06:36:51 -0700

Hi Martin,
this is great. it will be good enough for us.
Joachim: did you have any luck with collecting use cases for this project?


a

On Fri, Aug 26, 2011 at 1:16 AM, Martin Gerner
<[email protected]> wrote:
> Hi Arek,
>
> I can make sample data available for each of the entity types, if you are
> still interested. It's in an SQL table at the moment, so it'd be formatted
> as a simple tab-separated value file I imagine.
>
> -- Martin
>
> On 26/08/2011 02:16, Arek Kasprzyk wrote:
>>
>> Hi Goran,
>> is there are test dataset available right now that we could use for
>> quick prototyping?
>>
>> a
>>
>>
>> On Thu, Aug 25, 2011 at 1:03 PM, Goran Nenadic
>> <[email protected]>  wrote:
>>>
>>>>>> As for the data, we are finalising a datastore of mentions of genes,
>>>>>> proteins, species etc. for the entire Medline and (open-access) PMC -
>>>>>> e.g. some 80 million mentions of genes/proteins, associated with
>>>>>> anatomical location where available.
>>>>>
>>>>>   Sounds great! Will that be a continuation of www.pubmed2ensembl.org?
>>>
>>> The aim of pubmed2ensembl is to directly link genomic data (Ensembl) and
>>> biological literature (PubMed) on a large scale. There are links between
>>> over 2 million articles and nearly 150,000 genes from 50 species, and
>>> these
>>> are then integrated with other Ensembl data on these genes for integrated
>>> querying. One can query both PubMed for a particular term (e.g. disease,
>>> process) and get associated genomic data directly, or the other way
>>> around -
>>> e.g. use genomic coordinates as input and get associated papers. The data
>>> is
>>> available for browsing or download at www.pubmed2ensembl.org, and the
>>> associated paper was accepted two days ago (PLOS ONE), and is in
>>> production
>>> now :).
>>>
>>> The new dataset that Martin mentioned will be much larger, including
>>> links
>>> from some 200 million mentions of various 'things' in both PubMed and
>>> PubMed
>>> Central - from genes, species, anatomical terms etc., many of them linked
>>> to
>>> relevant databases and processes they are involved in. As I said, this
>>> dataset will be available very soon and it could be also potentially
>>> integrated in a biomart. I guess with all Joachim's efforts one would be
>>> also able to use SPARQL to query this data (if or once it's available via
>>> BioMart 0.8).
>>>
>>> Best,
>>> Goran
>>>
>>>
>>>
>
>
>
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Re: [BioMart Users] [biohackathon:270] Text-mining ontological data

Reply via email to