Hi
We are setting up a web resource for a draft genome. We have GBrowse up
and running no problem but now want the same data (sequence and basic
annotation) to be available via BioMart. I am not a
computer/bioinformatics 'expert'. I'm a biologist learning to use
computers so I am at the hands of the quality of available documentation
for resources. Pointing me to an existing Mart doesn't really help me
know how to go from a starting point of nothing to a working Mart.
I have searched the mailing list for anything relevant I can think of. I
have even found *identical* requests to this one but nothing has come of
those posts (other than 'look at one of our existing ensembl marts').
What I want is to know how to go from a starting point of having
a)genomic, transcript and protein FASTA files b)a GFF3 file for
transcripts c)a table of gene annotation information to setting up
BioMart. I imagine this has to be a common task so I am surprised there
is no documentation on this. As part of the GMOD project it would be
great to see an integrated tutorial on setting up GBrowse and getting
the same data into a working Mart.
I have looked at and tried to use the gff2biomart script (various
versions of it) but after much effort got nowhere. I have also asked
someone far more familiar with mysql and perl to have a go. They got a
lot further than me but still not to a working Mart. We both managed to
get simple tables structures for location and annotation information
working but the sequence side of things is a complete mystery.
So, is there any documentation available and if not can anyone help me
go from FASTA+GFF3+annotation table to a working Mart? If someone can
help I am more than willing to write up the end process into a
walk-through guide so that others benefit.
Thanks
Nathaniel
--
Nathaniel Street
Umeå Plant Science Centre
Department of Plant Physiology
University of Umeå
SE-901 87 Umeå
SWEDEN
email: [email protected]
tel: +46-90-786 5473
fax: +46-90-786 6676
www.popgenie.org