On 21 Jul 2006, at 22:49, Tom Oinn wrote:
Hi guys,
I know this isn't really a mart issue so much as a 'use of mart' one
but can people please bear in mind the implications of, say, renaming
all the filters in a mart? The (at least) human genome sequence
database at ensembl seems to have done this recently, the effect is
that all workflows which used to work on this data set have started
failing.
While 'chromosome_name_filter' is indeed prettier than
'chr_name_filter' oddly enough my code can't magically tell that these
are meant to be the same thing and so it breaks things. As the things
it's breaking are workflows written by our users this is a bad thing
mkay?
Any change to an API (and the mart schemas fall into this category the
moment you expose a query service) MUST be announced somewhere, it's
not acceptable for a public service to shift around like this with no
notice. Same goes for changes to the mart metadata, there should be a
channel by which you can notify us (or, better, our users) of these
changes.
As it is I've just spent an hour debugging a workflow that was working
fine a couple of weeks ago, I'm quite tolerant (hey, I'm used to
debugging distributed systems) but our users aren't, too many
occasions like this and they give up on us and they give up on biomart
by implication.
Hi Tom,
the major changes were announced a while back
http://listserver.ebi.ac.uk/mailing-lists-archives/mart-dev/
msg00135.html.
However, this probably did not include the latest tweaks as these
happen all the time.
We of course try to minimize changes as much as we can for the services
that we have control over (ei ensembl) but definitely
cannot guarantee that such changes will not occur somewhere in the
future and bearing in mind that in fact that great majority of
data sources are evolving independently from us the changes of this
kind will certainly happen again.
BioMart has been designed in such a way that, the buffer for such
changes is the dataset configuration file,
which describes what _currently_ available in the database and
guarantees that anything in this file can be used to construct a valid
query.
These file were designed precisely for this purpose that the query
logic can be decoupled from evolving data.
The problems that you encountered typically arise in the following
scenarios:
1. You have somewhere in your system hardcoded names of filters and
attributes
and/or
2. You forgot to update to the latest configuration eg using some sort
of cached config
v 38 against database v 39 or something of that sort.
I, of course, do not know enough about taverna to see if any of these
indeed apply to your situation
or you are using the software in some other way that causes this
problem (in which case we
would like know how)
could you give us more detail and we'll try to see if we can somehow
safeguard against this situation
in the future so that BioMart and Taverna could work smoothly together
:-) or if (2) is the problem then you need
to check the db versions in the registry file and if different from
your local config, pull out a new one.
a.
Tom
------------------------------------------------------------------------
-------
Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
------------------------------------------------------------------------
-------