On 22 Jul 2006, at 13:35, [EMAIL PROTECTED] wrote:
BioMart has been designed in such a way that, the buffer for such
changes is the dataset configuration file,
which describes what _currently_ available in the database and
guarantees that anything in this file can be used to construct a valid
query.
These file were designed precisely for this purpose that the query
logic can be decoupled from evolving data.
Assuming that you always start from scratch that's great, the problem
is
that by its nature we store query specifications in the workflow. These
specifications include filter and attribute names (how else could they
do
it?) and users expect that a query written against version x of
database y
shouldn't suddenly stop working.
Our interface will pick up the new configuration sure, it's not
hardcoded
in our code, but that doesn't help a user who writes a workflow then
comes
back the next day to find that it randomly fails. You need to take the
potential for stored queries into account.
ok, in this case you may find helpful the versioning which is coming
into
central server from 0.5 and this is something that originally
bioconductor
community asked for. We'll be able to serve multiple versions of the
databases
and so the workflow with hardcoded filters and attributes from version
39 will still be able
to run against v 39 and it is guaranteed to work 'forever' rather than
relying
on the most recent currently published version.
The other thing that you maybe able to to find helpful (coming with
0.5) is listings
of filters and attributes so you can do easy diff between the versions
and see
for yourself what has changed.
BTW, the changes as described in your email are rare anyway and they
normally
do not occur, there were only associated with our 'big cleaning up'
activities as per my archive
email. The changes which you may account more frequently is the
presence or absence
of certain attributes (not re-naming) but they are usually only limited
to things like
xrefs, while everything remains stable. For this however you need to
take your plea
to Ensembl team as we only pick up things that they map for every
release and this
is something that we do not have any control over
a.
Tom
------------------------------------------------------------------------
-------
Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
------------------------------------------------------------------------
-------