Jonathan O'Donnell wrote:
Hi WSG'ers
In general, data (including metadata) should be stored in one place
only. This prevents drift: if it is only stored in one place, it can
only be updated in that place.
Often, the information that we want to store as metadata already
appears in the Web page. Examples include the title, description
(especially as opening paragraph) and the author's name. In footers,
we often find rights information, the URL, and date information.
If this information already exists in the data, and we replicate it in
the metadata, there is the danger of drift. Perhaps pointing to the
data from the metadata fields is a way of preventing drift, and
ensuring that the metadata is as up-to-date as the data.
** Method **
Hi Jonathan,
Given what you have said here, and what I would expect to see in serious
authoring tools and CMSs, I think this area is generally neglected in
most publishing tools (last time I looked).
Quit a few CMS's say that they are DC compliant, but as you mentioned,
do they actually store the data in one place, and not in the web pages?
Is it part of the work flow and version control of the documents? I
don't think so. I'd be glad if anyone can point me to a product that
does address this need.
For a CMS to address this properly, it needs to have incorporated a
normalised schema based on DC into it's database. This was all the
pages published from this system can incorporate the various metadata as
well as "alt" and "longdesc" for images.
Many organisations have legal requirements where they require snapshots
of published data from any given time. A publishing system based on DC
not only allows this features, but allow a complete analysis of all the
subcomponents of a document and the various contributors.
That also leads to problems with document management systems that manage
their meta data from properties within the documents and network
environment variables.
Last time I tried to extract metadata from MS Word, using Perl and
Python, I could only get the standard set of properties, any data in
custom properties was unretrievable (at least by me). I don't know what
OO or the latest MS Office offers.
But I don't think asking users to maintain this data will work, unless
they are librarians. I think that it has to be as automated and as
transparent to the user as possible, because most users are just not
interested in this level of site QA, unless it is an important component
of the job.
Regards
Geoff Deering
******************************************************
The discussion list for http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
******************************************************