Deborah-

  I can only speak for myself, but I have a somewhat different
interpretation of several terms in play here: Fedora's "internal"
datastream is not called oai_dc; it's called "DC", and uses an element
with the namespace prefix "oai_dc" as a container.  This doesn't seem
to be an over-reach, as some (and perhaps most) Fedora installations
are not using the OAI publication framework.  It is a default, so it
seems appropriate for proai to fallback to that in the absence of
other configuration.

  The reservation of oai_dc as a metadata prefix in the OAI-PMH spec
doesn't restrict the use of the namespace, it indicates that the use
of that prefix or namespace in an OAI-PMH response is reserved to the
expression of unqualified dublin core (which the DC datastream is).

  The "internal" character of the DC datastream isn't that dramatic-
your objects can certainly have identifiers, titles, and formats that
you define for them.  The most frequent points of confusion on the
list that I have seen are:
1. Its restriction to unqualified Dublin Core
(http://dublincore.org/documents/dcq-rdf-xml/#sec1)
2. The relationship of the DC datastream, RELS-EXT datastream, and the
Resource Index
3. The fact that the DC datastream is parsed and serialized
internally, and its after-effects  (usually noticed in checksum
differences, or if you've specified values as references rather than
element content).

... and the potential problems with performance come from its inline
character, particularly when versioned, and the resulting size of the
serialized objects.  For what it's worth, there's work going on the
end the inline requirement for DC.

Looking back at the archives, I think some of your concerns are drawn
from an email from Thorny last April:
http://www.mail-archive.com/[email protected]/msg01541.html

... in which he was responding to a person wondering why *qualified*
dublin core was being dropped from his DC datastream after ingest.
Strictly speaking, that data should not be present in an oai_dc:dc
container element:
http://www.openarchives.org/OAI/2.0/oai_dc.xsd
... and I think Thorny's lament is, at least in part, owing to a
general difficulty in deciphering what someone means when they refer
to "dublin core metadata"- his other concern is a performance issue
with Fedora when the serialized objects become very large.  Arguably,
Fedora should just reject objects that have qualified dublin core in
the DC stream, which would allow for more clarity in that situation of
Thorny's first concern.

Regarding his other concern, there is work going on in Fedora to
remove the inline requirement for the DC datastream's data:
https://jira.duraspace.org/browse/FCREPO-492

All in all, I think this is less a bug than a gap in the OAI provider
documentation.  In particular, if it's been a roadblock for you (or
anyone else), we need to pick up the slack there.  A number of folks
on these lists are working on improving documentation, but it's an
ongoing effort.  I created an issue to track this:
 https://jira.duraspace.org/browse/FCREPO-865

... and hopefully we'll get a volunteer for it!

- Ben



On 2/24/11, Kaplan, Deborah <[email protected]> wrote:
> Thanks, Ben and Richard. We are using PROAI, and we had come to the same
> conclusion: returning our Dublin core data stream mapped to oai_dc. That's
> what we are planning on implementing now.
>
> My concern is more about Fedora going forward. This is clearly nonintuitive
> for people; the Fedora users list fairly regularly has people ask about the
> oai_dc datastream and then be told they shouldn't be using it. Basically,
> there's an internal-only datastream which has the same name as a reserved
> namespace which most Fedora Commons users expect to be publishing. That
> feels to me like a pretty major bug that there should be some plan to fix in
> the future. Yes, there are workarounds with harvesters, but having to go
> through a workaround -- having to even realize that there is a need to go
> through a workaround -- is a roadblock for a lot of people.
>
> -Deborah
> ------------------------------------------------------------------------------
> Free Software Download: Index, Search & Analyze Logs and other IT data in
> Real-Time with Splunk. Collect, index and harness all the fast moving IT
> data
> generated by your applications, servers and devices whether physical,
> virtual
> or in the cloud. Deliver compliance at lower cost and gain new business
> insights. http://p.sf.net/sfu/splunk-dev2dev
> _______________________________________________
> Fedora-commons-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers
>

------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Reply via email to