Deborah-

> (b) sharing a meaningless internal identifier via OAI PMH as dc:identifier is 
> bad practice, and should never be a fallback default.

You make a good point: The pid satisfies the requirements for a unique
identifier as per OAI-PMH section 2.4, but it doesn't hold up to the
use of the resource identifiers very well on its own.  It seems like a
natural extension to the features requested in:
https://jira.duraspace.org/browse/FCREPO-650
https://jira.duraspace.org/browse/FCREPO-655
... to work towards the OAI-PMH interfaces include a dereferenceable
URI.  That seems like an appropriate default to me, but it still may
not be what you want!

> The DC element's dc:identifier field is restricted to being the fedora pid.

This is a misunderstanding; at least as of Fedora 3.3 you can add
additional dc:identifier's.  I'd have to check the code to see whether
you can actually eliminate the pid's inclusion, and what the
consequences of that would be (I wouldn't recommend it in any case).

> Thorny has repeatedly said that we shouldn't be loading up the DC datastream 
> with metadata because it's bad practice (mostly because of the performance 
> problems he discusses in that e-mail you cite). Is that not true?

If you load inline DC datastreams up with large data, you may incur a
performance penalty; especially if you version the DC datastream.  So
there are three factors there: Is it inline, is it large, is it
versioned.  These three factors relate to any datastream in Fedora,
though, not just the DC datastream.

Thorny also said you shouldn't try to include qualified dublin core
data, but that is a separate issue.

Thanks for hashing this out! These lists are an important source of
documentation for me, and it's important to revisit long-standing
issues to make sure received wisdom is still aligned with the state of
the application.

- Ben

On 2/25/11, Kaplan, Deborah <[email protected]> wrote:
>> Fedora's "internal"
>> datastream is not called oai_dc; it's called "DC", and uses an element
>> with the namespace prefix "oai_dc" as a container.
>
> Fair enough; I was sloppy with my terminology and I shouldn't have been.
>
> That meeting said, it's clear that this is a blocker for any number of
> people who are getting set up either with Fedora itself for the first time,
> or with Fedora and OAI PMH. The namespace prefix oai_dc *means* something
> out in the world, and for that matter so does the identifier "DC" (even if
> not as formally). Asking it in Fedora to mean something else confuses new
> users, it is one more roadblock between concept and production for tool
> which is nontrivial to configure at the best of times.
>
> Quoting Thorny: "I wish we had called the DC datastream "repoMeta" or
> something. It was just intended to be the base metadata needed for the
> repository manager to be able to function and was not intended to be exposed
> externally."
>
> People setting up a Fedora instance, or implementing OAI PMH for an existing
> Fedora instance, see the apparently meaningful prefix oai_dc, on top of the
> apparently meaningful datastream name "DC", and get confused. Yes, I
> understand that oai_dc is reserved in such a way that it can't be used
> internal to Fedora, it's reserved for an OAI PMH implementation. Yet normal
> humans are going to see the namespace oai_dc and react accordingly. At a
> minimum, the documentation should be updated to highlight this potential
> confusion.
>
>>  It is a default, so it
>> seems appropriate for proai to fallback to that in the absence of
>> other configuration.
>
> Except it doesn't, because (a) the message on the mailing lists for the last
> year has repeatedly been "don't put any metadata into DC datastream, except
> what Fedora requires", and more importantly (b) sharing a meaningless
> internal identifier via OAI PMH as dc:identifier is bad practice, and should
> never be a fallback default.
>
>>   The "internal" character of the DC datastream isn't that dramatic-
>> your objects can certainly have identifiers, titles, and formats that
>> you define for them.
>
> Actually, that's not true. They can't have identifiers you define for them.
> The DC element's dc:identifier field is restricted to being the fedora pid.
> Unless there is something I am very much missing.
>
> Thorny has repeatedly said that we shouldn't be loading up the DC datastream
> with metadata because it's bad practice (mostly because of the performance
> problems he discusses in that e-mail you cite). Is that not true?
>
>
> -Deborah
> ------------------------------------------------------------------------------
> Free Software Download: Index, Search & Analyze Logs and other IT data in
> Real-Time with Splunk. Collect, index and harness all the fast moving IT
> data
> generated by your applications, servers and devices whether physical,
> virtual
> or in the cloud. Deliver compliance at lower cost and gain new business
> insights. http://p.sf.net/sfu/splunk-dev2dev
> _______________________________________________
> Fedora-commons-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers
>

------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Reply via email to