Steve,
I like this idea.
Configurable (default - soft/warning and optional - hard/failure) levels of
compliance, checked on ingest/update will allow those of us who might still
have cmodels without the appropriate hasModel assertions to discover that fact
gracefully. Would checking on access also be a feasible and appropriate
configuration option?
I wonder too if validation checks could be invoked asynchronously as part of a
data integrity / preservation utility or suite of utilities. Having such
validation mechanisms available could encourage a broader suite of preservation
services facilitating audit/certification, such as in TRAC. I imagine this
could work really well with the Spring refactoring/decomposition that's been
discussed.
- Bill
.
On Oct 30, 2010, at 3:20 AM, Steve Bayliss wrote:
> Some very interesting discussions. After digesting and thinking some more,
> my views are:
>
>> (a) when ingesting content model objects, should we enforce a RELS-EXT
> assertion to a valid content model for content model objects? or
>
> This is difficult, conceptually, I think. We only know that it is a content
> model object by presence of the hasModel assertion to the
> <info:fedora/fedora-system:ContentModel-3.0> object, so can't directly
> validate for presence of that relationship. We should not infer that it is
> a content model by any other means - for instance the presence of
> content-model-reserved datastream such as DS-COMPOSITE-MODEL as conceptually
> at least, the "reserved" status (or interpretation) of that datastream is
> only by virtue of it being a content model object - there should be nothing
> that prevents data objects for instance having a datastream of that name.
> (Furthermore, conceptually at least -- and certainly not currently
> implemented that way in code -- the interpretation of any "reserved"
> datastream should be through content models; for instance DC, RELS-EXT and
> RELS-INT *should* be interpreted as reserved by dint of the object belonging
> the default data object content model).
>
> In short, I think that any interpretation of the type/kind of object should
> be through explicit typing of the object through a hasModel relationship.
>
> However there is probably some useful validation that maybe could be done
> through ECM. For instance validating that the target of a data object's
> hasModel relationship itself asserts membership of a content model content
> model, similarly for the other CMA relationships. So this particular
> validation would in fact be validation on ingest of a data object -
> validation that the network of relationships associated with the data object
> are correct.
>
> I like the idea of configurable levels of validation, ie validation through
> an explicit call, "soft" (ie warnings only) validation on ingest, hard
> validation (error) on ingest, maybe configurable levels of what to validate.
> The default behaviour should be as it is at the moment (therefore not
> imposing any restrictions on order of ingest).
>
>> (b) should we create a Resource Index triple identifying the
> fedora-system:ContentModel-3.0 as a default for content model objects when
> none is specified in RELS-EXT?
>
> As for (a), if the only way of identifying that an object is a content model
> object is through its hasModel assertion, then this is conceptually
> difficult.
>
>> (c) should we stop CMA features working (eg the dissemination execution)
> if the object identified as the content model does not itself identify
> through RELS-EXT that it is a content model object?
>
> Probably. A question as to when - should we explicitly introduce additional
> checks now, or should we leave this to when they are required by other
> features (for instance, introducing different types of system content
> models, allowing different ways of describing objects and the services
> available to them - if this was done then presumably it would be necessary
> for the hasModel assertion to be present in order to correctly interpret the
> cmodel/sdef/sdep objects).
>
> The downside of doing this sooner rather than later is that it could break
> existing, working, content models that don't make the hasModel assertion.
> (However it could be argued that it is a bug that these content models just
> happen to work at the moment.)
>
> Taking on board the thoughts on levels of validation, that could be applied
> here also, the implementation of this could consist of default "do nothing"
> behaviour, and configurable settings that alternatively generate warnings or
> errors when the missing assertions are discovered as part of service
> execution. (Possibly the default should be warnings, to alert folks, as the
> hasModel assertion may be required in the future).
>
> Steve
>
>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]]
>> Sent: 29 October 2010 23:12
>> To: Support and info exchange list for Fedora users.
>> Subject: Re: [fcrepo-user] Cmodel discovery?
>>
>>
>> Aaron--
>>
>> You don't have to convince me of the dangers. I have ugly
>> memories of watching Ross Wayland and Thorny Staples being
>> chafed by the straightjacket of strong integrity for
>> objects/bdefs/bmechs that was baked into the 2.x series. {grin}
>>
>> With time, though, I've forgotten the grim images enough to
>> reprice in my mind the guarantees that such strong
>> constraints buy. But that's only my situation, and I accept
>> your narrative and its import and the cautions they imply.
>>
>> Let me suggest the existence of a set of categories of
>> repository behavior desirable by different users, categories
>> that might be directly connected with scale. The cost of
>> correcting a failure that is publicly visible (e.g. a
>> dissemination that fails) is often in a direct relationship
>> with some function of the size of the repository. It may be
>> that such a cost is very high to some users (who would then
>> prefer to work in that comfortable, stylish, and supportive
>> straightjacket) but a marginal and uninteresting item to
>> others (who find it confining).
>>
>> I'm impressed by Scott's suggestion of parameterized
>> validation behavior, and I wonder if we could imagine
>> partitioning his proposed CMA validation flag further into a
>> module or service to include some of the options we've been
>> discussing, and perhaps others. Could we leverage some of the
>> enhanced content model validation functionality to support a
>> range of sizes of straightjacket? {grin}
>>
>> While I understand and accept that forcing every user to
>> construct workflows that fulfill these kinds of integrity
>> would be wrong, I believe that there are enough users who
>> would really benefit from the workflow feedback and long-term
>> stability that such integrities provide that it's worth
>> considering such provision.
>>
>> ---
>> A. Soroka
>> Digital Research and Scholarship R & D and Online Library Environment
>> the University of Virginia Library
>>
>>
>>
>>
>> On Oct 29, 2010, at 4:32 PM, Aaron Birkland wrote:
>>
>>>
>>>> The CMA is such a core part of the repository architecture
>> that I think a situation in which the repository can be said
>> to be working but the CMA can't be is a bad situation to enable.
>>>
>>> Ah, I see your perspective. "working" is a bit of a sticky
>> point here.
>>> In conceptualizing the CMA, here were a few thoughts or
>> principles that
>>> motivated its design:
>>>
>>> - Users may choose to ignore the CMA - simply preserving
>> and providing
>>> access objects without an explicit model is a valid use case.
>>>
>>> - The core repository is not concerned with referential integrity of
>>> RELS-EXT relationships.
>>>
>>> - There shall not be a prescribed order in which objects must be
>>> ingested into the repository.
>>>
>>> - Service binding will occur dynamically. If this cannot happen for
>>> some reason (missing objects, relationships, etc), then a
>> runtime error
>>> is reported.
>>>
>>> These thoughts were partly in response to problems
>> encountered with the
>>> precursor to the CMA. In particular, the precursor *did*
>> enforce a kind
>>> of referential integrity - and this turned out to be a bit of a sore
>>> point. In response, there was a trend to more lightweight
>> and dynamic
>>> behaviour.
>>>
>>> So, in other words, it was intentional that the core
>> repository would be
>>> capable of ingesting and preserving objects that don't yet
>> contribute to
>>> a functioning CMA behaviour. I think it was viewed that
>> higher-level
>>> validation, correction, etc could occur later (if at all),
>> or as part of
>>> some other functionality on layered on top of the basic
>> core and enabled
>>> separately.
>>>
>>> Perhaps somebody can give more on the history and
>> motivation. It could
>>> be worth revisiting if the resulting behaviours are seen as
>> generally
>>> unintuitive.
>>>
>>>> It's not obvious to me how feedback could be supplied to
>> avoid that, but perhaps the ingest method could continue
>> without the additional validation of a) but could provide a
>> response notation like "Ingested with PID", "Ingested as
>> content model with PID", "Ingested as service definition with
>> PID", etc. for the system-defined models? It could even be
>> extended to user-defined models, which could provide valuable
>> feedback in a workflow. But admittedly, that would induce
>> more complexity and even if carried out cleverly, might break
>> older fragile installed workflows...
>>>
>>> That is an interesting line of thought.
>>>
>>> -Aaron
>>>
>>>
>>>
>> --------------------------------------------------------------
>> ----------------
>>> Nokia and AT&T present the 2010 Calling All
>> Innovators-North America contest
>>> Create new apps & games for the Nokia N8 for consumers in
>> U.S. and Canada
>>> $10 million total in prizes - $4M cash, 500 devices, nearly
>> $6M in marketing
>>> Develop with Nokia Qt SDK, Web Runtime, or Java and Publish
>> to Ovi Store
>>> http://p.sf.net/sfu/nokia-dev2dev
>>> _______________________________________________
>>> Fedora-commons-users mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>>
>> --------------------------------------------------------------
>> ----------------
>> Nokia and AT&T present the 2010 Calling All Innovators-North
>> America contest
>> Create new apps & games for the Nokia N8 for consumers in
>> U.S. and Canada
>> $10 million total in prizes - $4M cash, 500 devices, nearly
>> $6M in marketing
>> Develop with Nokia Qt SDK, Web Runtime, or Java and Publish
>> to Ovi Store
>> http://p.sf.net/sfu/nokia-dev2dev
>> _______________________________________________
>> Fedora-commons-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>
>
> ------------------------------------------------------------------------------
> Nokia and AT&T present the 2010 Calling All Innovators-North America contest
> Create new apps & games for the Nokia N8 for consumers in U.S. and Canada
> $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
> Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
> http://p.sf.net/sfu/nokia-dev2dev
> _______________________________________________
> Fedora-commons-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
Bill Parod
Library Technology Division - Enterprise Systems
Northwestern University Library
[email protected]
847 491 5368
------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
http://p.sf.net/sfu/nokia-dev2dev
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users