[ https://jira.duraspace.org/browse/DS-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=24076#comment-24076 ]
Mark Diggory commented on DS-893: --------------------------------- The challenge is that bundles are not manifestations. They are just... Bundles. And they are used internally to hold system generated derivative files, licensing and original metadata. Different projects have "bastardized" the Object for their own benefit, Search put the text extracts in a "TEXT" bundle, "Licensing put the license file in a "LICENSE" bundle, SWORD and LNI place the original xml manifest file in a METADATA bundle. And now in SWORDv2 we see every update to a Item from SWORD creating "BACKUP" bundles that to represent the previous state of the bitstreams in the Item. All of this is "Bad Practice", Bundle is a cesspool for stuff that system agents want to associate with an Item but not want to shown in the UI. IMO we need to protect DSpace from this abuse. The problem with equating the DSpace Data Model to a FRBR model is that they do not cleanly fit together, FRBR is an abstract model of "types" to attach as concepts to a thing, and what folks don't seem to get is that a Bitstream resource can be an Item, Manifestation, Expression and Work all at the same time, this is because FRBR is not meant to be a containership model, and the DSpace data model is. Christophe, in your case, it would be better to keep FRBR sparate from the DSpace model. Equate individual DSpace Items to represent the different types of FRBR entities: FRBR Work Item FRBR Expression Item FRBR Manifestation Item FRBR Item Item Then you are free to build the relationships as metadata relations, from what I understand, you were already heading in this direction with capturing your authority concepts as individual items in your collections. This is the area of DSpace that needs to evolve, it was to be a very important feature of the DSpace 2.0 approach, to map this onto legacy DSpace is something that is very important to do, especially in planning out future DSpace/Fedora integrations, because in Fedora, there is not really anything that aligns with "Bundle" when trying to Map DSpace Items to Fedora Objects. The best future for DSpace would be to deprecate the Bundle. > suggestion for a re-implementation of Bundles in the DSpace data model > ----------------------------------------------------------------------- > > Key: DS-893 > URL: https://jira.duraspace.org/browse/DS-893 > Project: DSpace > Issue Type: Improvement > Components: DSpace API > Affects Versions: 1.8.0 > Reporter: Bill Hays > > Preliminary ideas for a new implementation of "Bundle" in the DSpace data > model > Current database model relationships: > Item <- Item2Bundle -> Bundle <- Bundle2Bitstream -> Bitstream > Current java object model relationships: > Item <-> Bundle <-> Bitstream > Proposed database model relationships (1): > Item <- Item2Bitstream -> Bitstream(id, ..., bundlename, ...) > or even more succinctly: > Item <- Bitstream(id, item_id, bundlename, ...) > In current DSpace, there is no realized benefit from the container complexity > in the current model for Bundles. > This first step in the proposal removes the Bundle table and directly > associates Bitstream to Item. The concept of "bundle" is replaced by an enum > field in the Bitstream that identifies a bundle type (ORIGINAL, THUMBNAIL, > etc). Functionally this is very similar to what we get now: A bitstream > belongs to one item and is associated with one bundle. The bundle names are > not constrained, but some names are expected in various parts of the > codebase. > > Proposed database model relationships (2): > Item <- Bitstream -> MBundle(id, name, collection_id, derivative ...) > This variation replaces the bundlename enum with a new class and database > table "MBundle." Here bundles are not implemented as containers but are an > associated type concept for a bitstream. With the association to a > collection, bundles can be managed per collection or use a default set. > Other properties of MBundle can be added to further enhance management > capabilities, e.g.: > isDerivative - identify bundles for Thumbnails and DerivativeText > isVisible - indicate that the related bitstreams should be visible in > display contexts > isReserved - such as for very large "source" objects not for display or > filtermedia > [needs work - how complex does bundle "metadata" need to be?] > > Issues: > Primary Bitstream Id: This is currently only used for the ORIGINAL bundle, > so conceptually there is one per item. Note that the current model (API and > database) allows for multiple ORIGINAL bundles which therefore allows > multiple primary bitstream ids; however, the implementation doesn't expose > this possibility. > Possible replacement API calls, depending on the implementation: > item.setPrimaryBitstream(Bitstream b) > bitstream.isPrimary(Boolean b) > Various db solutions: > item.primaryBitstreamId - not standard database normalization but > consistent with dspace practice > item2bitstream.primaryBitstream - a boolean, standard normalization but > requires some management to avoid duplicates > mbundle.primaryBitstreamId - not standard database normalization but > consistent with dspace practice > item.primaryBitstream - a boolean, standard normalization but requires > some management to avoid duplicates > > In the event that someone has used primaryBitstreamId in non-ORIGINAL > bundles for special purposes, only 3 or 4 would work. > Affected Java code: Item and Bitstream would need to be adjusted. This is > fairly low-level so should not be visible to much of the API. Group > authorizations would need some work (this has not been fully analyzed). > Custom code that uses the API might be affected. Custom SQL such as for > reporting might break, but the replacement is shorter code. Collection > management of bundles types would need a new tab on the collection page > (XMLUI). > > Upgrading a DSpace instance: The database can be modified with queries. No > affect on assetstores. > > Benefits: > Simpler, more concise model which removes unused/unnecessary containership > structure. > Enhanced bitstream management with bundle properties. > Enumeration of names instead of uncontrolled strings, preventing typos in > bundle names (e.g. from ItemImport) > Provides easy solution to making derivative bitstreams visible. > Moving bitstreams between bundles does not require deleting and re-adding > the bitstream. > Fixes data model problem with primary bitstream and multiple bundles with > the same name > Drawbacks: > Not a backwardly compatible change. A fundamental change to the data > model. > Custom SQL code connecting using bundles will require rework. > Summary: > Bundles are categories for Bitstreams and do not need to be imlemented as > containers. > Bundles could be improved with added metadata and management features. > The current Bundle implementation may not be a priority issue to merit the > work suggested. However, the ideas > above may be suggestive for other work, including metadata for all DSpace > objects and exposing the data model > to external systems (e.g. Fedora) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://jira.duraspace.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------------ Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 _______________________________________________ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel