Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
Hi Mark, On 11/13/2013 12:06 PM, Mark H. Wood wrote: I had missed that we were hurrying to do something in 4.0. I had assumed that there wasn't time. To clarify, I was not recommending we change identifier schemes in 4.0. As Graham mentions, all I was recommending is that we stay *consistent* with our current identifier scheme. So, the only change I recommend for 4.0 is to change the REST-API to accept Handles ([prefix]/[suffix]) as the identifier. Currently the REST-API only works with Database IDs. As noted earlier in this thread, using Database IDs for the REST API has several limitations: (1) Database ID is not very easily discoverable to end users. So, if a user wanted to access this item from the REST API: http://demo.dspace.org/xmlui/handle/10673/3 How would they be able to determine that it is actually available at: http://demo.dspace.org/rest/items/2 (The same goes for specific Communities/Collections -- the ID from REST is almost always going to be different from the Handle ID) (2) As mentioned previously, Database IDs are slightly more fragile than Handles currently -- e.g. they cannot be restored by AIP backup restore. I hope this makes some sense. All I'm recommending for 4.0 is that we tweak the REST API to understand Handles (either instead of or in addition to Database IDs). For 5.0, we should investigate a better resolution for https://jira.duraspace.org/browse/DS-1782 (and related issues) - Tim -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
It is very easy to make the REST API understand handles. I have done it for items locally already. At the moment it is in addition to database ids, but I would be happy for DB ids to disappear from view completely. Best regards, Anja On 15/11/2013 16:08, Tim Donohue wrote: Hi Mark, On 11/13/2013 12:06 PM, Mark H. Wood wrote: I had missed that we were hurrying to do something in 4.0. I had assumed that there wasn't time. To clarify, I was not recommending we change identifier schemes in 4.0. As Graham mentions, all I was recommending is that we stay *consistent* with our current identifier scheme. So, the only change I recommend for 4.0 is to change the REST-API to accept Handles ([prefix]/[suffix]) as the identifier. Currently the REST-API only works with Database IDs. As noted earlier in this thread, using Database IDs for the REST API has several limitations: (1) Database ID is not very easily discoverable to end users. So, if a user wanted to access this item from the REST API: http://demo.dspace.org/xmlui/handle/10673/3 How would they be able to determine that it is actually available at: http://demo.dspace.org/rest/items/2 (The same goes for specific Communities/Collections -- the ID from REST is almost always going to be different from the Handle ID) (2) As mentioned previously, Database IDs are slightly more fragile than Handles currently -- e.g. they cannot be restored by AIP backup restore. I hope this makes some sense. All I'm recommending for 4.0 is that we tweak the REST API to understand Handles (either instead of or in addition to Database IDs). For 5.0, we should investigate a better resolution for https://jira.duraspace.org/browse/DS-1782 (and related issues) - Tim -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
Sorry, but the current REST API does support lookup by handles. https://github.com/DSpace/DSpace/tree/master/dspace-rest#handles https://github.com/DSpace/DSpace/blob/master/dspace-rest/src/main/java/org/dspace/rest/HandleResource.java#L32 @Path(/handle) public class HandleResource { @GET @Path(/{prefix}/{suffix}) @Produces({MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML}) public org.dspace.rest.common.DSpaceObject getObject(@PathParam(prefix) String prefix, @PathParam(suffix) String suffix, @QueryParam(expand) String expand) { It should work just-as-well as looking it up by DB-id. With the exception that we haven't added the pagination params to the Handle lookup. The handle returns either 404, or an object of the type as the found DSO, i.e. Community, Collection, or Item Peter Dietz On Fri, Nov 15, 2013 at 11:36 AM, Anja Le Blanc anja.lebl...@manchester.ac.uk wrote: It is very easy to make the REST API understand handles. I have done it for items locally already. At the moment it is in addition to database ids, but I would be happy for DB ids to disappear from view completely. Best regards, Anja On 15/11/2013 16:08, Tim Donohue wrote: Hi Mark, On 11/13/2013 12:06 PM, Mark H. Wood wrote: I had missed that we were hurrying to do something in 4.0. I had assumed that there wasn't time. To clarify, I was not recommending we change identifier schemes in 4.0. As Graham mentions, all I was recommending is that we stay *consistent* with our current identifier scheme. So, the only change I recommend for 4.0 is to change the REST-API to accept Handles ([prefix]/[suffix]) as the identifier. Currently the REST-API only works with Database IDs. As noted earlier in this thread, using Database IDs for the REST API has several limitations: (1) Database ID is not very easily discoverable to end users. So, if a user wanted to access this item from the REST API: http://demo.dspace.org/xmlui/handle/10673/3 How would they be able to determine that it is actually available at: http://demo.dspace.org/rest/items/2 (The same goes for specific Communities/Collections -- the ID from REST is almost always going to be different from the Handle ID) (2) As mentioned previously, Database IDs are slightly more fragile than Handles currently -- e.g. they cannot be restored by AIP backup restore. I hope this makes some sense. All I'm recommending for 4.0 is that we tweak the REST API to understand Handles (either instead of or in addition to Database IDs). For 5.0, we should investigate a better resolution for https://jira.duraspace.org/browse/DS-1782 (and related issues) - Tim -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
Example: http://demo.dspace.org/rest/handle/10673/1?expand=all Peter Dietz On Fri, Nov 15, 2013 at 12:00 PM, Peter Dietz pdiet...@gmail.com wrote: Sorry, but the current REST API does support lookup by handles. https://github.com/DSpace/DSpace/tree/master/dspace-rest#handles https://github.com/DSpace/DSpace/blob/master/dspace-rest/src/main/java/org/dspace/rest/HandleResource.java#L32 @Path(/handle) public class HandleResource { @GET @Path(/{prefix}/{suffix}) @Produces({MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML}) public org.dspace.rest.common.DSpaceObject getObject(@PathParam(prefix) String prefix, @PathParam(suffix) String suffix, @QueryParam(expand) String expand) { It should work just-as-well as looking it up by DB-id. With the exception that we haven't added the pagination params to the Handle lookup. The handle returns either 404, or an object of the type as the found DSO, i.e. Community, Collection, or Item Peter Dietz On Fri, Nov 15, 2013 at 11:36 AM, Anja Le Blanc anja.lebl...@manchester.ac.uk wrote: It is very easy to make the REST API understand handles. I have done it for items locally already. At the moment it is in addition to database ids, but I would be happy for DB ids to disappear from view completely. Best regards, Anja On 15/11/2013 16:08, Tim Donohue wrote: Hi Mark, On 11/13/2013 12:06 PM, Mark H. Wood wrote: I had missed that we were hurrying to do something in 4.0. I had assumed that there wasn't time. To clarify, I was not recommending we change identifier schemes in 4.0. As Graham mentions, all I was recommending is that we stay *consistent* with our current identifier scheme. So, the only change I recommend for 4.0 is to change the REST-API to accept Handles ([prefix]/[suffix]) as the identifier. Currently the REST-API only works with Database IDs. As noted earlier in this thread, using Database IDs for the REST API has several limitations: (1) Database ID is not very easily discoverable to end users. So, if a user wanted to access this item from the REST API: http://demo.dspace.org/xmlui/handle/10673/3 How would they be able to determine that it is actually available at: http://demo.dspace.org/rest/items/2 (The same goes for specific Communities/Collections -- the ID from REST is almost always going to be different from the Handle ID) (2) As mentioned previously, Database IDs are slightly more fragile than Handles currently -- e.g. they cannot be restored by AIP backup restore. I hope this makes some sense. All I'm recommending for 4.0 is that we tweak the REST API to understand Handles (either instead of or in addition to Database IDs). For 5.0, we should investigate a better resolution for https://jira.duraspace.org/browse/DS-1782 (and related issues) - Tim -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
Hi Peter, Thanks for clarifying! I must have overlooked that feature (I think most of the examples I had seen before were all DB ID based, and so that's how I was using REST). In that case, I think we are good enough for 4.0. For 5.0 we can work on improvements to our local identifiers (DS-1782) - Tim On 11/15/2013 11:01 AM, Peter Dietz wrote: Example: http://demo.dspace.org/rest/handle/10673/1?expand=all Peter Dietz On Fri, Nov 15, 2013 at 12:00 PM, Peter Dietz pdiet...@gmail.com mailto:pdiet...@gmail.com wrote: Sorry, but the current REST API does support lookup by handles. https://github.com/DSpace/DSpace/tree/master/dspace-rest#handles https://github.com/DSpace/DSpace/blob/master/dspace-rest/src/main/java/org/dspace/rest/HandleResource.java#L32 @Path(/handle) public class HandleResource { @GET @Path(/{prefix}/{suffix}) @Produces({MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML}) public org.dspace.rest.common.DSpaceObject getObject(@PathParam(prefix) String prefix, @PathParam(suffix) String suffix, @QueryParam(expand) String expand) { It should work just-as-well as looking it up by DB-id. With the exception that we haven't added the pagination params to the Handle lookup. The handle returns either 404, or an object of the type as the found DSO, i.e. Community, Collection, or Item Peter Dietz On Fri, Nov 15, 2013 at 11:36 AM, Anja Le Blanc anja.lebl...@manchester.ac.uk mailto:anja.lebl...@manchester.ac.uk wrote: It is very easy to make the REST API understand handles. I have done it for items locally already. At the moment it is in addition to database ids, but I would be happy for DB ids to disappear from view completely. Best regards, Anja On 15/11/2013 16:08, Tim Donohue wrote: Hi Mark, On 11/13/2013 12:06 PM, Mark H. Wood wrote: I had missed that we were hurrying to do something in 4.0. I had assumed that there wasn't time. To clarify, I was not recommending we change identifier schemes in 4.0. As Graham mentions, all I was recommending is that we stay *consistent* with our current identifier scheme. So, the only change I recommend for 4.0 is to change the REST-API to accept Handles ([prefix]/[suffix]) as the identifier. Currently the REST-API only works with Database IDs. As noted earlier in this thread, using Database IDs for the REST API has several limitations: (1) Database ID is not very easily discoverable to end users. So, if a user wanted to access this item from the REST API: http://demo.dspace.org/xmlui/handle/10673/3 How would they be able to determine that it is actually available at: http://demo.dspace.org/rest/items/2 (The same goes for specific Communities/Collections -- the ID from REST is almost always going to be different from the Handle ID) (2) As mentioned previously, Database IDs are slightly more fragile than Handles currently -- e.g. they cannot be restored by AIP backup restore. I hope this makes some sense. All I'm recommending for 4.0 is that we tweak the REST API to understand Handles (either instead of or in addition to Database IDs). For 5.0, we should investigate a better resolution for https://jira.duraspace.org/browse/DS-1782 (and related issues) - Tim -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net mailto:Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth,
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
On Wed, Nov 13, 2013 at 10:07:40PM +, Graham Triggs wrote: On 13 November 2013 18:06, Mark H. Wood mw...@iupui.edu wrote: Please let us keep DS-220 in mind. SWORD needs a *globally unique* identifier to begin deposit, before we will have created a Handle or DOI or whatnot -- that happens when the Item is installed. So we are sort of being forced toward UUIDs or something like them. https://jira.duraspace.org/browse/DS-220 As far as I can see it, SWORD does not demand a *globally* unique identifier. It expects an identifier, and it expects that it should persist. SWORD does not, but SWORD uses AtomPub, and Atom does: http://tools.ietf.org/html/rfc4287#section-4.2.6 4.2.6. The atom:id Element The atom:id element conveys a permanent, universally unique identifier for an entry or feed. So we need something that is universally unique, permanent, and generated before the object is installed in the repository. We need it regardless of whether DB IDs or Handles would serve for REST. -- Mark H. Wood, Lead System Programmer mw...@iupui.edu Machines should not be friendly. Machines should be obedient. signature.asc Description: Digital signature -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
Hi Graham, In all honesty, I agree with your points (in this email and others) about our assumptions about IDs and that we aren't necessarily providing all the proper guarantees needed for persistent IDs. My main point here is that we need to find something that is still *achievable* for 4.0. Brainstorming a longer term resolution (which you see to be), is also a great exercise. But it's not realistic in terms of fixing/improving the REST API in time for 4.0 (unless we were to delay 4.0, which I don't think anyone wants to do). It's great for figuring out what we may want to change in 5.0, however. So, I agree with your points. We aren't doing a 100% perfect job of persistent IDs anyhow, and a lot is reliant on the DSpace installation (and whether they've purchased a handle prefix or not). But, based on what we have in place currently, I don't see any other short term fix for 4.0 other than to allow the REST API to work with [prefix]/[suffix] (even if we cannot guarantee that the [prefix] is a unique registered prefix). It's still not perfect, but it's better than only using Database IDs (which are less obtainable to users, and unable to be restored by AIP tools as they are currently primary keys) - Tim On 11/12/2013 5:27 PM, Graham Triggs wrote: On 12 November 2013 16:50, Tim Donohue tdono...@duraspace.org mailto:tdono...@duraspace.org wrote: Therefore, I'm hesitant to limit the locally-unique ID to suffix only -- as I believe that may cause ID collisions in some DSpace instances. I think it needs to remain [prefix]/[suffix] until we have something better, like a UUID or similar (as noted in https://jira.duraspace.org/browse/DS-1782). And the prefix in two completely distinct instances may in fact be the same, because they haven't actually registered the prefix. So then you merge to DSpace instances (or just choose to migrate certain records), and you still have a collision between the IDs. We really need to (as I will reiterate elsewhere) get over our assumptions about IDs - because they really don't provide the guarantees that we are treating them as doing so. G -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
Hi Tim, On 13 November 2013 14:53, Tim Donohue tdono...@duraspace.org wrote: My main point here is that we need to find something that is still *achievable* for 4.0. Brainstorming a longer term resolution (which you see to be), is also a great exercise. My main point is actually to highlight what may well be false assumptions in what exists in DSpace, in order to frame our decisions correctly. But it's not realistic in terms of fixing/improving the REST API in time for 4.0 (unless we were to delay 4.0, which I don't think anyone wants to do). It's great for figuring out what we may want to change in 5.0, however. So, I agree with your points. We aren't doing a 100% perfect job of persistent IDs anyhow, and a lot is reliant on the DSpace installation (and whether they've purchased a handle prefix or not). But, based on what we have in place currently, I don't see any other short term fix for 4.0 other than to allow the REST API to work with [prefix]/[suffix] (even if we cannot guarantee that the [prefix] is a unique registered prefix). I tend to agree - the 'handle' is the only identifier that we make clearly and consistently available, and as such will have to be used in the current scenario. I think we should start taking steps to signal a future intention here though - i.e. document clearly the implications of using an unregistered a prefix, recommend strongly that a prefix is registered, and that in a future version (maybe the following release?) we will look to make having a handle prefix entirely optional, transferring primary identification to a local, non-persistent identifier. Looking ahead, we also have to bite the bullet that any local id - unless we go entirely for UUIDs (ugly) - will likely be an auto-incrementing sequence, and deal with it accordingly. For the purposes of straightforward backup / restore, this should not be a problem. Obviously, if we move a package from one instance to another, then we may have to assign those IDs, but in general I don't see that as a problem - and if it is, that's what you use externally registered persistent identifiers for! I'm not expecting us to make a decision on that right now, I'm just stating it for the record and so that we can start thinking about it in the background. G -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
I had missed that we were hurrying to do something in 4.0. I had assumed that there wasn't time. Please let us keep DS-220 in mind. SWORD needs a *globally unique* identifier to begin deposit, before we will have created a Handle or DOI or whatnot -- that happens when the Item is installed. So we are sort of being forced toward UUIDs or something like them. https://jira.duraspace.org/browse/DS-220 Let's see if I have all the constraints so far discovered: o Not tied to external services. o The DBMS is an external service. o Not coordinated with other instances. o Durable, because people for some reason want to remember these identifiers for use in other sessions. o Globally unique (at least statistically) due to Atom requirements in SWORD. o Usable across the several services making up a single instance. o Durable across backup/restore. Destroyed and reassigned when transporting objects between instances. o And we want *something* for 4.0. Anything else? When this list is somewhat stable, we should add it to DS-1782. https://jira.duraspace.org/browse/DS-1782 -- Mark H. Wood, Lead System Programmer mw...@iupui.edu Machines should not be friendly. Machines should be obedient. signature.asc Description: Digital signature -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
On 13 November 2013 18:06, Mark H. Wood mw...@iupui.edu wrote: Please let us keep DS-220 in mind. SWORD needs a *globally unique* identifier to begin deposit, before we will have created a Handle or DOI or whatnot -- that happens when the Item is installed. So we are sort of being forced toward UUIDs or something like them. https://jira.duraspace.org/browse/DS-220 As far as I can see it, SWORD does not demand a *globally* unique identifier. It expects an identifier, and it expects that it should persist. The definition of persistence can be considered vague, and I would take it to mean persistent for that instance - not that it should have some kind of persistence for migration to a different instance ( / SWORD endpoint). Basically, if I return back that this a SWORD deposit creates item 1, then it should remain item 1 when the deposit is complete - not that it gets reassigned as item 2 once accepted (or it increments as changes are made in a versioning scenario, etc.). Trouble is, there is a lot of weight to the words we use, a lot of baggage tied up to concepts, and indeed the problems that we've noted in this discussion about what identifiers that we have are public, which was leading us to preferring to return a handle as that identifier for SWORD (which is problematic, due to the timing of when it is assigned). But it doesn't mean that it couldn't be the database identifier, or anything else that we choose to be the public identifier. And even if it needed to be globally unique, that doesn't mean the globally unique value has be our primary / public identifier - we can register a UUID against the primary / public identifier, just as we register handles against the database identifiers. o Durable across backup/restore. Destroyed and reassigned when transporting objects between instances. Which the database ID does accomplish. It's very important to remember that although these identifiers originate from sequences, and are assigned into the primary key, this is not a property of the database as we've defined it. We *manually* select the next sequence number, and *manually* insert it into the primary key column when we create a row in the database. We can choose to ignore the sequence and insert any arbitrary integer in it's place - providing the integer does not already exist in the table. Which it shouldn't in the backup / restore scenario. Complete restore in an empty instance - no ID has been used. Selective restore after an item has been destroyed - the previous row will be gone, and the sequence will be past this point, ID won't be present in the table. You just need to ensure after restore that the sequence is set to assign IDs from a reasonable arbitrary point. The only time you run into trouble is migrating an item to a different instance, or trying to restore a deleted item after you've migrated another item with the same ID into the instance. But actually, those kind of clashes can exist with handles in the way that they have been used, we can provide reasonable options for reassigning IDs - it needs to be catered for anyway, and it's not (imho) a problem. It would just need AIPs updated to also track the database (or other local object) identifiers. o And we want *something* for 4.0. The only thing we can do for 4.0 is be consistent about our only public identifiers, and make clear statements that we will make handles properly optional for the next release, with a local object identifier taking over the primary, mandatory non-global public identifier (which really could just be the database identifier). G -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
I like that. It fits in very well with how we locally operate our web application (using only the suffix of the handle) and we don't have to expose database ids. If the DSpace community can agree on this method, it would not be difficult to modify the REST code to use the handle suffix as ID. Best regards, Anja On 11/11/2013 19:28, Mark H. Wood wrote: OK, if others see a use for persistent local identifiers then they had better be persistent. I wasn't thrilled with the idea of overloading database record IDs either, especially since we'd have to compound (ID, type) tuples into some string representation, to uniquely identify an object. So, it sounds like what we want is to split the two ways that Handles are used. One use is as, er, Handles: globally unique persistent identifiers that mean something outside of the local instance. That should remain as it is. The other use is as locally-unique identifiers of DSOs regardless of type. For this purpose the prefix is rubbish; unless we have multiple prefixes for some reason, we only care about the suffix. If this use were separated from the Handle system entirely then it could be a simple incrementing numeric label. If we can pull these meanings apart, then someone who doesn't want Handles can, theoretically, unplug them from DSpace and never think about them again, while DSpace still has something it can use to uniquely identify things locally. -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
On Tue, Nov 12, 2013 at 08:12:27AM +, Anja Le Blanc wrote: I like that. It fits in very well with how we locally operate our web application (using only the suffix of the handle) and we don't have to expose database ids. If the DSpace community can agree on this method, it would not be difficult to modify the REST code to use the handle suffix as ID. Well, no, I wrote unclearly. I meant to separate these functions by inventing a new, distinct sequential identifier for local use, and no longer using Handles or any part of them for any local identification. I think that global identifiers should only be attached to an Item as labels which are opaque to DSpace. We can resolve them to local objects as a service to others, and to link out to other services, but should not use them internally. -- Mark H. Wood, Lead System Programmer mw...@iupui.edu Machines should not be friendly. Machines should be obedient. signature.asc Description: Digital signature -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
I've created a Jira issue to gather all the threads related to local object identifiers: https://jira.duraspace.org/browse/DS-1782 I found another place where we need something like this, and it wants globally unique identifiers. That seems to point to something like a UUID. -- Mark H. Wood, Lead System Programmer mw...@iupui.edu Machines should not be friendly. Machines should be obedient. signature.asc Description: Digital signature -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
On 11/11/2013 1:28 PM, Mark H. Wood wrote: OK, if others see a use for persistent local identifiers then they had better be persistent. I wasn't thrilled with the idea of overloading database record IDs either, especially since we'd have to compound (ID, type) tuples into some string representation, to uniquely identify an object. So, it sounds like what we want is to split the two ways that Handles are used. One use is as, er, Handles: globally unique persistent identifiers that mean something outside of the local instance. That should remain as it is. The other use is as locally-unique identifiers of DSOs regardless of type. For this purpose the prefix is rubbish; unless we have multiple prefixes for some reason, we only care about the suffix. If this use were separated from the Handle system entirely then it could be a simple incrementing numeric label. If we can pull these meanings apart, then someone who doesn't want Handles can, theoretically, unplug them from DSpace and never think about them again, while DSpace still has something it can use to uniquely identify things locally. I think this all seems reasonable to me for 4.0. My only minor disagreement is that it *is* possible to have multiple (handle) prefixes in DSpace. For example: * a DSpace which harvests from other DSpaces (via OAI-PMH/OAI-ORE), all of which have their own Handle prefix, OR * a DSpace which is the 'merger' of two or more previous DSpaces instances (e.g. a consortial DSpace which merged several institutional-based DSpace instances, each of which had their own prefix). Therefore, I'm hesitant to limit the locally-unique ID to suffix only -- as I believe that may cause ID collisions in some DSpace instances. I think it needs to remain [prefix]/[suffix] until we have something better, like a UUID or similar (as noted in https://jira.duraspace.org/browse/DS-1782). So, I think this means that the REST API needs to support [prefix]/[suffix] for the 4.0 release. Perhaps for 5.0, we can work on minting local object identifiers, which can also be used via REST. - Tim -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
On 12 November 2013 16:50, Tim Donohue tdono...@duraspace.org wrote: Therefore, I'm hesitant to limit the locally-unique ID to suffix only -- as I believe that may cause ID collisions in some DSpace instances. I think it needs to remain [prefix]/[suffix] until we have something better, like a UUID or similar (as noted in https://jira.duraspace.org/browse/DS-1782). And the prefix in two completely distinct instances may in fact be the same, because they haven't actually registered the prefix. So then you merge to DSpace instances (or just choose to migrate certain records), and you still have a collision between the IDs. We really need to (as I will reiterate elsewhere) get over our assumptions about IDs - because they really don't provide the guarantees that we are treating them as doing so. G -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
[Dspace-devel] Instance-local object identifiers for REST and other uses [was Re: A few early REST API comments]
OK, if others see a use for persistent local identifiers then they had better be persistent. I wasn't thrilled with the idea of overloading database record IDs either, especially since we'd have to compound (ID, type) tuples into some string representation, to uniquely identify an object. So, it sounds like what we want is to split the two ways that Handles are used. One use is as, er, Handles: globally unique persistent identifiers that mean something outside of the local instance. That should remain as it is. The other use is as locally-unique identifiers of DSOs regardless of type. For this purpose the prefix is rubbish; unless we have multiple prefixes for some reason, we only care about the suffix. If this use were separated from the Handle system entirely then it could be a simple incrementing numeric label. If we can pull these meanings apart, then someone who doesn't want Handles can, theoretically, unplug them from DSpace and never think about them again, while DSpace still has something it can use to uniquely identify things locally. -- Mark H. Wood, Lead System Programmer mw...@iupui.edu Machines should not be friendly. Machines should be obedient. signature.asc Description: Digital signature -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel