Re: [Dspace-tech] Sequence ID generation

2007-05-10 Thread Gary Browne
Thanks to everyone for their input on this.

Well I had no idea that this would open such a can of worms! Mea
culpa...

I guess my question really is more the philosophical aspect of *how* the
sequence ID is (mis)used. I have items being submitted to DSpace in
response to a trigger from another system, but I need to pass back an
identifier to that system which matches the *bitstream* per se (hehe),
not just the item metadata page. In this particular project, it is the
*bitstreams* which are considered persistent objects, not just the
items. Problems arise in this case in trying to programmatically extract
pointers to bitstreams for external systems, particularly in light of
the fact that bitstream names need not be unique (is that right?).

But I'm wandering into a [dspace-general] discussion here...I note
Richard's previous instigation of discussion on these issues at
http://mailman.mit.edu/pipermail/dspace-general/2003-September/15.ht
ml 
and it would appear that general consensus is that persistent bitstream
IDs are *not* a good idea. However, when faced with a project that
*requires* them, what is one to do?

Regards
Gary


Gary Browne
Development Programmer
Library IT Services
University of Sydney
Australia
ph: 61-2-9351 5946 

-Original Message-
From: Larry Stone [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, 9 May 2007 6:18 AM
To: Richard Rodgers
Cc: Gary Browne; dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] Sequence ID generation

> First, it is assigned sequentially and IDs are not reused if a
bitstream
> is deleted. There is no magic ordering, and it was *not* intended for
> organizing a set of bitstreams into a meaningful sequence (e.g. PDF
> chapters of a book). Its sole purpose is to provide a *durable* unique
> ID for a bitstream - think of it as a 'sub-handle' ID - modulo an item

There's actually a bug in the data model, then.  It's possible to get
the same sequence ID reused, because when adding a Bitstream, the code
only looks for the highest existing SequenceID and increments that.

1. Take an existing Item, go into the "Edit Item" admin page
   (/dspace/tools/edit-item), and add a new Bitstream with a distinctive
name.
   Say, "foo.pdf".

2. Determine its Sequence ID.  Go to the Item page
/dspace/handle/ and observe the "View/Open" link next
   to your bitstream, the path element after its handle is the
SequenceID.
   It should be the highest SequenecID there since it was most recently
added.
   There are some "invisible" Bitstreams (like licenses) that also take
   up SIDs.

3. Go back to the "Edit" page and delete that newest bitstream.

4. Add a different bitstream with a different name, say, "bar.pdf".

5. Go to a freshly-loaded copy of the Item page, and observe that
   "bar.pdf" has the same SequenceID that "foo.pdf" had before.

I'll submit this as a bug on sourceforge too.

-- Larry


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Sequence ID generation

2007-05-08 Thread Larry Stone
> First, it is assigned sequentially and IDs are not reused if a bitstream
> is deleted. There is no magic ordering, and it was *not* intended for
> organizing a set of bitstreams into a meaningful sequence (e.g. PDF
> chapters of a book). Its sole purpose is to provide a *durable* unique
> ID for a bitstream - think of it as a 'sub-handle' ID - modulo an item

There's actually a bug in the data model, then.  It's possible to get
the same sequence ID reused, because when adding a Bitstream, the code
only looks for the highest existing SequenceID and increments that.

1. Take an existing Item, go into the "Edit Item" admin page
   (/dspace/tools/edit-item), and add a new Bitstream with a distinctive name.
   Say, "foo.pdf".

2. Determine its Sequence ID.  Go to the Item page
/dspace/handle/ and observe the "View/Open" link next
   to your bitstream, the path element after its handle is the SequenceID.
   It should be the highest SequenecID there since it was most recently added.
   There are some "invisible" Bitstreams (like licenses) that also take
   up SIDs.

3. Go back to the "Edit" page and delete that newest bitstream.

4. Add a different bitstream with a different name, say, "bar.pdf".

5. Go to a freshly-loaded copy of the Item page, and observe that
   "bar.pdf" has the same SequenceID that "foo.pdf" had before.

I'll submit this as a bug on sourceforge too.

-- Larry


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Sequence ID generation

2007-05-08 Thread Richard Rodgers
Hi Gary:

Here's a little more explanation/history of the sequence ID.
First, it is assigned sequentially and IDs are not reused if a bitstream
is deleted. There is no magic ordering, and it was *not* intended for
organizing a set of bitstreams into a meaningful sequence (e.g. PDF
chapters of a book). Its sole purpose is to provide a *durable* unique
ID for a bitstream - think of it as a 'sub-handle' ID - modulo an item
(sorry for the Latin again, it just kind of crept in). DSpace originally
used the bitstream database key, but while unique, this wasn't durable
in the sense that if you moved to a different database, or compressed,
etc, the key might change. Whereas the sequence ID is a bona fide
(damn!) piece of metadata, albeit a fairly opaque one. Remember that the
filename need not be unique (you can have 2 bitstreams with the same
name), so we do need something in this role. We actually kicked around
various proposals (e.g. MD5 checksums, date stamps, etc), but the
sequence ID won primarily because it resulted in shorter, easier-to-type
URLS.

Hope this sheds some light,

Richard R


On Tue, 2007-05-08 at 10:01 +1000, Gary Browne wrote:
> Thanks Claudia
> 
> Though I'm only inferring what a "numerus corens" is - I'm not really up with 
> my Latin.
> 
> Cheers
> Gary
> 
> 
> Gary Browne
> Development Programmer
> Library IT Services
> University of Sydney
> Australia
> ph: 61-2-9351 5946 
> 
> -Original Message-
> From: Claudia Jürgen [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, 3 May 2007 5:12 PM
> To: Gary Browne
> Cc: dspace-tech@lists.sourceforge.net
> Subject: Re: [Dspace-tech] Sequence ID generation
> 
> Hi Gary,
> 
> the sequence number is generated in:
> org.dspace.conten.Item
> update()
> 
>  // Set sequence IDs for bitstreams in item
>  int sequence = 0;
>  Bundle[] bunds = getBundles();
> 
>  // find the highest current sequence number
>  for (int i = 0; i < bunds.length; i++)
>  {
>  Bitstream[] streams = bunds[i].getBitstreams();
> 
>  for (int k = 0; k < streams.length; k++)
>  {
>  if (streams[k].getSequenceID() > sequence)
>  {
>  sequence = streams[k].getSequenceID();
>  }
>  }
>  }
> 
>  // start sequencing bitstreams without sequence IDs
>  sequence++;
> 
>  for (int i = 0; i < bunds.length; i++)
>  {
>  Bitstream[] streams = bunds[i].getBitstreams();
> 
>  for (int k = 0; k < streams.length; k++)
>  {
>  if (streams[k].getSequenceID() < 0)
>  {
>  streams[k].setSequenceID(sequence);
>  sequence++;
>  streams[k].update();
>  }
>  }
>  }
> 
> it's just a numerus corens.
> 
> sunny greetings
> 
> Claudia Jürgen
> University Dortmund
> 
> 
> Gary Browne schrieb:
> > Hi everyone - I submitted this question previously but had no
> > replies...thought I'd try my luck again with a cunningly disguised
> > turned about subject line.
> > 
> >  
> > 
> >  
> > 
> > Regarding the sequence ID, the number between the handle and the
> > filename in a DSpace bitstream URL:
> > 
> >  
> > 
> > dspace url/bitstream/handle/sequence ID/filename
> > 
> >  
> > 
> > can anyone tell me how the sequence ID number is generated by DSpace?
> > Does it simply correspond to the sequence of bitstreams as outlined in
> > the contents file?
> > 
> >  
> > 
> > Thanks
> > 
> > Gary
> > 
> >  
> > 
> > Gary Browne
> > Development Programmer
> > Library IT Services
> > University of Sydney
> > Australia
> > ph: 61-2-9351 5946 
> > 
> >  
> > 
> > 
> > 
> > 
> > 
> > 
> > -
> > This SF.net email is sponsored by DB2 Express
> > Download DB2 Express C - the FREE version of DB2 express and take
> > control of your XML. No limits. Just data. Click to get it now.
> > http://sourceforge.net/powerbar/db2/
> > 
> > 
> > 
> > 
> > ___
> > DSpace-tech mailing list
> > DSpace-tech@lists.sourcefor

Re: [Dspace-tech] Sequence ID generation

2007-05-07 Thread Gary Browne
Thanks Claudia

Though I'm only inferring what a "numerus corens" is - I'm not really up with 
my Latin.

Cheers
Gary


Gary Browne
Development Programmer
Library IT Services
University of Sydney
Australia
ph: 61-2-9351 5946 

-Original Message-
From: Claudia Jürgen [mailto:[EMAIL PROTECTED] 
Sent: Thursday, 3 May 2007 5:12 PM
To: Gary Browne
Cc: dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] Sequence ID generation

Hi Gary,

the sequence number is generated in:
org.dspace.conten.Item
update()

 // Set sequence IDs for bitstreams in item
 int sequence = 0;
 Bundle[] bunds = getBundles();

 // find the highest current sequence number
 for (int i = 0; i < bunds.length; i++)
 {
 Bitstream[] streams = bunds[i].getBitstreams();

 for (int k = 0; k < streams.length; k++)
 {
 if (streams[k].getSequenceID() > sequence)
 {
 sequence = streams[k].getSequenceID();
 }
 }
 }

 // start sequencing bitstreams without sequence IDs
 sequence++;

 for (int i = 0; i < bunds.length; i++)
 {
 Bitstream[] streams = bunds[i].getBitstreams();

 for (int k = 0; k < streams.length; k++)
 {
 if (streams[k].getSequenceID() < 0)
 {
 streams[k].setSequenceID(sequence);
 sequence++;
 streams[k].update();
 }
 }
 }

it's just a numerus corens.

sunny greetings

Claudia Jürgen
University Dortmund


Gary Browne schrieb:
> Hi everyone - I submitted this question previously but had no
> replies...thought I'd try my luck again with a cunningly disguised
> turned about subject line.
> 
>  
> 
>  
> 
> Regarding the sequence ID, the number between the handle and the
> filename in a DSpace bitstream URL:
> 
>  
> 
> dspace url/bitstream/handle/sequence ID/filename
> 
>  
> 
> can anyone tell me how the sequence ID number is generated by DSpace?
> Does it simply correspond to the sequence of bitstreams as outlined in
> the contents file?
> 
>  
> 
> Thanks
> 
> Gary
> 
>  
> 
> Gary Browne
> Development Programmer
> Library IT Services
> University of Sydney
> Australia
> ph: 61-2-9351 5946 
> 
>  
> 
> 
> 
> 
> 
> 
> -
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> 
> 
> 
> 
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Sequence ID generation

2007-05-03 Thread Claudia Jürgen
Hi Gary,

the sequence number is generated in:
org.dspace.conten.Item
update()

 // Set sequence IDs for bitstreams in item
 int sequence = 0;
 Bundle[] bunds = getBundles();

 // find the highest current sequence number
 for (int i = 0; i < bunds.length; i++)
 {
 Bitstream[] streams = bunds[i].getBitstreams();

 for (int k = 0; k < streams.length; k++)
 {
 if (streams[k].getSequenceID() > sequence)
 {
 sequence = streams[k].getSequenceID();
 }
 }
 }

 // start sequencing bitstreams without sequence IDs
 sequence++;

 for (int i = 0; i < bunds.length; i++)
 {
 Bitstream[] streams = bunds[i].getBitstreams();

 for (int k = 0; k < streams.length; k++)
 {
 if (streams[k].getSequenceID() < 0)
 {
 streams[k].setSequenceID(sequence);
 sequence++;
 streams[k].update();
 }
 }
 }

it's just a numerus corens.

sunny greetings

Claudia Jürgen
University Dortmund


Gary Browne schrieb:
> Hi everyone - I submitted this question previously but had no
> replies...thought I'd try my luck again with a cunningly disguised
> turned about subject line.
> 
>  
> 
>  
> 
> Regarding the sequence ID, the number between the handle and the
> filename in a DSpace bitstream URL:
> 
>  
> 
> dspace url/bitstream/handle/sequence ID/filename
> 
>  
> 
> can anyone tell me how the sequence ID number is generated by DSpace?
> Does it simply correspond to the sequence of bitstreams as outlined in
> the contents file?
> 
>  
> 
> Thanks
> 
> Gary
> 
>  
> 
> Gary Browne
> Development Programmer
> Library IT Services
> University of Sydney
> Australia
> ph: 61-2-9351 5946 
> 
>  
> 
> 
> 
> 
> 
> 
> -
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> 
> 
> 
> 
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech