Re: [dspace-tech] handles, isreplacedby, and withdrawn items

2020-05-13 Thread Øyvind Gjesdal
I see the jira issue is still open, so hopefully this is relevant for 
someone.

I had a similar issue now with oai-pmh -c import in a Dspace 5 instance.

To locate which item was the culprit I received good help from this old 
nabble thread on dspace oai -c error 
, but 
our discovery index did not return any matches of items without handles, 
and the sql query returned > 1000 items. The oai-pmh clean index ran after 
changing the item without a DOI to private to change discoverable to false. 
This would make the item no longer harvestable through OAI. 

SELECT item_id FROM item WHERE (in_archive=TRUE OR withdrawn=TRUE) AND 
discoverable=TRUE AND
NOT EXISTS
  (SELECT resource_id FROM handle WHERE handle.resource_id =
item.item_id);


In case someone has a similar error when reindexing, and needs to locate the 
item to fix the index.

I also noted that since the sql query used by the (oai) clean index 

 doesn't sort by date,
running:

dspace oai import -c # Failed after 12200 items
dspace oai import # Ran succesfully for 2000 items, probably because the error 
item was older than the newest existing post in the index


This left us with an OAI index missing around 3000 items.

I don't know if this has been fixed in later dspaces, but I notice the code 
has been refactored.

Best regards,
Øyvind

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/822f92f9-20e9-4b5b-8a70-7eee86056ed6%40googlegroups.com.


Re: [dspace-tech] handles, isreplacedby, and withdrawn items

2018-08-23 Thread Tim Donohue
Thanks for reporting back on your findings, Deborah.

It sounds like the act of pointing to handles at one item "works"...but,
removing the handle from the withdrawn item is the core issue.  I'm not
surprised that DSpace cannot fully manage Items without handles -- as
handles are still very "built into" DSpace.

I've logged this issue as a bug: https://jira.duraspace.org/browse/DS-3993

In the meantime, one option to possibly work around this issue would be to
define a "fake/dummy" handle for the withdrawn item.  For example, give it
a handle of "withdrawn/[old-id]" or something.  This isn't exactly ideal,
but if the only issue is that the Handle cannot be null, this might be a
possible workaround.

Nonetheless, honestly, Claudia's fix seems very reasonable and it seems to
involve much less "messing around in the database".  So, I wouldn't fault
you for looking towards simply using that.

- Tim

On Wed, Aug 22, 2018 at 8:56 PM Fitchett, Deborah <
deborah.fitch...@lincoln.ac.nz> wrote:

> Hi all,
>
>
>
> I finally got around to trying this out on our dev server: updated the
> “handle” table to point two different handles to the one resource_id (and
> none to the withdrawn resource_id). It seemed in the first instance to work
> – following the links did what I expected.
>
>
>
> Then I tried running the index-discovery -b job. That kept stopping
> mysteriously in the middle of it – no obvious errors in the solr or dspace
> logs, just stopped. (I don’t know, maybe our dev server’s just slow and ran
> out of memory or got distracted or something.) But running index-discovery
> with no options picked up where it left off, and after a couple of
> repetitions of this it finally indexed the item in question and all my
> search/browse tests worked as expected too.
>
>
>
> So I was on the verge of declaring victory – and then I ran an oai import
> -c -v job. To my tremendous disappointment that failed partway through with:
>
>
>
> Item with handle null indexed
>
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
> Document is missing mandatory uniqueKey field: item.handle
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
>
> at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
>
> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
>
> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
>
> at org.dspace.xoai.app.XOAI.index(XOAI.java:213)
>
> at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:200)
>
> at org.dspace.xoai.app.XOAI.index(XOAI.java:131)
>
> at org.dspace.xoai.app.XOAI.main(XOAI.java:495)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
> java:57)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at
> org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
>
> at
> org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
>
>
>
> This makes sense in retrospect: the OAI feed includes withdrawn items (in
> order to publish a “record status: deleted” record) but identifies them by
> their handles (so obviously requires a handle).
>
>
>
> This is highly disappointing, but at least now we know. It looks like it
> would work for a repository which didn’t publish an OAI feed, but sadly
> that’s vital for us.
>
>
>
> Our fall-back will be to follow Claudia’s suggestion of adjusting the Item
> Withdrawn page and using the metadata to point to the new item.
>
>
>
> Deborah
>
>
>
>
>
> *From:* Tim Donohue 
> *Sent:* Friday, 15 June 2018 2:42 AM
> *To:* Fitchett, Deborah 
> *Cc:* dspace-tech@googlegroups.com
>
>
> *Subject:* Re: [dspace-tech] handles, isreplacedby, and withdrawn items
>
>
>
> Hi Deborah,
>
>
>
> I'll admit, I've never tried this myself, but your suggestion to simply
> update the old "handle" table entries to point at the new "resource_id"
> seems like it *should work*.  The "handle" table in DSpace is really just
> used to resolve/assign Handles to Objects.  So, at least conceptually, it
> sh

RE: [dspace-tech] handles, isreplacedby, and withdrawn items

2018-08-22 Thread Fitchett, Deborah
Hi all,

I finally got around to trying this out on our dev server: updated the “handle” 
table to point two different handles to the one resource_id (and none to the 
withdrawn resource_id). It seemed in the first instance to work – following the 
links did what I expected.

Then I tried running the index-discovery -b job. That kept stopping 
mysteriously in the middle of it – no obvious errors in the solr or dspace 
logs, just stopped. (I don’t know, maybe our dev server’s just slow and ran out 
of memory or got distracted or something.) But running index-discovery with no 
options picked up where it left off, and after a couple of repetitions of this 
it finally indexed the item in question and all my search/browse tests worked 
as expected too.

So I was on the verge of declaring victory – and then I ran an oai import -c -v 
job. To my tremendous disappointment that failed partway through with:

Item with handle null indexed
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Document 
is missing mandatory uniqueKey field: item.handle
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
at org.dspace.xoai.app.XOAI.index(XOAI.java:213)
at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:200)
at org.dspace.xoai.app.XOAI.index(XOAI.java:131)
at org.dspace.xoai.app.XOAI.main(XOAI.java:495)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)

This makes sense in retrospect: the OAI feed includes withdrawn items (in order 
to publish a “record status: deleted” record) but identifies them by their 
handles (so obviously requires a handle).

This is highly disappointing, but at least now we know. It looks like it would 
work for a repository which didn’t publish an OAI feed, but sadly that’s vital 
for us.

Our fall-back will be to follow Claudia’s suggestion of adjusting the Item 
Withdrawn page and using the metadata to point to the new item.

Deborah


From: Tim Donohue 
Sent: Friday, 15 June 2018 2:42 AM
To: Fitchett, Deborah 
Cc: dspace-tech@googlegroups.com
Subject: Re: [dspace-tech] handles, isreplacedby, and withdrawn items

Hi Deborah,

I'll admit, I've never tried this myself, but your suggestion to simply update 
the old "handle" table entries to point at the new "resource_id" seems like it 
*should work*.  The "handle" table in DSpace is really just used to 
resolve/assign Handles to Objects.  So, at least conceptually, it should 
support pointing two handles at the same object (Item).

That said, I'd recommend first trying this out on a test or development server. 
 I think it should work, but it'd be worth testing more thoroughly how DSpace 
behaves when one Item object has multiple Handles (and for example, whether 
both handles appear on the Item splash page, etc).  I'd recommend testing basic 
functionality like browse/search/reindex. I suspect they all should work, but 
as this isn't a documented feature, it's worth double checking.

Let us know how it goes (please report back on this list), as this seems like 
it might be of interest to others.

- Tim


On Wed, Jun 13, 2018 at 11:51 PM Fitchett, Deborah 
mailto:deborah.fitch...@lincoln.ac.nz>> wrote:
Kia ora all,

We’re occasionally getting duplicate records created in Dspace with no way to 
resolve the issue other than to withdraw the earlier record and go forward with 
the more recent one.(1)

But of course we don’t really want the handle of the earlier record to result 
in a dead-end – we’d like it to resolve to the new record, or at least be 
redirected there.

We have metadata fields dc.relation.isreplacedby and dc.relation.replaces. 
These seem to have no functional purpose in Dspace (ie it doesn’t do any 
automatic redirects), it’s just information. If the item is *not* withdrawn, I 
could add some javascript to the page to accomplish the redirect that way. But 
I’m not sure it’s the cleanest way.

I’m looking at the handle table in the database and wondering – what if I 
simply find the row with the handle that’s currently linked to the