[
https://jira.duraspace.org/browse/DS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=23074#comment-23074
]
Richard Rodgers commented on DS-1073:
-------------------------------------
Hi - just a question. I took a quick look at the existing code, and that
maximum is (supposed to) only apply to items that are actually processed, not
'visited'. If that is so, I'm confused about the issue here: suppose there are
1000 items (always returned in the same order) but most have already been
filtered (say 990) from previous runs. It should skip those (and thus *not* add
to the 'processed' count) until it reaches the 'last' 10. If you set a maximum
of 20, it should do them just fine. The same will be true if 1000 becomes one
million. But there also could very well be a bug in the implementation, so it
is not working as intended.
> The maximum flag on filter-media is useless if results are returned in the
> same order every time
> ------------------------------------------------------------------------------------------------
>
> Key: DS-1073
> URL: https://jira.duraspace.org/browse/DS-1073
> Project: DSpace
> Issue Type: Bug
> Components: DSpace API
> Affects Versions: 1.8.0
> Reporter: Samuel Ottenhoff
> Attachments: DS-1073.patch
>
>
> Scenario: institution has a million PDFs on one sever and needs to run
> filter-media every night. Institution only wants to run on 10k PDFs per night.
> There is a "-m" flag to set a maximum. But the results are returned the same
> way every time preventing new items from being picked up.
> Possible solutions:
> 1) Return items sorted by recently updated?
> 2) Return a random sort of elements instead of the same ones every time?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel