I hypothesise that retrieving the ResourcePolicy for each of the 359
Collections and then determining if a user is a member of the Group
associated with the Collection is why there is a long delay in
submissions from the MyDSpace interface.
George could insert log.info() method calls into the SubmitServlet.java
file at line 306, and after line 309, and re-compile and re-deploy.
These calls would log the time of the start and the end of the operation
for users who are / are not administrators to attempt to help determine
if this is the reason for long delays. If this proves there is an
objective difference in the time then the log methods can be moved
closer to the end of the call tree to determine exactly where most
computation occurs.
The SubmitServlet adds a Collection[] to the select-collection step of
the submission process [-1] line 308.
This Collection[] is created through a call to the
Collection.findAuthorized(context) static method [0] line 1150.
The findAuthorized() method calls the findAll() method, [0] line 273,
which returns a list of all Collections in the collection database table
which should not be an expensive operation since it queries the database
and retrieves 359 rows.
A line contained inside the findAll() method block, line 285, determines
if the Collection object to be added to the Collection[] is cached in
memory. This should not be an expensive operation since the fromCache()
method determines if the Collection object is cached in a HashMap -
which offers O(1) access, [1] line 289.
The returned Collection[] is then examined, for each Collection object,
to determine if the current user has authorisation to write to that
Collection. The AuthorizeManager class method authorizeBooleanAction()
[2] line 217, calls authorizeAction() [2] line 131, calls authorize()
[2] line 258.
The authorize() method in AuthorizeManager retrieves a ResourcePolicy
and determines if the user is authorized as per the ResourcePolicy to
have access to the collection.
Users who are admins do have have to step through the ResourcePolicy
stage. George states that he suffers a delay when trying to submit from
the MyDSpace page but that it is a shorter delay than what others are
experiencing - 30s - 60s vs 60s - 300s. The only stage in the process
I can determine where there is a difference between Admins and Users is
at this point.
Having to retrieve the ResourcePolicy for each of the 359 collection
objects and determine if a user is part of that ResourcePolicy could be
the choke point.
I appreciate any thoughts on my analysis of what could be causing the
problem that George is experiencing.
[-1]
http://dspace.svn.sourceforge.net/viewvc/dspace/branches/dspace-1_4_x/dspace/src/org/dspace/app/webui/servlet/SubmitServlet.java?view=markup
[0]
http://dspace.svn.sourceforge.net/viewvc/dspace/branches/dspace-1_4_x/dspace/src/org/dspace/content/Collection.java?view=markup
[1]
http://dspace.svn.sourceforge.net/viewvc/dspace/branches/dspace-1_4_x/dspace/src/org/dspace/core/Context.java?view=markup
[2]
http://dspace.svn.sourceforge.net/viewvc/dspace/branches/dspace-1_4_x/dspace/src/org/dspace/authorize/AuthorizeManager.java?view=markup
--
Desmond Elliott | Hewlett-Packard Limited registered Office:
Research Associate| Cain Road,
HP Labs | Bracknell,
Bristol, UK | Berks
+44 117 312 8526 | RG12 1HN.
[EMAIL PROTECTED]| Registered No: 690597 England
The contents of this message and any attachments to it are
confidential and may be legally privileged. If you have received this
message in error, you should delete it from your system immediately
and advise the sender. To any recipient of this message within HP,
unless otherwise stated you should consider this message and
attachments as HP CONFIDENTIAL.
George Kozak wrote:
Hi...
Back in January, I wrote to the list about a problem we are having
(at Cornell) with DSpace after we migrated from v. 1.3.2 to v. 1.4.2.
Basically, if someone logs into DSpace by clicking on My DSpace and
then clicks on the Start a New Submission button, there can be a
delay from anywhere from 30 seconds to several minutes before the
Submit: Choose Collection screen appears. Clicking on the Start a
New Submission button in My DSpace can run the CPU usage up over
40%. This doesn't happen if one goes directly to the Collection to
which he or she is authorized to submit and clicks on the Submit to
this Collection button there.
Randall Floyd of Indiana University thought that this might be a
cleanup program in PostGreSQL, but I run vaccumdb nightly and do a
re-index daily.
Claudia Jurgen of TU Dortmund suggested that this may be the result
of a complicated Community/SubCommunity/Collection structure (we have
118 Communities/Sub-Communities and 359 Collections).
I have been looking at alternatives. I have put a message on the My
DSpace main page telling people to