[Dspace-devel] [DuraSpace JIRA] Created: (DS-743) ItemImport fails on large batches with OutOfMemory exception

2010-11-05 Thread Graham Triggs (DuraSpace JIRA)
ItemImport fails on large batches with OutOfMemory exception


 Key: DS-743
 URL: https://jira.duraspace.org/browse/DS-743
 Project: DSpace
  Issue Type: Bug
Reporter: Graham Triggs
Priority: Major
 Fix For: 1.7.0


On large imports, ItemImport will throw an OutOfMemory exception - this is 
because the context isn't having the cache cleared, so the Item objects that 
are created are building up in memory.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Created: (DS-742) Embargo Fails with Null Pointer Exception on Item Install.

2010-11-05 Thread Mark Diggory (DuraSpace JIRA)
Embargo Fails with Null Pointer Exception on Item Install.
--

 Key: DS-742
 URL: https://jira.duraspace.org/browse/DS-742
 Project: DSpace
  Issue Type: Bug
  Components: DSpace API
Affects Versions: 1.6.2, 1.6.1, 1.6.0, 1.7.0
Reporter: Mark Diggory
 Fix For: 1.7.0


Embargo fails because the code fails to check for NullPointers appropriately.

Fix


DCDate result = null;

// Its very poor form to blindly use an object that could be null...
if(terms != null && terms[0] != null)
{
result = setter.parseTerms(context, item, terms.length > 0 ? 
terms[0].value : null);
}



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


Re: [Dspace-devel] Discovery Instances in DSpace 1.7

2010-11-05 Thread Graham Triggs
On 5 November 2010 19:55, Graham Triggs  wrote:

> There have been a few improvements in DSpace 1.7 recently. I just ran a
> test on my MacBook Pro. My local repository started with an existing 94072
> items already installed.
>
> Running the ItemImport command, over a period of 5 minutes, I was able to
> consistently observe ingest rates of between 8 and 12 items per second
> (minute intervals of 94722, 95060, 95550, 96249 and 96864 items installed).
> This is using Postgres based browse tables and a Lucene search index.
>
> Note that these were metadata only items, although not entirely random - if
> you take a look in DSpace trunk, I've added into an org.dspace.testing
> package a PubmedToImport class - which will use a SAX parser to spit out
> DSpace import format directories from a medline.xml file (you can easily
> generate a large file consisting of many thousands of items from
> http://www.ncbi.nlm.nih.gov). It's very rough around the edges, and it's
> not a complete mapping of the data, but it provides a decent amount of
> reasonably 'real world' test data very quickly.
>
>
I forgot to add - that is not an ItemImport specific hack. This kind of
performance applies to web form submissions, item edits, deletions, and
SWORD deposits (although for anyone expecting to fire items at the SWORD
server at that rate ought to bear in mind that the packaging format is going
to add overhead to the overall processing).

G
--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Commented: (DS-164) Deposit Interface

2010-11-05 Thread Jim Ottaviani (DuraSpace JIRA)

[ 
https://jira.duraspace.org/browse/DS-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17997#action_17997
 ] 

Jim Ottaviani commented on DS-164:
--

DGOC reviewed this on 2 Nov 2010, and while its status is not 100% clear to us, 
it doesn't appear to have progressed substantially since 2009's Google Summer 
of Code 
.
 Nevertheless, this would be a welcome change to the current functionality, 
which requires DSpace collection managers to have XML skills to deal with what 
can quickly become a very large and unwieldy input-forms.xml document. As a 
result, this is a barrier to new DSpace users who often can't/don't customize 
as desired for fear of breaking this file (and thus bring down submission 
capability altogether for their installation) with a single syntax error or 
missing bit of punctuation. We recommend this receive a boost in priority, and 
if possible build on the GSoC work to implement it in a near-future release, 
since it would have a high impact on usability.

> Deposit Interface
> -
>
> Key: DS-164
> URL: https://jira.duraspace.org/browse/DS-164
> Project: DSpace
>  Issue Type: New Feature
>Reporter: Charles Kiplagat
>Priority: Major
>
> Suggestions included: a web interface for altering input-forms.xml, being 
> able to select an input form "on the fly" based on the type of item being 
> deposited, a web interface to the Configurable Submission System, eliminating 
> the need to restart the server after changes to input-forms.xml and the 
> Configurable Submission System, allowing more configuration (e.g. 
> input-forms.xml, Configurable Submission) and command-line actions (e.g. 
> batch imports) to be pushed down to community and collection administrators, 
> allowing metadata specific to an eperson (e.g. name, metadata fields to 
> exclude) to be stored in that eperson's profile, It was noted that the lack 
> of a web interface to many DSpace configuration files means that repository 
> managers who are not also systems administrators may not be able to configure 
> their installations fully.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


Re: [Dspace-devel] Discovery Instances in DSpace 1.7

2010-11-05 Thread Graham Triggs
On 5 November 2010 16:10, Mark Diggory  wrote:

> Enabling Discovery as a separate webapplication is possible if we make
> the following concessions:
>
> 1.) Enable the "Discovery Consumer" by default in dspace.cfg and
> accept that all apps including the traditional xmlui use it.
> 2.) Create a second war project under modules called something like
> "xmlui-beta"
> 3.) copy the xmlui.xconf into xmlui-beta/src/main/webapp/WEB-INF
> 4.) Configure the discovery aspects there.
>
> This would give you a second XMLUI instance with discovery enabled
> within it and be usable in the testathon without having to do any of
> the installation steps.
>

I can see the value in having a Discovery enabled instance for testing, but
I stand by my point of the Discovery enabled applications being treated as
an entirely separate repository to that of the standard repository.

Enabling the Discovery consumer in both sets of web applications is skewing
the testing environment, and potentially giving us confusing data (if, for
example, there was a problem with the Discovery Consumer, it's useful to us
to have it isolated between the environments).

Mark
>
> p.s. Peter, I think that we want to consider batch processing and the
> DiscoveryConsumer in changing to autocommit.  Ideally, we would seek a
> configuration that will optimize solr commits when processing a large
> number of items.  Note, until we get Browse completely out of the
> picture, we are stuck with that original problem with Browse
> interfering with batch loading scalability.
>
>
There have been a few improvements in DSpace 1.7 recently. I just ran a test
on my MacBook Pro. My local repository started with an existing 94072 items
already installed.

Running the ItemImport command, over a period of 5 minutes, I was able to
consistently observe ingest rates of between 8 and 12 items per second
(minute intervals of 94722, 95060, 95550, 96249 and 96864 items installed).
This is using Postgres based browse tables and a Lucene search index.

Note that these were metadata only items, although not entirely random - if
you take a look in DSpace trunk, I've added into an org.dspace.testing
package a PubmedToImport class - which will use a SAX parser to spit out
DSpace import format directories from a medline.xml file (you can easily
generate a large file consisting of many thousands of items from
http://www.ncbi.nlm.nih.gov). It's very rough around the edges, and it's not
a complete mapping of the data, but it provides a decent amount of
reasonably 'real world' test data very quickly.

G
--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Assigned: (DS-741) Ability to store an incoming package as a file in the event that the ingest fails

2010-11-05 Thread Richard Rodgers (DuraSpace JIRA)

 [ 
https://jira.duraspace.org/browse/DS-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Rodgers reassigned DS-741:
--

Assignee: Sands Fish

> Ability to store an incoming package as a file in the event that the ingest 
> fails
> -
>
> Key: DS-741
> URL: https://jira.duraspace.org/browse/DS-741
> Project: DSpace
>  Issue Type: Improvement
>  Components: SWORD
>Affects Versions: 1.7.0
>Reporter: Sands Fish
>Assignee: Sands Fish
> Attachments: sword-fail-ingest.diff
>
>
> Provided by Bill Hays (wh...@mit.edu)...
> This patch adds the ability to store an incoming package as a file in the 
> event that the ingest fails and an exception is thrown.  Configuration for 
> this option comes from two new optional configuration lines in dspace.cfg:
> sword.keep-package-on-fail (true/false, where the default is false)
> sword.failed-package.dir   (directory location on file system)
> Both properties must be set for the new feature to take place.  If the first 
> is missing the default is false.  If the second is missing when the first is 
> true, a warn message is written to the logs and the original exception is 
> still thrown.
> =
> Documentation:
> For DSpace Manual:  [I only see properties listed, no general discussion 
> of features.]
> Property:  sword.keep-package-on-fail
> Example value:  false
> Informational note:  In the event of package ingest failure, provide an 
> option to store the package on the file system. The default is false.  File 
> names are not maintained in the SWORD protocol.  The new file name of the 
> package is in the form: 
> sword--.
> Property:  sword.failed-package.dir
> Example value: ${dspace.baseUrl}/upload
> Informational note:  Directory location where failed package files are 
> written.
> [SWORD README file in distribution does not discuss this type of DSpace 
> implementation detail.]

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Created: (DS-741) Ability to store an incoming package as a file in the event that the ingest fails

2010-11-05 Thread Sands Fish (DuraSpace JIRA)
Ability to store an incoming package as a file in the event that the ingest 
fails
-

 Key: DS-741
 URL: https://jira.duraspace.org/browse/DS-741
 Project: DSpace
  Issue Type: Improvement
  Components: SWORD
Affects Versions: 1.7.0
Reporter: Sands Fish
 Attachments: sword-fail-ingest.diff

Provided by Bill Hays (wh...@mit.edu)...

This patch adds the ability to store an incoming package as a file in the event 
that the ingest fails and an exception is thrown.  Configuration for this 
option comes from two new optional configuration lines in dspace.cfg:

sword.keep-package-on-fail (true/false, where the default is false)
sword.failed-package.dir   (directory location on file system)

Both properties must be set for the new feature to take place.  If the first is 
missing the default is false.  If the second is missing when the first is true, 
a warn message is written to the logs and the original exception is still 
thrown.

=

Documentation:

For DSpace Manual:  [I only see properties listed, no general discussion 
of features.]

Property:  sword.keep-package-on-fail
Example value:  false
Informational note:  In the event of package ingest failure, provide an option 
to store the package on the file system. The default is false.  File names are 
not maintained in the SWORD protocol.  The new file name of the package is in 
the form: 
sword--.

Property:  sword.failed-package.dir
Example value: ${dspace.baseUrl}/upload
Informational note:  Directory location where failed package files are written.

[SWORD README file in distribution does not discuss this type of DSpace 
implementation detail.]

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] Discovery Instances in DSpace 1.7

2010-11-05 Thread Mark Diggory
Sorry for missing the meeting.  Catching up on this thread...

[20:57]  PeterDietz -- just catching up here. maybe you
could enable two versions of XMLUI? One with Discovery and one
without? Not tried that myself, but it could be a way to promote the
differences between Discovery and normal XMLUI
[20:58]  PeterDietz -- what I mean is actually creating a
'http://demo.dspace.org/xmluiDisc/ path (or similar), to go alongside
the existing http://demo.dspace.org/xmlui/
[20:58]  Yes, would not like to see Discovery left out, but if
there can be only one then it should be the default one.
[20:59]  That might involve some tomcat trickery that
might give the user the misleading impression that they can run both
in tandem. It would be better to have seperate instances, or to change
things up on people each day. Theme of Day One: Discover whats new
with DSpace, Theme of Day Two:
[20:59]  tdonohue_: yeah, i think teh tricky bit we're
worried about it making sure users know it's switched off by default,
and that it may not be production-ready
[21:02]  discovery: who would be the one to configure a
separate instance? the one who has to do the work should probably
decide if it's feasible.
[21:03]  it would be possible to have the discovery
applications alongside the non-discovery applications... but the
non-discovery applications wouldn't update the discovery indexes if
you add/change an item (and possibly vice versa)... so, they would
have to be entirely separate 'repositories'
[21:04]  ...could emulate that by having a cron task run
every hour to keep the discovery-solr-index in sync
[21:05]  grahamtriggs -- that's an idea. Have a separate
testathon install with Discovery enabled -- and send people there for
testing of it
[21:06]  ...so, if anyone is going to host their own
testathon let me know. This would be easy to help people see the
difference between dspace-plain, and dspace + discovery
[21:09]  PeterDietz -- if there are no takers to setup
another testathon instance, technically we could just install a second
DSpace on demo.dspace.org, and enable Discovery in that second DSpace
only
[21:10]  i.e. you could still end up with a
demo.dspace.org/xmluiDisc/ path -- but it would actually point at an
entirely separate install of DSpace. Not sure if that would get way
too confusing for testers though (so, a separate site may be better,
if we have a volunteer)
[21:15]  also, I think I'll change the
update-discovery-index to do solr.search.autocommit. As I mentioned to
a few last week, you get a 10x performance boost for the initial
indexing
[21:16]  I'll leave to hard solr.committing after every
update/delete so that though happen "for sure"

Enabling Discovery as a separate webapplication is possible if we make
the following concessions:

1.) Enable the "Discovery Consumer" by default in dspace.cfg and
accept that all apps including the traditional xmlui use it.
2.) Create a second war project under modules called something like "xmlui-beta"
3.) copy the xmlui.xconf into xmlui-beta/src/main/webapp/WEB-INF
4.) Configure the discovery aspects there.

This would give you a second XMLUI instance with discovery enabled
within it and be usable in the testathon without having to do any of
the installation steps.

Mark

p.s. Peter, I think that we want to consider batch processing and the
DiscoveryConsumer in changing to autocommit.  Ideally, we would seek a
configuration that will optimize solr commits when processing a large
number of items.  Note, until we get Browse completely out of the
picture, we are stuck with that original problem with Browse
interfering with batch loading scalability.

-- 
Mark R. Diggory
@mire - www.atmire.com
533 2nd Street - Encinitas, CA 92024 - USA
Technologielaan 9 - 3001 Heverlee - Belgium

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


Re: [Dspace-devel] Solr for 1.7

2010-11-05 Thread Mark Diggory
Thats relieving.

On Thu, Nov 4, 2010 at 11:51 AM, TAYLOR Robin  wrote:
> Hi Sands,
>
> A fair question, but as it turns out it looks like the problem was that I had 
> a corrupt version of colt-1.2.0.jar in my local Maven repository, so Tomcat 
> repeatedly tried to load it without success. I deleted it and rebuilt Dspace 
> and I can now see a 'good' version so things look promising. Not sure why 
> this happened but I guess something had gone wrong with my previous Dspace 
> build.
>
> Cheers, Robin.
>
>
> Robin Taylor
> Main Library
> University of Edinburgh
> Tel. 0131 6513808
>
>> -Original Message-
>> From: Sands Alden Fish [mailto:sa...@mit.edu]
>> Sent: 04 November 2010 17:45
>> To: TAYLOR Robin
>> Cc: dspace-devel@lists.sourceforge.net
>> Subject: Re: [Dspace-devel] Solr for 1.7
>>
>> Robin, is it possible you just need to provide more memory to
>> your JVM?
>>
>>
>> --
>> sands fish
>> Senior Software Engineer
>> MIT Libraries
>> Technology Research & Development
>> sa...@mit.edu
>> E25-131
>>
>>
>>
>>
>>
>> On Nov 4, 2010, at 12:39 PM, TAYLOR Robin wrote:
>>
>>
>>       Can anyone help ?
>>
>>       I'm having trouble with my latest version of Dspace
>> installed from trunk. Solr is running out of Permgen memory
>> after about 15 mins. I suspect it is because it is
>> reinitialising itself every 10 secs approx. In the logs I see
>> this being repeated...
>>
>>       04-Nov-2010 15:50:37 org.apache.solr.core.SolrCore
>> registerSearcher
>>       INFO: [statistics] Registered new searcher searc...@a85af8 main
>>       04-Nov-2010 15:50:47
>> org.apache.catalina.loader.WebappClassLoader modified
>>       INFO:     Additional JARs have been added : 'colt-1.2.0.jar'
>>       04-Nov-2010 15:50:47
>> org.apache.catalina.core.StandardContext reload
>>       INFO: Reloading this Context has started
>>       04-Nov-2010 15:50:47 org.apache.solr.core.SolrCore close
>>       INFO: [search]  CLOSING SolrCore
>> org.apache.solr.core.solrc...@18654ae
>>       04-Nov-2010 15:50:47 org.apache.solr.core.SolrCore closeSearcher
>>       INFO: [search] Closing main searcher on request.
>>       04-Nov-2010 15:50:47
>> org.apache.solr.search.SolrIndexSearcher close
>>       INFO: Closing searc...@1b3bfc5 main
>>
>>
>>
>>       In case its relevant I'm on a Windows machine with
>> Tomcat 6.x and Java 1.6.
>>
>>       Any help appreciated.
>>
>>       Cheers, Robin.
>>
>>
>>       Robin Taylor
>>       Main Library
>>       University of Edinburgh
>>       Tel. 0131 6513808
>>       --
>>       The University of Edinburgh is a charitable body, registered in
>>       Scotland, with registration number SC005336.
>>
>>
>>
>> --
>> 
>>       The Next 800 Companies to Lead America's Growth: New
>> Video Whitepaper
>>       David G. Thomson, author of the best-selling book
>> "Blueprint to a
>>       Billion" shares his insights and actions to help propel your
>>       business during the next growth cycle. Listen Now!
>>       http://p.sf.net/sfu/SAP-dev2dev
>>       ___
>>       Dspace-devel mailing list
>>       Dspace-devel@lists.sourceforge.net
>>       https://lists.sourceforge.net/lists/listinfo/dspace-devel
>>
>>
>>
>>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
> --
> The Next 800 Companies to Lead America's Growth: New Video Whitepaper
> David G. Thomson, author of the best-selling book "Blueprint to a
> Billion" shares his insights and actions to help propel your
> business during the next growth cycle. Listen Now!
> http://p.sf.net/sfu/SAP-dev2dev
> ___
> Dspace-devel mailing list
> Dspace-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-devel
>



-- 
Mark R. Diggory
@mire - www.atmire.com
533 2nd Street - Encinitas, CA 92024 - USA
Technologielaan 9 - 3001 Heverlee - Belgium

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Commented: (DS-739) Improve performance of Lucene indexing

2010-11-05 Thread Graham Triggs (DuraSpace JIRA)

[ 
https://jira.duraspace.org/browse/DS-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17996#action_17996
 ] 

Graham Triggs commented on DS-739:
--

The various tasks - media-filter, indexing, itemimport - put DSIndexer into 
batchprocessingmode in all cases. It's just a case of what you have set the 
dspace.cfg property to in order to control the size of the batch (default is 
20).

For the delayed update for web applications, this defaults to being disabled, 
but can be enabled by adding the dspace.cfg property.

> Improve performance of Lucene indexing
> --
>
> Key: DS-739
> URL: https://jira.duraspace.org/browse/DS-739
> Project: DSpace
>  Issue Type: Improvement
>Reporter: Graham Triggs
>Assignee: Graham Triggs
> Fix For: 1.7.0
>
>
> Adds a batch processing mode for Lucene indexes.
> Can be controlled by calling DSIndexer.setBatchProcessingMode(boolean).
> NB: If you set batch processing mode to true, ensure that you set it to 
> 'false' at the end of the batch to flush any unwritten documents.
> The size of the batch can be controlled by setting a numeric value in 
> dspace.cfg for the property: search.batch.documents
> By default, the size of the batch is 20 documents.
> Additionally, there is the possibility to create a 'delayed index flusher'. 
> If a web application pushes multiple search requests (ie. a barrage or sword 
> deposits, or multiple quick edits in the ui), then this will combine them 
> into a single index update (up to the limit of the batch defined above).
> To use the delayed update, set the property 'search.index.delay' in 
> dspace.cfg to the number of milliseconds to wait for an update. eg.
> search.index.delay = 5000
> will hold a Lucene update in a queue for up to 5 seconds. After 5 seconds - 
> or the batch limit above is reached - all waiting updates will be written to 
> the Lucene index.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Commented: (DS-627) Documentation refers to install-configs script

2010-11-05 Thread Jeffrey Trimble (DuraSpace JIRA)

[ 
https://jira.duraspace.org/browse/DS-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17995#action_17995
 ] 

Jeffrey Trimble commented on DS-627:


We still have references of install-configs for configuration files that are 
not part of dspace.cfg.  For example, we refer it in the section "Configuration 
Files for Other Applications".  But no alternative has been mentioned to 
replace it.  We need to make sure we have replacements instructions for scripts 
not being used.

> Documentation refers to install-configs script
> --
>
> Key: DS-627
> URL: https://jira.duraspace.org/browse/DS-627
> Project: DSpace
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.6.2
>Reporter: Nick Nicholas
>Assignee: Jeffrey Trimble
> Fix For: 1.7.0
>
>
> Documentation continues to refer to [dspace]/bin/install-configs , which was 
> withdrawn three years ago: 
> http://www.mail-archive.com/dspace-t...@lists.sourceforge.net/msg04335.html
> In the pdf, this occurs 10 times: pp. 79, 152 (twice), 161, 162 (twice), 229, 
> 233 (twice), 234, 250.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Resolved: (DS-628) Make the timeout for the extended resolver dnslookup configurable

2010-11-05 Thread Jeffrey Trimble (DuraSpace JIRA)

 [ 
https://jira.duraspace.org/browse/DS-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Trimble resolved DS-628.


Resolution: Fixed

> Make the timeout for the extended resolver dnslookup configurable
> -
>
> Key: DS-628
> URL: https://jira.duraspace.org/browse/DS-628
> Project: DSpace
>  Issue Type: Improvement
>  Components: DSpace API
>Affects Versions: 1.7.0
>Reporter: Claudia Jürgen
>Assignee: Jeffrey Trimble
> Fix For: 1.7.0
>
> Attachments: DnsLookup.java.patch, DnsLookup.java.patch
>
>
> This patch enables the configuration of the extended resolver's time out for 
> the dns lookup.
> A new configuration parameter solr.resolver.timeout (sorry for yet another 
> config param) is introduced and used to set the timout in DnsLookup.java.
> For backward compatibility the time out defaults to 20 milliseconds.
> OS time outs vary between 2 to 5 seconds, from what I've learned so far.
> We did face an increasing number of solr errors (200- 900/day) due to 
> timeout, with the timeout being set to 20 milliseconds. Setting it to the 
> system defaults (5 seconds) got rid of most of these. Nowadays an average of 
> 2/day do occur.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

   

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Updated: (DS-628) Make the timeout for the extended resolver dnslookup configurable

2010-11-05 Thread Jeffrey Trimble (DuraSpace JIRA)

 [ 
https://jira.duraspace.org/browse/DS-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Trimble updated DS-628:
---

Documentation Status: Complete or Committed  (was: Needed)

Documentation has been updated

> Make the timeout for the extended resolver dnslookup configurable
> -
>
> Key: DS-628
> URL: https://jira.duraspace.org/browse/DS-628
> Project: DSpace
>  Issue Type: Improvement
>  Components: DSpace API
>Affects Versions: 1.7.0
>Reporter: Claudia Jürgen
>Assignee: Jeffrey Trimble
> Fix For: 1.7.0
>
> Attachments: DnsLookup.java.patch, DnsLookup.java.patch
>
>
> This patch enables the configuration of the extended resolver's time out for 
> the dns lookup.
> A new configuration parameter solr.resolver.timeout (sorry for yet another 
> config param) is introduced and used to set the timout in DnsLookup.java.
> For backward compatibility the time out defaults to 20 milliseconds.
> OS time outs vary between 2 to 5 seconds, from what I've learned so far.
> We did face an increasing number of solr errors (200- 900/day) due to 
> timeout, with the timeout being set to 20 milliseconds. Setting it to the 
> system defaults (5 seconds) got rid of most of these. Nowadays an average of 
> 2/day do occur.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

   

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Commented: (DS-640) Interal System Error when browsing with wrong argument

2010-11-05 Thread Jeffrey Trimble (DuraSpace JIRA)

[ 
https://jira.duraspace.org/browse/DS-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17993#action_17993
 ] 

Jeffrey Trimble commented on DS-640:


I'm not sure what documentation needs to be made.  It's a fix to a strange 
return on a 404.

> Interal System Error when browsing with wrong argument
> --
>
> Key: DS-640
> URL: https://jira.duraspace.org/browse/DS-640
> Project: DSpace
>  Issue Type: Bug
>  Components: Documentation, JSPUI
>Affects Versions: 1.5.2, 1.6.0, 1.6.1, 1.6.2
>Reporter: Hardik Mishra
>Assignee: Jeffrey Trimble
>Priority: Major
>
> On Browsing Items:
> If someone tries to use browse type for which browse index does not exist, 
> like browse=publisher
> e.g. http://dspace.webinito.com/browse?type 
> OR
> e.g. http://dspace.webinito.com/browse?type=xyz
> You will get Internal System Error

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


[Dspace-devel] [DuraSpace JIRA] Commented: (DS-739) Improve performance of Lucene indexing

2010-11-05 Thread Jeffrey Trimble (DuraSpace JIRA)

[ 
https://jira.duraspace.org/browse/DS-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17991#action_17991
 ] 

Jeffrey Trimble commented on DS-739:


So, for batch processing, what command does the user issue to make it happen?  
Is this in the command launcher?

> Improve performance of Lucene indexing
> --
>
> Key: DS-739
> URL: https://jira.duraspace.org/browse/DS-739
> Project: DSpace
>  Issue Type: Improvement
>Reporter: Graham Triggs
>Assignee: Graham Triggs
> Fix For: 1.7.0
>
>
> Adds a batch processing mode for Lucene indexes.
> Can be controlled by calling DSIndexer.setBatchProcessingMode(boolean).
> NB: If you set batch processing mode to true, ensure that you set it to 
> 'false' at the end of the batch to flush any unwritten documents.
> The size of the batch can be controlled by setting a numeric value in 
> dspace.cfg for the property: search.batch.documents
> By default, the size of the batch is 20 documents.
> Additionally, there is the possibility to create a 'delayed index flusher'. 
> If a web application pushes multiple search requests (ie. a barrage or sword 
> deposits, or multiple quick edits in the ui), then this will combine them 
> into a single index update (up to the limit of the batch defined above).
> To use the delayed update, set the property 'search.index.delay' in 
> dspace.cfg to the number of milliseconds to wait for an update. eg.
> search.index.delay = 5000
> will hold a Lucene update in a queue for up to 5 seconds. After 5 seconds - 
> or the batch limit above is reached - all waiting updates will be written to 
> the Lucene index.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel