[Dspace-tech] Anyone played with G1 garbage collector in JDK7?

2014-07-01 Thread Alan Orth
Hey, all.

I was just looking over the DSpace tuning guide[0] and I got to reading
about garbage collectors.  Ubuntu {12,14}.04's Tomcat 7 both use the
ConcMarkSweep garbage collector, but I wonder if we should be using Java
7's new G1 garbage collector, as JDK7 has been out for a few years now
and there are some impressive numbers with G1GC[1][2].

I'm currently using the following JAVA_OPTS on my dev/production servers
where the repository has ~20,000 items:

-Djava.awt.headless=true -Xms1024m -Xmx2048m -XX:MaxPermSize=320m
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode

We're moving to newer servers soon (Linode, 6 CPUs, 8GB RAM, from a
pretty slow Ec2 m1.medium), so we'll see much more performance anyways,
but I'll be experimenting with different parameters there as our
repository is becoming increasingly important and heavy (~500,000 hits
per month).

Cheers,

[0] https://wiki.duraspace.org/display/DSDOC4x/Performance+Tuning+DSpace
[1] http://blog.sematext.com/2013/06/24/g1-cms-java-garbage-collector/
[2] http://www.aioug.org/sangam12/Presentations/20155.pdf
-- 
Alan Orth
alan.o...@gmail.com
http://alaninkenya.org
http://mjanja.co.ke
"I have always wished for my computer to be as easy to use as my
telephone; my wish has come true because I can no longer figure out how
to use my telephone." -Bjarne Stroustrup, inventor of C++
GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0



signature.asc
Description: OpenPGP digital signature
--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

[Dspace-tech] dc.date.available for embargoed Items with Advanced Embargo functionality

2014-07-01 Thread Christian Scheible
Hi together,

We are using DSpace 4.1 and I have a question regarding the Advanced 
Embargo functionality.
During the install of an Item in 
org.dspace.content.InstallItem.populateMetadata(Context, Item) line 164
the item is checked if there is an embargo. But it uses the old 
EmbargoManager so it will never find an embargo.
The result is that dc.date.available is always the day of the acceptance 
of an item.
Is this a Bug or the expected behaviour?
Because in pre DSpace 3.x instances the dc.date.available is set to the 
embargo lift date when the EmbargoLifter is run and an embargo is 
lifted. I think that's the way date available should be set.

Regards

-- 
Christian Scheible
Softwareentwickler / Abt. Content-basierte Dienste
Kommunikations-, Informations- und Medienzentrum (KIM)
Universität Konstanz
78457 Konstanz
+49 (0)7531 / 88-2857
Raum B 703


--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


[Dspace-tech] Can't find resource 'solrconfig.xml'

2014-07-01 Thread James Stockdale
Hello,

I have successfully installed DSpace 4.1 on Ubuntu 12.04.4 LTS. I was able to 
create some communities and collections and  upload items, but when I try to 
browse or search, I see the following stack trace (see end of message).

I am using Jetty as the webserver and I copied all of the generated webapps 
into the Jetty webapps directory. I notice that the file that appears to be 
missing, 'solrconfig.xml' is present in the [dspace]/solr/search/conf 
directory. However, I'm not sure how the webapp is configured to locate this 
file. As far as I know, I faithfully followed the instructions at 
https://wiki.duraspace.org/display/DSDOC4x/Installing+DSpace

Please advise me! Thanks.

Stacktrace follows:

6789985 [1361154225@qtp-119948-7] ERROR 
org.apache.solr.servlet.SolrDispatchFilter  – 
null:org.apache.solr.common.SolrException: SolrCore 'collection1' is not 
available due to init failure: Could not load config for solrconfig.
xml
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:860)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:251)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at 
org.dspace.solr.filters.LocalHostRestrictionFilter.doFilter(LocalHostRestrictionFilter.java:50)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:926)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.apache.solr.common.SolrException: Could not load config for 
solrconfig.xml
at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:592)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:657)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
Caused by: java.io.IOException: Can't find resource 'solrconfig.xml' in 
classpath or 'solr/collection1/conf/', cwd=/var/cache/jetty/tmp
at 
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:322)
at 
org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:287)
at org.apache.solr.core.Config.(Config.java:116)
at org.apache.solr.core.Config.(Config.java:86)
at org.apache.solr.core.SolrConfig.(SolrConfig.java:120)
at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:589)
... 11 more



TEC Centre – NIMBUS
Cork Institute of Technology,
Bishopstown,
Cork.
james.stockd...@cit.ie

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
h

Re: [Dspace-tech] index-lucene-update

2014-07-01 Thread Tim Donohue
Hi Jose,

This is a bit confusing, cause the error you show says the 
NullPointerException is on line 616 of DSIndexer.java.  In version 4.x 
of DSpace, that's this line (and NOT the "liveDocs" line you sent 
previously):

https://github.com/DSpace/DSpace/blob/dspace-4_x/dspace-api/src/main/java/org/dspace/search/DSIndexer.java#L616

Have you modified/customized the DSIndexer.java file in some way? Are 
you certain you've fully upgraded all your code to 4.x?

Based on the docs for the MultiFields.getLiveDocs() method though, it 
does also sound like we are missing a check for null:
https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/index/MultiFields.html#getLiveDocs(org.apache.lucene.index.IndexReader)

Since getLiveDocs can return null, it's likely the fix here may be to 
change the 'if' to be:

if (liveDocs!=null && !liveDocs.get(i))

So, this sounds like it also may be a possible bug.

- Tim


On 6/27/2014 10:40 AM, Jose Blanco wrote:
> I just ran index-lucene-init and it completed successfully, but when I
> try to run
>
> index-lucene-update, it blows up at where I have ():
>
>   */
>  public static void cleanIndex(Context context) throws IOException,
> SQLException {
>
>  IndexReader reader = DSQuery.getIndexReader();
>
>  Bits liveDocs = MultiFields.getLiveDocs(reader);
>
>  for(int i = 0 ; i < reader.numDocs(); i++)
>  {
> ()   if (!liveDocs.get(i))
>
>
> The error I'm getting is
>
> Exception: null
> java.lang.NullPointerException
>  at org.dspace.search.DSIndexer.cleanIndex(DSIndexer.java:616)
>
> I know reader is not null, but liveDocs must be, but why?
>
> -Jose
>
> --
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette: 
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] index-lucene-update

2014-07-01 Thread Jose Blanco
Tim,

Thanks for helping me out with this.  I am using 4.1 code, it's just
that I have some changes in there particular to our instance.  The
changes I have I don't think should affect this.  I put in the check
you suggested, and now I don't get the null exception but get a whole
bunch of these sorts of errors:

2014-07-01 12:35:34,764 ERROR org.dspace.search.DSIndexer @
java.io.FileNotFoundException:
/dspace/repository/dev/search/_6o8q_Lucene41_0.tip (Too many open
files)

I was getting these sort of errors when I was running
index-lucene-init and I changed these config parameters:

search.index.delay = -1
search.batch.documents = -1

and index-lucene-init completed successfully.

I'm a bit confused as to whether I need to run index-lucene-update as
a cron job or not.  When I load an item into my instance, the lucene
index metadata is updated  because, I have this set ( the search in
the list ):

event.dispatcher.default.consumers = search, versioning, discovery,
eperson, harvester

But as I understood it, when filter-media is run (version 3.x), the
the fulltext search is updated in lucene.  So I guess this is what
index-lucene-update is suppose to do with version 4.1?

My plan for the release is to run index-lucene-init, and then setup a
cron job to run index-lucene-update to run after filter-media runs.
Does that make sense?

Thank you again!

Jose


On Tue, Jul 1, 2014 at 11:49 AM, Tim Donohue  wrote:
> Hi Jose,
>
> This is a bit confusing, cause the error you show says the
> NullPointerException is on line 616 of DSIndexer.java.  In version 4.x of
> DSpace, that's this line (and NOT the "liveDocs" line you sent previously):
>
> https://github.com/DSpace/DSpace/blob/dspace-4_x/dspace-api/src/main/java/org/dspace/search/DSIndexer.java#L616
>
> Have you modified/customized the DSIndexer.java file in some way? Are you
> certain you've fully upgraded all your code to 4.x?
>
> Based on the docs for the MultiFields.getLiveDocs() method though, it does
> also sound like we are missing a check for null:
> https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/index/MultiFields.html#getLiveDocs(org.apache.lucene.index.IndexReader)
>
> Since getLiveDocs can return null, it's likely the fix here may be to change
> the 'if' to be:
>
> if (liveDocs!=null && !liveDocs.get(i))
>
> So, this sounds like it also may be a possible bug.
>
> - Tim
>
>
>
> On 6/27/2014 10:40 AM, Jose Blanco wrote:
>>
>> I just ran index-lucene-init and it completed successfully, but when I
>> try to run
>>
>> index-lucene-update, it blows up at where I have ():
>>
>>   */
>>  public static void cleanIndex(Context context) throws IOException,
>> SQLException {
>>
>>  IndexReader reader = DSQuery.getIndexReader();
>>
>>  Bits liveDocs = MultiFields.getLiveDocs(reader);
>>
>>  for(int i = 0 ; i < reader.numDocs(); i++)
>>  {
>> ()   if (!liveDocs.get(i))
>>
>>
>> The error I'm getting is
>>
>> Exception: null
>> java.lang.NullPointerException
>>  at org.dspace.search.DSIndexer.cleanIndex(DSIndexer.java:616)
>>
>> I know reader is not null, but liveDocs must be, but why?
>>
>> -Jose
>>
>>
>> --
>> Open source business process management suite built on Java and Eclipse
>> Turn processes into business applications with Bonita BPM Community
>> Edition
>> Quickly connect people, data, and systems into organized workflows
>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>> http://p.sf.net/sfu/Bonitasoft
>> ___
>> DSpace-tech mailing list
>> DSpace-tech@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>> List Etiquette:
>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>>
>

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] index-lucene-update

2014-07-01 Thread Tim Donohue
Hi Jose,

On 7/1/2014 11:46 AM, Jose Blanco wrote:
> Tim,
>
> Thanks for helping me out with this.  I am using 4.1 code, it's just
> that I have some changes in there particular to our instance.  The
> changes I have I don't think should affect this.  I put in the check
> you suggested, and now I don't get the null exception but get a whole
> bunch of these sorts of errors:
>
> 2014-07-01 12:35:34,764 ERROR org.dspace.search.DSIndexer @
> java.io.FileNotFoundException:
> /dspace/repository/dev/search/_6o8q_Lucene41_0.tip (Too many open
> files)

You are running into a "Too many open files" error. This actually may be 
*unrelated* to the indexing method. It's telling you that your DSpace 
has too many open file descriptors...so, somewhere in your code it is 
opening files without closing them (or just opening too many files at 
once -- the default setting is 1024 on most OS's)

More info on "Too many open files" errors:
* http://stackoverflow.com/a/4289528
* 
http://stackoverflow.com/questions/2272908/java-io-filenotfoundexception-too-many-open-files
(Or google the error, there's other resources out there)

The place where this error shows up is somewhat random...it's going to 
just show up in an area where you are accessing a file after running out 
of available file descriptors.

You can increase the maximum number of opened files on your system 
(ulimit -n), but chances are there may be an issue in some custom code 
you've written (or possibly in DSpace API itself -- though this is less 
likely unless others are hitting this error as well)

You might want to try stopping Tomcat and restarting it & then do the 
full index-lucene-init & index-lucene-update. If it errors again, it's 
possible that the index-lucene-init is somehow keeping files open. E.g. 
Does your search/browse *work* from the UI after an "init"? Or does 
searching from the UI also throw an error?

> I was getting these sort of errors when I was running
> index-lucene-init and I changed these config parameters:
>
> search.index.delay = -1
> search.batch.documents = -1
>
> and index-lucene-init completed successfully.
>
> I'm a bit confused as to whether I need to run index-lucene-update as
> a cron job or not.  When I load an item into my instance, the lucene
> index metadata is updated  because, I have this set ( the search in
> the list ):

index-lucene-update need not be run via cron job. The indexes should be 
updated automatically on each newly submitted item. That being said, 
I've heard some institutions schedule it via a cron job "just in case", 
but it shouldn't be needed.

index-lucene-update is mostly there to *update* your index if you chance 
any settings (e.g. add new search fields, etc.). It's not necessary to 
run on any ongoing basis. But, it also shouldn't fail -- it failing is a 
sign that something is wrong in your index or in DSpace.

- Tim

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] index-lucene-update

2014-07-01 Thread Tim Donohue
Jose,

I forgot to check JIRA for this issue. :)

Looks like you may be hitting this:

https://jira.duraspace.org/browse/DS-1970

It's fixed in the upcoming 4.2 release (hoping to be released in mid-July)

- Tim

On 7/1/2014 12:45 PM, Tim Donohue wrote:
> Hi Jose,
>
> On 7/1/2014 11:46 AM, Jose Blanco wrote:
>> Tim,
>>
>> Thanks for helping me out with this.  I am using 4.1 code, it's just
>> that I have some changes in there particular to our instance.  The
>> changes I have I don't think should affect this.  I put in the check
>> you suggested, and now I don't get the null exception but get a whole
>> bunch of these sorts of errors:
>>
>> 2014-07-01 12:35:34,764 ERROR org.dspace.search.DSIndexer @
>> java.io.FileNotFoundException:
>> /dspace/repository/dev/search/_6o8q_Lucene41_0.tip (Too many open
>> files)
>
> You are running into a "Too many open files" error. This actually may be
> *unrelated* to the indexing method. It's telling you that your DSpace
> has too many open file descriptors...so, somewhere in your code it is
> opening files without closing them (or just opening too many files at
> once -- the default setting is 1024 on most OS's)
>
> More info on "Too many open files" errors:
> * http://stackoverflow.com/a/4289528
> *
> http://stackoverflow.com/questions/2272908/java-io-filenotfoundexception-too-many-open-files
>
> (Or google the error, there's other resources out there)
>
> The place where this error shows up is somewhat random...it's going to
> just show up in an area where you are accessing a file after running out
> of available file descriptors.
>
> You can increase the maximum number of opened files on your system
> (ulimit -n), but chances are there may be an issue in some custom code
> you've written (or possibly in DSpace API itself -- though this is less
> likely unless others are hitting this error as well)
>
> You might want to try stopping Tomcat and restarting it & then do the
> full index-lucene-init & index-lucene-update. If it errors again, it's
> possible that the index-lucene-init is somehow keeping files open. E.g.
> Does your search/browse *work* from the UI after an "init"? Or does
> searching from the UI also throw an error?
>
>> I was getting these sort of errors when I was running
>> index-lucene-init and I changed these config parameters:
>>
>> search.index.delay = -1
>> search.batch.documents = -1
>>
>> and index-lucene-init completed successfully.
>>
>> I'm a bit confused as to whether I need to run index-lucene-update as
>> a cron job or not.  When I load an item into my instance, the lucene
>> index metadata is updated  because, I have this set ( the search in
>> the list ):
>
> index-lucene-update need not be run via cron job. The indexes should be
> updated automatically on each newly submitted item. That being said,
> I've heard some institutions schedule it via a cron job "just in case",
> but it shouldn't be needed.
>
> index-lucene-update is mostly there to *update* your index if you chance
> any settings (e.g. add new search fields, etc.). It's not necessary to
> run on any ongoing basis. But, it also shouldn't fail -- it failing is a
> sign that something is wrong in your index or in DSpace.
>
> - Tim

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


[Dspace-tech] database import error

2014-07-01 Thread Alain Tschanz
Hello,
I made a database dump of the dspace database (running on CentOS with 
PostgreSQL 9.1 and DSpace version 3.2) and am trying to import the dump into 
PostgreSQL 9.1 running on Windows Server 2008 R2 (with DSpace 4.1) but for some 
reason I get the following error message:

--
-- Data for Name: bi_2_dis; Type: TABLE DATA; Schema: public; Owner: dspace
--

COPY bi_2_dis (id, authority, value, sort_value) FROM stdin;
1  \N   Kean, Thomas H.  kean, thomas h.


ERROR:  syntax error at or near "1"
LINE 1851: 1 \N Kean, Thomas H. kean, thomas h.
   ^

** Error **

ERROR: syntax error at or near "1"
SQL state: 42601
Character: 38687

Alain Tschanz

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] index-lucene-update

2014-07-01 Thread Jose Blanco
Tim, can you point me to the DSIndexer.java that has this fixed, so I
can try it out.



On Tue, Jul 1, 2014 at 1:53 PM, Tim Donohue  wrote:
> Jose,
>
> I forgot to check JIRA for this issue. :)
>
> Looks like you may be hitting this:
>
> https://jira.duraspace.org/browse/DS-1970
>
> It's fixed in the upcoming 4.2 release (hoping to be released in mid-July)
>
> - Tim
>
> On 7/1/2014 12:45 PM, Tim Donohue wrote:
>> Hi Jose,
>>
>> On 7/1/2014 11:46 AM, Jose Blanco wrote:
>>> Tim,
>>>
>>> Thanks for helping me out with this.  I am using 4.1 code, it's just
>>> that I have some changes in there particular to our instance.  The
>>> changes I have I don't think should affect this.  I put in the check
>>> you suggested, and now I don't get the null exception but get a whole
>>> bunch of these sorts of errors:
>>>
>>> 2014-07-01 12:35:34,764 ERROR org.dspace.search.DSIndexer @
>>> java.io.FileNotFoundException:
>>> /dspace/repository/dev/search/_6o8q_Lucene41_0.tip (Too many open
>>> files)
>>
>> You are running into a "Too many open files" error. This actually may be
>> *unrelated* to the indexing method. It's telling you that your DSpace
>> has too many open file descriptors...so, somewhere in your code it is
>> opening files without closing them (or just opening too many files at
>> once -- the default setting is 1024 on most OS's)
>>
>> More info on "Too many open files" errors:
>> * http://stackoverflow.com/a/4289528
>> *
>> http://stackoverflow.com/questions/2272908/java-io-filenotfoundexception-too-many-open-files
>>
>> (Or google the error, there's other resources out there)
>>
>> The place where this error shows up is somewhat random...it's going to
>> just show up in an area where you are accessing a file after running out
>> of available file descriptors.
>>
>> You can increase the maximum number of opened files on your system
>> (ulimit -n), but chances are there may be an issue in some custom code
>> you've written (or possibly in DSpace API itself -- though this is less
>> likely unless others are hitting this error as well)
>>
>> You might want to try stopping Tomcat and restarting it & then do the
>> full index-lucene-init & index-lucene-update. If it errors again, it's
>> possible that the index-lucene-init is somehow keeping files open. E.g.
>> Does your search/browse *work* from the UI after an "init"? Or does
>> searching from the UI also throw an error?
>>
>>> I was getting these sort of errors when I was running
>>> index-lucene-init and I changed these config parameters:
>>>
>>> search.index.delay = -1
>>> search.batch.documents = -1
>>>
>>> and index-lucene-init completed successfully.
>>>
>>> I'm a bit confused as to whether I need to run index-lucene-update as
>>> a cron job or not.  When I load an item into my instance, the lucene
>>> index metadata is updated  because, I have this set ( the search in
>>> the list ):
>>
>> index-lucene-update need not be run via cron job. The indexes should be
>> updated automatically on each newly submitted item. That being said,
>> I've heard some institutions schedule it via a cron job "just in case",
>> but it shouldn't be needed.
>>
>> index-lucene-update is mostly there to *update* your index if you chance
>> any settings (e.g. add new search fields, etc.). It's not necessary to
>> run on any ongoing basis. But, it also shouldn't fail -- it failing is a
>> sign that something is wrong in your index or in DSpace.
>>
>> - Tim
>
> --
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette: 
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] index-lucene-update

2014-07-01 Thread Jose Blanco
I found the change in DSQuery.java.  Let me try it out and let you know.

Thank you!  Jose

On Tue, Jul 1, 2014 at 2:31 PM, Jose Blanco  wrote:
> Tim, can you point me to the DSIndexer.java that has this fixed, so I
> can try it out.
>
>
>
> On Tue, Jul 1, 2014 at 1:53 PM, Tim Donohue  wrote:
>> Jose,
>>
>> I forgot to check JIRA for this issue. :)
>>
>> Looks like you may be hitting this:
>>
>> https://jira.duraspace.org/browse/DS-1970
>>
>> It's fixed in the upcoming 4.2 release (hoping to be released in mid-July)
>>
>> - Tim
>>
>> On 7/1/2014 12:45 PM, Tim Donohue wrote:
>>> Hi Jose,
>>>
>>> On 7/1/2014 11:46 AM, Jose Blanco wrote:
 Tim,

 Thanks for helping me out with this.  I am using 4.1 code, it's just
 that I have some changes in there particular to our instance.  The
 changes I have I don't think should affect this.  I put in the check
 you suggested, and now I don't get the null exception but get a whole
 bunch of these sorts of errors:

 2014-07-01 12:35:34,764 ERROR org.dspace.search.DSIndexer @
 java.io.FileNotFoundException:
 /dspace/repository/dev/search/_6o8q_Lucene41_0.tip (Too many open
 files)
>>>
>>> You are running into a "Too many open files" error. This actually may be
>>> *unrelated* to the indexing method. It's telling you that your DSpace
>>> has too many open file descriptors...so, somewhere in your code it is
>>> opening files without closing them (or just opening too many files at
>>> once -- the default setting is 1024 on most OS's)
>>>
>>> More info on "Too many open files" errors:
>>> * http://stackoverflow.com/a/4289528
>>> *
>>> http://stackoverflow.com/questions/2272908/java-io-filenotfoundexception-too-many-open-files
>>>
>>> (Or google the error, there's other resources out there)
>>>
>>> The place where this error shows up is somewhat random...it's going to
>>> just show up in an area where you are accessing a file after running out
>>> of available file descriptors.
>>>
>>> You can increase the maximum number of opened files on your system
>>> (ulimit -n), but chances are there may be an issue in some custom code
>>> you've written (or possibly in DSpace API itself -- though this is less
>>> likely unless others are hitting this error as well)
>>>
>>> You might want to try stopping Tomcat and restarting it & then do the
>>> full index-lucene-init & index-lucene-update. If it errors again, it's
>>> possible that the index-lucene-init is somehow keeping files open. E.g.
>>> Does your search/browse *work* from the UI after an "init"? Or does
>>> searching from the UI also throw an error?
>>>
 I was getting these sort of errors when I was running
 index-lucene-init and I changed these config parameters:

 search.index.delay = -1
 search.batch.documents = -1

 and index-lucene-init completed successfully.

 I'm a bit confused as to whether I need to run index-lucene-update as
 a cron job or not.  When I load an item into my instance, the lucene
 index metadata is updated  because, I have this set ( the search in
 the list ):
>>>
>>> index-lucene-update need not be run via cron job. The indexes should be
>>> updated automatically on each newly submitted item. That being said,
>>> I've heard some institutions schedule it via a cron job "just in case",
>>> but it shouldn't be needed.
>>>
>>> index-lucene-update is mostly there to *update* your index if you chance
>>> any settings (e.g. add new search fields, etc.). It's not necessary to
>>> run on any ongoing basis. But, it also shouldn't fail -- it failing is a
>>> sign that something is wrong in your index or in DSpace.
>>>
>>> - Tim
>>
>> --
>> Open source business process management suite built on Java and Eclipse
>> Turn processes into business applications with Bonita BPM Community Edition
>> Quickly connect people, data, and systems into organized workflows
>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>> http://p.sf.net/sfu/Bonitasoft
>> ___
>> DSpace-tech mailing list
>> DSpace-tech@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>> List Etiquette: 
>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] index-lucene-update

2014-07-01 Thread Jose Blanco
Just tried it and it's working now.  Thank you very much for taking
the time to help me with this!

-Jose

On Tue, Jul 1, 2014 at 3:50 PM, Jose Blanco  wrote:
> I found the change in DSQuery.java.  Let me try it out and let you know.
>
> Thank you!  Jose
>
> On Tue, Jul 1, 2014 at 2:31 PM, Jose Blanco  wrote:
>> Tim, can you point me to the DSIndexer.java that has this fixed, so I
>> can try it out.
>>
>>
>>
>> On Tue, Jul 1, 2014 at 1:53 PM, Tim Donohue  wrote:
>>> Jose,
>>>
>>> I forgot to check JIRA for this issue. :)
>>>
>>> Looks like you may be hitting this:
>>>
>>> https://jira.duraspace.org/browse/DS-1970
>>>
>>> It's fixed in the upcoming 4.2 release (hoping to be released in mid-July)
>>>
>>> - Tim
>>>
>>> On 7/1/2014 12:45 PM, Tim Donohue wrote:
 Hi Jose,

 On 7/1/2014 11:46 AM, Jose Blanco wrote:
> Tim,
>
> Thanks for helping me out with this.  I am using 4.1 code, it's just
> that I have some changes in there particular to our instance.  The
> changes I have I don't think should affect this.  I put in the check
> you suggested, and now I don't get the null exception but get a whole
> bunch of these sorts of errors:
>
> 2014-07-01 12:35:34,764 ERROR org.dspace.search.DSIndexer @
> java.io.FileNotFoundException:
> /dspace/repository/dev/search/_6o8q_Lucene41_0.tip (Too many open
> files)

 You are running into a "Too many open files" error. This actually may be
 *unrelated* to the indexing method. It's telling you that your DSpace
 has too many open file descriptors...so, somewhere in your code it is
 opening files without closing them (or just opening too many files at
 once -- the default setting is 1024 on most OS's)

 More info on "Too many open files" errors:
 * http://stackoverflow.com/a/4289528
 *
 http://stackoverflow.com/questions/2272908/java-io-filenotfoundexception-too-many-open-files

 (Or google the error, there's other resources out there)

 The place where this error shows up is somewhat random...it's going to
 just show up in an area where you are accessing a file after running out
 of available file descriptors.

 You can increase the maximum number of opened files on your system
 (ulimit -n), but chances are there may be an issue in some custom code
 you've written (or possibly in DSpace API itself -- though this is less
 likely unless others are hitting this error as well)

 You might want to try stopping Tomcat and restarting it & then do the
 full index-lucene-init & index-lucene-update. If it errors again, it's
 possible that the index-lucene-init is somehow keeping files open. E.g.
 Does your search/browse *work* from the UI after an "init"? Or does
 searching from the UI also throw an error?

> I was getting these sort of errors when I was running
> index-lucene-init and I changed these config parameters:
>
> search.index.delay = -1
> search.batch.documents = -1
>
> and index-lucene-init completed successfully.
>
> I'm a bit confused as to whether I need to run index-lucene-update as
> a cron job or not.  When I load an item into my instance, the lucene
> index metadata is updated  because, I have this set ( the search in
> the list ):

 index-lucene-update need not be run via cron job. The indexes should be
 updated automatically on each newly submitted item. That being said,
 I've heard some institutions schedule it via a cron job "just in case",
 but it shouldn't be needed.

 index-lucene-update is mostly there to *update* your index if you chance
 any settings (e.g. add new search fields, etc.). It's not necessary to
 run on any ongoing basis. But, it also shouldn't fail -- it failing is a
 sign that something is wrong in your index or in DSpace.

 - Tim
>>>
>>> --
>>> Open source business process management suite built on Java and Eclipse
>>> Turn processes into business applications with Bonita BPM Community Edition
>>> Quickly connect people, data, and systems into organized workflows
>>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>>> http://p.sf.net/sfu/Bonitasoft
>>> ___
>>> DSpace-tech mailing list
>>> DSpace-tech@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>> List Etiquette: 
>>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_