Re: [dspace-tech] Re: [Dspace-tech] problem wth pdf thumbnail creation - imagemagick

2018-01-30 Thread Terry Brady
Try the following commands to check that your configuration is pointing to
the right installation location.

> grep ProcessStarter [dspace-install]/config/dspace.cfg
org.dspace.app.mediafilter.ImageMagickThumbnailFilter.ProcessStarter =
/usr/bin
> /usr/bin/convert -version
Version: ImageMagick 6.7.2-7 2017-01-12 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2011 ImageMagick Studio LLC
Features: OpenMP
>  /usr/bin/gs -version
GPL Ghostscript 8.70 (2009-07-31)
Copyright (C) 2009 Artifex Software, Inc.  All rights reserved.

Terry

On Tue, Jan 30, 2018 at 12:38 PM, Radha Sankara 
wrote:

> Hi all,
>
>
> We are getting the same error too.How did you fix it?
>
> Thanks
>
>
> On Thursday, December 21, 2017 at 4:51:08 PM UTC-5, Jose Villanueva wrote:
>>
>>
>> I am using dspace 5.7 and I get the same error when I try to generate a
>> thumbnail.
>> How did you solve?
>>
>>
>> El miércoles, 26 de agosto de 2015, 13:43:48 (UTC-5), dspace user
>> escribió:
>>>
>>> Hello,
>>>
>>> I am using dspace5.2 JSPUI version in centos6.6. I have installed
>>> imagemagick and ghostscript. and i have done changes related to generating
>>> thumbnail. while using dspace filter-media command i m getting below error
>>>
>>> ERROR filtering, skipping bitstream:
>>>
>>> Item Handle: 123456789/29
>>> Bundle Name: ORIGINAL
>>> File Size: 589
>>> Checksum: 9b68bef4ddfea5ae8c66b089f892d3f2 (MD5)
>>> Asset Store: 0
>>> org.im4java.core.CommandException: java.io.FileNotFoundException:
>>> convert
>>> org.im4java.core.CommandException: java.io.FileNotFoundException:
>>> convert
>>> at org.im4java.core.ImageCommand.run(ImageCommand.java:219)
>>>
>>>
>>> path has been passed here
>>>  org.dspace.app.mediafilter.ImageMagickThumbnailFilter.ProcessStarter =
>>> /root/software/ImageMagick
>>>
>>>
>>> Regards.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://github.com/terrywbrady/info
425-298-5498 (Seattle, WA)

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] handle server

2018-01-30 Thread Radha Sankara
Hi,
We have configured the handle server.
On creating a new record,the dc.identifier.uri shows as 
http://server:8080/jspui/handle/123456/23512


How to remove the port number from the dc.identifier.uri

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: [Dspace-tech] problem wth pdf thumbnail creation - imagemagick

2018-01-30 Thread Radha Sankara
Hi all,


We are getting the same error too.How did you fix it?

Thanks

On Thursday, December 21, 2017 at 4:51:08 PM UTC-5, Jose Villanueva wrote:
>
>
> I am using dspace 5.7 and I get the same error when I try to generate a 
> thumbnail.
> How did you solve?
>
>
> El miércoles, 26 de agosto de 2015, 13:43:48 (UTC-5), dspace user escribió:
>>
>> Hello, 
>>
>> I am using dspace5.2 JSPUI version in centos6.6. I have installed 
>> imagemagick and ghostscript. and i have done changes related to generating 
>> thumbnail. while using dspace filter-media command i m getting below error
>>
>> ERROR filtering, skipping bitstream:
>>
>> Item Handle: 123456789/29
>> Bundle Name: ORIGINAL
>> File Size: 589
>> Checksum: 9b68bef4ddfea5ae8c66b089f892d3f2 (MD5)
>> Asset Store: 0
>> org.im4java.core.CommandException: java.io.FileNotFoundException: convert
>> org.im4java.core.CommandException: java.io.FileNotFoundException: convert
>> at org.im4java.core.ImageCommand.run(ImageCommand.java:219)
>>
>>
>> path has been passed here 
>>  org.dspace.app.mediafilter.ImageMagickThumbnailFilter.ProcessStarter = 
>> /root/software/ImageMagick
>>
>>
>> Regards.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] AIP backup timing out

2018-01-30 Thread Gary Browne
Thanks a lot Tim, I’ll give that a go!

Gary Browne | Technical Manager, Developments
Online Services
University of Sydney Library
THE UNIVERSITY OF SYDNEY
Level 1, Fisher Library F03, The University of Sydney NSW 2006
T +61 2 9351 5946 | M +61 405 647 868
E 
gary.bro...@sydney.edu.au
Sent from my plain old desktop computer

From:  on behalf of Tim Donohue 

Date: Wednesday, 31 January 2018 at 1:30 am
To: Gary Browne 
Cc: DSpace Technical Support 
Subject: Re: [dspace-tech] AIP backup timing out

Hi Gary,

Perhaps bump up the memory available to that task?  By default the "dspace" 
commandline script will use the settings in your JAVA_OPTS (if set).  If it is 
*not* set, then it defaults to only 256MB of memory.  See this line here:
https://github.com/DSpace/DSpace/blob/master/dspace/bin/dspace#L43

To override this, you should be able to simply set JAVA_OPTS to some higher 
memory values.  Some examples are here:
https://wiki.duraspace.org/display/DSDOC6x/Performance+Tuning+DSpace#PerformanceTuningDSpace-GivetheCommandLineToolsMoreMemory

FWIW, I've run this AIP backup on sites in the 300GB+ range, so it should work 
fine at the range you are talking about.  But, I always have to provide more 
memory -- usually more like 1GB at least.

You also might want to look closer at your DSpace log files to see if any other 
errors are reported there prior to the server timeout.  My best guess here is 
you may be hitting memory issues, but it could be something else entirely.

- Tim

On Tue, Jan 30, 2018 at 5:44 AM Gary Browne 
mailto:gary.bro...@sydney.edu.au>> wrote:
HI all,

JSPUI
DSpace 4.1
Tomcat 7
Apache 2.2
RHEL 6.9

I'm trying to do an AIP export of the entire site - but it keeps timing out 
after about 20GB of exports.

The command goes something like this:

./dspace packager -d -a -t AIP -e em...@email.com -i 
2123/0 /tmp/repo-dump/repo.zip

The dump begins ok but after some time, I get:

Timeout, server not responding.

And the program dies. This happens after about 20GB of exports, but I have 
about 450GB to export. Any suggestions, other than doing each community one by 
one :(

Thanks a lot,
Gary
--
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to 
dspace-tech@googlegroups.com.
Visit this group at 
https://groups.google.com/group/dspace-tech.
For more options, visit 
https://groups.google.com/d/optout.
--
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org
--
You received this message because you are subscribed to a topic in the Google 
Groups "DSpace Technical Support" group.
To unsubscribe from this topic, visit 
https://groups.google.com/d/topic/dspace-tech/64ZBJXJfBDo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to 
dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to 
dspace-tech@googlegroups.com.
Visit this group at 
https://groups.google.com/group/dspace-tech.
For more options, visit 
https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] cosecha oai

2018-01-30 Thread LinKeRo GM


hola, como puedo hacer una cosecha oai?? una empresa recolectora de datos 
me pide que les mande un archivo con los metadatos pero no se cual es ni 
como generar el archivo, ojala me puedan ayudar, de ante mano muchas 
gracias.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Database error when trying to edit epersons (oracle - ORA-00932: inconsistent datatypes: expected -

2018-01-30 Thread Donald Bynum
Tim,

That was it.  Just made those two updates and rebuilt.  All looks good 
again.

Thank you for your assistance with this.  Very much appreciated.

-Don.

On Monday, January 29, 2018 at 3:15:21 PM UTC-5, Tim Donohue wrote:
>
> Hi Don,
>
> This stacktrace tells me that the issue you are having is *not* the same 
> as the issue that Francis is happening (despite the similar error 
> messages). 
>
> Based on the error stack below, the error you  see is coming from the 
> "EPerson.searchResultCount()" method, specifically the SQL on this line:
>
> https://github.com/DSpace/DSpace/blob/dspace-5_x/dspace-api/src/main/java/org/dspace/eperson/EPerson.java#L438
>  
>
> However, the error reported by Francis (and described in DS-3649: 
> https://jira.duraspace.org/browse/DS-3649), was coming from the 
> "EPerson.findAll()" method on this line: 
> https://github.com/DSpace/DSpace/blob/dspace-5_x/dspace-api/src/main/java/org/dspace/eperson/EPerson.java#L518
>
> So, I'm not surprised that the fix Tom provided won't work on your system. 
> As it is fixing the latter method, and not the former one.
>
> It looks to me like your error is resulting from a similar/related problem 
> though, that a CLOB data type cannot be used in "comparison conditions" 
> [1], and the SQL in that "searchResultCount()" method uses LOWER() on a 
> CLOB data type.
>
> So, I think, based on Tom's previous recommendation, you'd need to replace 
> the "text_value" fields with "dbms_lob.substr(text_value, 0, 4000)" on 
> these two lines:
> * Select statement here: 
> https://github.com/DSpace/DSpace/blob/dspace-5.5/dspace-api/src/main/java/org/dspace/eperson/EPerson.java#L441
> * And select statement here: 
> https://github.com/DSpace/DSpace/blob/dspace-5.5/dspace-api/src/main/java/org/dspace/eperson/EPerson.java#L442
>  
>
> That's my best guess here. It sounds to me like we need to take a closer 
> look at *all* the methods in this EPerson class, and ensure they are 
> updated similarly.
>
> - Tim
>
> [1] 
> https://docs.oracle.com/cd/B19306_01/server.102/b14200/conditions002.htm
>
>
> On Fri, Jan 26, 2018 at 1:05 PM Donald Bynum  > wrote:
>
>> I made the change as suggested.  Did a rebuild.  That resulted in a new 
>> additions-5.5.jar (which makes sense since the updated java module was in 
>> additions.  Same error.  Here is the log entry (sorry it so lengthy):
>>
>> 2018-01-26 13:55:13,555 ERROR org.dspace.storage.rdbms.DatabaseManager @ 
>> SQL query single Error - 
>> java.sql.SQLSyntaxErrorException: ORA-00932: inconsistent datatypes: 
>> expected - got CLOB
>>  at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447)
>>  at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
>>  at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:951)
>>  at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:513)
>>  at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227)
>>  at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
>>  at 
>> oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:208)
>>  at 
>> oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:886)
>>  at 
>> oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175)
>>  at 
>> oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296)
>>  at 
>> oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3613)
>>  at 
>> oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3657)
>>  at 
>> oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1495)
>>  at 
>> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
>>  at 
>> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
>>  at 
>> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
>>  at 
>> org.dspace.storage.rdbms.DatabaseManager.query(DatabaseManager.java:295)
>>  at 
>> org.dspace.storage.rdbms.DatabaseManager.querySingle(DatabaseManager.java:342)
>>  at org.dspace.eperson.EPerson.searchResultCount(EPerson.java:438)
>>  at 
>> org.dspace.app.xmlui.aspect.administrative.eperson.ManageEPeopleMain.addBody(ManageEPeopleMain.java:118)
>>  at 
>> org.dspace.app.xmlui.wing.AbstractWingTransformer.startElement(AbstractWingTransformer.java:223)
>>  at sun.reflect.GeneratedMethodAccessor116.invoke(Unknown Source)
>>  at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>  at java.lang.reflect.Method.invoke(Method.java:498)
>>  at 
>> org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
>>  at com.sun.proxy.$Proxy110.startElement(Unknown Source)
>>  at 
>> org.apache.cocoon.components.sax.XMLTeePipe.startElement(XMLTeePipe.java:87)
>>  at 
>> org.apache.cocoon.xml.Abs

[dspace-tech] DSpace CRIS 5.8 - Metadata from DOI

2018-01-30 Thread Natalie Neumann
Hello everyone,

we are using DSpace CRIS 5.8 and ran into a problem with the submission via 
DOI. When we have a DOI with only new authors everything works fine, but 
when we try to get the metadata for a DOI with multiple authors and one of 
them is already in our system, only the first author is transferred into 
the input form. The same thing happens 
on https://test.dspace-cris.4science.it/ (for example with the 
DOI 10.1016/j.procs.2014.06.019 only the first of four authors is 
transferred), so it seems to be a bug in the code. Does anyone have an idea 
how to fix this?

Thanks,

Natalie

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] AIP backup timing out

2018-01-30 Thread Tim Donohue
Hi Gary,

Perhaps bump up the memory available to that task?  By default the "dspace"
commandline script will use the settings in your JAVA_OPTS (if set).  If it
is *not* set, then it defaults to only 256MB of memory.  See this line here:
https://github.com/DSpace/DSpace/blob/master/dspace/bin/dspace#L43

To override this, you should be able to simply set JAVA_OPTS to some higher
memory values.  Some examples are here:
https://wiki.duraspace.org/display/DSDOC6x/Performance+Tuning+DSpace#PerformanceTuningDSpace-GivetheCommandLineToolsMoreMemory

FWIW, I've run this AIP backup on sites in the 300GB+ range, so it should
work fine at the range you are talking about.  But, I always have to
provide more memory -- usually more like 1GB at least.

You also might want to look closer at your DSpace log files to see if any
other errors are reported there prior to the server timeout.  My best guess
here is you may be hitting memory issues, but it could be something else
entirely.

- Tim

On Tue, Jan 30, 2018 at 5:44 AM Gary Browne 
wrote:

> HI all,
>
> JSPUI
> DSpace 4.1
> Tomcat 7
> Apache 2.2
> RHEL 6.9
>
> I'm trying to do an AIP export of the entire site - but it keeps timing
> out after about 20GB of exports.
>
> The command goes something like this:
>
> ./dspace packager -d -a -t AIP -e em...@email.com -i 2123/0
> /tmp/repo-dump/repo.zip
>
> The dump begins ok but after some time, I get:
>
> Timeout, server not responding.
>
> And the program dies. This happens after about 20GB of exports, but I have
> about 450GB to export. Any suggestions, other than doing each community one
> by one :(
>
> Thanks a lot,
> Gary
>
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>
-- 
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-30 Thread Alan Orth
You're welcome, Hayden. Glad it helped.

Regarding your performance issues, you need to try to understand where the
load is coming from. You might just have legitimate traffic, either from
users or search bots. Start by looking at your web server logs. We use
nginx in front of Tomcat, so I started logging requests to XMLUI, OAI, and
REST separately so I could see exactly what was happening in each. You'll
get much more insight once you start to understand the nature of your
traffic.

For example, Google and Bing respect robots.txt so you if enable the cron
job for nightly sitemap generation, then you can submit your sitemap to
their web master tools consoles, and then alter your robots.txt so that
dynamic search pages like Discovery and Browse are forbidden. This is a big
help because those pages are computationally heavy. See this wiki page for
how to enable sitemaps:

https://wiki.duraspace.org/display/DSDOC5x/Search+Engine+Optimization

On that note, from looking at my logs I saw that Baidu doesn't respect
robots.txt and just crawls all the dynamic pages. I used nginx to identify
Baidu by its user agent and throttle its requests using nginx's limit_req
module. I wrote about it on my blog a few months ago:

https://mjanja.ch/2017/11/rate-limiting-baiduspider-using-nginx/

Another important tweak along those lines is Tomcat's Crawler Session
Manager valve. Basically, Tomcat assigns all users a new session
(JSESSIONID) when they connect. Bots are users too, so they get one as
well. The problem is that search bots like Google, Bing, and Baidu often
crawl from fifty (50) IP addresses concurrently, and each one of those gets
a session, each of which takes up precious CPU time, memory, and database
resources. If you enable the Crawler Session Manager valve Tomcat will
force all matching user agents to use one session. See the Tomcat docs or
look at our server.xml for more information:

https://github.com/ilri/rmg-ansible-public/blob/master/roles/dspace/templates/tomcat/server-tomcat7.xml.j2

Sometimes I will get an alert that our server's CPU load is very high so
I'll go look at the logs and do some grep and awk magic to figure out the
IP addresses for the day's top ten users. Often its illuminating. For
example, one day in December some client in China downloaded 20,000 PDFs
from us. Another time some open access bot started harvesting us and made
400,000 requests in a few hours. I looked at the logs, found the contact
information in their user agent string, and contacted their developers.
They acknowledged that there bot had gone awry due to a bug and fixed it.

Now our repository has been using a pool with a maximum of 300 connections
for a month or so and I haven't had DSpace crash due to database issues
since then. We still have high load every few days and users complain that
the site is "down", but when I look at PostgreSQL activity I see we have
211 active database connections—so there's a problem but it's not the
database! This is when I started to look elsewhere in the application stack
to alleviate this issue. For example, Tomcat's default maxThreads is 200,
so it's likely that this is the bottle neck now. I've recently bumped it
and its companion processorCache up to 400 and I'm monitoring things now.

So start with looking at the logs! They are very interesting and
enlightening. :)

Cheers,

On Tue, Jan 30, 2018 at 2:11 PM Hayden Young 
wrote:

> Hi Alan
>
> Thanks for the in-depth analysis and possible remedies. I implemented the
> jndi connections as you had outlined and tested that it did indeed use the
> Tomcat settings and not the DSpace configuration. As soon as I redeployed
> our DSpace it certainly felt as if there was an improvement in performance.
> A couple of changes were required to the jndi tomcat configuration in
> server.xml (probably related to running tomcat8) which were changing
> maxWait to maxWaitMillis and maxActive to maxTotal.
>
> Unfortunately, we were soon back to where we started with
> org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for
> connection from pool errors when browsing bitstreams. We also noticed the
> problem arose quicker than it has in the past; usually it is 1 to 2 weeks
> before we see this problem; this time it has happened within 3 days. The
> site does not experience much traffic and a check of the Postgres idle
> connections shows only 8 which seems well below what would cause the db
> error.
>
> Are you able to outline possible steps to analyzing the problem from
> DSpace's end?
>
> Our other solution is to put PG Bouncer between the db and dspace although
> my feeling is that this is only a temporary solution.
>
> Thanks
>
>
> Hayden Young
> https://www.knowledgearc.com
>
>
> On Wednesday, January 17, 2018 at 1:36:16 AM UTC+8, Alan Orth wrote:
>
>> Thanks, Hardy. I've created a new Jira issue[0], updated my git commit
>> message with the DS issue number, and updated my pull request[1] title and
>> description to include the fac

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-30 Thread Hayden Young
Hi Alan

Thanks for the in-depth analysis and possible remedies. I implemented the 
jndi connections as you had outlined and tested that it did indeed use the 
Tomcat settings and not the DSpace configuration. As soon as I redeployed 
our DSpace it certainly felt as if there was an improvement in performance. 
A couple of changes were required to the jndi tomcat configuration in 
server.xml (probably related to running tomcat8) which were changing 
maxWait to maxWaitMillis and maxActive to maxTotal.

Unfortunately, we were soon back to where we started with 
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for 
connection from pool errors when browsing bitstreams. We also noticed the 
problem arose quicker than it has in the past; usually it is 1 to 2 weeks 
before we see this problem; this time it has happened within 3 days. The 
site does not experience much traffic and a check of the Postgres idle 
connections shows only 8 which seems well below what would cause the db 
error.

Are you able to outline possible steps to analyzing the problem from 
DSpace's end?

Our other solution is to put PG Bouncer between the db and dspace although 
my feeling is that this is only a temporary solution.

Thanks


Hayden Young
https://www.knowledgearc.com

On Wednesday, January 17, 2018 at 1:36:16 AM UTC+8, Alan Orth wrote:
>
> Thanks, Hardy. I've created a new Jira issue[0], updated my git commit 
> message with the DS issue number, and updated my pull request[1] title and 
> description to include the fact that this is slightly related to DS-3434. I 
> think that's good to go now.
>
> [0] https://jira.duraspace.org/browse/DS-3803
> [1] https://github.com/DSpace/DSpace/pull/1917
>
> Cheers,
>
> On Tue, Jan 16, 2018 at 5:41 PM Hardy Pottinger  > wrote:
>
>> Hi, Alan, yes, we require a matching Jira issue for each pull request. 
>> When you create a new issue, please mention DS-3434 as being related. 
>> Thanks!
>>
>> --Hardy
>>
>> On Sat, Jan 13, 2018 at 10:00 PM, Alan Orth > > wrote:
>>
>>> I've been testing DSpace 6.2 with Tomcat 8.5.24 and noticed that DSpace 
>>> does not start if you supply a database pool from JNDI(!). It seems to be a 
>>> known issue, as Mark Wood created a Jira ticket for it[0]. Also, DSpace 6.2 
>>> still has the db.jndi setting in dspace.cfg, even though this setting is no 
>>> longer user customizable. I've made a pull request against the dspace-6_x 
>>> branch to remove it, but I did not create a Jira ticket. Is it required?
>>>
>>> [0] https://jira.duraspace.org/browse/DS-3434
>>> [1] https://github.com/DSpace/DSpace/pull/1917
>>>
>>> Regards,
>>>
>>> On Thu, Jan 11, 2018 at 9:37 AM Alan Orth >> > wrote:
>>>
 To continue the discussion on a slightly related note: I've just 
 finished dealing with the fallout caused by some new bot — the only 
 fingerprint of which is its unique-but-normal-looking user agent — hitting 
 our XMLUI with 450,000 requests from six different IPs over just a few 
 hours. This generated a ridiculous amount of load on the server, including 
 160 PostgreSQL connections and 52,000 Tomcat sessions before I was able to 
 mitigate it. Surprisingly, since I had increased out pool size to 300 
 after 
 my last message, we never got pool timeout or database connection errors 
 in 
 dspace.log, but the site was very unresponsive — and this is on a beefy 
 server with SSDs, plenty of RAM, large PostgreSQL buffer cache, etc! I 
 ended up having to rate limit this user agent in our frontend nginx web 
 server using the limit_req_zone module[0].

 So a bit of a mixed success and frustration here. No amount of pool 
 tweaking will fix this type of issue, because there's always another 
 bigger, stupider bot that comes along eventually and doesn't match the 
 "bot" user agent. I will definitely look into implementing separate pools 
 as Tom had suggested, though, to limit the damage caused by high load to 
 certain DSpace web applications. Keep sharing your experiences! This is 
 very valuable and interesting to me.

 [0] 
 https://github.com/ilri/rmg-ansible-public/commit/368faaa99028c8e0c8a99de3f6c253a228d5f63b

 Cheers!

 On Thu, Jan 4, 2018 at 7:31 AM Alan Orth >>> > wrote:

> That's a cool idea to use a separate pool for each web application, 
> Tom! I'd much rather have my OAI fail to establish a database connection 
> than my XMLUI. ;)
>
> Since I wrote the original mailing list message two weeks ago I've had 
> DSpace fail to establish a database connection a few thousand times and 
> I've increased my pool's max active from 50 to 75 and then 125 — our site 
> gets about four million hits per month (from looking at nginx logs), so 
> I'm 
> still trying to find the "sweet spot" for the pool settings. Anything's 
> better than setting the pool in dspace.cfg, though.
>
> I wish other 

[dspace-tech] AIP backup timing out

2018-01-30 Thread Gary Browne
HI all,

JSPUI
DSpace 4.1
Tomcat 7
Apache 2.2
RHEL 6.9

I'm trying to do an AIP export of the entire site - but it keeps timing out 
after about 20GB of exports.

The command goes something like this:

./dspace packager -d -a -t AIP -e em...@email.com -i 2123/0 
/tmp/repo-dump/repo.zip

The dump begins ok but after some time, I get:

Timeout, server not responding.

And the program dies. This happens after about 20GB of exports, but I have 
about 450GB to export. Any suggestions, other than doing each community one 
by one :(

Thanks a lot,
Gary

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Database Migration 4.2->5.6 Issue

2018-01-30 Thread Scott Renton
Thanks Terry. It's good to gather evidence that they're clearly not 
essential for one's dspace to work properly!

Cheers
Scott

On Monday, January 29, 2018 at 8:31:57 PM UTC, Terry Brady wrote:
>
> Scott,
>
> I do not have those views in my installation.
>
> select schemaname,viewname from pg_catalog.pg_views where schemaname no
> t in ('information_schema','pg_catalog');
>  schemaname |viewname
> +
>  public | community2item
>  public | dcvalue
> (2 rows)
>
> I also did a scan of the DSpace 5 code base and I do not find the string 
> "v_files".
>
> I suspect those views are there for support purposes or they are a vendor 
> add-on.
>
> Terry
>
> On Mon, Jan 29, 2018 at 9:47 AM, Scott Renton  > wrote:
>
>> Hi folks, 
>>
>> I see this was mentioned before in 
>> http://dspace.2283337.n4.nabble.com/Upgrade-3-2-to-5-4-database-problem-td4680533.html
>>
>> Basically, we are migrating from 4.2 to 5.6, and have hit database 
>> migrations problems between the 5.0.2014.09.25 and 5.0.2014.09.26 stages. 
>> Just like the above problem, it was down to stats related views: the first 
>> two it complained about (v_stats_workflow and stats.v_workflow) had only 6 
>> lines in each, so I cheerfully hived them off and dropped. I'm now through 
>> to my next problem:
>>
>> Caused by: org.postgresql.util.PSQLException: ERROR: cannot drop table 
>> bundle column name because other objects depend on it
>>
>>   Detail: view stats.v_files depends on table bundle column name
>>
>> view stats.v_files_coll depends on view stats.v_files
>>
>> view stats.v_files_comm depends on view stats.v_files
>>
>> view stats.v_item2bitstream depends on table bundle column name
>>
>>
>> These views have a lot more in them. I'm all for dropping them, but 
>> wondered if anyone could advise on implications. I guess they are dspace 
>> generated- this is one of our least configured sites, so I'm guessing 
>> they've happened through user interaction rather than anything we've done. 
>> The one site we have that has gone to 5.6 didn't have these views.
>>
>>
>> Cheers
>>
>> Scott
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to dspace-tech...@googlegroups.com .
>> To post to this group, send email to dspac...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Terry Brady
> Applications Programmer Analyst
> Georgetown University Library Information Technology
> https://github.com/terrywbrady/info
> 425-298-5498 (Seattle, WA)
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Database Migration 4.2->5.6 Issue

2018-01-30 Thread Scott Renton
Thanks Tim. I'll go back to the user and ask them if they know anything- it 
certainly would be interesting to know what's generating them given it's 
not dspace- but I've gone far enough through the process to feel confident 
that deleting them will allow the migration to take place.

Cheers
Scott

On Monday, January 29, 2018 at 8:31:31 PM UTC, Tim Donohue wrote:
>
> Hi Scott,
>
> Those views are not generated by DSpace (as far as I'm aware).  Searching 
> Github does not show them in our codebase (and all views created by DSpace 
> are documented in our SQL database scripts).  So, my best guess is that 
> these were locally created, or someone installed a custom Statistics plugin 
> in your DSpace that uses those?  I notice they all start with "stats.", so 
> that's why I wonder if these are related either to a third-party plugin, or 
> were created locally to help run some detailed statistics on your DSpace.
>
> That's my best guess.  If they are from a plugin, then hopefully someone 
> else on this list will recognize these names and report back.
>
> In any case, I don't see any harm to your DSpace in deleting them.  But, 
> obviously, you may want to check to see if anyone knows why they were 
> created in the first place.
>
> Tim
>
> On Mon, Jan 29, 2018 at 2:19 PM Scott Renton  > wrote:
>
>> Hi folks, 
>>
>> I see this was mentioned before in 
>> http://dspace.2283337.n4.nabble.com/Upgrade-3-2-to-5-4-database-problem-td4680533.html
>>
>> Basically, we are migrating from 4.2 to 5.6, and have hit database 
>> migrations problems between the 5.0.2014.09.25 and 5.0.2014.09.26 stages. 
>> Just like the above problem, it was down to stats related views: the first 
>> two it complained about (v_stats_workflow and stats.v_workflow) had only 6 
>> lines in each, so I cheerfully hived them off and dropped. I'm now through 
>> to my next problem:
>>
>> Caused by: org.postgresql.util.PSQLException: ERROR: cannot drop table 
>> bundle column name because other objects depend on it
>>
>>   Detail: view stats.v_files depends on table bundle column name
>>
>> view stats.v_files_coll depends on view stats.v_files
>>
>> view stats.v_files_comm depends on view stats.v_files
>>
>> view stats.v_item2bitstream depends on table bundle column name
>>
>>
>> These views have a lot more in them. I'm all for dropping them, but 
>> wondered if anyone could advise on implications. I guess they are dspace 
>> generated- this is one of our least configured sites, so I'm guessing 
>> they've happened through user interaction rather than anything we've done. 
>> The one site we have that has gone to 5.6 didn't have these views.
>>
>>
>> Cheers
>>
>> Scott
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to dspace-tech...@googlegroups.com .
>> To post to this group, send email to dspac...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
> -- 
> Tim Donohue
> Technical Lead for DSpace & DSpaceDirect
> DuraSpace.org | DSpace.org | DSpaceDirect.org
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.