Peter,

A queue would be awesome.  You're absolutely right regarding the cron
jobs; it's almost like you need to set a weekly reminder to go check the
execution times of your DSpace maintenance cron jobs to make sure
they're all completing and not running at the same time. :)  I find that
I tweak everything and then we add a bunch more content, get a bunch
more hits, etc, and all the timings are off again. :P

Cheers,

Alan

On 05/30/2014 05:16 PM, Peter Dietz wrote:
> My "hammer" java_opts on our production server, for when some site has
> crazy big content is to temporarily run it with:
> 
> JAVA_OPTS="-server -Xms256m -Xmx4g -XX:MaxPermSize=256m"
> 
> 
> We have 64GB ram on our boxes, so we'll survive.
> 
> 
> Not to derail onto a tangent, but one thing I'd like to see DSpace
> support is some type of background-processing-queue. 
> 
> i.e. new content submitted should be queued to get: initial checksum,
> virus check, media-filters to generate thumbnail and fulltext
> extraction, Discovery needs to index the content
> 
> And then there are maintenance jobs: Recompute the checksum, OAI
> harvest, index-maintenance, ...
> 
> New submissions add to the queue, some scheduler can add maintenance
> tasks to the queue. This way you don't run into the issue of 3+
> concurrent cron jobs because they didn't complete in time. Maybe you can
> even tie this in to the curation task queue system too. In the past we
> had a GitHub Enterprise/Firewall, and being an admin of that shows you
> fancy admin bells and whistles, where you can even inspect the queue.
> 
> Now what happens if queue growth exceeds its throughput, we'll cross
> that bridge when we get there.
> 
> ________________
> Peter Dietz
> Longsight
> www.longsight.com <http://www.longsight.com>
> pe...@longsight.com <mailto:pe...@longsight.com>
> p: 740-599-5005 x809
> 
> 
> On Fri, May 30, 2014 at 6:11 AM, Alan Orth <alan.o...@gmail.com
> <mailto:alan.o...@gmail.com>> wrote:
> 
>     Peter,
> 
>     Ahh, that's very interesting.  I just looked up the -server flag and it
>     seems on recent Sun/Oracle JVMs -server is implied on 64-bit Linux
>     platforms[0].
> 
>     It seems my problem was the fact that heuristics used by the OOM killer
>     were killing Tomcat's java instead of whatever filter-media, etc cron
>     job which happened to be the final straw in exhausting the server's
>     memory.  I've since re-evaluated my Tomcat's -Xmx and -Xms values, and
>     determined there wasn't enough physical RAM to run both Tomcat's java as
>     well as the background tasks, yet DSpace's control panel shows Tomcat's
>     java is actually underutilizing the RAM we've allocated.  Reducing the
>     allocation there made a little more room for the background tasks and
>     things have been stable since then.
> 
>     Also, I suspect it was the checksum checker job (runs at 3am for us)
>     which was actually the final straw in exhausting the memory, so I've
>     modified to work for 1 hour each run, instead of attempting to crawl the
>     whole repository (default):
> 
>     0 3 * * * nice -n19 /blah/dspace/bin/dspace checker -d 1h -p
> 
>     Cheers,
> 
>     Alan
> 
>     [0]
>     http://docs.oracle.com/javase/7/docs/technotes/guides/vm/server-class.html
> 
>     On 05/28/2014 05:33 PM, Peter Dietz wrote:
>     > Hi Alan,
>     >
>     > At Longsight, we customize the JAVA_OPTS in dspace/bin/dspace
>     >
>     
> https://github.com/LongsightGroup/DSpace/blob/longsight-4_x/dspace/bin/dspace#L66
>     >
>     > #Allow user to specify java options through JAVA_OPTS variable
>     > if [ "$JAVA_OPTS" = "" ]; then
>     >   #Default Java to use 256MB of memory
>     >   JAVA_OPTS="-server -Xmx256m"
>     > fi
>     >
>     >
>     > Previously, when I was at Ohio State, I had more in my JAVA_OPTS, to
>     > help with permgen issues.
>     >
>     https://github.com/osulibraries/DSpace/blob/osukb/dspace/bin/dspace#L66
>     >
>     > #Allow user to specify java options through JAVA_OPTS variable
>     > if [ "$JAVA_OPTS" = "" ]; then
>     >   #Default Java to use 256MB of memory
>     >   JAVA_OPTS="-server -Xmx512m -XX:MaxPermSize=128m
>     > -XX:+CMSClassUnloadingEnabled"
>     > fi
>     >
>     >
>     > By adding the "-server" your ensuring that Java runs in server mode,
>     > as opposed to client mode. Server has slower initial startup, but a
>     > better memory footprint, and better performance for a longer running
>     > task, as per:
>     
> http://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client
>     >
>     > Then, if one of our clients has some jumbo-sized content that just
>     > isn't completing the cron jobs, then we'll temporarily bump the Xmx
>     > memory limit high, such as 4G.
>     > ________________
>     > Peter Dietz
>     > Longsight
>     > www.longsight.com <http://www.longsight.com>
>     > pe...@longsight.com <mailto:pe...@longsight.com>
>     > p: 740-599-5005 x809 <tel:740-599-5005%20x809>
>     >
>     >
>     > On Tue, May 27, 2014 at 7:03 PM, Terry Brady <tw...@georgetown.edu
>     <mailto:tw...@georgetown.edu>> wrote:
>     >> Alan,
>     >>
>     >> We override JAVA_OPTS for the nightly filter-media task in our cron.
>     >>
>     >> export JAVA_OPTS=-Xmx1200m;dspace filter-media ...
>     >>
>     >> We have a set of automated ingest tools.  We set JAVA_OPTS in
>     some of the
>     >> workflows that are run by those tools.
>     >>
>     >>
>     
> https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/bin-src/dspaceBatch.sh
>     >>
>     >> Terry
>     >>
>     >>
>     >>
>     >> On Tue, May 20, 2014 at 1:33 AM, Alan Orth <alan.o...@gmail.com
>     <mailto:alan.o...@gmail.com>> wrote:
>     >>>
>     >>> Hi,
>     >>>
>     >>> I'm curious if anyone sets memory limits for DSpace's various
>     cron jobs?
>     >>>
>     >>> Lately we've been having Tomcat's java process get killed every
>     morning
>     >>> around the same time, and all dmesg shows is that "java" was
>     killed by
>     >>> the kernel's OOM killer.  Catalina logs don't show any "SEVERE"
>     errors,
>     >>> so I have to assume it's the cron jobs which are using up loads of
>     >>> memory and then confusing the kernel, which then identifies Tomcat's
>     >>> java as the memory hog and kills it.
>     >>>
>     >>> So I'm just curious if anyone has had these kinds of problems, and
>     >>> if/what they set their JAVA_OPTS to in crontab.
>     >>>
>     >>> The long term plan of course is to move to a machine with more
>     memory
>     >>> (currently 4GB).
>     >>>
>     >>> Thanks,
>     >>>
>     >>> DSpace version is 3.1, OS is Ubuntu 12.04.
>     >>>
>     >>> --
>     >>> Alan Orth
>     >>> alan.o...@gmail.com <mailto:alan.o...@gmail.com>
>     >>> http://alaninkenya.org
>     >>> http://mjanja.co.ke
>     >>> "I have always wished for my computer to be as easy to use as my
>     >>> telephone; my wish has come true because I can no longer figure
>     out how
>     >>> to use my telephone." -Bjarne Stroustrup, inventor of C++
>     >>> GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
>     >>>
>     >>>
>     >>>
>     >>>
>     
> ------------------------------------------------------------------------------
>     >>> "Accelerate Dev Cycles with Automated Cross-Browser Testing -
>     For FREE
>     >>> Instantly run your Selenium tests across 300+ browser/OS combos.
>     >>> Get unparalleled scalability from the best Selenium testing platform
>     >>> available
>     >>> Simple to use. Nothing to install. Get started now for free."
>     >>> http://p.sf.net/sfu/SauceLabs
>     >>> _______________________________________________
>     >>> DSpace-tech mailing list
>     >>> DSpace-tech@lists.sourceforge.net
>     <mailto:DSpace-tech@lists.sourceforge.net>
>     >>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>     >>> List Etiquette:
>     >>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>     >>
>     >>
>     >>
>     >>
>     >> --
>     >> Terry Brady
>     >> Applications Programmer Analyst
>     >> Georgetown University Library Information Technology
>     >> https://www.library.georgetown.edu/lit/code
>     >> 425-298-5498 <tel:425-298-5498>
>     >>
>     >>
>     
> ------------------------------------------------------------------------------
>     >> The best possible search technologies are now affordable for all
>     companies.
>     >> Download your FREE open source Enterprise Search Engine today!
>     >> Our experts will assist you in its installation for $59/mo, no
>     commitment.
>     >> Test it for FREE on our Cloud platform anytime!
>     >>
>     
> http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clktrk
>     >> _______________________________________________
>     >> DSpace-tech mailing list
>     >> DSpace-tech@lists.sourceforge.net
>     <mailto:DSpace-tech@lists.sourceforge.net>
>     >> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>     >> List Etiquette:
>     >> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
> 
> 
>     --
>     Alan Orth
>     alan.o...@gmail.com <mailto:alan.o...@gmail.com>
>     http://alaninkenya.org
>     http://mjanja.co.ke
>     "I have always wished for my computer to be as easy to use as my
>     telephone; my wish has come true because I can no longer figure out how
>     to use my telephone." -Bjarne Stroustrup, inventor of C++
>     GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
> 
> 
>     
> ------------------------------------------------------------------------------
>     Time is money. Stop wasting it! Get your web API in 5 minutes.
>     www.restlet.com/download <http://www.restlet.com/download>
>     http://p.sf.net/sfu/restlet
>     _______________________________________________
>     DSpace-tech mailing list
>     DSpace-tech@lists.sourceforge.net
>     <mailto:DSpace-tech@lists.sourceforge.net>
>     https://lists.sourceforge.net/lists/listinfo/dspace-tech
>     List Etiquette:
>     https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
> 
> 


-- 
Alan Orth
alan.o...@gmail.com
http://alaninkenya.org
http://mjanja.co.ke
"I have always wished for my computer to be as easy to use as my
telephone; my wish has come true because I can no longer figure out how
to use my telephone." -Bjarne Stroustrup, inventor of C++
GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0

Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
Time is money. Stop wasting it! Get your web API in 5 minutes.
www.restlet.com/download
http://p.sf.net/sfu/restlet
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to