Re: [Resin-interest] Logrotation and NFS
Hi Scott Thank you for confirming that. On a side notice, what do I do in regards to a feature request? As far as I can see there is no way of controlling the programname/tag/application (3 names for the same field in a syslog message), but only facility and severity. Unless I begin overloading facility and severity and filter just on those two fields in the message, I cannot easily filter 10+ different systems running in 20+ different resin instances to a single file per instance – let alone a single log for each system now that syslog is in play. As far as I can tell from looking at src/resin/jni_vfs.c all syslog messages will be comming from “Resin” (there’s a call to “openlog("Resin", 0, LOG_DAEMON);” and then just calls to “syslog(priority, "%s", buffer);” , which makes it a bit hard to filter unless I change all existing logging to prepend a key for easier sorting – or create individual patched Resin versions per application with different names.. My feature request is that it should be possible to specify programname/tag/application as well as hostname via XML tags like so: daemon notice my_application my_non_FQDN Regards, Jens Dueholm Christensen Survey IT P +45 5161 7879 j...@ramboll.com<mailto:j...@ramboll.com> Rambøll Olof Palmes Allé 20 DK-8200 Aarhus N Denmark www.ramboll.dk<http://www.ramboll-management.dk/> From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Scott Ferguson Sent: Wednesday, February 10, 2016 7:04 PM To: resin-interest@caucho.com Subject: Re: [Resin-interest] Logrotation and NFS On 2/8/16 5:01 AM, Jens Dueholm Christensen wrote: Hi Resin license #1016608 Recently we’ve migrated to a ceph-based storage setup where we use NFS to allow our multiple Resin instances to write their logs directly to Ceph over NFS. The logging did change between Resin 3 and Resin 4 mostly to improve threading, but the log rotation is essentially the same. (As I remember.) So if NFS stalls during log rotation, it's still going to be a problem. The log is written assuming the rotation might be slow (and avoids blocking other threads while it's happening), but it does assume the rotation will complete and not freeze. -- Scott However, on 2 of our client-machines we see NFS stalls and retransmissions a few minutes past midnight for some files. So far I’ve managed to write off posibility after posibility and right now I’m stuck a few last possibilities, and the next one in line is the way Resin does logrotation. A poll of open files each second (via lsof output) shows this where out stdout logfile gets rotated: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME Sun Feb 7 23:59:58 CET 2016 java 30865 results 238w REG 0,25 1215099201 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log Sun Feb 7 23:59:59 CET 2016 java 30865 results 238w REG 0,25 1215100254 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log Mon Feb 8 00:00:00 CET 2016 java 30865 results 107w REG 0,25 233381888 1073774632 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log.20160207 java 30865 results 118r REG 0,25 1215103872 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log Mon Feb 8 00:00:04 CET 2016 java 30865 results 107w REG 0,25 929890304 1073774632 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log.20160207 java 30865 results 118r REG 0,25 1215103872 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log Mon Feb 8 00:00:55 CET 2016 java 30865 results 161r REG 0,251439432 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log All day on the 7th the logfile is opened on filehandle 238 with write access (238w in the FD column) Then around midnight a new file is created named resin-web-stdout.log.20160207 on filehandle 107 with write permission (107w in the FD column). At the same time it seems resin-web-stdout.log is opened with a new filehandle and only with read permission (118r). Then over time data is copied from the file with handle 118r to 107w – the size/offset numbers increase between 00:00:00 and 00:00:04. The stall I experience happens between 00:00:04 and 00:00:55 where lsof output stalls (the man-page mentions that lstat(2), readlink(2), and stat(2) calls are blocked if the NFS server is unresponsive). Other open files in other Resin instances running on the same server does not see these stalls – nor does other client-machines, so I am c
[Resin-interest] Logrotation and NFS
Hi Resin license #1016608 Recently we've migrated to a ceph-based storage setup where we use NFS to allow our multiple Resin instances to write their logs directly to Ceph over NFS. However, on 2 of our client-machines we see NFS stalls and retransmissions a few minutes past midnight for some files. So far I've managed to write off posibility after posibility and right now I'm stuck a few last possibilities, and the next one in line is the way Resin does logrotation. A poll of open files each second (via lsof output) shows this where out stdout logfile gets rotated: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME Sun Feb 7 23:59:58 CET 2016 java 30865 results 238w REG 0,25 1215099201 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log Sun Feb 7 23:59:59 CET 2016 java 30865 results 238w REG 0,25 1215100254 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log Mon Feb 8 00:00:00 CET 2016 java 30865 results 107w REG 0,25 233381888 1073774632 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log.20160207 java 30865 results 118r REG 0,25 1215103872 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log Mon Feb 8 00:00:04 CET 2016 java 30865 results 107w REG 0,25 929890304 1073774632 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log.20160207 java 30865 results 118r REG 0,25 1215103872 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log Mon Feb 8 00:00:55 CET 2016 java 30865 results 161r REG 0,251439432 1073774597 /nfsdata-web11nfs/results/logs/web9_results_backend1/resin-web-stdout.log All day on the 7th the logfile is opened on filehandle 238 with write access (238w in the FD column) Then around midnight a new file is created named resin-web-stdout.log.20160207 on filehandle 107 with write permission (107w in the FD column). At the same time it seems resin-web-stdout.log is opened with a new filehandle and only with read permission (118r). Then over time data is copied from the file with handle 118r to 107w - the size/offset numbers increase between 00:00:00 and 00:00:04. The stall I experience happens between 00:00:04 and 00:00:55 where lsof output stalls (the man-page mentions that lstat(2), readlink(2), and stat(2) calls are blocked if the NFS server is unresponsive). Other open files in other Resin instances running on the same server does not see these stalls - nor does other client-machines, so I am confident that our NFS-server is alive and answering requests. Now for the real questions: Since we are running an ancient version of Resin (3.1.13 and 3.1.14), have the way of rotating logfiles changed in Resin 4.0 or later? Are the stalls I'm seeing (I assume it could just as well happen locally if a disksubsystem was slow enough) a known issue that has been fixed in Resin 4.0 or later? Right now the best way of solving this is to change to syslog logging, but since we already have a roadmap for Resin 4 planned for later this year (yeay - finally!) this might just be enough for us to move it up to the top of our list. Regards, Jens Dueholm Christensen Survey IT P +45 5161 7879 j...@ramboll.com<mailto:j...@ramboll.com> Rambøll Olof Palmes Allé 20 DK-8200 Aarhus N Denmark www.ramboll.dk<http://www.ramboll-management.dk/> ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Watchdog loglevel
From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Scott Ferguson > Well, remember that the watchdog itself doesn't normally shutdown Resin on > errors. Resin exits itself and the watchdog just starts a new instance. > (Resin 4.0 communicates the reason better to the watchdog through exit > codes.) > So the problem is in the Resin instance itself. > Are there hs_err* files or something similar? No, not from this particular incident, but we have an older hr_err-file and an old watchdog-log that does show a restart around the same time as the hs_err file was created. *Eeeek* (the same sound a small animal that's about to get squished makes!) - Seems the JVM crashed at that time due to: # Problematic frame: # C [libawt.so+0x6f881] IntRgbSrcMaskFill+0x1b1 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) J sun.java2d.loops.MaskFill.MaskFill(Lsun/java2d/SunGraphics2D;Lsun/java2d/SurfaceData;Ljava/awt/Composite;[BII)V J sun.java2d.pipe.TextRenderer.drawGlyphList(Lsun/java2d/SunGraphics2D;Lsun/font/GlyphList;)V J sun.java2d.pipe.GlyphListPipe.drawString(Lsun/java2d/SunGraphics2D;Ljava/lang/String;DD)V J sun.java2d.SunGraphics2D.drawString(Ljava/lang/String;FF)V J org.jfree.text.TextLine.draw(Ljava/awt/Graphics2D;FFLorg/jfree/ui/TextAnchor;FFD)V This app creates thousands upon thousands of PDF reports with jFreeChart every day.. We'll have to dig into this - thanks Scott. Regards, Jens Dueholm Christensen Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
[Resin-interest] Watchdog loglevel
Hi We're running Resin 3.1.11 (soon to be 3.1.12 in the next servicewindow) in our production environment, and a few days ago we had an app that was restarted several times by the watchdog - with no apparent reason. The watchdog-log contains this: [2013/02/08 11:17:46.478] WatchdogProcess[Watchdog[results],1] stopping Resin [2013/02/08 11:17:46.478] WatchdogProcess[Watchdog[results],2] starting Resin [2013/02/08 11:20:37.256] WatchdogProcess[Watchdog[results],2] stopping Resin [2013/02/08 11:20:37.256] WatchdogProcess[Watchdog[results],3] starting Resin [2013/02/08 11:23:24.221] WatchdogProcess[Watchdog[results],3] stopping Resin [2013/02/08 11:23:24.221] WatchdogProcess[Watchdog[results],4] starting Resin There was a lot of regular and normal activity in our apps stdout-log before and inbetween all the restarts, but nothing that - in our opinion - should cause a restart by the watchdog. The JVM log has no mention of problems (performing CMS and young generation GC as expected) and load on the server was also low - no automatic stacktraces were taken. We have been running with the same resin configuration, app codebase and OS software-stack for a long time, so we are quite baffeled, as this struck us as lightning from a clear sky. Is there any way of getting more verbose output about the watchdog and what it decides to do? We tried setting and restarting Resin completely (not just a restart of the JVM), but that did not seem to help. Regards, Jens Dueholm Christensen Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Leap Second behaviour
Thanks Scott I just needed a bit of confirmation. And yes it's quite impressive and a bit scary that a single second could create problem on the scale we've seen - and judging by what we've seen so far this will most likely not be the last time a leap second gets this much attention. Lets just hope the next one doesn't happen on december 31st - I'd hate to have to use my reduced mental capacity on january 1st (seems to be a recurring issue on that date) on fixing crashed systems.. ;) Regards, Jens Dueholm Christensen Survey IT From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Scott Ferguson Sent: Monday, July 02, 2012 9:25 PM To: resin-interest@caucho.com Subject: Re: [Resin-interest] Leap Second behaviour On 07/02/2012 06:28 AM, Jens Dueholm Christensen (JEDC) wrote: Hi I'm quite well aware that the "base problem" regarding leap seconds is not connected to Resin, but to JVM and OS. However I saw some strange behaviour in the aftermatch, that might be interesting to clear up. Normally we run Resin 3.1.12 without any specific configuration for . This means a default sample-period of 60 seconds. Because of this every 59th second this was logged: [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 ... ~26600 lines later [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 This seemed a bit much to log every 60th second. I tried to add a section with 10 minute sample-period, and saw that those ~26600 lines was now only logged every 10th minute. But still loads more than usual (I would expect just a single line beeing logged). Was Resin to blame for this (ie. is there a possible bug around here?) or was it the JVM and OS? I'd guess it's related to the JVM/OS issue. In our nightly regressions, there were some sleep/waits that were stuck before we updated the system. It's pretty impressive that a 1s change could cause that much chaos. :) -- Scott Regards, Jens Dueholm Christensen Survey IT ___ resin-interest mailing list resin-interest@caucho.com<mailto:resin-interest@caucho.com> http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
[Resin-interest] Leap Second behaviour
Hi I'm quite well aware that the "base problem" regarding leap seconds is not connected to Resin, but to JVM and OS. However I saw some strange behaviour in the aftermatch, that might be interesting to clear up. Normally we run Resin 3.1.12 without any specific configuration for . This means a default sample-period of 60 seconds. Because of this every 59th second this was logged: [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.000][02:14:59.000] StatService[] cpu-load=41.05 ... ~26600 lines later [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 [2012-07-01 02:14:59.999][02:14:59.999] StatService[] cpu-load=41.05 This seemed a bit much to log every 60th second. I tried to add a section with 10 minute sample-period, and saw that those ~26600 lines was now only logged every 10th minute. But still loads more than usual (I would expect just a single line beeing logged). Was Resin to blame for this (ie. is there a possible bug around here?) or was it the JVM and OS? Regards, Jens Dueholm Christensen Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] cannot ./configure resin - says java not installed, - it is installed and JAVA_HOME is set correctly..
.. or you can use the --with-java-home option for configure. Eg something like this: ./configure --with-java-home=/opt/jdk/ Run "./configure --help" to see all options available.. Regards, Jens Dueholm Christensen -Original Message- From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Alex Rojkov Sent: Friday, April 13, 2012 6:15 PM To: General Discussion for the Resin application server Subject: Re: [Resin-interest] cannot ./configure resin - says java not installed, - it is installed and JAVA_HOME is set correctly.. > help? > I have java installed in a custom folder due to distriubtion reasons; Hi Tom, Can you add /opt/jdk/bin to the PATH and try again. ./configure in 4.0.27 does 'which java' to find java executable. I added a fallback to JAVA_HOME/bin/java for 4.0.28 and up. Thanks, Alex > > root@ubuntu64#echo $JAVA_HOME > /opt/jdk/ > > when double checking the version: > root@ubuntu64# $JAVA_HOME/bin/java -version > java version "1.6.0_27" > Java(TM) SE Runtime Environment (build 1.6.0_27-b07) > Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode) > > > however when doing a configure I get this error: ( I even exported > JAVA_HOME to /opt/jdk/ ) > == > root@ubuntu64:/opt/resin-4.0.27# ./configure > checking build system type... x86_64-unknown-linux-gnu > checking host system type... x86_64-unknown-linux-gnu > ... > .. > checking if is Java 1.6... no > configure: error: Java 1.6 required. returned: ./configure: line > 11291: -version: command not found > == > > why is this happening? > > tia! > > ___ > resin-interest mailing list > resin-interest@caucho.com > http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender
In case others are interested: I raised the thread-pool thread-max to 300 and the server-default thread-max to 1024. I'm still interested in hearing if and how they are connected, but for now we've hadn't seen the freeze for over a week. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Jens Dueholm Christensen (JEDC) Sent: Friday, March 02, 2012 10:11 AM To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender Hi Scott Thanks for that suggestion. However I'm a bit confused about what thread-max to change - our resin.conf looks like this: 150 5 ... 512 10 20 800 10s The documentation is IMO not quite clear on how the "thread-pool thread-max" is connected to a "server-default thread-max", so a reccomendation would be appriciated. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Scott Ferguson Sent: Thursday, March 01, 2012 7:00 PM To: resin-interest@caucho.com Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender Increase your thread-max for now. The 4.0.26 thread redesign will avoid this problem. Basically, the log rollover is asking for a new thread at the same time as you're hitting thread-max, so it's getting stuck. -- Scott On 02/28/2012 08:37 AM, Scott Ferguson wrote: On 02/28/2012 01:27 AM, Jens Dueholm Christensen (JEDC) wrote: Hej Knut Thanks for the link. In my somewhat extensive googling I havn't seen that page before. It seems to confirm what is beeing said on http://www.simonsite.org.uk/#activeasync about fallback to synchronous logging if an exception is thrown. This brings an interesting question to mind: Can Resins logrotation somehow provoke an exception at the moment the logfiles are rotated, that causes log4j to fall back to synchronous logging? Ie. if a buffer is attempted flushed while the backing file is unavailable? Or the backend file is closed and reopened during a flush? It should just be delayed, presumably for longer than some assumption of log4j. The way log4j all of a sudden blocks (and with full buffers that are never flushed) seems to indicate that it has encountered an exception of some kind, so again I'd be happy for any comments from Caucho on this. Well, it's a log4j issue, so I can only comment after I've debugged their code. (And the upcoming 4.0.26 release means that's still a day or two away.) -- Scott Oh yes - last night we saw it happen again. Our world ground to a halt with stacktraces showing the exact same as those in my original post. This was after a quiet weekend where we didn't encounter the problem once. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> [mailto:resin-interest-boun...@caucho.com] On Behalf Of Knut Forkalsrud Sent: Monday, February 27, 2012 7:33 PM To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender You may find the discussion in http://glueclue.blogspot.com/2007/01/log4j-asyncappender-is-not-always_31.html useful. Knut Forkalsrud On Mon, Feb 27, 2012 at 00:16, Jens Dueholm Christensen (JEDC) mailto:jens.dueh...@r-m.com>> wrote: Bump? :) Regards, Jens Dueholm Christensen Rambøll Survey IT -Original Message- From: resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> [mailto:resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com>] On Behalf Of Jens Dueholm Christensen (JEDC) Sent: Tuesday, February 21, 2012 9:01 PM To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender Hi Scott I'd really appreciate it, if you took the time to do that - I (and my coworkers) really quite baffled. We have had an increase of activity and load on the system during the last few weeks (more users and utilization, so nothing alarming), and now this problem has manifested itself, and I've never seen it before. My understanding of what's happening is that the blocking thread is waiting for an empty buffer so it can flush it's own full buffer, but that notification never comes. As the source for log4j shows the thread has called wait() on an ArrayList (the full buffer), and something has gone wrong, but what? We've seen it happen 3 or 4 times during the last week - always a few minutes after midnight when Resin performs i
Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender
Hi Scott Thanks for that suggestion. However I'm a bit confused about what thread-max to change - our resin.conf looks like this: 150 5 ... 512 10 20 800 10s The documentation is IMO not quite clear on how the "thread-pool thread-max" is connected to a "server-default thread-max", so a reccomendation would be appriciated. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Scott Ferguson Sent: Thursday, March 01, 2012 7:00 PM To: resin-interest@caucho.com Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender Increase your thread-max for now. The 4.0.26 thread redesign will avoid this problem. Basically, the log rollover is asking for a new thread at the same time as you're hitting thread-max, so it's getting stuck. -- Scott On 02/28/2012 08:37 AM, Scott Ferguson wrote: On 02/28/2012 01:27 AM, Jens Dueholm Christensen (JEDC) wrote: Hej Knut Thanks for the link. In my somewhat extensive googling I havn't seen that page before. It seems to confirm what is beeing said on http://www.simonsite.org.uk/#activeasync about fallback to synchronous logging if an exception is thrown. This brings an interesting question to mind: Can Resins logrotation somehow provoke an exception at the moment the logfiles are rotated, that causes log4j to fall back to synchronous logging? Ie. if a buffer is attempted flushed while the backing file is unavailable? Or the backend file is closed and reopened during a flush? It should just be delayed, presumably for longer than some assumption of log4j. The way log4j all of a sudden blocks (and with full buffers that are never flushed) seems to indicate that it has encountered an exception of some kind, so again I'd be happy for any comments from Caucho on this. Well, it's a log4j issue, so I can only comment after I've debugged their code. (And the upcoming 4.0.26 release means that's still a day or two away.) -- Scott Oh yes - last night we saw it happen again. Our world ground to a halt with stacktraces showing the exact same as those in my original post. This was after a quiet weekend where we didn't encounter the problem once. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> [mailto:resin-interest-boun...@caucho.com] On Behalf Of Knut Forkalsrud Sent: Monday, February 27, 2012 7:33 PM To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender You may find the discussion in http://glueclue.blogspot.com/2007/01/log4j-asyncappender-is-not-always_31.html useful. Knut Forkalsrud On Mon, Feb 27, 2012 at 00:16, Jens Dueholm Christensen (JEDC) mailto:jens.dueh...@r-m.com>> wrote: Bump? :) Regards, Jens Dueholm Christensen Rambøll Survey IT -Original Message- From: resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> [mailto:resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com>] On Behalf Of Jens Dueholm Christensen (JEDC) Sent: Tuesday, February 21, 2012 9:01 PM To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender Hi Scott I'd really appreciate it, if you took the time to do that - I (and my coworkers) really quite baffled. We have had an increase of activity and load on the system during the last few weeks (more users and utilization, so nothing alarming), and now this problem has manifested itself, and I've never seen it before. My understanding of what's happening is that the blocking thread is waiting for an empty buffer so it can flush it's own full buffer, but that notification never comes. As the source for log4j shows the thread has called wait() on an ArrayList (the full buffer), and something has gone wrong, but what? We've seen it happen 3 or 4 times during the last week - always a few minutes after midnight when Resin performs its logrotation. In the stacktrace most of the threads that waits for the deadlocked thread is just (as is the deadlocked thread) doing a simple logger.info<http://logger.info>(String), so it might be possible to reproduce by spawning 10-20 threads that calls logger.info<http://logger.info>() and then have Resin perform a rollover once a minute. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> [resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com>] On Behalf O
Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender
Hej Knut Thanks for the link. In my somewhat extensive googling I havn’t seen that page before. It seems to confirm what is beeing said on http://www.simonsite.org.uk/#activeasync about fallback to synchronous logging if an exception is thrown. This brings an interesting question to mind: Can Resins logrotation somehow provoke an exception at the moment the logfiles are rotated, that causes log4j to fall back to synchronous logging? Ie. if a buffer is attempted flushed while the backing file is unavailable? Or the backend file is closed and reopened during a flush? The way log4j all of a sudden blocks (and with full buffers that are never flushed) seems to indicate that it has encountered an exception of some kind, so again I’d be happy for any comments from Caucho on this. Oh yes – last night we saw it happen again. Our world ground to a halt with stacktraces showing the exact same as those in my original post. This was after a quiet weekend where we didn’t encounter the problem once. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Knut Forkalsrud Sent: Monday, February 27, 2012 7:33 PM To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender You may find the discussion in http://glueclue.blogspot.com/2007/01/log4j-asyncappender-is-not-always_31.html useful. Knut Forkalsrud On Mon, Feb 27, 2012 at 00:16, Jens Dueholm Christensen (JEDC) mailto:jens.dueh...@r-m.com>> wrote: Bump? :) Regards, Jens Dueholm Christensen Rambøll Survey IT -Original Message- From: resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> [mailto:resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com>] On Behalf Of Jens Dueholm Christensen (JEDC) Sent: Tuesday, February 21, 2012 9:01 PM To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender Hi Scott I'd really appreciate it, if you took the time to do that - I (and my coworkers) really quite baffled. We have had an increase of activity and load on the system during the last few weeks (more users and utilization, so nothing alarming), and now this problem has manifested itself, and I've never seen it before. My understanding of what's happening is that the blocking thread is waiting for an empty buffer so it can flush it's own full buffer, but that notification never comes. As the source for log4j shows the thread has called wait() on an ArrayList (the full buffer), and something has gone wrong, but what? We've seen it happen 3 or 4 times during the last week - always a few minutes after midnight when Resin performs its logrotation. In the stacktrace most of the threads that waits for the deadlocked thread is just (as is the deadlocked thread) doing a simple logger.info<http://logger.info>(String), so it might be possible to reproduce by spawning 10-20 threads that calls logger.info<http://logger.info>() and then have Resin perform a rollover once a minute. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> [resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com>] On Behalf Of Scott Ferguson [f...@cauchomail.com<mailto:f...@cauchomail.com>] Sent: 21 February 2012 19:01 To: resin-interest@caucho.com<mailto:resin-interest@caucho.com> Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender On 02/20/2012 12:56 PM, Jens Dueholm Christensen (JEDC) wrote: Hi Lately we are seeing increased blocking behaviour from log4j when Resin performs the nightly rollover of logfiles. We're running Resin Pro 3.1 (license #1013826) in a pre .11 snapshot (resin-pro-3.1.s110225) (due to bug #4349 and a bit of waiting before 3.1.11 was released). Updating to 3.1.12 is planned, but the changelog (http://www.caucho.com/resin-3.1/changes/changes.xtp is the best I could find?) does not mention any fix for our current problem, so I'm looking for any insights or advise. I'd need to look at the log4j code to understand what it's doing. I don't think that wait() should be related to Resin (other than timing) but I'd need to look at their code to be certain, and I can't think of any logging that would help. (Logging the logging code is tricky :) 3.1.13 should be out next week, by the way. -- Scott We've configured resin to perform rollovers of this logfile: In our log4j configuration we log to stdout with: <> Some time after
Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender
Bump? :) Regards, Jens Dueholm Christensen Rambøll Survey IT -Original Message- From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Jens Dueholm Christensen (JEDC) Sent: Tuesday, February 21, 2012 9:01 PM To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender Hi Scott I'd really appreciate it, if you took the time to do that - I (and my coworkers) really quite baffled. We have had an increase of activity and load on the system during the last few weeks (more users and utilization, so nothing alarming), and now this problem has manifested itself, and I've never seen it before. My understanding of what's happening is that the blocking thread is waiting for an empty buffer so it can flush it's own full buffer, but that notification never comes. As the source for log4j shows the thread has called wait() on an ArrayList (the full buffer), and something has gone wrong, but what? We've seen it happen 3 or 4 times during the last week - always a few minutes after midnight when Resin performs its logrotation. In the stacktrace most of the threads that waits for the deadlocked thread is just (as is the deadlocked thread) doing a simple logger.info(String), so it might be possible to reproduce by spawning 10-20 threads that calls logger.info() and then have Resin perform a rollover once a minute. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com [resin-interest-boun...@caucho.com] On Behalf Of Scott Ferguson [f...@cauchomail.com] Sent: 21 February 2012 19:01 To: resin-interest@caucho.com Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender On 02/20/2012 12:56 PM, Jens Dueholm Christensen (JEDC) wrote: Hi Lately we are seeing increased blocking behaviour from log4j when Resin performs the nightly rollover of logfiles. We're running Resin Pro 3.1 (license #1013826) in a pre .11 snapshot (resin-pro-3.1.s110225) (due to bug #4349 and a bit of waiting before 3.1.11 was released). Updating to 3.1.12 is planned, but the changelog (http://www.caucho.com/resin-3.1/changes/changes.xtp is the best I could find?) does not mention any fix for our current problem, so I'm looking for any insights or advise. I'd need to look at the log4j code to understand what it's doing. I don't think that wait() should be related to Resin (other than timing) but I'd need to look at their code to be certain, and I can't think of any logging that would help. (Logging the logging code is tricky :) 3.1.13 should be out next week, by the way. -- Scott We've configured resin to perform rollovers of this logfile: In our log4j configuration we log to stdout with: <> Some time after midnight when Resin performs the rollover, the JVM becomes unresponsive and a stacktrace shows multiple threads hanging, and we have to manually restart (it doesn't get picked up by the watchdog, and can/will hang for hours). A complete JVM stacktrace is attached to this mail. All threads that are attempting to log are waiting for access to the object blocked by this thread: "http-172.27.80.36:8080-30$1663241944" daemon prio=10 tid=0x7f8804005800 nid=0x2a40 in Object.wait() [0x7f87a08ed000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:195) - locked <0x0005ba882928> (a java.util.ArrayList) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked <0x0005b8d6db00> (a org.apache.log4j.AsyncAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked <0x0005ba6f9e38> (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.pls.popinhandler.PopinScriptHandler.handleRequest(PopinScriptHandler.java:65) . It seems like something never never calls notifyAll() on the ArrayList, and as a result our world grinds to a halt.. https://issues.apache.org/bugzilla/show_bug.cgi?id=38137#c16 (comment #16) has the same kind of stacktrace as we are seeing, and the comments in #17 does point towards Resin. The last line in logfile resin-web-stdout.log.20120218 (before rotation) has the timestamp 2012-02-18 23:59:59,718. Th
Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender
Hi Scott I'd really appreciate it, if you took the time to do that - I (and my coworkers) really quite baffled. We have had an increase of activity and load on the system during the last few weeks (more users and utilization, so nothing alarming), and now this problem has manifested itself, and I've never seen it before. My understanding of what's happening is that the blocking thread is waiting for an empty buffer so it can flush it's own full buffer, but that notification never comes. As the source for log4j shows the thread has called wait() on an ArrayList (the full buffer), and something has gone wrong, but what? We've seen it happen 3 or 4 times during the last week - always a few minutes after midnight when Resin performs its logrotation. In the stacktrace most of the threads that waits for the deadlocked thread is just (as is the deadlocked thread) doing a simple logger.info(String), so it might be possible to reproduce by spawning 10-20 threads that calls logger.info() and then have Resin perform a rollover once a minute. Regards, Jens Dueholm Christensen Rambøll Survey IT From: resin-interest-boun...@caucho.com [resin-interest-boun...@caucho.com] On Behalf Of Scott Ferguson [f...@cauchomail.com] Sent: 21 February 2012 19:01 To: resin-interest@caucho.com Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 1.2.16 and AsyncAppender On 02/20/2012 12:56 PM, Jens Dueholm Christensen (JEDC) wrote: Hi Lately we are seeing increased blocking behaviour from log4j when Resin performs the nightly rollover of logfiles. We’re running Resin Pro 3.1 (license #1013826) in a pre .11 snapshot (resin-pro-3.1.s110225) (due to bug #4349 and a bit of waiting before 3.1.11 was released). Updating to 3.1.12 is planned, but the changelog (http://www.caucho.com/resin-3.1/changes/changes.xtp is the best I could find?) does not mention any fix for our current problem, so I’m looking for any insights or advise. I'd need to look at the log4j code to understand what it's doing. I don't think that wait() should be related to Resin (other than timing) but I'd need to look at their code to be certain, and I can't think of any logging that would help. (Logging the logging code is tricky :) 3.1.13 should be out next week, by the way. -- Scott We’ve configured resin to perform rollovers of this logfile: In our log4j configuration we log to stdout with: <> Some time after midnight when Resin performs the rollover, the JVM becomes unresponsive and a stacktrace shows multiple threads hanging, and we have to manually restart (it doesn’t get picked up by the watchdog, and can/will hang for hours). A complete JVM stacktrace is attached to this mail. All threads that are attempting to log are waiting for access to the object blocked by this thread: "http-172.27.80.36:8080-30$1663241944" daemon prio=10 tid=0x7f8804005800 nid=0x2a40 in Object.wait() [0x7f87a08ed000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:195) - locked <0x0005ba882928> (a java.util.ArrayList) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked <0x0005b8d6db00> (a org.apache.log4j.AsyncAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked <0x0005ba6f9e38> (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.pls.popinhandler.PopinScriptHandler.handleRequest(PopinScriptHandler.java:65) … It seems like something never never calls notifyAll() on the ArrayList, and as a result our world grinds to a halt.. https://issues.apache.org/bugzilla/show_bug.cgi?id=38137#c16 (comment #16) has the same kind of stacktrace as we are seeing, and the comments in #17 does point towards Resin. The last line in logfile resin-web-stdout.log.20120218 (before rotation) has the timestamp 2012-02-18 23:59:59,718. The first line in the new logfile resin-web-stdout.log.20120219 has the timestamp 2012-02-19 01:30:48.878 (here we actually waited to see if the problem corrected itself – but alas) which was when we did a restart. All other logfiles (watchdog, jvm, resin stdout and stderr) does not contain any indication of what is wrong. If it’s not Resin causing the problem, I guess we’ll just have to switch our entire log4j setup to use org.apac
Re: [Resin-interest] Deadlock in access log
Hi Paul Thank you for that. For now we've disabled the accesslog in our resin.conf (and lo and behold the problem went away.. ;) ) I know that the 3.1 branch is considered stable and won't get large new features, but the 4.0 branch is still (at least in our opinion) a bit to development-ish in nature, so we are sticking with 3.1 for now. Regards, Jens Dueholm Christensen From: resin-interest-boun...@caucho.com [resin-interest-boun...@caucho.com] On Behalf Of Paul Cowan [co...@caucho.com] Sent: 28 October 2011 19:48 To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Deadlock in access log On Oct 26, 2011, at 8:57 AM, Jens Dueholm Christensen (JEDC) wrote: Hi We’ve been hit rather badly by errors very similar these: http://bugs.caucho.com/view.php?id=3509 during the last few days where we’ve had to restart Resin several time a day. A lot of threads are blocked on the same object at com.caucho.server.log.AccessLog.log(AccessLog.java:345). I could attach a stacktrace, but it’s _very_ similar to those in mantis bug #3509. Our app just stalls on accepting incomming connections while threads and other services within the VM continue to run just fine. We are still on Resin Pro 3.1 (somewhere between 3.1.10 and 3.1.11 as we’ve been using a special build to alliviate another bug that’s not listed in the 3.1.11 changelog) – any chance of a backport of the fixes to the 3.1 branch and a new version – or just a nightly build with the fix? Hi Jens, We're generally only fixing critical bugs in Resin 3.1 branch, but I've created a backport request for bug 3509 for tracking purposes. http://bugs.caucho.com/view.php?id=4831 Please monitor the issue. If we won't fix it we'll mark it as such. Thanks, Paul Regards, Jens Dueholm Christensen Survey IT ___ resin-interest mailing list resin-interest@caucho.com<mailto:resin-interest@caucho.com> http://maillist.caucho.com/mailman/listinfo/resin-interest === Paul Cowan, Software Engineer Caucho Technology co...@caucho.com<mailto:co...@caucho.com> http://blog.caucho.com http://twitter.com/cauchoresin ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
[Resin-interest] Deadlock in access log
Hi We've been hit rather badly by errors very similar these: http://bugs.caucho.com/view.php?id=3509 during the last few days where we've had to restart Resin several time a day. A lot of threads are blocked on the same object at com.caucho.server.log.AccessLog.log(AccessLog.java:345). I could attach a stacktrace, but it's _very_ similar to those in mantis bug #3509. Our app just stalls on accepting incomming connections while threads and other services within the VM continue to run just fine. We are still on Resin Pro 3.1 (somewhere between 3.1.10 and 3.1.11 as we've been using a special build to alliviate another bug that's not listed in the 3.1.11 changelog) - any chance of a backport of the fixes to the 3.1 branch and a new version - or just a nightly build with the fix? Regards, Jens Dueholm Christensen Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
[Resin-interest] Resin 3.1.11 comming soon?
Hi What are the chances of a new release of the Resin 3.1 branch (ie. a version 3.1.11) any time soon? Bug #4349 which has affected us several times is marked as fixed in version 3.1.11.. Regards, Jens Dueholm Christensen Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Does dependency-check-intervalaffect jspchanges?
Ok, thanks for that Rob. Regards, Jens Dueholm Christensen Business Process and Improvement, Rambøll Survey IT From: resin-interest-boun...@caucho.com [mailto:resin-interest-boun...@caucho.com] On Behalf Of Rob Lockstone Sent: 18. juni 2009 01:09 To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Does dependency-check-intervalaffect jspchanges? I'd recommend setting the global dependency-check-interval and the jsp one and not worry about the inheritance aspect. That's what we do and it works fine. What you have below should do what you want. The only change I'd make is to use 60s instead of just 60. I'm not sure if resin assumes 'seconds' or not. Rob On Jun 17, 2009, at 01:46, Jens Dueholm Christensen wrote: Scott Ferguson wrote: JSP is handled separately and has its own check interval. The concepts are similar, of course, but the actual needs are different enough that it made more sense to configure them separately. Thanks Scott However, as Rob Lockstone points out regarding the jsp version of the setting, the default value is inherited - but from where? If I were to do something like (Resin 3.0) -1 60 (Resin 3.1/4) -1 60 I would still get the behaviour I want (ie. able to change resin.conf without restart, but still get on-the-fly recompile of jsp changes), or is the setting for jsp inherited from elsewhere (and thus the last dependency-check-interval is unnessecary)? Regards, Jens Dueholm Christensen Business Process and Improvement, Rambøll Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Does dependency-check-interval affect jspchanges?
Scott Ferguson wrote: JSP is handled separately and has its own check interval. The concepts are similar, of course, but the actual needs are different enough that it made more sense to configure them separately. Thanks Scott However, as Rob Lockstone points out regarding the jsp version of the setting, the default value is inherited - but from where? If I were to do something like (Resin 3.0) -1 60 (Resin 3.1/4) -1 60 I would still get the behaviour I want (ie. able to change resin.conf without restart, but still get on-the-fly recompile of jsp changes), or is the setting for jsp inherited from elsewhere (and thus the last dependency-check-interval is unnessecary)? Regards, Jens Dueholm Christensen Business Process and Improvement, Rambøll Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
[Resin-interest] Resin pro 3.0.24 with Apache frontend with increased load after a directory attack..
) = 16379 ... write(44, "last-updateS\0\n1073741823h\0\0c\0\0e\0"..., 1818) = 1818 close(44) = 0 rename("/tmp/resintmp-K8ylXW", "/tmp/localhost_6856") = 0 unlink("/tmp/resintmp-K8ylXW") = -1 ENOENT (No such file or directory) Then I remember seeing a post about "localhost_" files on the mailinglist from Vlad Artamonov (03 Aug 2008) and Scott Ferguson's reply on the 4th. Oh dear - I had some large localhost_ files: -rw--- 1 apache apache 198379 Sep 16 11:22 /tmp/localhost_6856 -rw--- 1 apache apache 176417 Sep 16 11:22 /tmp/localhost_6862 -rw--- 1 apache apache 766 Sep 16 11:26 /tmp/localhost_6873 -rw--- 1 apache apache 152038 Sep 16 11:20 /tmp/localhost_6880 -rw--- 1 apache apache 139985 Sep 16 09:25 /tmp/localhost_6893 -rw--- 1 apache apache 140689 Sep 16 11:21 /tmp/localhost_6897 (The smaller one (localhost_6873) is a site that's only available from specific IP's in the firewall and was never attacked, so I took that as a "normal" size) Looking at the contents of them (via less and strings) I could see a lot what looked like leftover garbage from the directory attack we experienced Sunday. I then stopped Apache, removed the files and restarted Apache, and as can be seen on the "last 2 hours" graph this immediately lowered the load (I removed the files around 11:55), and my problem vanished! This left me wondering. - Was there anything I could or should have done earlier to find the error (except to trace an Apache process as I did)? - Was Resin's mod_caucho (from the pro version 3.0.24) behaviour as expected - ie. it kept a really large cache-file updated on every request and persisted over restartes of both Apache and Resin? - Can Resin detect when performance issues arrise due to the large size and possible do something? - Can I somehow configure how often this file is updated - the documentation on the tag Scott mentioned doesn't mention the effect on the localhost_ files? - Shouldn't these files be reset or removed when a VM is shut down or started to ensure optimal performance? - Is the default behaviour different in Resin 3.1.x? - Is there a bug somewhere or somehow? - Did I do the "right thing" to remove the files, or should I have done something else entirely? Regards, Jens Dueholm Christensen Rambøll Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] RHEL 5 Installation
Make sure you have the openssl-devel package installed - compiling usually needs the -devel packages of dependencies too. I've had absolutely no problems with 3.1, and I doubt there's been much change in the configure-script for 3.2 (but I might just be totally wrong on that one..). Regards, Jens Dueholm Christensen Business Process and Improvement, Rambøll Survey IT From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nathan Maves Sent: 21. august 2008 07:14 To: resin-interest@caucho.com Subject: [Resin-interest] RHEL 5 Installation Hey guys, I am trying to get resin 3.2 install on some new linux boxes and I am having issues. I was curious if anyone else has done this to help me out. Here are the details 64 bit RHEL 5 Java 6 I try to run the following : ./configure --enable-64bit --enable-ssl --with-java-home=/usr/java/default But I get : Openssl library was not found I have also tried to specify the location of the OpenSSL lib but it does not work either. --with-openssl=/usr/lib/openssl/engines But I get : configure: error: Can't find valid OpenSSL library in /usr/lib/openssl/engines/lib Here is what is in that location : [EMAIL PROTECTED] resin]$ ls -l /usr/lib/openssl/engines total 132 -rwxr-xr-x 1 root root 14488 Jan 15 2008 lib4758cca.so -rwxr-xr-x 1 root root 14604 Jan 15 2008 libaep.so -rwxr-xr-x 1 root root 10456 Jan 15 2008 libatalla.so -rwxr-xr-x 1 root root 18900 Jan 15 2008 libchil.so -rwxr-xr-x 1 root root 16096 Jan 15 2008 libcswift.so -rwxr-xr-x 1 root root 2444 Jan 15 2008 libgmp.so -rwxr-xr-x 1 root root 8680 Jan 15 2008 libnuron.so -rwxr-xr-x 1 root root 18748 Jan 15 2008 libsureware.so -rwxr-xr-x 1 root root 14624 Jan 15 2008 libubsec.so I hope this helps. Thanks Nathan ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
[Resin-interest] Resin 3.1 and pid-file
Hi After using Resin 2.1 for a long long time we upgraded to Resin 3.0 some time last year, and some months ago to 3.1. We use stacktraces from the running VM for debugging, but the process of retriving the correct PID when running on unix is now quite hard when we have 10+ resin VM's running on the same server (jps just shows multiple Resin and WatchdogManager processes). Previous versions (prior to 3.1 at least) used to have a -pid-file (or similar) option, so the parent resin PID was recorded. This made it easy to get hold of the child PID with a bit of gawk-magic, but with 3.1 this is no longer a possibility. What can I do to record the PID of a newly launched Resin process - or even better - the PID for the Java VM it spawns? Regards, Jens Dueholm Christensen Business Process and Improvement, Rambøll Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Problem with Resin 3.1.6 Pro and keepalive
Hi Scott That seems to have done the trick! A bit more digging told me, that epoll(4) was introduced around 2.5.44, so that was spot on - thanks for the hint! Apart from the increased usage of sockets (or so I understand it from http://www.caucho.com/resin/doc/server-tags.xtp#keepalive-select-enable) there shouldn't be any other sideeffects? Oh well - seems like I should start working on upgrading my testserver.. :) Regards, Jens Dueholm Christensen Rambøll survey IT From: [EMAIL PROTECTED] on behalf of Scott Ferguson Sent: Tue 6/17/2008 19:07 To: General Discussion for the Resin application server Subject: Re: [Resin-interest] Problem with Resin 3.1.6 Pro and keepalive On Jun 17, 2008, at 9:47 AM, Jens Dueholm Christensen wrote: > > So far so good, however this also appears in resin-stderr.log at the > same time: > > [2008-06-17 16:28:01.507]Exception in thread "resin-select-manager" > java.io.IOException: failed to add EPOLL for pipe=47 (errno=-1) > [2008-06-17 16:28:01.508] at > com.caucho.server.port.JniSelectManager.initNative(Native Method) > [2008-06-17 16:28:01.508] at > com.caucho.server.port.JniSelectManager.run(JniSelectManager.java:274) > [2008-06-17 16:28:01.508] at java.lang.Thread.run(Thread.java: > 619) Can you try setting keepalive-select-enable="false" in the : ... ... false ... I believe your older Linux version doesn't correctly support EPOLL, which the select manager uses to handle keepalives. -- Scott > > > Then upon accessing the site this appears in logs/resin-stdout.log: > > [2008-06-17 16:28:18.377][16:28:18.377] Tcp[results_test,0] failed > keepalive (select) > [2008-06-17 16:28:32.685][16:28:32.684] Tcp[results_test,1] failed > keepalive (select) > [2008-06-17 16:28:32.705][16:28:32.705] Tcp[results_test,2] failed > keepalive (select) > [2008-06-17 16:28:32.725][16:28:32.725] Tcp[results_test,3] failed > keepalive (select) > [2008-06-17 16:28:32.754][16:28:32.754] Tcp[results_test,4] failed > keepalive (select) > [2008-06-17 16:28:32.785][16:28:32.784] Tcp[results_test,5] failed > keepalive (select) > [2008-06-17 16:28:32.825][16:28:32.825] Tcp[results_test,6] failed > keepalive (select) > > .. and the site changes between a 503-error page (as if the site > isn't started) and a working site (using refresh in the browser a > few times), so it's obviously not working as it should. > > This is my resin.conf: > > http://caucho.com/ns/resin"; > xmlns:resin="http://caucho.com/ns/resin/core > "> > > > > > > > > > > > > > > >150 >5 > > > > > -Xmn32m > -Xms64m > -Xmx512m > -server > -XX:-OmitStackTraceInFastThrow > > > > 7875 > 100 > 30s > true > > >true > > > > > > > > > > > > > > > 16384 > true > true > > > class="com.caucho.servlets.FileServlet"/> > > class="com.caucho.jsp.JspServlet"> > >false >1024 > > > > > > name="invoker"/> > > > > ... context params removed... > > > > > > > I've looked at http://bugs.caucho.com/view.php?id=2555 > <http://bugs.caucho.com/view.php?id=2555 > > (first and almost onle relevant hit when googling for "resin > failed keepalive" and variations) and tried to add the select-max>-tag, but to no avail (and I can't seem to find any > documentation on that tag?). > > I think it's somehow related to the Exception in resin-select- > manager at startup, but I'm not sure, and googling that error turns > nothing up, so I'm lost on what to try.. > > I build resin with the following line: > > $ export JAVA_HOME=/usr/java/jdk1.6.0_03 > $ ./configure --enable-jni --enable-ssl --with-apxs=/usr/sbin/apxs -- > prefix=/usr/local/resin/resin-pro-3.1.6 > > and the output from the configure script does say: > > checking for JNI in /usr/java/jdk1.6.0_03/include/linux ... found > > Ideas are appriciated - the same resin.conf and configure-arguments > (apart from --enable-64bit) works fine on a new production-ready > server running RHEL5.2 (2.6.18-92) and JDK1.6.0_05. > > Regards, > Jens Dueholm Christensen > Rambøll Survey IT > > > ___ > resin-interest mailing list > resin-interest@caucho.com > http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest <>___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
[Resin-interest] Problem with Resin 3.1.6 Pro and keepalive
Hi I have a test-server that runs a mix of Resin 2.1.17 and 3.0.23 Pro (havn't had a reason to upgrade yet) using 2 different Apache 2.2 instances as frontends for several vhosts on RHEL3u9 (2.4.21-50) with JDK1.6.0_03. This setup is just fine, and performs as expected for all my needs.Howver I'm in the process of rolling out Resin 3.1.6 Pro on a new server (since the 3.1 branch it's now stable I might as well upgrade). I've set everything up - added a new Apache instance for the 3.1.6 version of mod_caucho.so etc. - and started up a site. In resin.conf (see below) I've used the -tag to output to logs/resin-stdout.log, and the -tag to logs/resin-stderr.log. Upon startup of a site this is written to resin-stdout.log: [2008-06-17 16:28:01.467][16:28:01.467] Linux 2.4.21-50.EL i386 [2008-06-17 16:28:01.467][16:28:01.467] Java(TM) SE Runtime Environment 1.6.0_03-b05, ISO-8859-1, en [2008-06-17 16:28:01.468][16:28:01.467] Java HotSpot(TM) Server VM 1.6.0_03-b05, 32, mixed mode, Sun Microsystems Inc. [2008-06-17 16:28:01.468][16:28:01.468] user.name: root [2008-06-17 16:28:01.468][16:28:01.468] resin.home = /usr/local/resin/resin-pro-3.1.6/ [2008-06-17 16:28:01.469][16:28:01.469] resin.root = /usr/local/www/results_test [2008-06-17 16:28:01.469][16:28:01.469] resin.conf = /usr/local/www/results_test/conf/resin.conf [2008-06-17 16:28:01.469][16:28:01.469] [2008-06-17 16:28:16.800][16:28:16.800] Loaded Socket JNI library. [2008-06-17 16:28:16.804][16:28:16.804] hmux listening to localhost.localdomain:6875 [2008-06-17 16:28:16.816][16:28:16.816] Server[id=results_test,cluster=web] active [2008-06-17 16:28:16.818][16:28:16.818] Resin started in 17788ms So far so good, however this also appears in resin-stderr.log at the same time: [2008-06-17 16:28:01.507]Exception in thread "resin-select-manager" java.io.IOException: failed to add EPOLL for pipe=47 (errno=-1) [2008-06-17 16:28:01.508] at com.caucho.server.port.JniSelectManager.initNative(Native Method) [2008-06-17 16:28:01.508] at com.caucho.server.port.JniSelectManager.run(JniSelectManager.java:274) [2008-06-17 16:28:01.508] at java.lang.Thread.run(Thread.java:619) Then upon accessing the site this appears in logs/resin-stdout.log: [2008-06-17 16:28:18.377][16:28:18.377] Tcp[results_test,0] failed keepalive (select) [2008-06-17 16:28:32.685][16:28:32.684] Tcp[results_test,1] failed keepalive (select) [2008-06-17 16:28:32.705][16:28:32.705] Tcp[results_test,2] failed keepalive (select) [2008-06-17 16:28:32.725][16:28:32.725] Tcp[results_test,3] failed keepalive (select) [2008-06-17 16:28:32.754][16:28:32.754] Tcp[results_test,4] failed keepalive (select) [2008-06-17 16:28:32.785][16:28:32.784] Tcp[results_test,5] failed keepalive (select) [2008-06-17 16:28:32.825][16:28:32.825] Tcp[results_test,6] failed keepalive (select) .. and the site changes between a 503-error page (as if the site isn't started) and a working site (using refresh in the browser a few times), so it's obviously not working as it should. This is my resin.conf: http://caucho.com/ns/resin"; xmlns:resin="http://caucho.com/ns/resin/core";> 150 5 -Xmn32m -Xms64m -Xmx512m -server -XX:-OmitStackTraceInFastThrow 7875 100 30s true true 16384 true true false 1024 ... context params removed... I've looked at http://bugs.caucho.com/view.php?id=2555 <http://bugs.caucho.com/view.php?id=2555> (first and almost onle relevant hit when googling for "resin failed keepalive" and variations) and tried to add the -tag, but to no avail (and I can't seem to find any documentation on that tag?). I think it's somehow related to the Exception in resin-select-manager at startup, but I'm not sure, and googling that error turns nothing up, so I'm lost on what to try.. I build resin with the following line: $ export JAVA_HOME=/usr/java/jdk1.6.0_03 $ ./configure --enable-jni --enable-ssl --with-apxs=/usr/sbin/apxs --prefix=/usr/local/resin/resin-pro-3.1.6 and the output from the configure script does say: checking for JNI in /usr/java/jdk1.6.0_03/include/linux ... found Ideas are appriciated - the same resin.conf and configure-arguments (apart from --enable-64bit) works fine on a new production-ready server running RHEL5.2 (2.6.18-92) and JDK1.6.0_05. Regards, Jens Dueholm Christensen Rambøll Survey IT ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest