I actually had this problem and I was able to duplicate it  
consistently by simply trying to run a default installation of  
Atlassian's Confluence software, version 2.6.

Rather than doing all the profiling like you (which I would have  
liked to but i was too lazy), I just did a thread dump.  It appeared  
to be occurring within the VFS package.  Resin 3.1.2 under Linux  
CentOS 4.x was spending FOREVER inside the Jar loading code.

I tested Confluence under Tomcat with no apparent problem.   
Everything loaded FAST.

Finally, after about 6 hours trying diagnose this problem, I went  
ahead and downloaded the newest Resin snapshot (dated Sep 27, 2007 on  
Caucho's download page).  I was this close to moving over to using  
either JBoss or Tomcat.  But the good part is that the new snapshot  
actually solved my problem.  I checked the bugtrack and didn't notice  
any bug fixes related to this issue so I wasn't sure.

Anyways, try installing the newest snapshot version (not the stable  
3.1.2 release, which apparently was too stable for my own good) and  
see if it solves your problem.

Thanks,
Chris


On Oct 6, 2007, at 4:03 AM, Mattias Jiderhamn wrote:

> Thanks for all the suggestions.
>
> Serge wrote:
>> I've seen something like this, and we were able to diagnose it on  
>> a Unix
>> environment by using a program called truss
>> (http://www.scit.wlv.ac.uk/cgi-bin/mansec?1+truss) which allows  
>> you to
>> watch every file that is getting read.  I don't know of a Windows
>> equivalent.  Most Java profilers I know of do not help you find IO
>> related issues very well, so you might want to look for a lower  
>> level OS
>> tool to see what's changing.
>>
>> What we found was that the application was looking for a class  
>> that did
>> not exist.  Java does not hold a cache entry for "class not  
>> found", so
>> it kept opening every jar in the entire classpath to find said class,
>> numerous times for a single http request because of where this  
>> library
>> was used.  You might want to check all the resin/lib and WEB-INF/ 
>> lib and
>> whatever other classes to see if a non-critical class is missing but
>> being searched for over and over.
> I have watched file access using FileMon on Windows. Unfortunately, I
> don't get enough information to see if I have the same problem you  
> had.
> But since the inital start of the application is also slow, I lean
> towards there is a generic problem with class loading and/or  
> dependency
> checking.
> There seem to occur several depenency checks during a single request,
> which is why a simple page can take 40 seconds. If I increase the
> dependency check intervall, so that there is only one check during the
> request monitored by FileMon, it will finish much faster.
>
>
> Scott wrote:
>> Do you see a difference if you remove the resin.dll? (Which would
>> disable the JNI.)
>>
> Not much.
>
>> Also, what is the stack trace for the slow read() call?
> With JNI, 57% of the time for loading a single page (total 78 s) is  
> spent in
> com.caucho.vfs.JniStream.read cased by (only!!!) 22 calls from
>   com.caucho.vfs.ReadStream.readBuffer
>   com.caucho.vfs.ReadStream.waitForRead
>   com.caucho.server.port.TcpConnection.run
>   ...
>
>
> Without JNI 43% of the time (total 33 s) is spent in
> java.io.File.lastModified caused by 28000 calls (depency checking over
> and over) from
>   com.caucho.vfs.FilePath.getLastModified
>   com.caucho.vfs.Depend.isModified
>   ...
> and 41% of the time is spent in
> java.io.File.length called 28000 times from
>   com.caucho.vfs.FilePath.getLength
>   com.caucho.vfs.Depend.isModified
>   ...
>
>
> I created a simple test page, calling File.length and  
> File.lastModified
> 100'000 times and measured the time. I then ran the page in another
> webapp, that does not seem to be affected (on the same computer  
> though).
> The results were the same (about 0.2 ms / operation - not sure if this
> is normal though; I'll have to compare to the office computer next  
> week).
> Anyway, this could indicate the problem is not with the I/O calls
> themselves, but the number of I/O calls... (Which makes Serges tip  
> more
> interesting.) Again, I will have to profile at the other computer next
> week and compare.
>
>
> Tom Hintz wrote:
>> I’ve seen small TCP/IP MTU sizes cause similar behavior.  We had VPN
>> software that set the MTU to less than the IP default, which I think
>> should have been 1500(?).  Result was packet fragmentation that  
>> caused
>> huge delays in the Microsoft TCP implementation.  I think the MTU
>> setting, in this case, was in the VPN software.
> I have not run any VPN software other than the built in PPTP  
> client, and
> have not changed the MTU of the NICs. (MTU isn't fetched via DHCP  
> is it?)
> I thought maybe there was a problem fetching external DTDs when  
> reading
> Hibernate mappings or whatever (lots of profiling time spent  
> parsing doc
> types and my ISP has had some issues lately). However, inactivating  
> the
> NIC altogether makes no difference.
>
>
>> By the way, the CPU usage is peaking all the way through. Around  
>> 49% on
>> my dual core system.
>> (And there is even more memory on the computer with the problem  
>> than the
>> other one)
>>
>>
>>> -----Original Message-----
>>> From: [EMAIL PROTECTED]
>>> [mailto:[EMAIL PROTECTED] Behalf Of Mattias  
>>> Jiderhamn
>>> Sent: Friday 05 October 2007 11:20
>>> To: Resin
>>> Subject: [Resin-interest] Slooow file reads (really weird!)
>>>
>>>
>>> Hi list.
>>> I have my J2EE webapp on an external hard drive, which I carry  
>>> between
>>> my office and my home computer.
>>> On each computer - running Windows XP and Java 1.5 - I have a Resin
>>> (3.0.22) installation and a shortcut to start Resin with the  
>>> server root
>>> on the external drive.
>>> This has worked flawlessly for over a year.
>>>
>>> Now suddenly (ok, after returning from vacation), the application is
>>> immensely slow - on one of the computers!
>>> Starting Resin and the application now takes anywhere from 3 to 5
>>> minutes on my home computer, compared to the usual 30 or so seconds.
>>> Loading a simple page can take 40 seconds. It seems that most of the
>>> time is spent inside the disk access of the dependency checking.
>>> (Turning dependency checking off was much faster, but still  
>>> slower than
>>> normal).
>>> I tried to copy the project to the internal drive to see if there  
>>> was
>>> some interface hardware issue - no difference.
>>> I profiled the application with JProfiler, and it thinks there is a
>>> hotspot in com.caucho.vsf.JniStream.read(). Why...?
>>>
>>> I am running out of ideas on what could be wrong and how to track  
>>> it down.
>>>
>>> Any tips would be much welcome!
>>>
>>>   /Mattias
>
>
>
> _______________________________________________
> resin-interest mailing list
> resin-interest@caucho.com
> http://maillist.caucho.com/mailman/listinfo/resin-interest



_______________________________________________
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest

Reply via email to