Re: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-25 Thread Paul Speed


I couldn't find alot of info on testing.  I also couldn't find any
tests that included multiple files... so I may be looking in the wrong
place.  I eventually found and played with the tester stuff.

Attached are the files I added to the tester to exploit the include
problem.  SSIInclude09.shtml simply includes another .shtml file 
twice.

Here is the section I added to tester.xml to get the test to run.
tester host=${host} port=${port} protocol=${protocol}
 request=${context.path}/SSIInclude09.shtml debug=${debug}
  golden=${golden.path}/SSIInclude03.txt/

I now have this working on my system here.  It currently passes all 
of the tester tests in addition to about 7 more tests that I added 
myself here locally.  I also added the initial support for the set 
directive and variable substitution.  I have one more command to get 
working and then some clean-up and I'll see about posting the diffs.  

Actually, while I'm on that subject, the diffs are extensive since
I've pretty much touched every SSI related file in a very significant
way... in addition to removing a few of them.  What is the preferred
way to submit such a large patch?

Thanks,
-Paul Speed

Bip Thelin wrote:
 
  -Original Message-
  From: Paul Speed [mailto:[EMAIL PROTECTED]]
 
  For the curious reader, after looking into this code at some length
  it seems clear why the set command was not added.  All SSI requests
  share the same environment, which not only makes a set command
  impossible but also means that multiple SSI requests (or even nested
  SSI requests) trample all over each other.  A simple shtml file that
  includes two other shtml files illustrates this quite nicely.
 
 Do you have a smal testcase? We have unittests with Tomcat that have
 nested includes and several includes in one page. All Ssi directives
 share the same enviroment per page through a mediator, this is due to
 the fact that you can have a config directive that changes the error
 message that you would get for a failed include further down on the same
 page, for instance.
 
 However if pageA includes pageB, if pageB is also an shtml/ssi file it
 would have a new fresh enviroment and could not tamper with pageA's
 enviroment.
 
 So you could easily do a set command simmilar to the config command.
 
  Since I'm between assignments at the moment, I'm working on a patch
  here locally.  It's pretty significant, though, so it may take me a
  few days.  It will include the set command though since that's what
  I'm going to use to test it. :)
 
 Patches and additions are gladly appreciated.
 
 Bip Thelin

This is Content of includeme.shtml

This is Content of includeme.shtml




RE: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-25 Thread Bip Thelin

 -Original Message-
 From: Paul Speed [mailto:[EMAIL PROTECTED]] 
 
 Actually, while I'm on that subject, the diffs are extensive since
 I've pretty much touched every SSI related file in a very significant
 way... in addition to removing a few of them.  What is the preferred
 way to submit such a large patch?

Send them along as .zip or .tar.gz if they're *really* big maybe
you could put them somewhere and send along the link.

Bip



RE: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-25 Thread Bip Thelin

 -Original Message-
 From: Paul Speed [mailto:[EMAIL PROTECTED]] 

 [...]
 I now have this working on my system here.  It currently passes all 
 of the tester tests in addition to about 7 more tests that I added 
 myself here locally.  I also added the initial support for the set 
 directive and variable substitution.  I have one more command to get 
 working and then some clean-up and I'll see about posting the diffs.  
 
 Actually, while I'm on that subject, the diffs are extensive since
 I've pretty much touched every SSI related file in a very significant
 way... in addition to removing a few of them.  What is the preferred
 way to submit such a large patch?

I don't know how much you have left but I'm going to be unavailable for
the coming weeks starting from tomorrow. I'm relocating back to Sweden
from San Francisco. So if you want/can you could send me the files today
and I could try and integrate the changes before I take off.


Bip Thelin



Re: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-24 Thread Paul Speed



Bip Thelin wrote:
 
  -Original Message-
  From: Paul Speed [mailto:[EMAIL PROTECTED]]
 
  For the curious reader, after looking into this code at some length
  it seems clear why the set command was not added.  All SSI requests
  share the same environment, which not only makes a set command
  impossible but also means that multiple SSI requests (or even nested
  SSI requests) trample all over each other.  A simple shtml file that
  includes two other shtml files illustrates this quite nicely.
 
 Do you have a smal testcase? We have unittests with Tomcat that have
 nested includes and several includes in one page. All Ssi directives
 share the same enviroment per page through a mediator, this is due to
 the fact that you can have a config directive that changes the error
 message that you would get for a failed include further down on the same
 page, for instance.

Actually, SsiInvokerServlet has a static reference to SsiMediator.
Furthermore, all of SsiMediators fields are static.  

A simple test case that I use is a .shtml page that includes the same 
.shtml page twice.  Only the first one will actually be included 
because of the way the Request object in SsiMediator is overwritten.

 
 However if pageA includes pageB, if pageB is also an shtml/ssi file it
 would have a new fresh enviroment and could not tamper with pageA's
 enviroment.
 
 So you could easily do a set command simmilar to the config command.

Actually, includes should share the environment of the parent...
in fact, if they set server variables the parent will see them.

 
  Since I'm between assignments at the moment, I'm working on a patch
  here locally.  It's pretty significant, though, so it may take me a
  few days.  It will include the set command though since that's what
  I'm going to use to test it. :)
 
 Patches and additions are gladly appreciated.

Cool.  I'm almost done refactoring.  I'm basically replacing the
SsiMediator with an SsiEnvironment that is then stuck into a
request attribute.  In the process, I'm moving some things around 
a little since all of the commands were relying on the fact that 
they were SsiMediator subclasses... and therefore directly accessing 
the static fields of SsiMediator.

Hopefully I'll have something done in the morning.  Even more 
hopeful, it will be worth committing. ;)  We'll see...
-Paul Speed



DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-24 Thread bugzilla

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361

SsiServlet potentially leaks files





--- Additional Comments From [EMAIL PROTECTED]  2001-10-24 07:47 ---
The file leaking problem appears to have been solved in the latest
servlets-ssi.jar :-)

This was the last SSI related bug I have to report for now. Thank you Bip for
your time and contributions.



RE: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-24 Thread Bip Thelin

 -Original Message-
 From: Paul Speed [mailto:[EMAIL PROTECTED]] 
 
 [...]
 
 Actually, includes should share the environment of the parent...
 in fact, if they set server variables the parent will see them.

Ok, that might be true(just looked at Apache's behavior and they
seem to do just that), when we implemented SSI we strictly followed
the NCSA standard which don't have set, and doesn't talk about if
the included page should see commands set by the parent. God catch!

 Cool.  I'm almost done refactoring.  I'm basically replacing the
 SsiMediator with an SsiEnvironment that is then stuck into a
 request attribute.  In the process, I'm moving some things around 
 a little since all of the commands were relying on the fact that 
 they were SsiMediator subclasses... and therefore directly accessing 
 the static fields of SsiMediator.

Ok, sounds good, send along some code and I can take a look at it
and commit it.


Bip



DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-23 Thread bugzilla

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361

SsiServlet potentially leaks files

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |ASSIGNED



--- Additional Comments From [EMAIL PROTECTED]  2001-10-23 09:57 ---
Hmm, the SsiPackage doesn't open any FileStreams explicity, only indirect
through RequestDispatcher.include(). Could it be that your lengthly data 
import operation that accesses thousands of HTML documents exceeds the number 
of open files currently allowed on your system?

You could try to raise proc/sys/kernel/file-max AND /proc/sys/kernel/inode-max
and see if things work better, you could also look at how many files currently
open with 'lsof'.

Could you try that out and report back if it works or not.

   Bip Thelin



DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-23 Thread bugzilla

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361

SsiServlet potentially leaks files





--- Additional Comments From [EMAIL PROTECTED]  2001-10-23 12:39 ---
I can't recreate the error here but I did happen to notice something as I was 
browsing through the code.  Each SsiInvokerServlet request causes it to open the 
requested file as a Resource.  The InputStream that is obtained from this 
Resource is never closed.  I tried to poke around in the Resource classes to see 
if some magic was closing the stream but couldn't find anything.

I can attach a patch if needed, but the change is pretty simple.  Just a 
try/finally with a close().



DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-23 Thread bugzilla

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361

SsiServlet potentially leaks files





--- Additional Comments From [EMAIL PROTECTED]  2001-10-23 13:17 ---
Well, my lengthy operation succeeds if SSI operations have been disabled, and
fails if SSI is enabled. The problem may of course be in my own application too,
but between the successfull and failed tests cases, the only thing that is
different is that wether SSI has been enabled or not. The same application has
also proven to work in other production environments, so I have some confidence
on it. On the other hand, Tomcat4 is a new and different runtime environment, so
it may be that some old problem is just now manifesting itself in my
application.

Anyway, the SSI servlet must surely open a stream to the material it is about to
include into a master document, so the big question is, does the servlet ever
close that stream explicitly? And does it do it right, like this:

InputStream in = someOpenOperation();

try
{
  :
}
finally // Close the stream even if something goes wrong
{
  in.close();
}

Anyway, to make sure, I also run some 'lsof' tests. Without SSI servlets, my
Java application kept about 4.000 open files at any given time. However, with
SSI servlets enabled, the number started to grow. Although the garbage collertor
(or something) seemed to periodically close files, the average number of open
files soon exceeded 10.000, and the peak values grew near 80.000. And after some
time, the peak value hit 100.000, halting effectively all server services. After
the test, the number remained high until Tomcat was restarted, and all was well
again.

Here is a sample of the number of open files data, taken at 2 second intervals,
and at the moment the Too many files errors started to appeared. As you see,
there is some cyclic behaviour, but the average increased all the time.

  36351
  43159
  49996
  57696
  62826
  67088
  77168
  88502
  96290
 110256
 100572
  44237
  48503
  56869
  63395
  71967
  80834
  87933
 100460
 115717
 118438
 118438
 118424
 118438
 118429
 118438

As a final minor detail, I should propably mention that I'm mapping the SSI
servlet to *.html file name pattern. This is propably not significant, but you
never know ...



Re: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-23 Thread Paul Speed

On a vaguely related note...

For the curious reader, after looking into this code at some length
it seems clear why the set command was not added.  All SSI requests
share the same environment, which not only makes a set command 
impossible but also means that multiple SSI requests (or even nested
SSI requests) trample all over each other.  A simple shtml file that
includes two other shtml files illustrates this quite nicely.

Since I'm between assignments at the moment, I'm working on a patch
here locally.  It's pretty significant, though, so it may take me a 
few days.  It will include the set command though since that's what
I'm going to use to test it. :)

If noone else beats me to it, I'll post it when I'm done.
-Paul Speed

[EMAIL PROTECTED] wrote:
 
 DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
 RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
 http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361.
 ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
 INSERTED IN THE BUG DATABASE.
 
 http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361
 
 SsiServlet potentially leaks files
 
 --- Additional Comments From [EMAIL PROTECTED]  2001-10-23 12:39 ---
 I can't recreate the error here but I did happen to notice something as I was
 browsing through the code.  Each SsiInvokerServlet request causes it to open the
 requested file as a Resource.  The InputStream that is obtained from this
 Resource is never closed.  I tried to poke around in the Resource classes to see
 if some magic was closing the stream but couldn't find anything.
 
 I can attach a patch if needed, but the change is pretty simple.  Just a
 try/finally with a close().



RE: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-23 Thread Bip Thelin

 -Original Message-
 From: Paul Speed [mailto:[EMAIL PROTECTED]] 
 
 For the curious reader, after looking into this code at some length
 it seems clear why the set command was not added.  All SSI requests
 share the same environment, which not only makes a set command 
 impossible but also means that multiple SSI requests (or even nested
 SSI requests) trample all over each other.  A simple shtml file that
 includes two other shtml files illustrates this quite nicely.

Do you have a smal testcase? We have unittests with Tomcat that have
nested includes and several includes in one page. All Ssi directives
share the same enviroment per page through a mediator, this is due to
the fact that you can have a config directive that changes the error
message that you would get for a failed include further down on the same
page, for instance.

However if pageA includes pageB, if pageB is also an shtml/ssi file it
would have a new fresh enviroment and could not tamper with pageA's
enviroment.

So you could easily do a set command simmilar to the config command.

 Since I'm between assignments at the moment, I'm working on a patch
 here locally.  It's pretty significant, though, so it may take me a 
 few days.  It will include the set command though since that's what
 I'm going to use to test it. :)

Patches and additions are gladly appreciated.


Bip Thelin



DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files

2001-10-23 Thread bugzilla

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361

SsiServlet potentially leaks files

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED



--- Additional Comments From [EMAIL PROTECTED]  2001-10-23 17:18 ---
I just commited some code that could very well fix the problem your having. You
were right about not closing some of the streams. Even though the 
garbagcollector
should it takes a while for it to do it and is generally ugly to keep unused 
open
streams.

Please try the upcoming nightly build and let me know if the problem persists.
And about mapping SSI to *.html, that's not an optional solution since it adds 
overhead to each request.(It needs to scan every HTML file served).