Re: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
I couldn't find alot of info on testing. I also couldn't find any tests that included multiple files... so I may be looking in the wrong place. I eventually found and played with the tester stuff. Attached are the files I added to the tester to exploit the include problem. SSIInclude09.shtml simply includes another .shtml file twice. Here is the section I added to tester.xml to get the test to run. tester host=${host} port=${port} protocol=${protocol} request=${context.path}/SSIInclude09.shtml debug=${debug} golden=${golden.path}/SSIInclude03.txt/ I now have this working on my system here. It currently passes all of the tester tests in addition to about 7 more tests that I added myself here locally. I also added the initial support for the set directive and variable substitution. I have one more command to get working and then some clean-up and I'll see about posting the diffs. Actually, while I'm on that subject, the diffs are extensive since I've pretty much touched every SSI related file in a very significant way... in addition to removing a few of them. What is the preferred way to submit such a large patch? Thanks, -Paul Speed Bip Thelin wrote: -Original Message- From: Paul Speed [mailto:[EMAIL PROTECTED]] For the curious reader, after looking into this code at some length it seems clear why the set command was not added. All SSI requests share the same environment, which not only makes a set command impossible but also means that multiple SSI requests (or even nested SSI requests) trample all over each other. A simple shtml file that includes two other shtml files illustrates this quite nicely. Do you have a smal testcase? We have unittests with Tomcat that have nested includes and several includes in one page. All Ssi directives share the same enviroment per page through a mediator, this is due to the fact that you can have a config directive that changes the error message that you would get for a failed include further down on the same page, for instance. However if pageA includes pageB, if pageB is also an shtml/ssi file it would have a new fresh enviroment and could not tamper with pageA's enviroment. So you could easily do a set command simmilar to the config command. Since I'm between assignments at the moment, I'm working on a patch here locally. It's pretty significant, though, so it may take me a few days. It will include the set command though since that's what I'm going to use to test it. :) Patches and additions are gladly appreciated. Bip Thelin This is Content of includeme.shtml This is Content of includeme.shtml
RE: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
-Original Message- From: Paul Speed [mailto:[EMAIL PROTECTED]] Actually, while I'm on that subject, the diffs are extensive since I've pretty much touched every SSI related file in a very significant way... in addition to removing a few of them. What is the preferred way to submit such a large patch? Send them along as .zip or .tar.gz if they're *really* big maybe you could put them somewhere and send along the link. Bip
RE: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
-Original Message- From: Paul Speed [mailto:[EMAIL PROTECTED]] [...] I now have this working on my system here. It currently passes all of the tester tests in addition to about 7 more tests that I added myself here locally. I also added the initial support for the set directive and variable substitution. I have one more command to get working and then some clean-up and I'll see about posting the diffs. Actually, while I'm on that subject, the diffs are extensive since I've pretty much touched every SSI related file in a very significant way... in addition to removing a few of them. What is the preferred way to submit such a large patch? I don't know how much you have left but I'm going to be unavailable for the coming weeks starting from tomorrow. I'm relocating back to Sweden from San Francisco. So if you want/can you could send me the files today and I could try and integrate the changes before I take off. Bip Thelin
Re: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
Bip Thelin wrote: -Original Message- From: Paul Speed [mailto:[EMAIL PROTECTED]] For the curious reader, after looking into this code at some length it seems clear why the set command was not added. All SSI requests share the same environment, which not only makes a set command impossible but also means that multiple SSI requests (or even nested SSI requests) trample all over each other. A simple shtml file that includes two other shtml files illustrates this quite nicely. Do you have a smal testcase? We have unittests with Tomcat that have nested includes and several includes in one page. All Ssi directives share the same enviroment per page through a mediator, this is due to the fact that you can have a config directive that changes the error message that you would get for a failed include further down on the same page, for instance. Actually, SsiInvokerServlet has a static reference to SsiMediator. Furthermore, all of SsiMediators fields are static. A simple test case that I use is a .shtml page that includes the same .shtml page twice. Only the first one will actually be included because of the way the Request object in SsiMediator is overwritten. However if pageA includes pageB, if pageB is also an shtml/ssi file it would have a new fresh enviroment and could not tamper with pageA's enviroment. So you could easily do a set command simmilar to the config command. Actually, includes should share the environment of the parent... in fact, if they set server variables the parent will see them. Since I'm between assignments at the moment, I'm working on a patch here locally. It's pretty significant, though, so it may take me a few days. It will include the set command though since that's what I'm going to use to test it. :) Patches and additions are gladly appreciated. Cool. I'm almost done refactoring. I'm basically replacing the SsiMediator with an SsiEnvironment that is then stuck into a request attribute. In the process, I'm moving some things around a little since all of the commands were relying on the fact that they were SsiMediator subclasses... and therefore directly accessing the static fields of SsiMediator. Hopefully I'll have something done in the morning. Even more hopeful, it will be worth committing. ;) We'll see... -Paul Speed
DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361 SsiServlet potentially leaks files --- Additional Comments From [EMAIL PROTECTED] 2001-10-24 07:47 --- The file leaking problem appears to have been solved in the latest servlets-ssi.jar :-) This was the last SSI related bug I have to report for now. Thank you Bip for your time and contributions.
RE: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
-Original Message- From: Paul Speed [mailto:[EMAIL PROTECTED]] [...] Actually, includes should share the environment of the parent... in fact, if they set server variables the parent will see them. Ok, that might be true(just looked at Apache's behavior and they seem to do just that), when we implemented SSI we strictly followed the NCSA standard which don't have set, and doesn't talk about if the included page should see commands set by the parent. God catch! Cool. I'm almost done refactoring. I'm basically replacing the SsiMediator with an SsiEnvironment that is then stuck into a request attribute. In the process, I'm moving some things around a little since all of the commands were relying on the fact that they were SsiMediator subclasses... and therefore directly accessing the static fields of SsiMediator. Ok, sounds good, send along some code and I can take a look at it and commit it. Bip
DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361 SsiServlet potentially leaks files [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |ASSIGNED --- Additional Comments From [EMAIL PROTECTED] 2001-10-23 09:57 --- Hmm, the SsiPackage doesn't open any FileStreams explicity, only indirect through RequestDispatcher.include(). Could it be that your lengthly data import operation that accesses thousands of HTML documents exceeds the number of open files currently allowed on your system? You could try to raise proc/sys/kernel/file-max AND /proc/sys/kernel/inode-max and see if things work better, you could also look at how many files currently open with 'lsof'. Could you try that out and report back if it works or not. Bip Thelin
DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361 SsiServlet potentially leaks files --- Additional Comments From [EMAIL PROTECTED] 2001-10-23 12:39 --- I can't recreate the error here but I did happen to notice something as I was browsing through the code. Each SsiInvokerServlet request causes it to open the requested file as a Resource. The InputStream that is obtained from this Resource is never closed. I tried to poke around in the Resource classes to see if some magic was closing the stream but couldn't find anything. I can attach a patch if needed, but the change is pretty simple. Just a try/finally with a close().
DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361 SsiServlet potentially leaks files --- Additional Comments From [EMAIL PROTECTED] 2001-10-23 13:17 --- Well, my lengthy operation succeeds if SSI operations have been disabled, and fails if SSI is enabled. The problem may of course be in my own application too, but between the successfull and failed tests cases, the only thing that is different is that wether SSI has been enabled or not. The same application has also proven to work in other production environments, so I have some confidence on it. On the other hand, Tomcat4 is a new and different runtime environment, so it may be that some old problem is just now manifesting itself in my application. Anyway, the SSI servlet must surely open a stream to the material it is about to include into a master document, so the big question is, does the servlet ever close that stream explicitly? And does it do it right, like this: InputStream in = someOpenOperation(); try { : } finally // Close the stream even if something goes wrong { in.close(); } Anyway, to make sure, I also run some 'lsof' tests. Without SSI servlets, my Java application kept about 4.000 open files at any given time. However, with SSI servlets enabled, the number started to grow. Although the garbage collertor (or something) seemed to periodically close files, the average number of open files soon exceeded 10.000, and the peak values grew near 80.000. And after some time, the peak value hit 100.000, halting effectively all server services. After the test, the number remained high until Tomcat was restarted, and all was well again. Here is a sample of the number of open files data, taken at 2 second intervals, and at the moment the Too many files errors started to appeared. As you see, there is some cyclic behaviour, but the average increased all the time. 36351 43159 49996 57696 62826 67088 77168 88502 96290 110256 100572 44237 48503 56869 63395 71967 80834 87933 100460 115717 118438 118438 118424 118438 118429 118438 As a final minor detail, I should propably mention that I'm mapping the SSI servlet to *.html file name pattern. This is propably not significant, but you never know ...
Re: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
On a vaguely related note... For the curious reader, after looking into this code at some length it seems clear why the set command was not added. All SSI requests share the same environment, which not only makes a set command impossible but also means that multiple SSI requests (or even nested SSI requests) trample all over each other. A simple shtml file that includes two other shtml files illustrates this quite nicely. Since I'm between assignments at the moment, I'm working on a patch here locally. It's pretty significant, though, so it may take me a few days. It will include the set command though since that's what I'm going to use to test it. :) If noone else beats me to it, I'll post it when I'm done. -Paul Speed [EMAIL PROTECTED] wrote: DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361 SsiServlet potentially leaks files --- Additional Comments From [EMAIL PROTECTED] 2001-10-23 12:39 --- I can't recreate the error here but I did happen to notice something as I was browsing through the code. Each SsiInvokerServlet request causes it to open the requested file as a Resource. The InputStream that is obtained from this Resource is never closed. I tried to poke around in the Resource classes to see if some magic was closing the stream but couldn't find anything. I can attach a patch if needed, but the change is pretty simple. Just a try/finally with a close().
RE: DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
-Original Message- From: Paul Speed [mailto:[EMAIL PROTECTED]] For the curious reader, after looking into this code at some length it seems clear why the set command was not added. All SSI requests share the same environment, which not only makes a set command impossible but also means that multiple SSI requests (or even nested SSI requests) trample all over each other. A simple shtml file that includes two other shtml files illustrates this quite nicely. Do you have a smal testcase? We have unittests with Tomcat that have nested includes and several includes in one page. All Ssi directives share the same enviroment per page through a mediator, this is due to the fact that you can have a config directive that changes the error message that you would get for a failed include further down on the same page, for instance. However if pageA includes pageB, if pageB is also an shtml/ssi file it would have a new fresh enviroment and could not tamper with pageA's enviroment. So you could easily do a set command simmilar to the config command. Since I'm between assignments at the moment, I'm working on a patch here locally. It's pretty significant, though, so it may take me a few days. It will include the set command though since that's what I'm going to use to test it. :) Patches and additions are gladly appreciated. Bip Thelin
DO NOT REPLY [Bug 4361] - SsiServlet potentially leaks files
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4361 SsiServlet potentially leaks files [EMAIL PROTECTED] changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED --- Additional Comments From [EMAIL PROTECTED] 2001-10-23 17:18 --- I just commited some code that could very well fix the problem your having. You were right about not closing some of the streams. Even though the garbagcollector should it takes a while for it to do it and is generally ugly to keep unused open streams. Please try the upcoming nightly build and let me know if the problem persists. And about mapping SSI to *.html, that's not an optional solution since it adds overhead to each request.(It needs to scan every HTML file served).