RE: mod_cache does not return a 304 Not Modified

2004-09-09 Thread Michael Corcoran

Ok, I've tried with 2.0.50, 2.0.51-rc2, and also APACHE_2_0_BRANCH.
They all seem to be exhibiting the same miss-handling of a conditional
request.

Here's what I've found out.

The problem seems to be with the If-None-Match condition, or more
specifically, any condition that requires an Etag.

It seems as though, during the execution of mod_cache.c (and
mod_disk_cache.c), the function ap_meets_conditions() (in
http_protocol.c) is called, which **properly** evaluates the request
conditions and returns the correct status code.  The problem is that,
right at the beginning of the function, the Etag value is attempted to
be retrieved from the r->headers_out table, but the value is always NULL
(at least when called from mod_cache.c), which causes later conditional
checks to always fail, and a status of OK to be returned.  The one case
that you can get a HTTP_NOT_MODIFIED status to be returned is if you
specify a (*) as your Etag in the request, e.g. If-None-Match: *

I would love to help out and try to fix this but, after looking at the
code and debugging for quite a bit, I can't seem to find the appropriate
place where the Etag value is supposed to be read in from the ".header"
file into the r->headers_out table.  If anyone wants to point me in the
right direction or have a look themselves, it would be much appreciated.

Best regards,
Michael Corcoran.


-Original Message-
From: Justin Erenkrantz [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 08, 2004 6:14 PM
To: [EMAIL PROTECTED]
Subject: Re: mod_cache does not return a 304 Not Modified

--On Wednesday, September 8, 2004 2:51 PM -0700 Michael Corcoran
<[EMAIL PROTECTED]> wrote:

> I've looked at the code a little bit (Apache 2.0.50), and at first 
> glance, it seems as though proper 304 response handling might not 
> actually be fully implemented yet.  Is that actually the case, or am I
just missing something.

You should try the APACHE_2_0_BRANCH HEAD, the latest CVS snapshots, or
2.0.51 release candidates at: 

2.0.51 (coming soon) will have a bunch of changes to mod_cache that
*may* fix this.  mod_disk_cache doesn't do anything worthwhile in 2.0.50
due to a lot of bugs and brokenness.  -- justin


Re: Seg fault: Possible race conditions in mod_mem_cache.c

2004-09-09 Thread Jean-Jacques Clar

>>> [EMAIL PROTECTED] 09/09/04 2:27 AM >>>
>>As far as performance, I currently don't see any slow down on my >>box, but will run longer tests tomorrow morning. >no worries about that hereLooking more closely at the implication of the submitted patch, 
I don't think we want to lock decrement_refcount() with a 
global mutex affecting the whole cache. Yes, it fixes the double free,
but it is kind of ugly and increases mutex contention drastically,
at least on my 4 procs NetWare box. The tests I did using
SLES9 shows no slow down using the new code, but the
story is different on NetWare when running with more
than 500 threads.
I am trying to find a solution using the current atomic APIs,
but have not been succesfull so far. If someone has a better 
idea, I am more than open to try it.
Thanks,
JJ
 
 
 


RE: Memory leak!? - the explanation

2004-09-09 Thread Morten Due Jorgensen
Hello all,

I have found an explanation of the "problem" I reported a little while ago - and I 
have been asked to post it here, in case someone else has observed something similar.

I spent a day in the debugger with Apache, I found that I could in fact reproduce the 
problem in "the lab" - with some patience. The short description of the problem: 
Apache's virtuel memory consumption is apparently increasing over time as pages are 
viewed, but not necessarily every time.

Apparently... It was (as "expected") not a memory leak, but an effect of the large 
number of threads that I have configured, 144. It appears each thread will allocate a 
"completion context" from a pool (see mpm\winnt\child.c), which will be recycled 
shortly after. Each context takes around half a megabyte. Usually a limited amount of 
contexts are needed, but in a worst case scenario, each thread will request a context, 
and the full amount will be allocated. However, larger numbers of contexts appears 
only after "some" not well defined use. This is why it looks like a leak. Initially, 
my Apache uses around 10MB, and if all threads allocate a context, it will amount to 
86MB, but it will get there very slowly, raising the memory usage only when one or 
more threads decide to need a context and don't find one in the pool.

The reason I see this is www.mdjnet.dk/discog.html, a HTML document with well over 100 
small pictures embedded. If I don't have at least a thread per picture, I will get 
these in the error.log:
"[warn] Server ran out of threads to serve requests. Consider raising the 
ThreadsPerChild setting"
and pictures missing on the page. When someone views that page, or if several users 
access pages in general at the same time, the server will be litterally "showered" 
with request, mostly requests for pictures. Each request will activate a thread. 
Usually only a limited number of threads will need to allocate a context at the same 
time, but - perhaps because my server is an old 200MHz with only 128MB - eventually, a 
larger and larger number of threads will - coincidentially - need contexts at the same 
time. This does not necessarily happen immediately, and with 144 threads there is room 
for the "record" to be broken many times over a period of many weeks/months, so this 
is why it appears to be a memory leak, as memory consumption will grow "forever" and 
to propertions much larger than the entire content of my simple site, even though 
there is indeed an upper limit, in my case 86MB, which it will approach 
"assymptotically".

So, I am no longer worried about a leak in Apache, but there are still a few things I 
don't understand in depth:

- The allocation of contexts. It appears the contexts are not released to the pool 
until after a few seconds, downloading (locally) the page in question takes less time 
than that. Why is the total need for contexts still relatively small - typically? 
After one download of the page, the pool will typically contain around 10-20 contexts, 
while the next download may result in 30 contexts being used. Why the difference, if 
the recycling is relatively slow anyway?

- Why do I so easily get the "out of threads" error? My server is indeed a low volume 
server, and I know Apache to serve a significant larger amount of clients at the same 
time elsewhere, so even without the 100+ jpgs page, the number of simultaneous 
requests will be rather large. Is people out there really running their Apaches with a 
gigantic number og threads, og could the fact that I see the error message be related 
to my "2.0.x / W2K / slow CPU / slow DSL (128kbs)" combination? - or have I simply 
configured my Apache all wrong?

Not questions I loose any sleep over, but it could be interesting to understand in 
detail. If you fell like discussing this further, please cc me on any posts, as I will 
be leaving this list some time soon.

I hope this analysis is interesting to someone, and I hope it might even save someone 
some headache, if they encounter the same effect.

Regards,
Morten Due Jorgensen
http://www.mdjnet.dk



Re: Bug 31126: Reiser4

2004-09-09 Thread Rici Lake
On 9-Sep-04, at 12:48 PM, William A. Rowe, Jr. wrote:
You miss a case with this patch...
Consider a request for /content/foo/index.php/extra-path-info
The path doesn't end where you think it does, and directory walk
would usually handle this case.  Perhaps a patch to perform an
extra sanity check on EACCES results is a better fix, at least
introduced by a compile-time flag?
I'm sure you know the code better than I do. I thought that
r->info.filetype would be 0 in this case, since extra-path-info
doesn't exist; in which case the optimisation is already being skipped.
But I might be completely out to lunch.
R,




Re: Bug 31126: Reiser4

2004-09-09 Thread William A. Rowe, Jr.
At 11:12 AM 9/9/2004, Rici Lake wrote:

>>EACCES is just the wrong error, because it has different semantics.
>
>I tend to agree, but it's not totally clear. Reiser4's files can also function as 
>directories, so ENOTDIR is incorrect. ENOENT seems like a reasonable choice, but 
>POSIX also mandates the return of EACCES instead of ENOENT if search permissions are 
>not present. I don't know enough about the Reiser filesystem to know how one 
>specifies search permissions on a file which can also be a directory.
>
>>>It is worth noting that in the particular circumstances which give rise
>>>to this error, ap_directory_walk could tell if the path refers to a
>>>directory or a file; it should be possible to avoid walking too far,
>>>although there may well be edge cases I haven't thought of.
>
>>-> Race condition.
>
>What about just changing line 930 of server/request.c from
>
>   if (r->finfo.filetype
>to
>
>   if (r->finfo.filetype && *r->path_info
>
>All that will do is not optimise on the last segment; I don't see how not performing 
>the optimization could create a race condition that didn't already exist.

You miss a case with this patch...

Consider a request for /content/foo/index.php/extra-path-info

The path doesn't end where you think it does, and directory walk
would usually handle this case.  Perhaps a patch to perform an
extra sanity check on EACCES results is a better fix, at least
introduced by a compile-time flag?

Bill




[PATCH] don't crash with per-dir (location) rewrite config and NULL r->filename

2004-09-09 Thread Jeff Trawick
See attached patch.  Given a module with map-to-storage hook which
leaves r->filename NULL, and config like the following, you get
segfault on platforms that don't like strlen(NULL).


RewriteEngine On
RewriteCond %{SERVER_PORT} ^8080$
RewriteRule (.*) https://%{SERVER_NAME}%{REQUEST_URI}


/silly is handled by a module which implements a map-to-storage hook
and leaves r->filename NULL

comments?  better way to do it?

odd to me that rewrite's translate_name hook, used for processing
server config directives, can update r->filename permanently even when
it declines


patch
Description: Binary data


Re: Bug 31126: Reiser4

2004-09-09 Thread Rici Lake
I want to make it clear here that I have never used Reiser4, so on a 
personal level the issue is somewhat academic.

On 9-Sep-04, at 1:11 AM, André Malo wrote:
SUSv3/POSIX disagrees as well. So I think, Mr. Reiser should just fix,
what's broken. It's not such an uncommon case to test an arbitrary file
path, if it's possible to open it.
So, Apache is only going to work on POSIX systems, you're saying?
Even under POSIX, there are cases where this particular piece of error 
analysis could fail. For example, the redundant open on 
pathname/.htaccess could return ENAMETOOLONG even though pathname was 
within limits and names a regular file. (Admittedly, that is a 
pathological case, but...)

EACCES is just the wrong error, because it has different semantics.
I tend to agree, but it's not totally clear. Reiser4's files can also 
function as directories, so ENOTDIR is incorrect. ENOENT seems like a 
reasonable choice, but POSIX also mandates the return of EACCES instead 
of ENOENT if search permissions are not present. I don't know enough 
about the Reiser filesystem to know how one specifies search 
permissions on a file which can also be a directory.

It is worth noting that in the particular circumstances which give 
rise
to this error, ap_directory_walk could tell if the path refers to a
directory or a file; it should be possible to avoid walking too far,
although there may well be edge cases I haven't thought of.

-> Race condition.
What about just changing line 930 of server/request.c from
   if (r->finfo.filetype
to
   if (r->finfo.filetype && *r->path_info
All that will do is not optimise on the last segment; I don't see how 
not performing the optimization could create a race condition that 
didn't already exist.



html Parser...

2004-09-09 Thread Manos Moschous

Hi,


I would like to use an html Parser in my apache module.
Is there any well-known html parser to do this..?
Do i need to recompile the apache web server in order to use this parser...?
I found the old/net/libxml2/HTMLparser.c, what am i have to do to expand the
apache web server...(recompile..??)

Thanks in advance!

Manos Moschous



Re: HTTP proxy working for folks on 2.1-dev?

2004-09-09 Thread Mladen Turk
Jeff Trawick wrote:
Just committed the needed changes to allow forward proxies.
I saw... definitely gets farther now... This shows how far it gets
when I configure mozilla to use Apache as HTTP proxy:
[Thu Sep 09 06:53:17 2004] [crit] [Thu Sep 09 06:53:17 2004] file
http_protocol.c, line 981, assertion "readbytes > 0" failed
[Thu Sep 09 06:53:18 2004] [notice] child pid 4606 exit signal Abort
(6), possible coredump in /export/home/trawick/inst/21
Yes, I've just tried mozilla too, and it core dumps :(.
Interesting when using IE as a client everything works.
I know where the problem is. Give me couple of hours to test
that on each mpm.
Regards,
MT.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Seg fault: Possible race conditions in mod_mem_cache.c

2004-09-09 Thread Jim Jagielski
Jeff Trawick wrote:
> 
> >As far as performance, I currently don't see any slow down on my 
> >box, but will run longer tests tomorrow morning. 
> 
> no worries about that here
>   
> >Should I commit to 2.1 and add an entry in the status file for the 
> >2.0 branch? 
> 
> yes; I haven't seen any potential 3rd +1s participate in this thread,
> so better put it in STATUS
> 
> >I will like to see that fixed in time for 2.0.51 (if there 
> >is an rc3). 
> 
> this has to go in 2.0.51; no need to ship such a problem
>   

I should have some testing done by lunchtime (eastern) today!

-- 
===
   Jim Jagielski   [|]   [EMAIL PROTECTED]   [|]   http://www.jaguNET.com/
  "A society that will trade a little liberty for a little order
 will lose both and deserve neither" - T.Jefferson


Re: HTTP proxy working for folks on 2.1-dev?

2004-09-09 Thread Nick Kew
On Thu, 9 Sep 2004, Mladen Turk wrote:

> Q:
> Is it possible to have forward and reverse proxies mixed together
> on the same box?

Of course!  I have that defined in different virtual hosts,
but AFIACS it should also work fine simply using  for
the reverse proxies and  for the forward.

-- 
Nick Kew


Re: HTTP proxy working for folks on 2.1-dev?

2004-09-09 Thread Jeff Trawick
On Thu, 09 Sep 2004 12:52:53 +0200, Mladen Turk <[EMAIL PROTECTED]> wrote:
> 
> 
> Jeff Trawick wrote:
> > On Fri, 3 Sep 2004 12:30:34 -0400, Jeff Trawick <[EMAIL PROTECTED]> wrote:
> >>>
> 
> 192.168.1.11 - - [03/Sep/2004:12:05:59 -0400] "GET
> http://127.0.0.1:10101/cgi-bin/printenv HTTP/1.0" 404 236
> 
> error log has:
> 
> [Fri Sep 03 12:05:59 2004] [error] [client 127.0.0.1] File does not
> exist: proxy:http://127.0.0.1:10101/cgi-bin/printenv
> >>
> > I had time dig into it enough to get the feeling that it is something
> > that the balancer/worker folks ought to have a look at ;)  It would be
> > a big headstart knowing what is supposed to happen in the handler
> > hook.  See attached function call trace.  Does the balancer's handler
> > have to return OK?  Does the balancer's proxy-pre_request hook have to
> > return OK?
> 
> Just committed the needed changes to allow forward proxies.

I saw... definitely gets farther now... This shows how far it gets
when I configure mozilla to use Apache as HTTP proxy:

[Thu Sep 09 06:53:16 2004] [debug] mod_proxy.c(654): Trying to run
scheme_handler
[Thu Sep 09 06:53:16 2004] [debug] proxy_http.c(1195): proxy: HTTP:
serving URL http://planetsun.org/
[Thu Sep 09 06:53:16 2004] [debug] proxy_util.c(1483): proxy:
initialized worker for (*:0) min=0 max=25 smax=25
[Thu Sep 09 06:53:16 2004] [debug] proxy_util.c(1414): proxy: socket
is constructed
[Thu Sep 09 06:53:16 2004] [debug] proxy_util.c(1586): proxy: HTTP:
has acquired connection for (*:0)
[Thu Sep 09 06:53:16 2004] [debug] proxy_util.c(1640): proxy:
connecting http://planetsun.org/ to planetsun.org:80
[Thu Sep 09 06:53:16 2004] [debug] proxy_util.c(1789): proxy: HTTP:
fam 2 socket created to connect to *:0
[Thu Sep 09 06:53:16 2004] [debug] proxy_util.c(1880): proxy: HTTP:
connection complete to 194.70.142.72:80 (planetsun.org)
[Thu Sep 09 06:53:17 2004] [debug] proxy_http.c(1016): proxy: start body send
[Thu Sep 09 06:53:17 2004] [crit] [Thu Sep 09 06:53:17 2004] file
http_protocol.c, line 981, assertion "readbytes > 0" failed
[Thu Sep 09 06:53:18 2004] [notice] child pid 4606 exit signal Abort
(6), possible coredump in /export/home/trawick/inst/21

(gdb) where
#0  0xd116200c in _lwp_kill () from /lib/libc.so.1
#1  0xd115f24d in thr_kill () from /lib/libc.so.1
#2  0xd110c7af in raise () from /lib/libc.so.1
#3  0xd10eef34 in abort () from /lib/libc.so.1
#4  0x080cf256 in ap_log_assert (szExp=0x80fa235 "readbytes > 0",
szFile=0x80fa0f0 "http_protocol.c", nLine=981) at log.c:708
#5  0x08091d86 in ap_http_filter (f=0x823dff8, b=0x8232e58,
mode=AP_MODE_READBYTES, block=APR_BLOCK_READ, readbytes=0) at
http_protocol.c:981
#6  0x080da031 in ap_get_brigade (next=0x823dff8, bb=0x8232e58,
mode=AP_MODE_READBYTES, block=APR_BLOCK_READ, readbytes=0)
at util_filter.c:474
#7  0x080e3c46 in net_time_filter (f=0x8233e30, b=0x8232e58,
mode=AP_MODE_READBYTES, block=APR_BLOCK_READ, readbytes=0) at
core.c:3768
#8  0x080da031 in ap_get_brigade (next=0x8233e30, bb=0x8232e58,
mode=AP_MODE_READBYTES, block=APR_BLOCK_READ, readbytes=0)
at util_filter.c:474
#9  0x0808d4e9 in ap_proxy_http_process_response (p=0x8231f78,
r=0x8237fc8, backend=0x81e7e30, origin=0x82326b8, conf=0x8180598,
server_portstr=0xcf66dd20 ":8080") at proxy_http.c:1027
#10 0x0808da6f in ap_proxy_http_handler (r=0x8237fc8,
worker=0x8186490, conf=0x8180598, url=0x8232658 "/", proxyname=0x0,
proxyport=0)
at proxy_http.c:1254
#11 0x080828cf in proxy_run_scheme_handler (r=0x8237fc8,
worker=0x8186490, conf=0x8180598, url=0x8239216
"http://planetsun.org/";,
proxyhost=0x0, proxyport=0) at mod_proxy.c:1749
#12 0x08080419 in proxy_handler (r=0x8237fc8) at mod_proxy.c:656
#13 0x080cabb5 in ap_run_handler (r=0x8237fc8) at config.c:156
#14 0x080cb312 in ap_invoke_handler (r=0x8237fc8) at config.c:368
#15 0x08095864 in ap_process_request (r=0x8237fc8) at http_request.c:246
#16 0x0808fb2e in ap_process_http_connection (c=0x82320a0) at http_core.c:253
#17 0x080d719a in ap_run_process_connection (c=0x82320a0) at connection.c:42
#18 0x080d7586 in ap_process_connection (c=0x82320a0, csd=0x8231fb0)
at connection.c:175
#19 0x080c7305 in process_socket (p=0x8231f78, sock=0x8231fb0,
my_child_num=1, my_thread_num=24, bucket_alloc=0x8235f88) at
worker.c:520
#20 0x080c7a96 in worker_thread (thd=0x81e4418, dummy=0x817e650) at worker.c:856

> 
> Q:
> Is it possible to have forward and reverse proxies mixed together
> on the same box?

definitely


Re: HTTP proxy working for folks on 2.1-dev?

2004-09-09 Thread Mladen Turk
Jeff Trawick wrote:
On Fri, 3 Sep 2004 12:30:34 -0400, Jeff Trawick <[EMAIL PROTECTED]> wrote:

192.168.1.11 - - [03/Sep/2004:12:05:59 -0400] "GET
http://127.0.0.1:10101/cgi-bin/printenv HTTP/1.0" 404 236
error log has:
[Fri Sep 03 12:05:59 2004] [error] [client 127.0.0.1] File does not
exist: proxy:http://127.0.0.1:10101/cgi-bin/printenv

I had time dig into it enough to get the feeling that it is something
that the balancer/worker folks ought to have a look at ;)  It would be
a big headstart knowing what is supposed to happen in the handler
hook.  See attached function call trace.  Does the balancer's handler
have to return OK?  Does the balancer's proxy-pre_request hook have to
return OK?
Just committed the needed changes to allow forward proxies.
Q:
Is it possible to have forward and reverse proxies mixed together
on the same box?
For example: The httpd is acting as forward proxy, but also has defined
few reverse proxies. In that case instead default worker we could
directly use reverse proxies workers together with balancing etc...
Regards,
MT.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Seg fault: Possible race conditions in mod_mem_cache.c

2004-09-09 Thread Jeff Trawick
>As far as performance, I currently don't see any slow down on my 
>box, but will run longer tests tomorrow morning. 

no worries about that here
  
>Should I commit to 2.1 and add an entry in the status file for the 
>2.0 branch? 

yes; I haven't seen any potential 3rd +1s participate in this thread,
so better put it in STATUS

>I will like to see that fixed in time for 2.0.51 (if there 
>is an rc3). 

this has to go in 2.0.51; no need to ship such a problem
  
>JJ 

hey, good debugging on this!!!