[jira] Resolved: (MODPYTHON-156) Module imports from server side includes and new importer.

2006-05-01 Thread Graham Dumpleton (JIRA)
 [ http://issues.apache.org/jira/browse/MODPYTHON-156?page=all ]
 
Graham Dumpleton resolved MODPYTHON-156:


Fix Version: 3.3
 Resolution: Fixed

 Module imports from server side includes and new importer.
 --

  Key: MODPYTHON-156
  URL: http://issues.apache.org/jira/browse/MODPYTHON-156
  Project: mod_python
 Type: Sub-task

   Components: importer
 Reporter: Graham Dumpleton
 Assignee: Graham Dumpleton
  Fix For: 3.3


 With old module importer, where Python*Handler directives are used in a 
 directory context, that directory is added to sys.path. Now where Python code 
 is being used with server side includes in that same directory and code of 
 form:
   module = apache.import_module(xxx)
 is used with xxx.py also being in the same directory, it will be found due 
 to the directory being added to sys.path.
 With the new module importer, the directory isn't added to sys.path and so 
 the module would not be found.
 In the case of a handler module (rather than SSI Python code), such a module 
 import would still work, as the new module importer is smart enough to 
 realise that the caller was also imported using the new module importer and 
 thus would look in the same directory first or as necessary, in the directory 
 the Python*Handler directive was specified for. This will not work for SSI 
 code though, as it is not part of an imported module and is eval/exec'd on 
 each page request.
 What is thus required is for the global environment in which the SSI code is 
 executed for a specific page, to be automatically marked up in such a way 
 that the new module importer believes it was imported as a module by the new 
 module importer, thus triggering it to look in the same directory for modules.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



Re: svn commit: r398494 - in /httpd/site/trunk: docs/security/vulnerabilities_13.html docs/security/vulnerabilities_20.html docs/security/vulnerabilities_22.html xdocs/security/vulnerabilities_22.xml

2006-05-01 Thread Ruediger Pluem


On 05/01/2006 03:32 AM, [EMAIL PROTECTED] wrote:
 Author: pquerna
 Date: Sun Apr 30 18:32:18 2006
 New Revision: 398494
 
 URL: http://svn.apache.org/viewcvs?rev=398494view=rev
 Log:
 rebuild all.
 
 Modified:
 httpd/site/trunk/docs/security/vulnerabilities_13.html
 httpd/site/trunk/docs/security/vulnerabilities_20.html
 httpd/site/trunk/docs/security/vulnerabilities_22.html
 httpd/site/trunk/xdocs/security/vulnerabilities_22.xml

This killed the list of vulnerabilities for all versions. Was this intended?
And if yes, where can they be found now?

Anyway, many thanks for doing this release work :-).

Regards

Rüdiger



Re: svn commit: r398492 - in /httpd/site/trunk: docs/download.html docs/index.html xdocs/download.xml xdocs/index.xml

2006-05-01 Thread Ruediger Pluem


On 05/01/2006 03:25 AM, [EMAIL PROTECTED] wrote:
 Author: pquerna
 Date: Sun Apr 30 18:25:38 2006
 New Revision: 398492
 
 URL: http://svn.apache.org/viewcvs?rev=398492view=rev
 Log:
 Rev website for 2.2.2
 
 Modified:
 httpd/site/trunk/docs/download.html
 httpd/site/trunk/docs/index.html
 httpd/site/trunk/xdocs/download.xml
 httpd/site/trunk/xdocs/index.xml

I see that 2.2.2 and 2.0.58 are announced. What about 1.3.35? Did it not hit 
the mirrors in time?

Regards

Rüdiger



Re: Possible new cache architecture

2006-05-01 Thread Brian Akins

Graham Leggett wrote:


The potential danger with this is for race conditions to happen while 
expiring cache entries. If the data entity expired before the header 
entity, it potentially could confuse the cache - is the entry cached or 
not? The headers say yes, data says no.


Nope.  Look at the way the current http cache works. An http object, 
headers and data, is only valid if both headers and data are valid.


Each variant should be an independent cached entry, the cache should 
allow different variants to be cached side by side.


Yes.  Each is distinguished by its key.

As far as mod_cache is concerned these are 3 independent entries, but 
mod_http_cache knows how to stitch them together.


mod_cache should *not* be HTTP specific in any way.


mod_cache need not be HTTP specific, it only needs the ability to cache 
multiple entities (data, headers) under the same key, 


No.



In other words, there must be the ability to cache by a key and a subkey.



No. mod_http_cache generates new keys for headers (key.header) data 
(key.data) and each variant (key1.header, key2.header, key1.daya... 
etc.).  As far as the underlying generic cache is concerned, they are 
all independent entries.



--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-01 Thread Brian Akins

Davi Arnaut wrote:


mod_cache needs only to cache key/value pairs. The key/value format is up to
the mod_cache user.


correct.

--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


[Fwd: svn commit: r398585 - in /httpd/site/trunk: docs/download.html docs/index.html xdocs/download.xml xdocs/index.xml]

2006-05-01 Thread William A. Rowe, Jr.
+liWin32 Binary (Self extracting): a 
href=[preferred]/httpd/binaries/win32/apache_1.3.35-win32-x86-no_src.exeapache_1.3.35-win32-x86-no_src.exe/a 



There is no more .exe (and won't be).  By 2006 everyone has at least
msiexec 1.10 installed ;-)

Only -src.msi and -no_src.msi remain for 1.3, while 2.0 and 2.2 will have
-ssl.msi and -no-ssl.msi flavors.

Bill



Re: Possible new cache architecture

2006-05-01 Thread Brian Akins

Here is a scenario.  We will assume a cache hit.

Client asks for http://domain/uri.html?args

mod_http_cache generates a key: http-domain-uri.html-args-header

asks mod_cache for value with this key.

mod_cache fetches the value, looks at expire time, its good, and returns 
the blob


mod_http_cache examines blob, it's vary information on Accept-Encoding.

mod_http_cache generates a new key: http-domain.html-args-header-gzip 
(value from client)


asks mod_cache for value with this key.

mod_cache fetches the value, looks at expire time, its good, and returns 
the blob


mod_http_cache examines blob, it's a normal header blob. does not meet 
conditions need to get data.


mod_http_cache generates a new key: http-domain.html-args-data-gzip 
(value from client)


asks mod_cache for value with this key.

mod_cache fetches the value, looks at expire time, its good, and returns 
the blob



mod_http_cache returns headers and data to client.


Notice there is a pattern to this...
--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-01 Thread Davi Arnaut
On Mon, 01 May 2006 14:51:53 +0200
Graham Leggett [EMAIL PROTECTED] wrote:

 Davi Arnaut wrote:
 
  mod_cache need not be HTTP specific, it only needs the ability to cache 
  multiple entities (data, headers) under the same key, and be able to 
  replace zero or more entities independently of the other entities (think 
  updating headers without updating content).
  
  mod_cache needs only to cache key/value pairs. The key/value format is up to
  the mod_cache user.
 
 It's a design flaw to create problems that have to be specially coded 
 around, when you can avoid the problem entirely.

Maybe I'm missing something, what problems do you foresee ?

 The cache needs to be generic, yes - but there is no need to stick to 
 the key/value cliché of cache code, if a variation to this is going to 
 make your life significantly easier.
 

And the variation is..?

--
Davi Arnaut


Re: Possible new cache architecture

2006-05-01 Thread Davi Arnaut
On Mon, 01 May 2006 09:02:31 -0400
Brian Akins [EMAIL PROTECTED] wrote:

 Here is a scenario.  We will assume a cache hit.

I think the usage scenario is clear. Moving on, I would like to able to stack
up the cache providers (like the apache filter chain). Basically, mod_cache
will expose the functions:

add(key, value, expiration, flag)
get(key)
remove(key)

mod_cache will then pass the request (add/get or remove) down the chain,
similar to apache filter chain. ie:

apr_status_t mem_cache_get_filter(ap_cache_filter_t *f,
  apr_bucket_brigade *bb, ...);

apr_status_t disk_cache_get_filter(ap_cache_filter_t *f,
   apr_bucket_brigade *bb, ...);

This way it would be possible for one cache to act as a cache of another
cache provider, mod_mem_cache would work as a small/fast MRU cache for
mod_disk_cache.

--
Davi Arnaut



Re: Possible new cache architecture

2006-05-01 Thread Brian Akins

Davi Arnaut wrote:

This way it would be possible for one cache to act as a cache of another
cache provider, mod_mem_cache would work as a small/fast MRU cache for
mod_disk_cache.


Slightly off subject, but in my testing, mod_disk_cache is much faster 
than mod_mem_cache.  Thanks to sendifle!


I was thinking about scenarios were each cache had there local cache 
(disk, mem, whatever) with memcache behind it.  That way each object 
only has to be generated once for the entire farm.  This would be an 
easy way to have a distributed cache.


Also, the squid type htcp (or icp) could be a failback for the local 
cache as well without mucking up all the proxy and cache code.



--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: svn commit: r398494 - in /httpd/site/trunk: docs/security/vulnerabilities_13.html docs/security/vulnerabilities_20.html docs/security/vulnerabilities_22.html xdocs/security/vulnerabilities_22.xml

2006-05-01 Thread Mark J Cox
 This killed the list of vulnerabilities for all versions. Was this intended?
 And if yes, where can they be found now?

Must be someone with bad java foo, fixing.

Mark
--
Mark J Cox | www.awe.com/mark





Re: svn commit: r398494 - in /httpd/site/trunk: docs/security/vulnerabilities_13.html docs/security/vulnerabilities_20.html docs/security/vulnerabilities_22.html xdocs/security/vulnerabilities_22.xml

2006-05-01 Thread Paul Querna
Mark J Cox wrote:
 This killed the list of vulnerabilities for all versions. Was this intended?
 And if yes, where can they be found now?
 
 Must be someone with bad java foo, fixing.
 

Er. ya. It wasn't my intention to break stuff, I just ran build.sh and
it kept saying it wanted to do this

java version 1.5.0_06

Intel Mac.

How could a version of java change the behavior of the site build stuff?

-Paul


Re: plain file name of a request

2006-05-01 Thread Greg Ames

Markus Litz wrote:

Hello,

how can i get the filename only of the requested uri? For example if 
http://www.example.com/test.html; is requestet, i only want test.html. 
request_rec::filename only gives the full filename on disk.


basename(r-filename)

Greg


Re: plain file name of a request

2006-05-01 Thread William A. Rowe, Jr.

Greg Ames wrote:

Markus Litz wrote:


Hello,

how can i get the filename only of the requested uri? For example if 
http://www.example.com/test.html; is requestet, i only want 
test.html. request_rec::filename only gives the full filename on disk.


basename(r-filename)


:)  Or portably, apr_filepath_name_get() declared in apr_lib.h


Re: Possible new cache architecture

2006-05-01 Thread Graham Leggett

Davi Arnaut wrote:

It's a design flaw to create problems that have to be specially coded 
around, when you can avoid the problem entirely.


Maybe I'm missing something, what problems do you foresee ?


There are lots of issues that were uncovered when I split the proxy and 
cache code for httpd v2.0.


A web cache requires two separately alterable cached entities (headers, 
body) just for caching a single variant. This pair of entities need to 
expire and/or be forceably expired (think Cache-Control no-cache) 
atomically. Sure, you can code and debug a lot of code to try and create 
the effect of atomically expiring multiple cache entries at once. Or you 
can avoid this issue entirely by building a generic cache that works 
with key/subkey/data.


There are a number of other issues that have been listed as bugs since 
httpd v1.3 that are still present, most notably the thundering herd 
problem, and the independent caching of variants. There is no point in 
refactoring the cache code if the new code isn't going to be 
significantly better than the existing code.


Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Possible new cache architecture

2006-05-01 Thread Brian Akins

Graham Leggett wrote:
 the independent caching of variants. 


The example I posted should address this issue.

I also have some ideas concerning the thundering herd problem, it's just 
a matter if you think it should be handled in cache or http_cache.




--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-01 Thread Graham Leggett

Brian Akins wrote:

Nope.  Look at the way the current http cache works. An http object, 
headers and data, is only valid if both headers and data are valid.


That's two hits to find whether something is cached.

How are races prevented?

Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Possible new cache architecture

2006-05-01 Thread Brian Akins

Graham Leggett wrote:
 Or you 
can avoid this issue entirely by building a generic cache that works 
with key/subkey/data.


and then you have to find a way to bridge the gap between this interface 
and all the key/value caches that currently exist (memcache being the 
most popular example).


what if mod_http_cache had a way to record it's cached objects? It 
could keep up with the relationships there.  Basically, you have a 
provider that has a few functions that get called whenever 
mod_http_cache caches or expires an object.



--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-01 Thread Brian Akins

Graham Leggett wrote:


That's two hits to find whether something is cached.


You must have two hits if you support vary.


How are races prevented?


shouldn't be any.  something is in the cache or not.  if one piece of 
an http object is not valid or in cache, the object is invalid. 
Although other variants may be valid/in cache.



--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-01 Thread William A. Rowe, Jr.

Brian Akins wrote:

Graham Leggett wrote:


That's two hits to find whether something is cached.


You must have two hits if you support vary.


Well, one to three hits.  One, if you use an arbitrary page (MRU or most
frequently referenced would be most optimial, but it really doesn't matter)
and then determine what varies, and if you are in the right place, or what
that right place is (page by language, or whatever fields it varied by.)

Three hits or more if your variant also varies ;)


How are races prevented?


shouldn't be any.  something is in the cache or not.  if one piece of 
an http object is not valid or in cache, the object is invalid. 
Although other variants may be valid/in cache.


And, of course, inserting the hit once it's composed is important, and can
happen in parallel (3 clients looking for the same, and then fetching the
same page from the origin).  But it's harmless if the insertion is mutex
protected, and the insertion can only happen once the page is fetched
complete.


Re: Possible new cache architecture

2006-05-01 Thread Brian Akins

William A. Rowe, Jr. wrote:


And, of course, inserting the hit once it's composed is important, and can
happen in parallel (3 clients looking for the same, and then fetching the
same page from the origin).  But it's harmless if the insertion is mutex
protected, and the insertion can only happen once the page is fetched
complete.



in the case of mod_disk_cache the way I would do it is to have a 
deterministic tempfile rather than user apr_tempfile and opening it EXCL.


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-01 Thread Davi Arnaut
On Mon, 01 May 2006 15:46:58 -0400
Brian Akins [EMAIL PROTECTED] wrote:

 Graham Leggett wrote:
 
  That's two hits to find whether something is cached.
 
 You must have two hits if you support vary.
 
  How are races prevented?
 
 shouldn't be any.  something is in the cache or not.  if one piece of 
 an http object is not valid or in cache, the object is invalid. 
 Although other variants may be valid/in cache.
 

More important, if we stick with the key/data concept it's possible to
implement the header/body relationship under single or multiple keys.

I think Brian want's mod_cache should be only a layer (glue) between the
underlying providers and the cache users. Each set of problems are better
dealt under their own layers. The storage layer (cache providers) are going
to only worry about storing the key/data pairs (and expiring ?) while the
protocol layer will deal with the underlying concepts of each protocol
(mod_http_cache).

The current design leads to bloat, just look at mem_cache and disk_cache,
both have their own duplicated quirks (serialize/unserialize, et cetera)
and need special handling of the headers and file format. Under the new
design this duplication will be gone, think that we will assemble the
HTTP-specific part and generalize the storage part.

--
Davi Arnaut


Re: Possible new cache architecture

2006-05-01 Thread Graham Leggett

Brian Akins wrote:


That's two hits to find whether something is cached.


You must have two hits if you support vary.


You need only one - bring up the original cached entry with the key, and 
then use cheap subkeys over a very limited data set to find both the 
variants and the header/data.



How are races prevented?


shouldn't be any.  something is in the cache or not.  if one piece of 
an http object is not valid or in cache, the object is invalid. 
Although other variants may be valid/in cache.


I can think of one race off the top of my head:

- the browser says send me this URL.

- the cache has it cached, but it's stale, so it asks the backend 
If-None-Match.


- the cache reaper comes along, says oh, this is stale, and reaps the 
cached body (which is independant, remember?). The data is no longer 
cached even though the headers still exist.


- The backend says 304 Not Modified.

- the cache says cool, will send my copy upstream. Oops, where has my 
data gone?.


The end user will probably experience this as oh, the website had a 
glitch, let me try again, so it won't be reported as a bug.


Ok, so you tried to lock the body before going to the backend, but 
searching for and locking the body would have been an additional wasted 
cache hit if the backend answered with its own body. Not to mention 
having to write and debug code to do this.


Races need to be properly handled, and atomic cache operations will go a 
long way to prevent them.


Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH] #39275 MaxClients on startup [Was: Bug in 2.0.56-dev]

2006-05-01 Thread Greg Ames

Jeff Trawick wrote:


I have been working with a user on one of these fork bomb scenarios
and assumed it was the child_init hook.  But after giving them a test
fix that relies on a child setting scoreboard fields in child_main
before child-init hooks run, and also adds some debugging traces
related to calling child-init hooks, it is clear that their stall
occurs BEFORE the child-init hook.  Which leaves a stretch of fairly
simple code.

(Best theory is bad stuff happening in an atfork handler registred by
a third-party module or some library it uses.  But that's besides the
point.)


after more thought, there is a simpler patch that should do the job.  the key to both of 
these is how threads in SERVER_DEAD state with a pid in the scoreboard are treated.  this 
means that p_i_s_m forked on a previous timer pop but some thread never made it into 
SERVER_STARTING state.


the difference:  this patch just counts those potential threads as idle, and allows 
MinSpareThreads worth of processes to be forked before putting on the brakes.  the 
previous patch pauses the forking immediately when the strange situation is detected but 
requires more code and a new variable.  I'm leaning toward this one because it is simpler. 
 opinions?


Greg

--- server/mpm/worker/worker.c  (revision 398659)
+++ server/mpm/worker/worker.c  (working copy)
@@ -1422,7 +1422,7 @@
  */
 if (ps-pid != 0) { /* XXX just set all_dead_threads in outer for
loop if no pid?  not much else matters */
-if (status = SERVER_READY  status != SERVER_DEAD 
+if (status = SERVER_READY 
 !ps-quiescing 
 ps-generation == ap_my_generation) {
 ++idle_thread_count;




Re: Possible new cache architecture

2006-05-01 Thread William A. Rowe, Jr.

Graham Leggett wrote:

Brian Akins wrote:


That's two hits to find whether something is cached.



You must have two hits if you support vary.



You need only one - bring up the original cached entry with the key, and 
then use cheap subkeys over a very limited data set to find both the 
variants and the header/data.



How are races prevented?



shouldn't be any.  something is in the cache or not.  if one piece 
of an http object is not valid or in cache, the object is invalid. 
Although other variants may be valid/in cache.



I can think of one race off the top of my head:

- the browser says send me this URL.

- the cache has it cached, but it's stale, so it asks the backend 
If-None-Match.


- the cache reaper comes along, says oh, this is stale, and reaps the 
cached body (which is independant, remember?). The data is no longer 
cached even though the headers still exist.


- The backend says 304 Not Modified.

- the cache says cool, will send my copy upstream. Oops, where has my 
data gone?.


I think that can be avoided by, instead of reaping the cached body, actually
setting aside the cached body (public  private), by changing it's key or
whatnot.  Then - throw it away after the backend says 200 OK, and replace
it with something new.  Or, rekey it a second time (private  public) when
the backend reports 304 NOT MODIFIED.

In the race, one will set it aside looking for another, the second will make
a fresh request (it doesn't see it in the cache), and either the first or
second request will wrap up -last- to place the final copy back into the
cache, replacing the document from the winner.  No harm no foul.

Bill


Re: Possible new cache architecture

2006-05-01 Thread Davi Arnaut
On Mon, 01 May 2006 22:46:44 +0200
Graham Leggett [EMAIL PROTECTED] wrote:

 Brian Akins wrote:
 
  That's two hits to find whether something is cached.
  
  You must have two hits if you support vary.
 
 You need only one - bring up the original cached entry with the key, and 
 then use cheap subkeys over a very limited data set to find both the 
 variants and the header/data.
 
  How are races prevented?
  
  shouldn't be any.  something is in the cache or not.  if one piece of 
  an http object is not valid or in cache, the object is invalid. 
  Although other variants may be valid/in cache.
 
 I can think of one race off the top of my head:
 
 - the browser says send me this URL.
 
 - the cache has it cached, but it's stale, so it asks the backend 
 If-None-Match.
 
 - the cache reaper comes along, says oh, this is stale, and reaps the 
 cached body (which is independant, remember?). The data is no longer 
 cached even though the headers still exist.
 
 - The backend says 304 Not Modified.
 
 - the cache says cool, will send my copy upstream. Oops, where has my 
 data gone?.

Sorry, but this only happens in your imagination. It's pretty obvious
that mod_cache_http will handle this.

 The end user will probably experience this as oh, the website had a 
 glitch, let me try again, so it won't be reported as a bug.

No.

 Ok, so you tried to lock the body before going to the backend, but 
 searching for and locking the body would have been an additional wasted 
 cache hit if the backend answered with its own body. Not to mention 
 having to write and debug code to do this.

Locks are not necessary, perhaps you are imaginating something very different.
If a data body disappears under mod_http_cache it is not a big deal! It will
refuse to serve the request from the cache and a new version of the page will
be cached.

 Races need to be properly handled, and atomic cache operations will go a 
 long way to prevent them.

I think we are discussing apples and oranges. First, we only want to *organize*
the current cache code into a more layered solution. The current semantics won't
change, yet!

--
Davi Arnaut