Re: svn commit: r1879591 - in /httpd/httpd/trunk: configure.in server/log.c

2020-07-15 Thread Joe Orton
On Wed, Jul 15, 2020 at 02:58:43PM +0200, Graham Leggett wrote:
> On 07 Jul 2020, at 15:40, jor...@apache.org wrote:
> > Author: jorton
> > Date: Tue Jul  7 13:40:15 2020
> > New Revision: 1879591
> > 
> > URL: http://svn.apache.org/viewvc?rev=1879591&view=rev
> > Log:
> > Check for and use gettid() directly if available; glibc 2.30 and later
> > provides a wrapper for the system call:
> > 
> > * configure.in: Check for gettid() and define HAVE_SYS_GETTID if
> >  gettid() is only usable via syscall().
> > 
> > * server/log.c (log_tid): Use gettid() directly if available.
> 
> This is not working for me on CentOS8.
> 
> ./configure says this:
> 
> checking for gettid()... yes

Interesting, can you provide the config.log and "rpm -q glibc"?  With a 
RHEL8 vm here it does not detect gettid (as I'd expect for glibc 2.28) 
and builds fine.

Regards, Joe



Re: Still Failing: apache/httpd#896 (trunk - 9af2218)

2020-06-29 Thread Joe Orton
On Mon, Jun 29, 2020 at 11:16:54AM +0200, Graham Leggett wrote:
> On 29 Jun 2020, at 09:19, Joe Orton  wrote:
> 
> > The litmus tests are failing, not the perl-framework tests:
> > 
> > https://travis-ci.org/github/apache/httpd/jobs/702768269#L2491
> 
> Ah - that’s where  I was going wrong.
> 
> > You can also see that there are segfaults logged from line 2519 onwards:
> > 
> > https://travis-ci.org/github/apache/httpd/jobs/702768269#L2519
> > 
> > Running litmus locally, the backtrace looks like this:
> 
> So the simple workaround is to be defensive against ctx->r being NULL, 
> but I need to check - is there ever a legitimate reason for there to 
> be a filled out ctx structure including a filename, but with no 
> request?

I don't know, you added the ctx->r field in r823703 and it's not obvious 
exactly what the semantics are supposed to be.

If the request_rec is supposed to correspond to the dav_resource object 
then it makes sense that it is NULL when looking at the dav_resource 
representing the parent collection (in this case) or some other resource 
during a walk.  It would be good to document this!

Regards, Joe



Re: Still Failing: apache/httpd#896 (trunk - 9af2218)

2020-06-29 Thread Joe Orton
On Sun, Jun 28, 2020 at 05:15:03PM +0200, Graham Leggett wrote:
> On 28 Jun 2020, at 15:47, Travis CI  wrote:
> 
> > apache / httpd
> > trunk
> > Build #896 is still failing9 mins and 44 secs
> 
> Travis mavens, I’ve been looking for the test/perl-framework directory 
> as referred to in --with-test-suite=test/perl-framework but I’m coming 
> up with a blank.

The litmus tests are failing, not the perl-framework tests:

https://travis-ci.org/github/apache/httpd/jobs/702768269#L2491

You can also see that there are segfaults logged from line 2519 onwards:

https://travis-ci.org/github/apache/httpd/jobs/702768269#L2519

Running litmus locally, the backtrace looks like this:

warning: Unexpected size of section `.reg-xstate/44301' in core file.
#0  0x7fcea5f9059a in dav_fs_getetag (resource=) at 
repos.c:1869
1869er.request_time = ctx->r->request_time;
[Current thread is 1 (Thread 0x7fcea6d70800 (LWP 44301))]
Missing separate debuginfos, use: dnf debuginfo-install 
pcre2-10.35-3.fc32.x86_64
(gdb) where
#0  0x7fcea5f9059a in dav_fs_getetag (resource=) at 
repos.c:1869
#1  0x7fcea5fda7af in dav_validate_resource_state (p=0xfb5288, 
resource=, lockdb=0xfcf060, if_header=0xfcf008, 
flags=flags@entry=288, pbuf=pbuf@entry=0x7ffdc5916010, r=0xfb5300) at 
util.c:1046
#2  0x7fcea5fdb9cf in dav_validate_request (r=r@entry=0xfb5300, 
resource=0xfcec70, depth=depth@entry=0, 
locktoken=locktoken@entry=0x0, response=response@entry=0x7ffdc5916168, 
flags=flags@entry=32, lockdb=)
at util.c:1652
#3  0x7fcea5fd1e4e in dav_method_put (r=0xfb5300) at mod_dav.c:985
#4  0x004170d0 in ap_run_handler (r=r@entry=0xfb5300) at config.c:170
#5  0x00417666 in ap_invoke_handler (r=r@entry=0xfb5300) at config.c:444
#6  0x0045a3eb in ap_process_async_request (r=r@entry=0xfb5300) at 
http_request.c:463
#7  0x0045a41f in ap_process_request (r=r@entry=0xfb5300) at 
http_request.c:498
#8  0x00456fbe in ap_process_http_sync_connection (c=0xf90440) at 
http_core.c:214
#9  ap_process_http_connection (c=0xf90440) at http_core.c:255
#10 0x004283c0 in ap_run_process_connection (c=c@entry=0xf90440) at 
connection.c:42
#11 0x00428919 in ap_process_connection (c=c@entry=0xf90440, 
csd=) at connection.c:219
#12 0x7fcea75439fe in child_main (child_num_arg=child_num_arg@entry=2, 
child_bucket=child_bucket@entry=0) at prefork.c:621
#13 0x7fcea7543d7e in make_child (s=, slot=2) at 
prefork.c:723
#14 0x7fcea75444bf in perform_idle_server_maintenance (p=) 
at prefork.c:827
#15 prefork_run (_pconf=, plog=, s=) at prefork.c:1020
#16 0x0042b7f0 in ap_run_mpm (pconf=pconf@entry=0x9f33f8, 
plog=0xa1e608, s=0xa1a750) at mpm_common.c:101
#17 0x00415347 in main (argc=, argv=) at 
main.c:844



Re: s390x, ppc64le and arm64 Travis jobs always fail

2020-06-23 Thread Joe Orton
On Tue, Jun 23, 2020 at 12:14:01PM +0200, Yann Ylavic wrote:
> Argh, fixed in r1879110 (hopefully).

Thanks a lot, looking green!  If we didn't find some actual bugs I would 
feel like burning thousands of hours of CPU time in CI was a wasted 
effort ;)





Re: s390x, ppc64le and arm64 Travis jobs always fail

2020-06-23 Thread Joe Orton
On Tue, Jun 23, 2020 at 10:53:13AM +0100, Joe Orton wrote:
> On Fri, Jun 19, 2020 at 02:21:48PM +0100, Joe Orton wrote:
> > so if Apache-Test is trying to connect to ::1 but httpd is only 
> > listening on AF_INET port 80, it will look like the server is not 
> > running, even if it is.
> 
> After r1879103 they are now at least starting up, but failing in the 
> proxy tests, which I find a bit surprising:
> 
> https://travis-ci.org/github/apache/httpd/jobs/701189909
> 
> # Failed test 3 in t/modules/proxy.t at line 22
> # Failed test 4 in t/modules/proxy.t at line 23
> t/modules/proxy.t ... 
> Failed 2/31 subtests 
> 
> that seems to be failing in the -context ProxyPass test 
> *only*.

cc Yann, I'm guessing this is an unsigned-char bug introduced in:

http://svn.apache.org/viewvc?view=revision&revision=1879094

these tests are going to default to *true* not *false* for unsigned char 
architectures:

-if (!dconf->mapping_decoded
+if (dconf->use_original_uri > 0




Re: s390x, ppc64le and arm64 Travis jobs always fail

2020-06-23 Thread Joe Orton
On Fri, Jun 19, 2020 at 02:21:48PM +0100, Joe Orton wrote:
> so if Apache-Test is trying to connect to ::1 but httpd is only 
> listening on AF_INET port 80, it will look like the server is not 
> running, even if it is.

After r1879103 they are now at least starting up, but failing in the 
proxy tests, which I find a bit surprising:

https://travis-ci.org/github/apache/httpd/jobs/701189909

# Failed test 3 in t/modules/proxy.t at line 22
# Failed test 4 in t/modules/proxy.t at line 23
t/modules/proxy.t ... 
Failed 2/31 subtests 

that seems to be failing in the -context ProxyPass test 
*only*.

Not at all impossible this is still some IPv6 vs localhost vs 
getaddrinfo-related fubar but I can't see why that would cause the 
-specific tests to fail when the other tests are apparently 
working.  Any ideas?




Re: s390x, ppc64le and arm64 Travis jobs always fail

2020-06-19 Thread Joe Orton
On Fri, Jun 19, 2020 at 02:08:18PM +0200, Ruediger Pluem wrote:
> Anyone an idea why the Travis builds on s390x, ppc64le and arm64 always fail
> (e.g. https://travis-ci.org/github/apache/httpd/builds/700041805)? It looks 
> like the server fails to start in time when running
> the test suite.

My guess:

The test config in the test framework use

Listen 0.0.0.0:

throughout... that is hard-coded in Apache::Test IIRC

but if you look at the Apache::Test output in those failing jobs, it 
always looks like this:

https://travis-ci.org/github/apache/httpd/jobs/700041828#L2397

server ip6-localhost:8529 started
server ip6-localhost:8530 listening (mod_nntp_like)

which implies it thinks localhost is ::1

so if Apache-Test is trying to connect to ::1 but httpd is only 
listening on AF_INET port 80, it will look like the server is not 
running, even if it is.



Re: Broken: apache/httpd#804 (trunk - 97bc128)

2020-06-17 Thread Joe Orton
On Tue, Jun 16, 2020 at 12:08:41PM +0200, Ruediger Pluem wrote:
> 
> 
> On 6/16/20 9:29 AM, Stefan Eissing wrote:
> >> Am 15.06.2020 um 21:46 schrieb Ruediger Pluem :
> >>
> >> I would like to unblock the test failure on trunk soon. Any comments on 
> >> the below?
> > 
> > I am not really familiar with ap_parse_request_line() and why it was added 
> > there. Yann?
> > 
> > As to "faking" the http version of a request, that does not look good. Do 
> > we need to preserve the fairy tale for our code that everything is HTTP/1.x?
> 
> I think the point is that ap_parse_request_line is only designed to parse <= 
> HTTP/1.x requests, if it should parse >= HTTP/2.0
> I guess we need to adjust the API such that we can tell it which major and 
> probably minor version to consider.
> But looking at the code again that brought me to the point that I would need 
> to reset r->the_request to
> 
> r->the_request = apr_psprintf(r->pool, "%s %s HTTP/2.0", req->method, 
> req->path ? req->path : "");
> 
> after the ap_parse_request_line() call.
> And yes this makes me think if this kind of fake really makes sense. The need 
> to reset 3 request_rec fields after
> ap_parse_request_line doesn't sound good. OTOH calling ap_parse_request_line 
> makes it possible to apply the same policies to
> HTTP/2.0 and HTTP/1.x requests at least where they overlap. Maybe the change 
> to the API is the way out.
> Keen to see Yann's feedback on this :-)

I'm not Yann but my 2c is that ap_parse_request_line() is designed to 
safely parse an HTTP/1.x request-line off the wire and probably 
shouldn't be used in mod_h2.  The complexity of that function is dealing 
with all the error cases, which is not useful since none of them will be 
hit with HTTP/2.

It looks cheaper - and not too complex - to set the fields of the 
request_rec directly as appropriate for HTTP/2 as 
ap_parse_request_line() does for HTTP/1.1?

Regards, Joe




Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use

2020-06-16 Thread Joe Orton
On Thu, May 07, 2020 at 09:50:26AM -0400, Eric Covener wrote:
> On Thu, May 7, 2020 at 9:39 AM Joe Orton  wrote:
> > Better question... or stupider question?  For "modern" OpenLDAP where
> > ldap_set_rebind_proc takes a void *, this linked list cache is
> > completely redundant and you can "simply" pass the (bindDN, bindPW)
> > through to the rebind callback via the void *, and that will work
> > correctly?
> 
> That makes sense, I would think so.  The manual on my Mac has the
> userdata parm too.

I have merged that change in r1878890 - thanks a lot for reviewing & 
walking me through this far, apologies for moving slowing but I wanted 
to get a working test case before proceeding.

Reviewing this thread again I am not sure if the change to 
remove/reorder the removal of the rebind is safe for non-OpenLDAP 
toolkits.  I can revert that bit if required?

https://github.com/apache/httpd/commit/130eac3ae6e8f7a8465c7792660ce8b747642b23#diff-f0cf47f5d582509d36e95f170231bc21L226

Regards, Joe



Re: Errored: notroj/httpd#100 (ldaprebind - 1255c78)

2020-06-09 Thread Joe Orton
Apologies, I didn't realize that forks would spam the list as well.  :( 
There will be a few more errors before the commit which turns off 
e-mails gets through and/or I get lucky and the thing passes.

Looks like we can use a conditional to disable notifications by default 
for forks, I'll try adjusting it in trunk.








Re: Trunk mod_md MDStoreDir default

2020-05-18 Thread Joe Orton
On Sat, May 16, 2020 at 11:19:58AM +0200, Steffen wrote:
> Current trunk 1877795  Win
> 
> Change with current 2.4.43 :
> 
> MDStoreDire defaults to ServerRoot/logs and not relative to ServerRoot
> 
> 
> Updating 2.4.43 to Trunk :  issue my mailserver cannot find certificates.

As I said in my previous message you need to think about the appropriate 
default DEFAULT_REL_STATEDIR for Windows.  I set it to logs/ to get your 
build working but maybe setting it to "" is better.  I don't know, you 
should adjust it as you think best.

http://svn.apache.org/viewvc?view=revision&revision=1877471

Regards, Joe



Re: svn commit: r1877397 - in /httpd/httpd/trunk: CHANGES modules/ssl/ssl_engine_init.c modules/ssl/ssl_engine_io.c modules/ssl/ssl_engine_kernel.c modules/ssl/ssl_private.h modules/ssl/ssl_util_ssl.c

2020-05-18 Thread Joe Orton
On Fri, May 15, 2020 at 11:20:51PM +0200, Yann Ylavic wrote:
> On Fri, May 15, 2020 at 8:59 PM Ruediger Pluem  wrote:
> >
> > On 5/15/20 6:50 PM, Yann Ylavic wrote:
> > >
> > > Somehow this change (bisected) broke many framework tests for me:
> > > t/ssl/* and t/security/CVE-*, the ones using mod_ssl I suppose.
> > > This is with openssl 1.1.1, and "SSLProtocol all -TLSv1.3" (which is
> > > the default $sslproto in "Apache-Test/lib/Apache/TestSSLCA.pm").
> >
> > Good catch that it fails with these settings, I and I guess Travis as well
> > use more recent versions of Net::SSLeay such that $sslproto is set to "all".
> 
> Actually it was just Joe's keyboard needing a new "I" key :)
> Everything works for me now with r1877795.

Oh my... can I blame autocomplete for this one, somehow?

Thanks a lot, Yann!  I guess we can have a special travis test run which 
should catch that, will investigate.

Regards, Joe



Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use

2020-05-07 Thread Joe Orton
On Thu, May 07, 2020 at 08:06:00AM -0400, Eric Covener wrote:
> On Thu, May 7, 2020 at 2:04 AM Ruediger Pluem  wrote:
> > Not looked at the problem further, but there is now a PR for this:
> >
> > https://bz.apache.org/bugzilla/show_bug.cgi?id=64414
> >
> > Maybe this revives the discussion :-)
> 
> Nothing stood out in the source, is it OK to hash a pointer just by
> passing APR_SIZEOF_VOIDP as the length?

I don't see why not.

Better question... or stupider question?  For "modern" OpenLDAP where 
ldap_set_rebind_proc takes a void *, this linked list cache is 
completely redundant and you can "simply" pass the (bindDN, bindPW) 
through to the rebind callback via the void *, and that will work 
correctly?





Re: Trunk DEFAULT_REL_STATEDIR undeclared

2020-05-07 Thread Joe Orton
On Thu, May 07, 2020 at 12:55:03PM +0200, Steffen wrote:
> Tried to build trunk again after 2 years :)
> 
> server\core.c(5618,58): error C2065: 'DEFAULT_REL_STATEDIR': undeclared
> identifier

That was added in r1842929 (October 2018) and nobody has noticed Windows 
was broken before now?  "Discontinued Integration".  Try r1877471 but if 
Windows MPMs use privilege separation like Unix then the STATEDIR should 
be a different directory to logs/ so that's not ideal.

Can someone who cares about Windows pretty plaseee work out how to 
get it building in Travis (or appveyor, or whatever)?





Re: RFC: mod_ssl features to dump for 2.5

2020-05-06 Thread Joe Orton
On Wed, May 06, 2020 at 11:44:37AM +0100, Joe Orton wrote:
> On Mon, May 04, 2020 at 05:23:23PM +0200, Ruediger Pluem wrote:
> > On 5/4/20 3:49 PM, Joe Orton wrote:
> > > d) SSLRandomSeed.  This might have made sense in 1998 but at least with 
> > > OpenSSL 1.1.1 which has a rewritten and fork-safe RAND, I think httpd 
> > > should not be doing RAND seeding ever.  Currently mod_ssl will splat 
> > > random stack data, time() and the pid into the RNG state for each new 
> > > connection.  Unless someone can prove this is valuable and the OpenSSL 
> > > PRNG is somehow broken OOTB, I think this code + directive should be 
> > > dropped for OpenSSL 1.1.1+, including EGD support etc.
> > 
> > Do we drop it only for OpenSSL 1.1.1 or are there other older versions of 
> > OpenSSL where this is save to drop?
> 
> From https://wiki.openssl.org/index.php/Random_fork-safety it seems like 
> there is some reason to believe the <1.1.1 RNG is not safe after fork 
> unless you help it.
> 
> I was looking at the Fedora default mod_ssl config which does have a 
> default "SSLRandom", but the example httpd-ssl.conf shipped does not. So 
> *maybe* configuring SSLRandomSeed is useful, but really if it is not 
> needed by default we should do something by default, which we don't.

^ Apologies for garbled grammar, I meant:

"if it IS needed by default, we should do something by default"

... and we *do* have something configured by default:

# Note: The following must must be present to support
#   starting without SSL on platforms with no /dev/random equivalent
#   but a statically compiled-in mod_ssl.
#

SSLRandomSeed startup builtin
SSLRandomSeed connect builtin


but if OpenSSL does not have entropy source beyond that provided by 
mod_ssl calling getpid() and time() it is IMO far better to fail to 
start up.

So maybe we should still call RAND_status() and fail startup if the PRNG 
is not initialized correctly?

Regards, Joe



Re: RFC: mod_ssl features to dump for 2.5

2020-05-06 Thread Joe Orton
On Mon, May 04, 2020 at 05:23:23PM +0200, Ruediger Pluem wrote:
> On 5/4/20 3:49 PM, Joe Orton wrote:
> > d) SSLRandomSeed.  This might have made sense in 1998 but at least with 
> > OpenSSL 1.1.1 which has a rewritten and fork-safe RAND, I think httpd 
> > should not be doing RAND seeding ever.  Currently mod_ssl will splat 
> > random stack data, time() and the pid into the RNG state for each new 
> > connection.  Unless someone can prove this is valuable and the OpenSSL 
> > PRNG is somehow broken OOTB, I think this code + directive should be 
> > dropped for OpenSSL 1.1.1+, including EGD support etc.
> 
> Do we drop it only for OpenSSL 1.1.1 or are there other older versions of 
> OpenSSL where this is save to drop?

>From https://wiki.openssl.org/index.php/Random_fork-safety it seems like 
there is some reason to believe the <1.1.1 RNG is not safe after fork 
unless you help it.

I was looking at the Fedora default mod_ssl config which does have a 
default "SSLRandom", but the example httpd-ssl.conf shipped does not. So 
*maybe* configuring SSLRandomSeed is useful, but really if it is not 
needed by default we should do something by default, which we don't.

(I feel like there should be a assumption in favour of correctness with 
OpenSSL and any code which assumes incorrectness should have very strong 
justification for its continued existence.  Instead we have a tendency 
to carry a lot of code merely because "we've always done it like this".)

> And if we drop how do we drop it? If we can only drop it for OpenSSL 1.1.1 I 
> would be in favour
> of sending a message to the log (INFO level) that it is just ignored. This 
> avoids that a config working with OpenSSL < 1.1.1
> fails with OpenSSL 1.1.1 but the same Apache version.

Very good idea, I'll do it like that.  Thanks for the feedback!

Regards, Joe



Re: svn commit: r1877397 - in /httpd/httpd/trunk: CHANGES modules/ssl/ssl_engine_init.c modules/ssl/ssl_engine_io.c modules/ssl/ssl_engine_kernel.c modules/ssl/ssl_private.h modules/ssl/ssl_util_ssl.c

2020-05-05 Thread Joe Orton
On Tue, May 05, 2020 at 03:23:18PM +0200, Ruediger Pluem wrote:
> On 5/5/20 2:40 PM, jor...@apache.org wrote:
> > Author: jorton
> > Date: Tue May  5 12:40:38 2020
> > New Revision: 1877397
> > 
> > URL: http://svn.apache.org/viewvc?rev=1877397&view=rev
> > Log:
> > mod_ssl: Switch to using SSL_OP_NO_RENEGOTATION (where available) to
> > block client-initiated renegotiation with TLSv1.2 and earlier.
...
> > Modified: httpd/httpd/trunk/modules/ssl/ssl_engine_kernel.c
> > URL: 
> > http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/ssl/ssl_engine_kernel.c?rev=1877397&r1=1877396&r2=1877397&view=diff
> > ==
> > --- httpd/httpd/trunk/modules/ssl/ssl_engine_kernel.c (original)
> > +++ httpd/httpd/trunk/modules/ssl/ssl_engine_kernel.c Tue May  5 12:40:38 
> > 2020
> 
> > @@ -1182,8 +1182,6 @@ static int ssl_hook_Access_modern(reques
> >  return HTTP_FORBIDDEN;
> >  }
> >  
> > -old_state = sslconn->reneg_state;
> > -sslconn->reneg_state = RENEG_ALLOW;
> >  modssl_set_app_data2(ssl, r);
> >  
> >  SSL_do_handshake(ssl);
> > @@ -1193,7 +1191,6 @@ static int ssl_hook_Access_modern(reques
> >   */
> >  SSL_peek(ssl, peekbuf, 0);
> >  
> > -sslconn->reneg_state = old_state;
> >  modssl_set_app_data2(ssl, NULL);
> >  
> >  /*
> 
> I don't understand why this can be removed unconditionally.

It is removed unconditionally in ssl_hook_Access_modern, which is used 
only for TLSv1.3.  The aim of this code was to protect against 
ClientHello being sent unexpectedly and that is always illegal in 
TLSv1.3.  https://www.rfc-editor.org/rfc/rfc8446.html#page-28 -

   Because TLS 1.3 forbids renegotiation, if a server has negotiated
   TLS 1.3 and receives a ClientHello at any other time, it MUST
   terminate the connection with an "unexpected_message" alert.

So the ->reneg_state setting already had no effect for TLSv1.3, and the 
code in ssl_callback_Info *before* my commit already checked and was a 
noop for the TLSv1.3 case, in line 2289 of the function:

http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/ssl/ssl_engine_kernel.c?revision=1877349&view=markup&pathrev=1877397#l2273

Does that make sense?

Regards, Joe



Re: RFC: mod_ssl features to dump for 2.5

2020-05-04 Thread Joe Orton
On Mon, May 04, 2020 at 09:59:24AM -0400, Eric Covener wrote:
> On Mon, May 4, 2020 at 9:49 AM Joe Orton  wrote:
> > c) Client-initiated renegotiation prevention mechanism.  This was
> > introduced mostly as a temporary workaround for CVE-2009-3555, and as
> > the saying goes, there is nothing as permanent as a temporary
> > workaround.  This already doesn't apply for TLSv1.3, and it doesn't
> > really add much for TLS < v1.3 so I think it can go completely.
> 
> I am not familiar with this one in mod_ssl but I am familiar with the issue.
> Does it generate distinctive log messages for TLS < 1.3 that are
> useful for e.g. fail2ban?

Yes - APLOGNO(02042) is generated here.

> Has OpenSSL caught up and can we directly kill client-initiated renegotiation?

Great question.  Looks like OpenSSL 1.1.1 added a new option flag, 
SSL_OP_NO_RENEGOTIATION, which does exactly this, so we could use that 
instead of the current code which has to track and intercept handshakes 
manually.  It also sends a TLS alert rather than simply aborting the 
connection, so it's better behaved than the current code.  I'll look at 
switching over to this instead of dropping it instead.  Thanks!

Regards, Joe



RFC: mod_ssl features to dump for 2.5

2020-05-04 Thread Joe Orton
I'd like to gauge consensus on removing the following mod_ssl features 
for 2.5.  I am +1 (more or less strongly) on removing all the following:

a) SSLInsecureRengotiation.  If you haven't patched your clients for 
CVE-2009-3555 there is no hope.  This should definitely be removed.

b) SSLRequire - this has been deprecated since it was subsumed into the 
better "Require expr" interface in 2.4.x. 

c) Client-initiated renegotiation prevention mechanism.  This was 
introduced mostly as a temporary workaround for CVE-2009-3555, and as 
the saying goes, there is nothing as permanent as a temporary 
workaround.  This already doesn't apply for TLSv1.3, and it doesn't 
really add much for TLS < v1.3 so I think it can go completely.

d) SSLRandomSeed.  This might have made sense in 1998 but at least with 
OpenSSL 1.1.1 which has a rewritten and fork-safe RAND, I think httpd 
should not be doing RAND seeding ever.  Currently mod_ssl will splat 
random stack data, time() and the pid into the RNG state for each new 
connection.  Unless someone can prove this is valuable and the OpenSSL 
PRNG is somehow broken OOTB, I think this code + directive should be 
dropped for OpenSSL 1.1.1+, including EGD support etc.

e) SSLCompression - enabling this has been considered (and documented 
as) a bad idea for a good while.  IMO we should have "SSLCompression 
off" the hard-coded default and drop the directive.

Regards, Joe



Re: Can github activity (new PRs, comments) be forwarded to dev@ ?

2020-05-04 Thread Joe Orton
On Wed, Apr 29, 2020 at 11:47:33AM -0500, Daniel Gruno wrote:
> And done, plus we have GH notifications going there now :)

Nice, thanks a lot Daniel.




Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use

2020-04-30 Thread Joe Orton
On Wed, Apr 29, 2020 at 03:09:28PM -0400, Eric Covener wrote:
> On Wed, Apr 29, 2020 at 2:43 PM Ruediger Pluem  wrote:
> > Hm. Don't we need to do the apr_pool_clear(ldc->rebind_pool); in any case 
> > if ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON or
> > does this only makes sense if ldc->ldap != NULL?
> > And can we still do an ldap_unbind_s(ldc->ldap); if we did an 
> > apr_ldap_rebind_remove(ldc->ldap); before?
> 
> I see what you mean. If we lost the ldc->ldap some other way, anything
> allocated from ldc->rebind_pool is leaked because the LDAP key can't
> be looked up in the xref linked list.
> We would likely have linked list growth in that case too.

Shouldn't clearing the pool here be sufficient anyway?  Since there is a
cleanup registered by apr_ldap_rebind_add() which calls
apr_ldap_rebind_remove() with the original LDAP * pointer value it looks
safe, and it avoids entering _rebind_remove twice.

Doing it before the ldap_unbind() call so that LDAP * pointer is not a
dangling pointer seems better.

The underlying problem here is a performance issue which looked like
linked list growth so actually with:

a) threaded MPM and large number of threads,
b) each thread may have its own LDAP * and an associated rebind entry  (is that 
really right?!)
c) on unbind each thread walks the linked list w/mutex held

which looks quite painful regardless, and doing (c) twice is clearly 
worse.  So as well as changing this the indexed linked list should 
really be a hash table?

Regards, Joe

-- 
Joe Orton // Red Hat Core Services



[PATCH] mod_ldap: fix apr_ldap_rebind_remove() use

2020-04-29 Thread Joe Orton
In uldap_connection_unbind, apr_ldap_rebind_remove() is always passed 
NULL since ldc->ldap is either NULL on entry or is set to NULL.  This 
looks safe, but seems like an expensive noop since 
apr_ldap_rebind_remove() acquires a mutex and iterates a linked list 
trying to find a rebind reference matching NULL (I assume always 
failing).

http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/ldap/util_ldap.c?revision=1876599&view=markup#l222

Can this be removed or should it be moved up so it's not a noop?  I've 
not tried to reproduce problems in an LDAP config with referrals.

(Credit to Abhishek and Angel who saw apr_ldap_rebind_remove() spinning 
and debugged this)

Index: modules/ldap/util_ldap.c
===
--- modules/ldap/util_ldap.c(revision 1877157)
+++ modules/ldap/util_ldap.c(working copy)
@@ -225,6 +225,12 @@
 
 if (ldc) {
 if (ldc->ldap) {
+/* forget the rebind info for this conn */
+if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
+apr_ldap_rebind_remove(ldc->ldap);
+apr_pool_clear(ldc->rebind_pool);
+}
+
 if (ldc->r) { 
 ap_log_rerror(APLOG_MARK, APLOG_TRACE5, 0, ldc->r, "LDC %pp 
unbind", ldc); 
 }
@@ -233,11 +239,6 @@
 }
 ldc->bound = 0;
 
-/* forget the rebind info for this conn */
-if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
-apr_ldap_rebind_remove(ldc->ldap);
-apr_pool_clear(ldc->rebind_pool);
-}
 }
 
 return APR_SUCCESS;



Re: Can github activity (new PRs, comments) be forwarded to dev@ ?

2020-04-29 Thread Joe Orton
On Tue, Apr 28, 2020 at 05:19:09PM +0200, Ruediger Pluem wrote:
> +1 g...@httpd.apache.org (I declare the naming discussion for the list name 
> opened :-)).
> This offers an easy opt-in for those who are interested in these updates.

+1 from me, does anybody know if it's actually technically possible to 
do though?  AFAICT there is not github project setting to do this kind 
of thing specifically so we'd either need to use the API somehow?  Or 
add whate...@httpd.apache.org to a personal github account and have 
notifications through that, which seems ugly.

Regards, Joe



Re: Move FILE buckets on setaside (like/as a morphing bucket)?

2020-04-27 Thread Joe Orton
On Mon, Apr 27, 2020 at 04:27:56PM +0200, Yann Ylavic wrote:
> On Mon, Apr 27, 2020 at 4:11 PM Joe Orton  wrote:
> >
> > +1 from me for using the term "opaque buckets" as a synonym for
> > "e->length == (apr_size_t)-1" aka "buckets with indeterminate length".
> 
> OK thanks, just committed "something" in r1877077 because I keep
> messing up with my attachments today.
> Let's discuss from "that" if needed..

That change is definitely better.  Isn't the code still making the 
assumption that FILE is the only possible non-opaque morphing bucket 
type?

http://svn.apache.org/viewvc/httpd/httpd/trunk/server/util_filter.c?revision=1877077&view=markup#l1108

...and by "non-opaque morphing bucket type" the distinction intended 
here is... "representing a determinate length bucket type which is not 
fully mapped into RAM until you try read()ing it until exhaustion".

(!)



Re: Move FILE buckets on setaside (like/as a morphing bucket)?

2020-04-27 Thread Joe Orton
On Mon, Apr 27, 2020 at 03:57:37PM +0200, Yann Ylavic wrote:
> On Mon, Apr 27, 2020 at 3:43 PM Yann Ylavic  wrote:
> >
> > On Mon, Apr 27, 2020 at 2:09 PM Yann Ylavic  wrote:
> > >
> > > https://github.com/apache/httpd/pull/117
> >
> > I closed it, how about the second patch there though?
> >
> > https://github.com/apache/httpd/pull/117/commits/cdf4c4a3c61340f061267cc70c456d9469e9faac
> 
> Actually the attached one (with no mention of morphing since we are
> not talking about FILE buckets anymore).
> I have chosen the term "opaque" because it's concise (e.g.
> opaque_buckets_in_brigade), but it's just in comments now (and
> accompanied by "(length == -1)" to specify what it means). The goal
> being to axe AP_BUCKET_IS_MORPHING() which I introduced recently..

+1 from me for using the term "opaque buckets" as a synonym for 
"e->length == (apr_size_t)-1" aka "buckets with indeterminate length". 

Regards, Joe



Re: Providing near-time averaged monitoring data for mod_systemd and mod_status

2020-04-27 Thread Joe Orton
On Sat, Apr 25, 2020 at 08:10:40PM +0200, Rainer Jung wrote:
> Patch available at
> 
> home.apache.org/~rjung/patches/httpd-trunk-mon-snaps-v1_2.patch

Very nice!  +1 from me.

Does the times_per_thread logic still make any sense?  It's always been 
wrong for Linux AFAICT so maybe can just be dropped.

A minor nit, I think the snap_index should be read/written using 
apr_atomic_t since it's going to be accessed concurrently across 
threads?

Regards, Joe



> Some notes:
> 
> - in order to use the data from mod_systemd (monitor hook), which runs in
> the maim process, and also from mod_status, so from child processes, it
> needs to sit in shared memory. Since most of the data is tightly coupled to
> the scoreboard, I added the new cached data to the global part of the
> scoreboard.
> 
> - it was then IMHO a good fit to move the existing ap_get_sload(ap_sload_t
> *ld) from server/util.c to server/scoreboard.c.
> 
> - ap_sload_t is used as a collection of data which can be used to generate
> averaged data from a pair of ap_sload_t structs. It was extended to also
> contain the total request duration plus total usr and sys CPU and the
> timestamp when the data was taken. So it should now contain all data from
> which mod-systemd and mod_status currently like to display derived averaged
> values.
> 
> - busy and idle in ap_sload_t have been changed from int to float, because
> they were already only used a percentages. IMHO it doesn't make sense to
> express idle and busy as percentages based on a total of busy plus idle,
> because that sum is not a meaningful total. The server can grow by creating
> new processes and suddenly your busy percentage might shrink, although the
> absolute number of busy threads increases. That's counter intuitive. So I
> added the unused process slots to the total count, and we have three
> counters that sum up to this total, busy, idle and dead. We need a better
> name than "dead" for these unused process slots that might get used later.
> "unused" is to close to "idle" and I don't have a good name yet.
> 
> - the new ap_mon_snap_t contains a pointer to an ap_sload_t, six averaged
> values generated from two ap_sload_t and a member that conatins the time
> delta between those two ap_sload_t. The ap_sload_t referenced by
> ap_mon_snap_t contains the data at time t1. During the next monitor run, new
> t1 data will be collected and the previous t1 data will be used as t0 data
> to generate the new averages.
> 
> - the scoreboard contains two ap_mon_snap_t (plus the two ap_sload_t
> referenced by them) to be able to update one of them without breaking access
> by consumers to the other one. After the update the roles get switched.
> 
> - both structs, ap_sload_t and ap_mon_snap_t are declared in httpd.h,
> because ap_sload_t already had been there. t might be a better fit to move
> them to scoreboard.h, but I'm not sure about that and I don't know if that
> would be allowed by the compatibility rules.
> 
> - also in httpd.h there are now three new function declarations. Two only
> used by server/core.c as hook functions:
> 
>   int ap_scoreboard_monitor(apr_pool_t *p, server_rec *s);
>   void ap_scoreboard_child_init(apr_pool_t *p, server_rec *s);
> 
> and one for public consumption:
> 
>   AP_DECLARE(void) ap_get_mon_snap(ap_mon_snap_t *ms);
> 
> - mod_systemd and mod_status now call ap_get_mon_snap(&snap) to retrieve the
> latest averaged data. mod_status still uses the scoreboard in addition to
> retrieve up-to-date current scalar metrics. Small adjustments to the
> mod_status output plus additions to mod_systemd notification messages. Tne
> STATUS in the notification of mod_systemd now looks liek this:
> 
> STATUS=Pct Idle workers 28.4; Pct Busy workers 10.6; Pct Dead workers 60.9;
> Requests/sec: 5030; Bytes served/sec: 2.9MB/sec; Bytes served/request: 596
> B/req; Avg Duration(ms): 5.78; Avg Concurrency: 29.1; Cpu Pct: 40.5
> 
> - scoreboard.c changes:
> 
>   - take over ap_get_sload(ap_sload_t *sl) from server/util.c.
> Added copied code from mod_status to that function to also add up the
> request duration and the usr and sys CPU times.
> 
>   - ap_scoreboard_monitor() runs in the monitor hook. It calls
> ap_get_sload() and then a static utility function calc_mon_data() to
> calculate the new averages.
> 
> - Minor mmn change not yet part of the patch.
> 
> It compiles fine (maintainer mode) on RHEL 7 x86_64 and on Solaris 10 Sparc
> and I did some tests with mod_status and mod_systemd.
> 
> Regards,
> 
> Rainer
> 
> Am 24.04.2020 um 18:32 schrieb Rainer Jung:
> > Am 24.04.2020 um 16:21 schrieb Joe Orton:
> > > On Fri, Apr 24

Re: Move FILE buckets on setaside (like/as a morphing bucket)?

2020-04-27 Thread Joe Orton
On Mon, Apr 27, 2020 at 02:29:55PM +0200, Yann Ylavic wrote:
> The call chain is:
>   apr_bucket_setaside(bucket, newpool)
>   => file_bucket_setaside(bucket, newpool)
>  => apr_file_setaside(&newfd, bucket->data->fd, newpool)
> => copy bucket->data->fd to newfd, move cleanup to newpool
> => bucket->data->fd->filedes = -1
>  => bucket->data->fd = newfd
> 
> And since bucket->data is shared, all other buckets with the same data
> get their filedes set to -1.

You mentioned this was a problem for split buckets, which I don't get.  
For a split FILE bucket bucket->data is common across copies, so this 
seems fine, the last "bucket->data->fd = newfd;" updates _all_ the split 
buckets.

I can see this is a problem *outside* of _split, so if you do something 
like:

e1 = apr_brigade_insert_file(bb, fd, ...);
e2 = apr_brigade_insert_file(bb, fd, ...);

and then a later apr_bucket_setaside(e1) will break e2 as you describe, 
because e1 and e2 _don't_ have a common ->data.

I can see in the list archive a discussion of apr_file_setaside() where 
cliffw suggested apr_file_setaside() should work like apr_mmap_dup() 
which doesn't appear to have this problem.

> > > The case I'm facing is a bit different but similar. A handler is 
> > > saving large bodies into a file and inserts an input filter for 
> > > mod_proxy to use the file bucket as request body (much like 
> > > mod_request). The file is still needed/used after mod_proxy (in an 
> > > output filter), so I don't expect the file to be invalidated by 
> > > mod_proxy's output filtering.
> >
> > I do agree that it is surprising behaviour that the apr_file_t * is
> > potentially invalidated after being wrapped in a FILE bucket though. Not
> > sure how to avoid it, IIRC this is some optimisation to avoid excessive
> > dup2() calls?
> 
> Yes I think so, that plus a new fd to take into account for ulimit nofile.
> Alternatively we could play with apr_os_file_get()+apr_os_file_set(),
> which will leave alone the fd and cleanup from the original pool, but
> at that point I think we'd better let the caller responsible for the
> lifetime..

I think within the constraint of the current APR/httpd buckets/filtering 
API it is probably necessary to say that you cannot insert a FILE bucket 
referring to the same apr_file_t * twice.  Or you should dup2 inbetween.

That is clearly the implication of the apr_file_setaside() API warning 
which extends to apr_bucket_setaside().  Alternatively you could say the 
apr_file_setaside() implementation is broken and should work like 
apr_mmap_dup().

(I think treating setaside as special for FILE is a bad road to start 
down - I should be able copy and paste the FILE implementation as a new 
MYFILE bucket type and have the core behave correctly, if not optimally)

Regards, Joe



Re: Move FILE buckets on setaside (like/as a morphing bucket)?

2020-04-27 Thread Joe Orton
On Sun, Apr 26, 2020 at 05:26:02PM +0200, Yann Ylavic wrote:
> For FILE buckets, the behaviour of apr_bucket_setaside() is to take
> *full* ownership of the underlying apr_file, that is: allocate/move
> the file/cleanup to the new pool AND set the old file's fd to -1 (see
> apr_file_setaside, [1]).
> 
> This can be problematic in an output filters chain whenever a FILE
> bucket gets splitted. Suppose that an FTYPE_RESOURCE output filter
> splits a FILE bucket in two and sends the first one through the chain,
> and then due to network congestion the core output filter needs to set
> the FILE bucket aside. It happens that the second FILE bucket becomes
> a dangling bucket (with fd == -1).

Hang on, I don't follow this, how does data->fd end up as -1 in this 
case?  The buckets split from the original FILE bucket have a common 
->data since it is a refcounted bucket type.  If any of them are 
setaside the common ->data->fd is replaced with a new apr_file_t 
allocated from a new (longer-lived) pool.  No?

> The case I'm facing is a bit different but similar. A handler is
> saving large bodies into a file and inserts an input filter for
> mod_proxy to use the file bucket as request body (much like
> mod_request). The file is still needed/used after mod_proxy (in an
> output filter), so I don't expect the file to be invalidated by
> mod_proxy's output filtering.

I do agree that it is surprising behaviour that the apr_file_t * is 
potentially invalidated after being wrapped in a FILE bucket though. Not 
sure how to avoid it, IIRC this is some optimisation to avoid excessive 
dup2() calls?

Regards, Joe



Re: mod_systemd suggestion

2020-04-24 Thread Joe Orton
On Fri, Apr 24, 2020 at 12:17:19PM +0200, Rainer Jung wrote:
> Thinking further: I think it would make sense to have a module or core
> implement the monitor hook to generate that derived data (requests/sec,
> bytes/sec, durationMs/request, avgConcurrency) in the last monitor interval
> and to provide that data to consumers like mod_systemd or - new - mod_status
> - instead of the long term averages since start. It could probably be added
> to the code that already provides "sload". That way mod_status would also
> profit from the more precise average values (taken over the last monitor
> interval).

I definitely like the patch, it has bothered me that the "per second" 
stats are not very useful but wasn't sure how to make it better.

This is also an interesting idea.

So you would suggest having a new monitor hook which runs REALLY_LAST in 
the order, calls ap_get_sload() and stores it in a global, and then we'd 
have an ap_get_cached_sload() (or whatever) which gives you the cached 
data from the last iteration?  Or are you thinking of a more 
sophisticated API which does the "diff" between intervals internally?

Regards, Joe

> 
> Regards,
> 
> Rainer
> 
> Am 23.04.2020 um 21:29 schrieb Rainer Jung:
> > Hi all,
> > 
> > triggered by the new mod_systemd I drafted a patch to enhance the
> > monitoring data it provides during the monitor hook run.
> > 
> > Currently it publishes important data, like idle and busy slots and
> > total request count, but also not so useful info like requests/second
> > and bytes/second as a long term average (since start). These two figues
> > tend to become near constant after a longer time of operation.
> > 
> > Since the monitor hook of the module always seems to run in the same
> > (parent) process, it is easy to remember the previous request and byte
> > count data and average only over the last monitor hook interval. This
> > should give more meaningful data. And is a change local to mod_systemd.
> > 
> > In addition we have a third metric available in the scoreboard, namely
> > the total request duration. From that we can get the average request
> > duration and the average request concurrency. This part also needs a
> > change to the sload structure. Maybe we need a minor MMN bump for that.
> > 
> > I scetched a patch under
> > 
> > home.apache.org/~rjung/patches/httpd-trunk-mod_systemd-interval-stats.patch
> > 
> > Any comments, likes or dislikes?
> > 
> > Thanks and regards,
> > 
> > Rainer
> 



Travis notifications for trunk?

2020-04-23 Thread Joe Orton
In the past two weeks we've had three true negative build failures from 
Travis on trunk - i.e. bugs were introduced and CI caught them - and one 
false negative (where the gcc 9 build broke because of some change in 
the Ubuntu repo).  So I think we're reaching the point where signal is 
dominating noise.

Is everybody OK with sending e-mail notifications for CI failures for 
trunk to this list?  It's quite likely we will still see occassional 
false negatives, and can continue to mark as "allowed failure" any build 
combination which looks unstable, so it doesn't fail the job overall.

As with 2.4.x we will only see fail->pass and pass->fail transitions via 
e-mail notification.

Regards, Joe



Re: svn commit: r1876870 - in /httpd/httpd/branches/2.4.x: ./ acinclude.m4

2020-04-23 Thread Joe Orton
On Thu, Apr 23, 2020 at 08:56:23AM -, rj...@apache.org wrote:
> URL: http://svn.apache.org/viewvc?rev=1876870&view=rev
> Log:
> systemd dependencies are only needed by mod_systemd.
> They should currently not be needed by httpd directly
> or any other binary. So no need to add them to
> HTTPD_LIBS.
..
> --- httpd/httpd/branches/2.4.x/acinclude.m4 (original)
> +++ httpd/httpd/branches/2.4.x/acinclude.m4 Thu Apr 23 08:56:23 2020
> @@ -623,7 +623,6 @@ case $host in
>if test "${ac_cv_header_systemd_sd_daemon_h}" = "no" || test -z 
> "${SYSTEMD_LIBS}"; then
>  AC_MSG_WARN([Your system does not support systemd.])
>else
> -APR_ADDTO(HTTPD_LIBS, [$SYSTEMD_LIBS])
>  AC_DEFINE(HAVE_SYSTEMD, 1, [Define if systemd is supported])
>fi
> fi

I think this is wrong on trunk but right on 2.4.x, as seen in Travis =>

trunk: FAIL https://travis-ci.org/github/apache/httpd/builds/678511325
2.4.x: PASS https://travis-ci.org/github/apache/httpd/builds/678512386

On trunk, systemd socket activation support is in the core so httpd 
should be linked against $SYSTEMD_LIBS if built with systemd support.

On 2.4.x mod_systemd has been backported but socket activation has not, 
so this is correct.

Also I don't think 2.4.x Unix buildsystem changes are CTR ;)

tl;dr: +1 for 2.4.x, -1 for trunk.

Regards, Joe



[RFC] static type checking for AP_INIT_FLAG etc

2020-04-22 Thread Joe Orton
Another one I'd like to check consensus on, before disappearing too far 
down the rabbit-hole.

The ap_set_*_slot directive function initializers throw away type safety 
completely since you can pass anything to APR_OFFSETOF(x,y) and it's 
cast to void * regardless.  I've seen one bug because of this in a 
third-party module which was using a bool [sizeof(char)] rather than an 
int with ap_set_int_slot.  r1876823 has another example.

You can get away with "minor" errors like that on little-endian 
platforms since the least significant bytes are in the "right" place 
even if the type is wrong.  So bugs don't show up until customers run 
your code on, just as an example, IBM hardware! ;)

I don't see a way to fix all the code with hard-coded APR_OFFSETOF(), 
but we can offer some alternative wrapper macros which get rid of 
OFFSETOF and check the types at compile-time with some gcc builtin 
magic.  PoC attached.

The failure will look like this one from the mod_proxy_html case:

In file included from mod_proxy_html.c:55:
/home/jorton/src/asf/httpd-git/include/http_config.h:162:5: error: void value 
not ignored as it ought to be
  162 | __builtin_choose_expr(sizeof(actual) == sizeof(expected), result, 
(void)0)
  | ^
/home/jorton/src/asf/httpd-git/include/http_config.h:169:13: note: in expansion 
of macro ‘AP_INIT_CHECKED_TYPE’
  169 | AP_INIT_CHECKED_TYPE(((structname *)0)->fieldname, int, 
\
  | ^~~~
mod_proxy_html.c:1316:5: note: in expansion of macro ‘AP_INIT_TAKE1_INT_SLOT’
 1316 | AP_INIT_TAKE1_INT_SLOT("ProxyHTMLBufSize", proxy_html_conf, bufsz,
  | ^~

which is not totally obvious but at least it's a failure.

Opinions?

Regards, Joe
diff --git a/include/http_config.h b/include/http_config.h
index 2aac6d4325..0c32d94b3c 100644
--- a/include/http_config.h
+++ b/include/http_config.h
@@ -155,6 +155,31 @@ typedef union {
 # define AP_INIT_FLAG(directive, func, mconfig, where, help) \
 { directive, { .flag=func }, mconfig, where, FLAG, help }
 
+#ifdef __GNUC__
+/* Expands to a compile-time error (void)0 if size of actual and
+ * expected arguments do not match, otherwise expands to results. */
+# define AP_INIT_CHECKED_TYPE(actual, expected, result) \
+__builtin_choose_expr(sizeof(actual) == sizeof(expected), result, (void)0)
+#else
+# define AP_INIT_CHECKED_TYPE(actual, expected, result) result
+#endif
+
+# define AP_INIT_TAKE1_INT_SLOT(directive, structname, fieldname, where, help) 
\
+{ directive, { .take1=ap_set_int_slot }, \
+AP_INIT_CHECKED_TYPE(((structname *)0)->fieldname, int, \
+ (void *)APR_OFFSETOF(structname, fieldname)), 
\
+ where, TAKE1, help }
+# define AP_INIT_FLAG_SLOT(directive, structname, fieldname, where, help) \
+{ directive, { .flag=ap_set_flag_slot },\
+AP_INIT_CHECKED_TYPE(((structname *)0)->fieldname, int, \
+ (void *)APR_OFFSETOF(structname, fieldname)), \
+where, FLAG, help }
+# define AP_INIT_FLAG_CHAR_SLOT(directive, structname, fieldname, where, help) 
\
+{ directive, { .flag=ap_set_flag_slot_char },\
+AP_INIT_CHECKED_TYPE(((structname *)0)->fieldname, char, \
+ (void *)APR_OFFSETOF(structname, fieldname)), \
+where, FLAG, help }
+
 #else /* AP_HAVE_DESIGNATED_INITIALIZER */
 
 typedef const char *(*cmd_func) ();
@@ -193,6 +218,15 @@ typedef const char *(*cmd_func) ();
 { directive, func, mconfig, where, TAKE3, help }
 # define AP_INIT_FLAG(directive, func, mconfig, where, help) \
 { directive, func, mconfig, where, FLAG, help }
+# define AP_INIT_TAKE1_INT_SLOT(directive, structname, fieldname, where, help) 
\
+{ directive, ap_set_int_slot }, (void *)APR_OFFSETOF(structname, 
fieldname)), \
+ where, TAKE1, help }
+# define AP_INIT_FLAG_SLOT(directive, structname, fieldname, where, help) \
+{ directive, ap_set_flag_slot, (void *)APR_OFFSETOF(structname, 
fieldname), \
+  where, FLAG, help }
+# define AP_INIT_FLAG_CHAR_SLOT(directive, structname, fieldname, where, help) 
\
+{ directive, ap_set_flag_slot_char, APR_OFFSETOF(structname, fieldname)), \
+where, FLAG, help }
 
 #endif /* AP_HAVE_DESIGNATED_INITIALIZER */
 
diff --git a/modules/aaa/mod_authnz_ldap.c b/modules/aaa/mod_authnz_ldap.c
index 08f5fa1bc9..a0d55aa8db 100644
--- a/modules/aaa/mod_authnz_ldap.c
+++ b/modules/aaa/mod_authnz_ldap.c
@@ -1755,12 +1755,12 @@ static const command_rec authnz_ldap_cmds[] =
 AP_INIT_TAKE1("AuthLDAPBindPassword", set_bind_password, NULL, OR_AUTHCFG,
   "Password to use to bind to LDAP server. If not provided, 
will do an anonymous bind."),
 
-AP_INIT_FLAG("AuthLDAPBindAuthoritative", ap_set_flag_slot,
-  (void *)APR_OFFSETOF(authn_ldap_config_t, 
bind_authoritati

Re: Env var default value

2020-04-22 Thread Joe Orton
On Wed, Apr 22, 2020 at 12:50:36PM +0200, Yann Ylavic wrote:
> On Wed, Apr 22, 2020 at 12:10 PM Yann Ylavic  wrote:
> >
> > > ':'
> >
> > This one looks special already in ap_resolve_env(), though it's
> > forbidden in Define so that may be it.
> 
> I'm afraid ':' will collide with mod_rewrite's
> "${mapname:key|default}" syntax for RewriteMap.
> Same goes for '|' it seems, naming discussion still open :)
> 
> bash uses '=' for the default value too, looks quite readable/meaningful to 
> me..

+1 to the idea, and '=' seems like a good choice.

Regards, Joe



Re: [RFC] optional Listen options= argument

2020-04-21 Thread Joe Orton
On Tue, Apr 21, 2020 at 03:13:47PM +0100, Joe Orton wrote:
> On Tue, Apr 21, 2020 at 03:37:04PM +0200, Ruediger Pluem wrote:
> > Shouldn't that be argc < 2?
> 
> It is < 3 to make the second arg truly optional, so a proto default is 

No, you were right the logic was borked.

Updated patch which fixes this, and checks the protocol name for the 
obvious typo problem attached.  Thanks for reviews.

commit e9161f977d7633baf1ef34890f0de31058353c97
Author: Joe Orton 
Date:   Mon Apr 20 16:22:06 2020 +0100

Add "ListenFree" directive to create listener with APR_SO_FREEBIND set
(for use with an IP address which is not yet available on the system).

Reimplement "use_specific_errors" listener flag under generic
ap_listen_rec flags field to hold both these options.

* include/ap_listen.h: Add AP_LISTEN_* flags.
  (ap_listen_rec): Rename use_specific_errors to flags.

* server/listen.c (make_sock): Set APR_SO_FREEBIND if
  AP_LISTEN_FREEBIND flag is set on listener.
  (alloc_listener): Take flags argument.
  (ap_setup_listeners): Set AP_LISTEN_SPECIFIC_ERRORS flag here.
  (ap_set_listener): Parse optional options=... argument.
  (ap_duplicate_listeners): Duplicate flags.

Submitted by: jkaluza, Lubos Uhliarik , jorton
PR: 61865

diff --git a/include/ap_listen.h b/include/ap_listen.h
index 9cbaaa4910..2329cae70c 100644
--- a/include/ap_listen.h
+++ b/include/ap_listen.h
@@ -38,6 +38,11 @@ typedef struct ap_slave_t ap_slave_t;
 typedef struct ap_listen_rec ap_listen_rec;
 typedef apr_status_t (*accept_function)(void **csd, ap_listen_rec *lr, 
apr_pool_t *ptrans);
 
+/* Flags for ap_listen_rec.flags */
+#define AP_LISTEN_SPECIFIC_ERRORS (0x0001)
+#define AP_LISTEN_FREEBIND(0x0002)
+#define AP_LISTEN_REUSEPORT   (0x0004)
+
 /**
  * @brief Apache's listeners record.
  *
@@ -73,10 +78,9 @@ struct ap_listen_rec {
 ap_slave_t *slave;
 
 /**
- * Allow the accept_func to return a wider set of return codes
+ * Various AP_LISTEN_* flags.
  */
-int use_specific_errors;
-
+apr_uint32_t flags;
 };
 
 /**
diff --git a/include/ap_mmn.h b/include/ap_mmn.h
index 24ac648ac9..1271ce18ed 100644
--- a/include/ap_mmn.h
+++ b/include/ap_mmn.h
@@ -628,14 +628,15 @@
  * 20200331.2 (2.5.1-dev)  Add ap_proxy_should_override to mod_proxy.h
  * 20200331.3 (2.5.1-dev)  Add ap_parse_request_line() and
  * ap_check_request_header()
+ * 20200420.0 (2.5.1-dev)  Add flags to listen_rec in place of 
use_specific_errors
  */
 
 #define MODULE_MAGIC_COOKIE 0x41503235UL /* "AP25" */
 
 #ifndef MODULE_MAGIC_NUMBER_MAJOR
-#define MODULE_MAGIC_NUMBER_MAJOR 20200331
+#define MODULE_MAGIC_NUMBER_MAJOR 20200420
 #endif
-#define MODULE_MAGIC_NUMBER_MINOR 3/* 0...n */
+#define MODULE_MAGIC_NUMBER_MINOR 0/* 0...n */
 
 /**
  * Determine if the server's current MODULE_MAGIC_NUMBER is at least a
diff --git a/os/unix/unixd.c b/os/unix/unixd.c
index bde859022b..3b0e695727 100644
--- a/os/unix/unixd.c
+++ b/os/unix/unixd.c
@@ -323,7 +323,7 @@ AP_DECLARE(apr_status_t) ap_unixd_accept(void **accepted, 
ap_listen_rec *lr,
 }
   
 /* Let the caller handle slightly more varied return values */
-if (lr->use_specific_errors && ap_accept_error_is_nonfatal(status)) { 
+if ((lr->flags & AP_LISTEN_SPECIFIC_ERRORS) && 
ap_accept_error_is_nonfatal(status)) {
 return status;
 }
 
diff --git a/server/listen.c b/server/listen.c
index 87a4bafe80..f9c8ff5a54 100644
--- a/server/listen.c
+++ b/server/listen.c
@@ -150,7 +150,8 @@ static apr_status_t make_sock(apr_pool_t *p, ap_listen_rec 
*server, int do_bind_
 #endif
 
 #if defined(SO_REUSEPORT)
-if (ap_have_so_reuseport && ap_listencbratio > 0) {
+if (server->flags & AP_LISTEN_REUSEPORT
+|| (ap_have_so_reuseport && ap_listencbratio > 0)) {
 int thesock;
 apr_os_sock_get(&thesock, s);
 if (setsockopt(thesock, SOL_SOCKET, SO_REUSEPORT,
@@ -166,6 +167,21 @@ static apr_status_t make_sock(apr_pool_t *p, ap_listen_rec 
*server, int do_bind_
 }
 #endif
 
+
+#if defined(APR_SO_FREEBIND)
+if (server->flags & AP_LISTEN_FREEBIND) {
+if (apr_socket_opt_set(s, APR_SO_FREEBIND, one) < 0) {
+stat = apr_get_netos_error();
+ap_log_perror(APLOG_MARK, APLOG_CRIT, stat, p, APLOGNO()
+  "make_sock: apr_socket_opt_set: "
+  "error setting APR_SO_FREEBIND");
+apr_socket_close(s);
+return stat;
+}
+}
+#endif
+
+
 if (do_bind_listen) {
 #if APR_HAVE_IPV6
 if (server->bind_addr->family == APR_INET6) {
@@ -467,7 +483,7 @@ static int find_listeners(ap_listen_rec **from, 
ap_listen_rec **to,
 static const char *alloc_listener(process_rec *proce

Re: [RFC] optional Listen options= argument

2020-04-21 Thread Joe Orton
On Tue, Apr 21, 2020 at 03:37:04PM +0200, Ruediger Pluem wrote:
> Looks good in general apart from the comment below.
> 
> On 4/21/20 2:53 PM, Joe Orton wrote:
> > @@ -1058,7 +1104,24 @@ AP_DECLARE_NONSTD(const char *) 
> > ap_set_listener(cmd_parms *cmd, void *dummy,
> >  return "Port must be specified";
> >  }
> >  
> > -if (argc != 2) {
> > +if (argc == 3) {
> > +if (strncasecmp(argv[2], "options=", 8)) {
> > +return "Third argument to Listen must be options=...";
> > +}
> > +
> > +err = parse_listen_flags(cmd->temp_pool, argv[2] + 8, &flags);
> > +if (err) {
> > +return err;
> > +}
> > +}
> > +
> > +if (argc == 2 && strncasecmp(argv[1], "options=", 8) == 0) {
> > +err = parse_listen_flags(cmd->temp_pool, argv[1] + 8, &flags);
> > +if (err) {
> > +return err;
> > +}
> > +}
> > +else if (argc < 3) {
> 
> Shouldn't that be argc < 2?

It is < 3 to make the second arg truly optional, so a proto default is 
picked either for the 1-arg form or for the 2-arg form where the second 
arg is options= (i.e. first if condition applies) i.e. 

  Listen 80 options=foo

Making the protocol option truly optional was one of the review feedback 
comments in the original thread.

No strong opinion here on whether that's right, but it reminds me that 
it leads to a fragile outcome, e.g.

  Listen 80 optons=foo

silently succeeds with "optons=foo" as the protocol name (not sure what 
it does with it).  If it is safe to assume "=" can never appear in a 
protocol name, we could catch any proto with "=" and make that a config 
error again.

Alternatively, can simply make the protocol a required option if 
options= is used.

Regards, Joe



[RFC] optional Listen options= argument

2020-04-21 Thread Joe Orton
A previous conversation [1] about optionally enabling socket options for 
Listen seemed to gain consensus around adding an optional argument, 
rather than multiple alternative Listen-like directives.

I've attached a patch based on work by both Jan Kaluza and Lubos 
Uhliarik which implements this, updated for trunk.  Syntax used is:

  Listen [IP-address:]portnumber [protocol] [options=keyword,keyword,...]

where options is a comma-separated list of keywords.  In this patch the 
"reuseport" and "freebind" keywords are supported for APR_SO_REUSEPORT 
and APR_SO_FREEBIND respectively.  Key/value keywords could be used to 
allow per-listener backlog, TCP buffer sizes etc, though non-boolean 
options will require extending ap_listen_rec so that needs care.

Opinions?  Is there still consensus that extending Listen like this is a 
good idea?

Regards, Joe

[1] 
http://mail-archives.apache.org/mod_mbox/httpd-dev/201603.mbox/%3c56dd68e5.2040...@redhat.com%3e
diff --git a/include/ap_listen.h b/include/ap_listen.h
index 9cbaaa4910..2329cae70c 100644
--- a/include/ap_listen.h
+++ b/include/ap_listen.h
@@ -38,6 +38,11 @@ typedef struct ap_slave_t ap_slave_t;
 typedef struct ap_listen_rec ap_listen_rec;
 typedef apr_status_t (*accept_function)(void **csd, ap_listen_rec *lr, 
apr_pool_t *ptrans);
 
+/* Flags for ap_listen_rec.flags */
+#define AP_LISTEN_SPECIFIC_ERRORS (0x0001)
+#define AP_LISTEN_FREEBIND(0x0002)
+#define AP_LISTEN_REUSEPORT   (0x0004)
+
 /**
  * @brief Apache's listeners record.
  *
@@ -73,10 +78,9 @@ struct ap_listen_rec {
 ap_slave_t *slave;
 
 /**
- * Allow the accept_func to return a wider set of return codes
+ * Various AP_LISTEN_* flags.
  */
-int use_specific_errors;
-
+apr_uint32_t flags;
 };
 
 /**
diff --git a/include/ap_mmn.h b/include/ap_mmn.h
index 24ac648ac9..1271ce18ed 100644
--- a/include/ap_mmn.h
+++ b/include/ap_mmn.h
@@ -628,14 +628,15 @@
  * 20200331.2 (2.5.1-dev)  Add ap_proxy_should_override to mod_proxy.h
  * 20200331.3 (2.5.1-dev)  Add ap_parse_request_line() and
  * ap_check_request_header()
+ * 20200420.0 (2.5.1-dev)  Add flags to listen_rec in place of 
use_specific_errors
  */
 
 #define MODULE_MAGIC_COOKIE 0x41503235UL /* "AP25" */
 
 #ifndef MODULE_MAGIC_NUMBER_MAJOR
-#define MODULE_MAGIC_NUMBER_MAJOR 20200331
+#define MODULE_MAGIC_NUMBER_MAJOR 20200420
 #endif
-#define MODULE_MAGIC_NUMBER_MINOR 3/* 0...n */
+#define MODULE_MAGIC_NUMBER_MINOR 0/* 0...n */
 
 /**
  * Determine if the server's current MODULE_MAGIC_NUMBER is at least a
diff --git a/os/unix/unixd.c b/os/unix/unixd.c
index bde859022b..3b0e695727 100644
--- a/os/unix/unixd.c
+++ b/os/unix/unixd.c
@@ -323,7 +323,7 @@ AP_DECLARE(apr_status_t) ap_unixd_accept(void **accepted, 
ap_listen_rec *lr,
 }
   
 /* Let the caller handle slightly more varied return values */
-if (lr->use_specific_errors && ap_accept_error_is_nonfatal(status)) { 
+if ((lr->flags & AP_LISTEN_SPECIFIC_ERRORS) && 
ap_accept_error_is_nonfatal(status)) {
 return status;
 }
 
diff --git a/server/listen.c b/server/listen.c
index 87a4bafe80..3657ba65c7 100644
--- a/server/listen.c
+++ b/server/listen.c
@@ -150,7 +150,8 @@ static apr_status_t make_sock(apr_pool_t *p, ap_listen_rec 
*server, int do_bind_
 #endif
 
 #if defined(SO_REUSEPORT)
-if (ap_have_so_reuseport && ap_listencbratio > 0) {
+if (server->flags & AP_LISTEN_REUSEPORT
+|| (ap_have_so_reuseport && ap_listencbratio > 0)) {
 int thesock;
 apr_os_sock_get(&thesock, s);
 if (setsockopt(thesock, SOL_SOCKET, SO_REUSEPORT,
@@ -166,6 +167,21 @@ static apr_status_t make_sock(apr_pool_t *p, ap_listen_rec 
*server, int do_bind_
 }
 #endif
 
+
+#if defined(APR_SO_FREEBIND)
+if (server->flags & AP_LISTEN_FREEBIND) {
+if (apr_socket_opt_set(s, APR_SO_FREEBIND, one) < 0) {
+stat = apr_get_netos_error();
+ap_log_perror(APLOG_MARK, APLOG_CRIT, stat, p, APLOGNO()
+  "make_sock: apr_socket_opt_set: "
+  "error setting APR_SO_FREEBIND");
+apr_socket_close(s);
+return stat;
+   }
+   }
+#endif
+
+
 if (do_bind_listen) {
 #if APR_HAVE_IPV6
 if (server->bind_addr->family == APR_INET6) {
@@ -467,7 +483,7 @@ static int find_listeners(ap_listen_rec **from, 
ap_listen_rec **to,
 static const char *alloc_listener(process_rec *process, const char *addr,
   apr_port_t port, const char* proto,
   const char *scope_id, void *slave,
-  apr_pool_t *temp_pool)
+  apr_pool_t *temp_pool, apr_uint32_t flags)
 {
 ap_listen_rec *last;
 apr_status_t status;
@@ -511,6 +527,7 @@ static const char *alloc_listener(process_rec *process, 
const char *addr,
 new->next = 0;
 new->bind_addr = sa;
 new

Re: svn commit: r1876619 - /httpd/httpd/trunk/modules/core/mod_watchdog.c

2020-04-20 Thread Joe Orton
On Fri, Apr 17, 2020 at 07:22:00PM +0200, Yann Ylavic wrote:
> On Thu, Apr 16, 2020 at 7:55 PM  wrote:
> >
> > Author: jorton
> > Date: Thu Apr 16 17:55:48 2020
> > New Revision: 1876619
> >
> > URL: http://svn.apache.org/viewvc?rev=1876619&view=rev
> > Log:
> > * modules/core/mod_watchdog.c (wd_worker): Fix crashes snuck into
> >   r1876599 where a destroyed pool was reused.  Rename the "ctx"
> >   variable to reflect its purpose.  Also tweak the pool tags.
> 
> Sorry for messing up with that, I didn't notice that
> apr_pool_destroy() was used :/

No worries, I was quite confused for a while by the horrific failures 
which couldn't possibly be caused by the "just add the pool tags" 
commit, but this is why we have Travis ;)

> Hopefully my intent is better addressed in r1876675..

+1 nice, thanks.

And BTW I have not seen a single failure of the filter tests (which had 
been haunting the CI) after the ap_request_core_filter removal commit, 
so huge thanks again also for that.

Regards, Joe



Re: svn commit: r1876511 - /httpd/httpd/trunk/modules/core/mod_watchdog.c

2020-04-17 Thread Joe Orton
On Wed, Apr 15, 2020 at 09:50:01AM -0400, Jim Jagielski wrote:
> very, very elegant.

Thank you Jim!

I wonder if the new state variable is redundant with the other state 
variables in the watchdog structure?  I don't understand the watchdog 
model well enough to be confident here.

 struct ap_watchdog_t
{
apr_uint32_t  thread_started; /* set to 1 once thread running */
...
int   singleton;
int   active;


Also reviewing this module more, in the post_config hook:

http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/core/mod_watchdog.c?annotate=1876619#l434

it looks to me like the ap_state_query() call correctly guards against 
multiple post_config runs during start, so there will never be a point 
where the wd_server_conf attached to pconf is reused or reusable.

- in initial startup you get only one post_config invocation if the 
_PRE_CONFIG phase is skipped

- during subsequent server reloads pconf is cleared each time

I may be missing something.

Regards, Joe

> 
> > On Apr 14, 2020, at 8:37 AM, jor...@apache.org wrote:
> > 
> > Author: jorton
> > Date: Tue Apr 14 12:37:17 2020
> > New Revision: 1876511
> > 
> > URL: http://svn.apache.org/viewvc?rev=1876511&view=rev
> > Log:
> > * modules/core/mod_watchdog.c: Switch to simpler logic to avoid the
> >  thread cleanup running before the thread has started, avoiding
> >  mutex operations which both have undefined behaviour:
> > 
> >  a) double-locking an UNNESTED (non-recursive) mutex twice in the parent
> >  b) unlocking a mutex in the spawned thread which was locked by the parent
> > 
> >  (wd_startup, wd_worker_cleanup, wd_worker): Use a boolean to ensure
> >  the cleanup does nothing if the thread wasn't started, drop the mutex.
> > 
> > Modified:
> >httpd/httpd/trunk/modules/core/mod_watchdog.c
> > 
> > Modified: httpd/httpd/trunk/modules/core/mod_watchdog.c
> > URL: 
> > http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/core/mod_watchdog.c?rev=1876511&r1=1876510&r2=1876511&view=diff
> > ==
> > --- httpd/httpd/trunk/modules/core/mod_watchdog.c (original)
> > +++ httpd/httpd/trunk/modules/core/mod_watchdog.c Tue Apr 14 12:37:17 2020
> > @@ -24,6 +24,8 @@
> > #include "http_core.h"
> > #include "util_mutex.h"
> > 
> > +#include "apr_atomic.h"
> > +
> > #define AP_WATCHDOG_PGROUP"watchdog"
> > #define AP_WATCHDOG_PVERSION  "parent"
> > #define AP_WATCHDOG_CVERSION  "child"
> > @@ -43,7 +45,7 @@ struct watchdog_list_t
> > 
> > struct ap_watchdog_t
> > {
> > -apr_thread_mutex_t   *startup;
> > +apr_uint32_t  thread_started; /* set to 1 once thread running 
> > */
> > apr_proc_mutex_t *mutex;
> > const char   *name;
> > watchdog_list_t  *callbacks;
> > @@ -74,6 +76,10 @@ static apr_status_t wd_worker_cleanup(vo
> > apr_status_t rv;
> > ap_watchdog_t *w = (ap_watchdog_t *)data;
> > 
> > +/* Do nothing if the thread wasn't started. */
> > +if (apr_atomic_read32(&w->thread_started) != 1)
> > +return APR_SUCCESS;
> > +
> > if (w->is_running) {
> > watchdog_list_t *wl = w->callbacks;
> > while (wl) {
> > @@ -110,7 +116,8 @@ static void* APR_THREAD_FUNC wd_worker(a
> > w->pool = apr_thread_pool_get(thread);
> > w->is_running = 1;
> > 
> > -apr_thread_mutex_unlock(w->startup);
> > +apr_atomic_set32(&w->thread_started, 1); /* thread started */
> > +
> > if (w->mutex) {
> > while (w->is_running) {
> > if (ap_mpm_query(AP_MPMQ_MPM_STATE, &mpmq_s) != APR_SUCCESS) {
> > @@ -264,10 +271,7 @@ static apr_status_t wd_startup(ap_watchd
> > {
> > apr_status_t rc;
> > 
> > -/* Create thread startup mutex */
> > -rc = apr_thread_mutex_create(&w->startup, APR_THREAD_MUTEX_UNNESTED, 
> > p);
> > -if (rc != APR_SUCCESS)
> > -return rc;
> > +apr_atomic_set32(&w->thread_started, 0);
> > 
> > if (w->singleton) {
> > /* Initialize singleton mutex in child */
> > @@ -277,22 +281,12 @@ static apr_status_t wd_startup(ap_watchd
> > return rc;
> > }
> > 
> > -/* This mutex fixes problems with a fast start/fast end, where the pool
> > - * cleanup was being invoked before the thread completely spawned.
> > - */
> > -apr_thread_mutex_lock(w->startup);
> > -apr_pool_pre_cleanup_register(p, w, wd_worker_cleanup);
> > -
> > /* Start the newly created watchdog */
> > rc = apr_thread_create(&w->thread, NULL, wd_worker, w, p);
> > -if (rc) {
> > -apr_pool_cleanup_kill(p, w, wd_worker_cleanup);
> > +if (rc == APR_SUCCESS) {
> > +apr_pool_pre_cleanup_register(p, w, wd_worker_cleanup);
> > }
> > 
> > -apr_thread_mutex_lock(w->startup);
> > -apr_thread_mutex_unlock(w->startup);
> > -apr_thread_mutex_destroy(w->startup);
> > -
> > return rc;
> > }
> > 
> > 
> > 
> 



Re: svn commit: r1875947 - in /httpd/httpd/trunk: include/http_request.h modules/http/http_core.c modules/http/http_request.c server/core.c server/core_filters.c server/request.c server/util_filter.c

2020-04-02 Thread Joe Orton
On Thu, Apr 02, 2020 at 10:58:21AM +0200, Yann Ylavic wrote:
> On Thu, Apr 2, 2020 at 6:39 AM Ruediger Pluem  wrote:
> >
> > > +#define AP_BUCKET_IS_MORPHING(e)((e)->length == (apr_size_t)-1)
> >
> > Nitpick: After having a second thought on the whole thing, I think the 
> > above name is misleading to some extend. If MMAP is enabled
> > a file bucket is also a morphing bucket as a read on it causes the bucket 
> > to split in an MMAP bucket and a shorter file bucket.
> > A MMAP bucket consumes at least address space and I vaguely remember cases 
> > from the past (back in 32 bit times) where filters that
> > processed (doing apr_bucket_read) a whole brigade at once without passing 
> > results down the chain on a regular basis caused the
> > address space to be exhausted by MMAP buckets if the file bucket was huge.
> 
> How about a less subjective:
>   #define AP_BUCKET_HAS_LENGTH(e) ((e)->length != (apr_size_t)-1)
> ?
>
> If find it quite painful to type e->length == (apr_size_t)-1, though I
> could simply use -1 and rely on unsigned promotion..

Formally this is *determinate* length - all buckets have a length just 
some of them are indeterminate.

#define AP_BUCKET_DETERMINATE_LENGTH(e) ((e)->length != (apr_size_t)-1)
 
but it is quite a mouthful and exactly the same number of characters as 
writing it out longhand.  No strong opinion on having a macro for it, 
probably should be in APR tho ;)




Re: svn commit: r1875881 - /httpd/httpd/trunk/modules/ssl/ssl_engine_io.c

2020-04-02 Thread Joe Orton
On Thu, Apr 02, 2020 at 10:37:01AM +0200, Ruediger Pluem wrote:
> Sounds reasonable. Thanks for explaining. Do you fix <= vs < and > vs >= or 
> should I?

http://svn.apache.org/viewvc?view=revision&revision=1876037

I hope I have tweaked this to death now, but more review definitely 
welcome since I found another way to (prematurely?) optimize the logic 
after trawling through some debug logs.  Also really need some robust 
tests for this since it's a pain to validate the behaviour across 
various edge cases.

(and I am even more convinced this filter really should not exist and 
the buffering could be done more efficiently in ssl_io_filter_output but 
I will have to wait for a spare few days to try that...)

Regards, Joe



Re: svn commit: r1875881 - /httpd/httpd/trunk/modules/ssl/ssl_engine_io.c

2020-04-02 Thread Joe Orton
On Thu, Apr 02, 2020 at 07:30:50AM +0200, Ruediger Pluem wrote:
> On 4/1/20 11:44 PM, Mike Rumph wrote:
> > I'm not very familiar with this code, so my questions might not make sense.
> > 
> > But take a look at the following two code segments in 
> > ssl_io_filter_coalesce():
> > 
> >     for (e = APR_BRIGADE_FIRST(bb);
> >          e != APR_BRIGADE_SENTINEL(bb)
> >              && !APR_BUCKET_IS_METADATA(e)
> >              && e->length != (apr_size_t)-1
> >              && e->length < COALESCE_BYTES
> >              && (buffered + bytes + e->length) < COALESCE_BYTES;
> > 
> > and
> > 
> >         if (rv == APR_SUCCESS) {
> >             /* If the read above made the bucket morph, it may now fit
> >              * entirely within the buffer.  Otherwise, split it so it does
> >              * fit. */
> >             if (e->length >= COALESCE_BYTES
> >                 || e->length + buffered + bytes >= COALESCE_BYTES) {
> >                 rv = apr_bucket_split(e, COALESCE_BYTES - (buffered + 
> > bytes));
> >             }
> > 
> > Does this mean that either buffered or bytes might be negative?
> > Otherwise, testing both "e->length" and  "e->length + buffered + bytes" 
> > against  COALESCE_BYTES seems redundant. 
> 
> I guess they are some sort redundant. Joe?

I was being paranoid about unsigned integer overflow when writing this.  
(e->length + X) may overflow, comparing e->length against COALESCE_BYTES 
in both cases prevents that since max(bytes + buffered)=COALESCE_BYTES.

There is nothing preventing someone creating a bucket (e.g. FILE) with 
e->length = (apr_size_t)-2 and passing it through the filter stack 
AFAIK, so I think paranoia is reasonable.

But now you made me read the code *again* I think those tests should be 
e->length <= COALESCE_BYTES and e->length > COALESCE_BYTES etc. 

And I thought this would be a simple patch...  Very much appreciate all 
the review!

Regards, Joe



Re: svn commit: r1875881 - /httpd/httpd/trunk/modules/ssl/ssl_engine_io.c

2020-04-01 Thread Joe Orton
On Wed, Apr 01, 2020 at 02:04:21PM +0200, Ruediger Pluem wrote:
> On 3/30/20 3:18 PM, jor...@apache.org wrote:
> >  
> > -rv = apr_bucket_split(e, COALESCE_BYTES - (buffered + bytes));
> > +/* If the read above made the bucket morph, it may now fit
> > + * entirely within the buffer.  Otherwise, split it so it does
> > + * fit. */
> > +if (e->length < COALESCE_BYTES
> > +&& e->length + buffered + bytes < COALESCE_BYTES) {
> > +rv = APR_SUCCESS;
> 
> Hmm. If we had e->length == -1 above and the bucket read failed, e might 
> still be the morphing bucket and hence e->length == -1.
> I think all the code below assumes e->length >= 0 things can get off the 
> rails.

Thanks a lot for the review... I tried to keep that as simple as 
possible but there are too many cases to cover.  Yep, you're right.

> How about the following patch (minus whitespace changes) to fix this:

+1 that looks correct to me, please commit (or I can...)
 
> Index: ssl_engine_io.c
> ===
> --- ssl_engine_io.c   (revision 1875997)
> +++ ssl_engine_io.c   (working copy)
> @@ -1739,7 +1739,7 @@
>  && bytes + buffered < COALESCE_BYTES
>  && e != APR_BRIGADE_SENTINEL(bb)
>  && !APR_BUCKET_IS_METADATA(e)) {
> -apr_status_t rv;
> +apr_status_t rv = APR_SUCCESS;
> 
>  /* For an indeterminate length bucket (PIPE/CGI/...), try a
>   * non-blocking read to have it morph into a HEAP.  If the
> @@ -1753,17 +1753,16 @@
>  if (rv != APR_SUCCESS && !APR_STATUS_IS_EAGAIN(rv)) {
>  ap_log_cerror(APLOG_MARK, APLOG_ERR, rv, f->c, APLOGNO(10232)
>"coalesce failed to read from data bucket");
> +return AP_FILTER_ERROR;
>  }
>  }
> 
> +if (rv == APR_SUCCESS) {
>  /* If the read above made the bucket morph, it may now fit
>   * entirely within the buffer.  Otherwise, split it so it does
>   * fit. */
> -if (e->length < COALESCE_BYTES
> -&& e->length + buffered + bytes < COALESCE_BYTES) {
> -rv = APR_SUCCESS;
> -}
> -else {
> +if (e->length >= COALESCE_BYTES
> +|| e->length + buffered + bytes >= COALESCE_BYTES) {
>  rv = apr_bucket_split(e, COALESCE_BYTES - (buffered + bytes));
>  }
> 
> @@ -1787,6 +1786,7 @@
>  return AP_FILTER_ERROR;
>  }
>  }
> +}
> 
>  upto = e;
> 
> 
> Regards
> 
> Rüdiger
> 



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-04-01 Thread Joe Orton
On Wed, Apr 01, 2020 at 10:06:20AM +0200, Ruediger Pluem wrote:
> 
> 
> On 4/1/20 9:53 AM, Joe Orton wrote:
> > On Wed, Apr 01, 2020 at 09:24:27AM +0200, Ruediger Pluem wrote:
> >> I have checked socket, pipe and cgi buckets and none of them seem to 
> >> return EOF. In case they hit EOF they replace themselves with
> >> an immortal bucket of length 0. So above seems just an additional safety 
> >> if a (future) morphing bucket behaves differently and
> >> returns EOF, but with the current code that path should not be hit really.
> > 
> > Yeah, the "read till EOF" is an implementation detail for those bucket 
> > types, I would very strongly argue if they ever exposed EOF on read() it 
> > would be a bucket type bug.  It could quite possibly obscure a bug 
> > elsewhere if filters ignored EOF on read() so I don't think that should 
> > be recommended.
> 
> So you recommend that part of the patch to be reverted?

Thinking about it more, the most likely way to hit that condition is a 
FILE bucket where the underlying file is truncated simultaneous to it 
being read (i.e. it's shorter than when the bucket was created).  That 
will definitely result in apr_bucket_read() returning EOF, and I think 
should definitely be treated as an error rather than silently ignored.

TL;DR -> yes, or am I missing something?



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-04-01 Thread Joe Orton
On Wed, Apr 01, 2020 at 09:24:27AM +0200, Ruediger Pluem wrote:
> I have checked socket, pipe and cgi buckets and none of them seem to return 
> EOF. In case they hit EOF they replace themselves with
> an immortal bucket of length 0. So above seems just an additional safety if a 
> (future) morphing bucket behaves differently and
> returns EOF, but with the current code that path should not be hit really.

Yeah, the "read till EOF" is an implementation detail for those bucket 
types, I would very strongly argue if they ever exposed EOF on read() it 
would be a bucket type bug.  It could quite possibly obscure a bug 
elsewhere if filters ignored EOF on read() so I don't think that should 
be recommended.



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-31 Thread Joe Orton
On Tue, Mar 31, 2020 at 01:19:08AM +0200, Yann Ylavic wrote:
> FWIW, attached is v2 with simpler "batching" in
> ap_filter_setaside_brigade(), no spurious hunk, bb cleaned up at the
> end, and APR_ENOTIMPL when called from a filter < AP_FTYPE_CONNECTION.

FWIW this makes more sense to me than what's in trunk, so +1 here too.




Re: Solaris, prefork, accept mutex and mod_ext_filter (Was: Prefork MPM and mod_watchdog)

2020-03-30 Thread Joe Orton
On Mon, Mar 30, 2020 at 12:31:05AM +0200, Rainer Jung wrote:
> I can now see the same problem on Linux (eg. RHEL 7, SLES 12 and SLES 15)
> when doing testing for 2.4.43. I think it is not a regression and for me it
> is not a showstopper, but something that would be nice to fix. Symptom is
> 
> [Sat Mar 28 15:53:21.121461 2020] [mpm_prefork:debug] [pid 4574]
> prefork.c(919): AH00165: Accept mutex: pthread (default: pthread)
> [Sat Mar 28 15:58:19.251858 2020] [mpm_prefork:emerg] [pid 6509] (22)Invalid
> argument: AH00144: couldn't grab the accept mutex
> [Sat Mar 28 15:58:20.624995 2020] [mpm_prefork:emerg] [pid 4902] (22)Invalid
> argument: AH00146: couldn't release the accept mutex
> [Sat Mar 28 15:58:21.410552 2020] [:emerg] [pid 4574] AH02818: MPM run
> failed, exiting
> 
> happening during t/modules/ext_filter.t and as a result all httpd processes
> are gone.
>
> Although it happens only rarely, I think it does not happen when using APR
> 1.6, but only for 1.7 (1.7.0 and also for latest head). The accept mutex is
> pthread based. There are changes in APR 1.7 for those.

This is interesting, I saw some similar mutex state snafu (Fedora 31) 
but assumed it was some kind of PEBKAC issue and moved on.  You have 
only seen this with prefork?

May be completely unrelated but since I added a grep for child segfaults 
in error_log that has triggered with prefork at least once:

https://travis-ci.org/github/apache/httpd/jobs/665874906

Regards, Joe

> Since I only observe it for mod_ext_filter, there should be some dependency
> with the forked perl process.
> 
> I didn't yet have the opportunity to check, how much of the follwing
> description for Solaris also holds for Linux.
> 
> Regards,
> 
> Rainer
> 
> Am 03.02.2019 um 13:30 schrieb Rainer Jung:
> > I can now frequently reproduce running t/modules/ext_filter.t only. I
> > stripped the reproducer down to the part of t/modules/ext_filter.t where
> > it runs
> > 
> > POST "/apache/extfilter/out-limit/modules/cgi/perl_echo.pl", content =>
> > "foo and bar\n"
> > 
> > 10 times in a row. If I increase the iteration count slightly to 20, the
> > problem happens nearly every time. I also replaced perl_echo.pl and
> > eval-cmd.pl by small C programs doing the echo and the s/foo/bar/ and
> > can still reproduce.
> > 
> > This test involved mod_ext_filter and LimitRequestBody.
> > 
> > It seems I can not reproduce, if I shorten the POST body, so that it no
> > longer hits the LimitRequestBody 6 configured for
> > /apache/extfilter/out-limit/.
> > 
> > What happens, but I do not understand is:
> > 
> > - the first few requests are handled by two children in an alternating
> > way. I can see the accept mutex being passed between these two children
> > using lwp_mutex_timedlock(ADDRESS, 0x) and
> > lwp_mutex_unlock(ADDRESS) using truss (Solaris variant of strace). So
> > Solaris seems to implement the pthread mutexes via these lwp mutexes.
> > 
> > - after a few requests alternating between the two children, something
> > strange happens:
> > 
> >    - child A returns from port_getn() for 22 (probably part of the
> > accept() impl)
> >    - child A calls accept()
> >    - child A unlocks the accept mutex using lwp_mutex_unlock()
> >    - child B locks the accept mutex using lwp_mutex_timedlock()
> >    - child B calls port_associate for 22 (probably part of accept() impl)
> >    - child A handles the request
> > ! - child A also calls port_associate for 22 (no sign of lock acquire)
> > ! - child A returns from port_getn() for 22
> >    - child A calls accept() and starts to handle the connection
> > ! - child B also returns from port_getn() for 22
> > ! - child B gets EAGAIN for the accept()
> >    - child B calls port_associate for 22
> >    - child A handles the request
> > ! - child A gets EDEADLK for pthread_mutex_lock() (this is from the
> > error_log; there's no system call for this instance of
> > pthread_mutex_lock() in the truss output). EDEADLK for
> > pthread_mutex_lock() means that this process already has that lock.
> >    - child B returns from port_getn() for 22
> >    - child B calls accept() and starts to handle the connection
> >    - child A exits (now B is the only child left)
> >    - child B proceeds request handling
> >    - child B does all further port_associate(), port_getn(), and
> > accept() plus connection and request handling. No more
> > lwp_mutex_timedlock() or lwp_mutex_unlock() for B, maybe due to some
> > optimization when only one process for a lock is left.
> > 
> > 
> > It is quite possible, that there is an underlying lwp_mutex or portfs
> > bug here. But it is strange, that this only comes up when used with
> > mod_ext_filter in combination with LimitRequestBody.
> > 
> > There was the fix
> > 
> > https://bz.apache.org/bugzilla/show_bug.cgi?id=60375
> > 
> > but I don't see, how this would influence the above list.
> > 
> > I can try to further narrow down (using static content to eliminate one
> > forked child; using L

Re: [VOTE] Release httpd-2.4.43

2020-03-26 Thread Joe Orton
On Thu, Mar 26, 2020 at 09:50:12AM -0500, Daniel Ruggeri wrote:
> Hi, all;
>    Please find below the proposed release tarball and signatures:
> https://dist.apache.org/repos/dist/dev/httpd/
> 
> I would like to call a VOTE over the next few days to release this
> candidate tarball as 2.4.43:
> [X] +1: It's not just good, it's good enough!
> [ ] +0: Let's have a talk.
> [ ] -1: There's trouble in paradise. Here's what's wrong.

+1 from me.

I don't know why my not-so-clever Travis tag matching boolean was 
defeated by this again, but CI passed on the branch, and tests pass 
locally.

Thanks twice for RMing!

Regards, Joe



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-26 Thread Joe Orton
On Thu, Mar 26, 2020 at 01:11:10AM +0200, Graham Leggett wrote:
> The question you’re asking is “why is is an async path being taken 
> when AP_MPMQ_IS_ASYNC is false”. The setasides and reinstates should 
> be noops in this situation.

The "noop" path in ap_filter_setaside_brigade is when it is passed an 
empty brigade, right?  It is trivial to see this is not true for worker 
based off the logging for the tests which are triggering this:

sh-5.0$ > t/logs/error_log 
sh-5.0$ ./t/TEST t/apache/passbrigade.t &>/dev/null 
sh-5.0$ grep resuming t/logs/error_log 
[Thu Mar 26 11:26:28.632069 2020] [mpm_worker:notice] [pid 60770:tid 
139707977312256] AH00292: Apache/2.5.1-dev (Unix) OpenSSL/1.1.1d configured -- 
resuming normal operations
sh-5.0$ grep -c 'setaside full' t/logs/error_log 
392






Re: Broken: apache/httpd#515 (2.4.x - e936ddc)

2020-03-25 Thread Joe Orton
On Wed, Mar 25, 2020 at 11:20:32AM +0100, Rainer Jung wrote:
> And now I notice that's of course not appropriate (everone needing to change
> the call to the Makefile. So I will add local defaults in the spirit of your
> below suggestion into Makefile.PL.

Thanks a lot, looks good now with r1875640!



Re: Broken: apache/httpd#515 (2.4.x - e936ddc)

2020-03-25 Thread Joe Orton
On Tue, Mar 24, 2020 at 11:35:38PM +0100, Rainer Jung wrote:
> Excellent. That gave me the right idea where to look at.
> 
> I found a way to pass additionals args inside Makefile.PL into
> Apache::TestMM::Argv which get automatically added to $vars in
> Apache::TestConfig. It seems to work well. Committed in r1875598.

Doesn't seem to work for me or in Travis -

https://travis-ci.org/github/apache/httpd/jobs/06723#L2462

I don't up with with any Argv defined for limitrequestline if either 
passing -apxs to Makefile.PL or not passing any argv.

starting w/below patch applied:

sh-5.0$ perl ./Makefile.PL -apxs foo &> /dev/null 
sh-5.0$ grep Argv t/TEST 
$Apache::TestConfig::Argv{'limitrequestline'} = q|128|;
$Apache::TestConfig::Argv{'apxs'} = q|foo|;
$Apache::TestConfig::Argv{'limitrequestlinex2'} = q|256|;
sh-5.0$ svn revert Makefile.PL 
Reverted 'Makefile.PL'
sh-5.0$ perl ./Makefile.PL -apxs foo &> /dev/null 
sh-5.0$ grep Argv t/TEST 
$Apache::TestConfig::Argv{'apxs'} = q|foo|;


Index: Makefile.PL
===
--- Makefile.PL (revision 1875618)
+++ Makefile.PL (working copy)
@@ -28,13 +28,12 @@
 # supported in an Apache::Test release.
 # Code borrowed from Apache::TestMM::filter_args().
 my %local_args = (
-limitrequestline => 'Value for LimitRequestLine',
-limitrequestlinex2 => 'Twice the value for LimitRequestLine',
+limitrequestline => '128',
+limitrequestlinex2 => '256',
 );
-my($argv, $vars) = Apache::TestConfig::filter_args(\@ARGV, \%local_args);
-@ARGV = @$argv;
-push(@Apache::TestMM::Argv, %$vars);
 
+push(@Apache::TestMM::Argv, %local_args);
+
 for my $script (@scripts) {
 Apache::TestMM::generate_script($script);
 }



> 
> Thanks and regards,
> 
> Rainer
> 
> Am 24.03.2020 um 18:35 schrieb Joe Ortqon:
> > On Tue, Mar 24, 2020 at 05:55:20PM +0100, Rainer Jung wrote:
> > > I've got the following problem: I want to use a new config var in
> > > Apache::Test as a patern to replace in extra.conf.in. I added the bvar to
> > > Apache::Test::Config, but it seems Travis uses only a released version of
> > > Apache::Test. Is there a way of influencing Apache::Test early from our 
> > > own
> > > scripts, so that I can set a default value for the new var?
> > 
> > This is the downside of relying on a released Apache::Test :(
> > 
> > If we can patch t/TEST I think we can set the defaults.  Not sure
> > if there is a clean way to do it but Makefile.PL is generating the file
> > so it's possible in theory.
> > 
> > If I apply this then ./t/TEST runs again with external Apache::Test
> > 
> > --- t/TEST~ 2020-03-12 11:46:26.823610447 +
> > +++ t/TEST  2020-03-24 17:31:43.225348563 +
> > @@ -9,6 +9,8 @@
> >   BEGIN { eval { require blib && blib->import; } }
> >   $Apache::TestConfig::Argv{'apxs'} = 
> > q|/home/jorton/src/asf/httpd-git/check/bin/apxs|;
> > +$Apache::TestConfig::Argv{'limitrequestline'} = q|128|;
> > +$Apache::TestConfig::Argv{'limitrequestlinex2'} = q|256|;
> >   use strict;
> >   use warnings FATAL => 'all';
> > @@ -19,4 +21,4 @@
> >   );
> > -use Apache::TestRun ();Apache::TestRun->new->run(@ARGV);
> > \ No newline at end of file
> > +use Apache::TestRun ();Apache::TestRun->new->run(@ARGV);
> > 
> > 
> > 
> > 
> > 
> > > 
> > > Error message:
> > > 
> > > [  error] configure() has failed:
> > > 
> > > invalid token: @limitrequestline@ in file
> > > /home/travis/build/apache/httpd/test/perl-framework/t/conf/extra.conf.in
> > > 
> > > I introduces the new variable in extra.conf.in in r1875569 and the needed
> > > code in Apache::TestConfig in r1875568.
> > > 
> > > Any help welcome. Otherwise I will revert and run with local 
> > > modifications.
> > > 
> > > Thanks and regards,
> > > 
> > > Rainer
> > > 
> > > Am 24.03.2020 um 14:30 schrieb Travis CI:
> > > > apache
> > > > 
> > > > /
> > > > 
> > > > httpd
> > > > 
> > > > 
> > > > 
> > > > 
> > > > branch icon2.4.x 
> > > > 
> > > > build has failed
> > > > Build #515 was broken 
> > > > 
> > > > arrow to build time
> > > > clock icon9 mins and 52 secs
> > > > 
> > > > Jim Jagielski avatarJim Jagielski
> > > > 
> > > > e936ddc CHANGESET →
> > > > 
> > > > 
> > > > 2.4.42 was DOA
> > > > 
> > > > 
> > > > git-svn-id:
> > > > https://svn.apache.org/repos/asf/httpd/httpd/branches/2.4.x@1875576
> > > > 13f79535-47bb-0310-9956-ffa450edef68
> > > > 
> > > > Want to know about upcoming build environment updates?
> > > > 
> > > > Would you like to stay up-to-date with the upcoming Travis CI build
> > > > environment updates? We set up a mailing list for you!
> > > > 
> > > > SIGN UP HERE 
> > > > 
> > > > book icon
> > > > 
> > > > Documentatio

Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-24 Thread Joe Orton
On Tue, Mar 24, 2020 at 12:35:45AM +0200, Graham Leggett wrote:
> The most likely reason is because:
> 
> a) http://httpd.apache.org/docs/trunk/mod/core.html#asyncfilter defaults to 
> request, meaning request filters are expected to support async filters; and
> 
> b) the worker MPM doesn't call the ap_run_output_pending() hook, and 
> so by definition does not support async filters.
> 
> The fix (as far as I can see) is for the worker MPM to run the 
> ap_run_output_pending() hook.

Only 3/8 trunk MPMs have that call, so those are all broken right now?  
I know we have seen this fail for both prefork and worker in Travis.

> What happens if you set "AsyncFilter connection”? Does it skip the problem?

Are you unable to repro using the script I posted in the thread Ruediger 
referenced to confirm?  If this is the correct default I guess I'd ask 
why isn't this hard-coded - at least maybe for affected MPMs?  Maybe try 
it and see if the Travis failures go away.

Regards, Joe



Re: Use of [skip ci] in commit messages to avoid Travis builds

2020-03-23 Thread Joe Orton
On Sat, Feb 08, 2020 at 12:01:29PM +0100, Luca Toscano wrote:
> Hi everybody,
> 
> Travis is able to read commit messages and skip the CI workflow if the
> "[skip ci]" magic sequence is added somewhere. I keep forgetting about
> it too, but it would be nice if we could start using it to avoid using
> CI resources/workers when not needed (like docs changes, entries in
> STATUS, etc..). I didn't find a way to instruct Travis to avoid
> triggering a build if only certain file types are committed, so the
> only solution for the moment is to manually add the aforementioned
> sequence :(
> 
> It is not a big deal to trigger builds even for docs etc.., but the
> ASF resources/workers are limited (and shared among projects IIUC) so
> I am only suggesting to be good neighbors :) If this is totally insane
> or unacceptable I'll back off and stop vouching for it I promise!

I found a new option here while playing with conditionals, we can skip 
an entire build based on the commit message, it appears -

https://github.com/apache/httpd/pull/87/commits/7dd995c555ac0aea6cb3d58634750e2e076eb0cc

so this could catch a lot of the quite pointless Travis runs for STATUS 
changes by filtering by commit message, something like...

if: (branch != "2.4.x" and commit_message !~ /^[Tt]ransforms$/) or 
(branch = "2.4.x" and commit_message !~ 
/^([Pp]ropose|[Vv]ote|[Tt]ransforms)$/)

anything else??



async filters still borked (was Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis)

2020-03-20 Thread Joe Orton
On Fri, Mar 06, 2020 at 09:30:41AM +0100, Ruediger Pluem wrote:
> On 3/4/20 9:23 AM, jor...@apache.org wrote:
> > Author: jorton
> > Date: Wed Mar  4 08:23:55 2020
> > New Revision: 1874775
> > 
> > URL: http://svn.apache.org/viewvc?rev=1874775&view=rev
> > Log:
> > Update docs. The expr_string.t failure has not been seen since 
> > the workaround was added AFAICT.  The async filter bug
> > is still breaking the tests regularly. [skip ci].
> 
> Anyone found time to check
> 
> https://github.com/apache/httpd/pull/88
> 
> regarding the async filter bug?

[crickets]

We are two months on since discussing this first.  It seems quite 
worrying that the async filters stuff was introduced with apparently 
nasty bugs and yet is already bitrotting on trunk with not enough people 
to review and fix it.  Can we move the code to a branch until it is 
ready?

Regards, Joe



Re: Still Failing: apache/httpd#500 (2.4.x - 462a0bc)

2020-03-20 Thread Joe Orton
On Fri, Mar 20, 2020 at 12:28:31PM +, Travis CI wrote:
> Build Update for apache/httpd
> -
> 
> Build: #500
> Status: Still Failing

This is now only failing because of the apxs bug so I'll give Eric the 
pleasure of making it pass by merging the fix.



Re: Failed: apache/httpd#484 (2.4.42 - 5ad7f90)

2020-03-20 Thread Joe Orton
On Thu, Mar 19, 2020 at 04:27:06PM +, Joe Orton wrote:
> On Thu, Mar 19, 2020 at 08:32:25AM -0700, Mike Rumph wrote:
> > Build #484 fails on test #484.13, Linux Ubuntu, Regenerate ap_expr
> > +./buildconf --with-apr=/usr/bin/apr-1-config --with-regen-expr
> > unknown option  --with-regen-expr
> > The command "./test/travis run_${TRAVIS_OS_NAME}.sh" exited with 1.
> 
> It's something screwy with the conditional on the test, which should not 
> run for 2.4.x and is described as:
> 
> - if: branch != 2.4.x
>   name: Linux Ubuntu, Regenerate ap_expr
> 
> but this build is special because it's a tag.  Could adjust those tests 
> to be "if: branch = master" but then that's wrong when we branch 2.5.x.  
> Not obvious how to fix it right now.
> 
> Anyway, ignore the failure, everything else passed and it's good ;)

Urghh, apologies for the commit/build spam while I try to fix this.

I think the right way is probably:

   if: (branch IS present AND branch != 2.4.x) OR (tag IS present AND tag !~ 
/^2.4)

and

   if: (branch IS present AND branch = 2.4.x) OR (tag IS present AND tag =~ 
/^2.4)

but I did it the simpler match-against-trunk away.  And of course it 
should be "trunk" not "master" so r1875469 is also wrong but it's not 
giving false negatives for 2.4.x.




Re: Failed: apache/httpd#484 (2.4.42 - 5ad7f90)

2020-03-19 Thread Joe Orton
On Thu, Mar 19, 2020 at 08:32:25AM -0700, Mike Rumph wrote:
> Build #484 fails on test #484.13, Linux Ubuntu, Regenerate ap_expr
> +./buildconf --with-apr=/usr/bin/apr-1-config --with-regen-expr
> unknown option  --with-regen-expr
> The command "./test/travis run_${TRAVIS_OS_NAME}.sh" exited with 1.

It's something screwy with the conditional on the test, which should not 
run for 2.4.x and is described as:

- if: branch != 2.4.x
  name: Linux Ubuntu, Regenerate ap_expr

but this build is special because it's a tag.  Could adjust those tests 
to be "if: branch = master" but then that's wrong when we branch 2.5.x.  
Not obvious how to fix it right now.

Anyway, ignore the failure, everything else passed and it's good ;)



Re: httpd/test/framework location/mirroring (was Re: Errored: apache/httpd#357 (2.4.x - d56819e))

2020-02-27 Thread Joe Orton
On Thu, Feb 27, 2020 at 11:30:58AM +, Joe Orton wrote:
> and weirdly the tests failing everywhere in the same places:

Obviously worked it out just after writing this, duh.  It's because the 
svn2git process drops empty directories, which we need preserved.  Will 
see if infra can fix this in the mirroring, otherwise we could work 
around it.



Re: httpd/test/framework location/mirroring (was Re: Errored: apache/httpd#357 (2.4.x - d56819e))

2020-02-27 Thread Joe Orton
On Thu, Feb 20, 2020 at 09:04:21AM +, Joe Orton wrote:
> On Wed, Feb 19, 2020 at 08:20:20AM -0500, Eric Covener wrote:
> > On Wed, Feb 19, 2020 at 8:17 AM Joe Orton  wrote:
> > > OK so I will ask for httpd/test/framework to be mirrored to github, but
> > > first the really hard question - what to name it?  Assuming we can pick
> > > an arbitrary name, "httpd-tests" makes sense to me.
> > 
> > works for me
> 
> Thanks folks.  -> https://issues.apache.org/jira/browse/INFRA-19872

OK Infra set this up at https://github.com/apache/httpd-tests

I srarted using this in a PR 
https://travis-ci.org/apache/httpd/builds/655723094

and weirdly the tests failing everywhere in the same places:

Test Summary Report
---
t/modules/allowmethods.t  (Wstat: 0 Tests: 10 Failed: 1)
  Failed test:  10
t/modules/proxy_fcgi.t(Wstat: 0 Tests: 26 Failed: 1)
  Failed test:  26
  Parse errors: Bad plan.  You planned 42 tests but ran 26.

this seems extremely mysterious.  The arm build is still using verbose 
logging and the failures are:

# testing : Get - When Post/reset is allowed.
# expected: 200
# received: '405'
not ok 10
Failed 1/10 subtests 

and I guess all the php-fpm failures are because it is failing to run 
php-fpm properly, but, what has changed to make that happen?

[27-Feb-2020 10:16:12] ERROR: failed to open error_log 
(/home/travis/build/apache/httpd/test/perl-framework/t/php-fpm/log/php-fpm.log):
 No such file or directory (2)
[27-Feb-2020 10:16:12] ERROR: failed to post process the configuration
[27-Feb-2020 10:16:12] ERROR: FPM initialization failed
# Failed test 26 in t/modules/proxy_fcgi.t at line 245




Re: httpd/test/framework location/mirroring (was Re: Errored: apache/httpd#357 (2.4.x - d56819e))

2020-02-20 Thread Joe Orton
On Wed, Feb 19, 2020 at 08:20:20AM -0500, Eric Covener wrote:
> On Wed, Feb 19, 2020 at 8:17 AM Joe Orton  wrote:
> > OK so I will ask for httpd/test/framework to be mirrored to github, but
> > first the really hard question - what to name it?  Assuming we can pick
> > an arbitrary name, "httpd-tests" makes sense to me.
> 
> works for me

Thanks folks.  -> https://issues.apache.org/jira/browse/INFRA-19872



Re: httpd/test/framework location/mirroring (was Re: Errored: apache/httpd#357 (2.4.x - d56819e))

2020-02-19 Thread Joe Orton
On Mon, Feb 17, 2020 at 05:32:48PM -0500, Eric Covener wrote:
> > >> There is also the complicating factor of the svn:external for
> > >> https://svn.apache.org/repos/asf/perl/Apache-Test/trunk
> > >>
> > >
> > > Good point. Not sure how to map svn:external in the git world.
> >
> > https://help.github.com/en/github/using-git/about-git-subtree-merges
> >
> > seems to be an approach, but not sure how this works with our svn to git 
> > mirroring.
> 
> IIUC We could also just use it from CPAN with a README about dropping
> in an Apache-Test checkout if you need to debug the framework itself?
> Other than that, we could use a vendor/ like approach and see what's
> new around release time.

Yeah, so long as the existence of the svn:externals doesn't preclude 
mirroring, I think we can safely ignore it.  The test suite does seem to 
work with an external install of Apache::Test just fine.

OK so I will ask for httpd/test/framework to be mirrored to github, but 
first the really hard question - what to name it?  Assuming we can pick 
an arbitrary name, "httpd-tests" makes sense to me.

Regards, Joe




httpd/test/framework location/mirroring (was Re: Errored: apache/httpd#357 (2.4.x - d56819e))

2020-02-17 Thread Joe Orton
On Mon, Feb 17, 2020 at 10:52:10AM -0500, Eric Covener wrote:
> On Mon, Feb 17, 2020 at 10:44 AM Joe Orton  wrote:
> >
> > On Mon, Feb 17, 2020 at 04:30:40PM +0100, Stefan Eissing wrote:
> > > Does this look like a Travis problem? Just red on some "Linux Ubuntu"
> > > combinations, but I do not really see the cause.
> >
> > Yep, it's a timeout doing the "svn checkout" because the test suite was
> > updated, you can ignore.
> 
> Could we point travis at GitHub for that?

The Perl framework isn't mirrored to GitHub at the moment so we have to 
get it from SVN.  We could:

a) ask infra to mirror httpd/test/framework to some new github repo name
   (httpd-test-framework? httpd-perl-tests? httpd-tests?)

b) move httpd/test/framework under trunk/test/framework or something

c) work out the underlying issue which makes Travis/SVN fail, my 
   assuption is that since a Travis build will hit SVN with 30+
   near-simultaneous new connections which looks a bit DoS-like so it 
   gets rate-limited/banned by the ASF infra somehow, but it's a guess

I have a vague preference for (b) since it would be nice to be able to 
commit tests and code at the same time but it'd be quite a big change.
(a) might be easier

Regards, Joe



Re: Errored: apache/httpd#357 (2.4.x - d56819e)

2020-02-17 Thread Joe Orton
On Mon, Feb 17, 2020 at 04:30:40PM +0100, Stefan Eissing wrote:
> Does this look like a Travis problem? Just red on some "Linux Ubuntu" 
> combinations, but I do not really see the cause.

Yep, it's a timeout doing the "svn checkout" because the test suite was 
updated, you can ignore.



Re: svn commit: r1874102 - /httpd/httpd/trunk/modules/http/http_filters.c

2020-02-17 Thread Joe Orton
On Mon, Feb 17, 2020 at 10:01:17AM +0100, Ruediger Pluem wrote:
> On 02/17/2020 09:20 AM, jor...@apache.org wrote:
> > Author: jorton
> > Date: Mon Feb 17 08:20:52 2020
> > New Revision: 1874102
> > 
> > URL: http://svn.apache.org/viewvc?rev=1874102&view=rev
> > Log:
> > * modules/http/http_filters.c (parse_chunk_size): Reduce by four the
> >   limit to the number of bits that can be handled in a chunk size, to
> >   avoid undefined behaviour bitshifting a signed integer left.  Max
> >   chunk size on 32-bit arch is now 32MiB.  Avoids UBSan error in:
> 
> Why 32 MiB? 32 bit - 4 are 28 bit we should be able to store 256 MiB if read 
> as unsigned or 128 MiB signed.

Errr, you are right, no idea how I came up with that, maths before 
coffee clearly not a good idea.  I amended the log entry, thanks :)




Re: [PATCH] event idle_spawn_rate initialization?

2020-02-14 Thread Joe Orton
On Fri, Feb 14, 2020 at 11:33:50AM +0100, Ruediger Pluem wrote:
> On 02/14/2020 10:08 AM, Joe Orton wrote:
> > I've been playing with UBSan[1] which catches undefined behaviour, found a 
> > couple of interesting things so far.
> > 
> > One is with event, I messed with the line numbers but the error is:
> > 
> > event.c:3620:13: runtime error: null pointer passed as argument 2, which is 
> > declared to never be null
> > 
> > from the memcpy() in this code: 
> > https://github.com/apache/httpd/blob/trunk/server/mpm/event/event.c#L3619
> > 
> > new_ptr = (int *)apr_palloc(ap_pglobal, new_max * sizeof(int));
> > memcpy(new_ptr, retained->idle_spawn_rate,
> >retained->mpm->num_buckets * sizeof(int));
> 
> I guess the above only does not crash because retained->mpm->num_buckets is 0 
> at the same time.
> But this is probably something we should not rely on with all memcpy 
> implementations.

A, yes, that makes more sense now.  Thanks a lot.

...
> I guess there is no need for two cases here. We should only avoid the call to 
> memcpy
> if retained->idle_spawn_rate is NULL. The initialization happens then in the 
> block starting
> at line 3624.

Gotcha.  Done in r1874011.

Regards, Joe



[PATCH] event idle_spawn_rate initialization?

2020-02-14 Thread Joe Orton
I've been playing with UBSan[1] which catches undefined behaviour, found a 
couple of interesting things so far.

One is with event, I messed with the line numbers but the error is:

event.c:3620:13: runtime error: null pointer passed as argument 2, which is 
declared to never be null

from the memcpy() in this code: 
https://github.com/apache/httpd/blob/trunk/server/mpm/event/event.c#L3619

new_ptr = (int *)apr_palloc(ap_pglobal, new_max * sizeof(int));
memcpy(new_ptr, retained->idle_spawn_rate,
   retained->mpm->num_buckets * sizeof(int));
retained->idle_spawn_rate = new_ptr;
retained->mpm->max_buckets = new_max;

At startup it seems retained->idle_spawn_rate is NULL, and you can't 
(shouldn't?!) memcpy() from NULL.  I am not sure what the right fix is 
here, is that array supposed to be initialized to something other than 
zero here, or somewhere else?  Not obvious if the loop below will 
initialize it properly:

https://github.com/apache/httpd/blob/trunk/server/mpm/event/event.c#L3624

Is something like this correct?  (also this could use apr_pmemdup rather 
than palloc+memcpy now I think about it)

--- a/server/mpm/event/event.c
+++ b/server/mpm/event/event.c
@@ -3615,9 +3615,15 @@ static int event_open_logs(apr_pool_t * p, apr_pool_t * 
plog,
 if (new_max < num_buckets) {
 new_max = num_buckets;
 }
-new_ptr = (int *)apr_palloc(ap_pglobal, new_max * sizeof(int));
-memcpy(new_ptr, retained->idle_spawn_rate,
-   retained->mpm->num_buckets * sizeof(int));
+if (retained->idle_spawn_rate) {
+new_ptr = (int *)apr_palloc(ap_pglobal, new_max * sizeof(int));
+memcpy(new_ptr, retained->idle_spawn_rate,
+   retained->mpm->num_buckets * sizeof(int));
+}
+else {
+/* ### should initialize array to something other than 0?? */
+new_ptr = apr_pcalloc(ap_pglobal, new_max * sizeof(int));
+}
 retained->idle_spawn_rate = new_ptr;
 retained->mpm->max_buckets = new_max;
 }


[1] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html



Re: Use of [skip ci] in commit messages to avoid Travis builds

2020-02-11 Thread Joe Orton
On Mon, Feb 10, 2020 at 07:22:45AM -0500, Eric Covener wrote:
> Is there anything possible in SVN like a pre-commit hook that can see
> what changed and append [skip ci] unless some other incantantation is
> present?

Seems like a neat idea, I think that could work *if* infra actually 
allows us to do that.  I can imagine they might not want to since the 
pre-commit hooks are global to the asf repos.

An alternative I found would be to check the diff in the travis jobs and 
pass if there are non-docs changes:

https://reflectoring.io/skip-ci-build/

con: we'd probably still hit the transient failures at apt install 
level, though I we could/should also move that to before_script level 
anyway

con: we'd have lots of "false" passes in the travis history, and having 
a pass even if the tree is genuinely broken by prior non-docs commits is 
bad.

(... ideally the travis jobs could report "skip" rather than pass/fail 
but I can't see a way to do that either.)

pro: we can easily control what to skip w/o needing infra

pro: doesn't need asf-wide repos changes & doesn't modify commits 
(possibly fragile)

Not sure about this.  Maybe status quo is best and everyone should learn 
to use [skip ci] until travis implement filtering.

Regards, Joe



Re: svn commit: r1866035 - /httpd/httpd/branches/2.4.x/STATUS

2020-02-07 Thread Joe Orton
On Thu, Feb 06, 2020 at 07:52:18AM -0600, Daniel Ruggeri wrote:
> Hey there, Joe; No idea how I didn't detect this much sooner. I have 
>access to hardware security modules with PKCS11 interfaces for key 
>operations and would be happy to put this through it's paces. The 
>2.5 docs are fairly light (note, this 2.4 patch seems to be missing 
>docs) on how to test this out. Pointers appreciated if you have a 
>working recipe.

That would be awesome.  The stuff I'm not really sure about & could use 
better docs is:

a) how to identify the right PKCS#11 URI for the key/cert objects, and
b) how to set up the OpenSSL pkcs11 engine correctly so this works

On recent Fedora/RHEL (b) works OOTB but I imagine this may take some 
effort on other systems or from-scratch builds.

For testing locally I used a USB smartcard reader, setting up the card 
following https://github.com/OpenSC/OpenSC/wiki/Quick-Start-with-OpenSC

If you can store a cert & private key on the token, mod_ssl will use 
both, but I think not all HSMs can store the cert, so you can load that 
from a PEM file if required and list the key only as a pkcs11: URI in 
SSLCertificateKeyFile.

Beyond that it should "just work" if you configure per the mod_ssl docs, 
running "p11tool --list-tokens" listed the URI for the token, and I 
used:

SSLCertificateFile 
"pkcs11:model=PKCS%2315;manufacturer=OpenSC%20Project;serial=0001C9540200;token=Joe%20Orton%20%28OpenSC%20Card%29"

Regards, Joe

> 
> On 2019/08/28 12:15:02 jor...@apache.org wrote:
> > Author: jorton
> > Date: Wed Aug 28 12:15:01 2019
> > New Revision: 1866035
> > 
> > URL: http://svn.apache.org/viewvc?rev=1866035&view=rev
> > Log:
> > Proposed mod_ssl PKCS#11 cert/key support.
> > 
> > Modified:
> > httpd/httpd/branches/2.4.x/STATUS
> > 
> > Modified: httpd/httpd/branches/2.4.x/STATUS
> > URL: 
> > http://svn.apache.org/viewvc/httpd/httpd/branches/2.4.x/STATUS?rev=1866035&r1=1866034&r2=1866035&view=diff
> > ==
> > --- httpd/httpd/branches/2.4.x/STATUS (original)
> > +++ httpd/httpd/branches/2.4.x/STATUS Wed Aug 28 12:15:01 2019
> > @@ -160,6 +160,21 @@ PATCHES PROPOSED TO BACKPORT FROM TRUNK:
> >rpluem says: -1 for now. See further discussion at
> > https://bz.apache.org/bugzilla/show_bug.cgi?id=63503
> >  
> > +   *) mod_ssl: Add support for loading certs & keys from PKCS#11 URLs via 
> > the
> > +   OpenSSL pkcs11 engine.  Includes related minor cleanups and
> > +   simplification to mod_ssl internals.
> > +  trunk patch: http://svn.apache.org/r1830819
> > +   http://svn.apache.org/r1830912
> > +   http://svn.apache.org/r1830913
> > +   http://svn.apache.org/r1830927
> > +   http://svn.apache.org/r1831168
> > +   http://svn.apache.org/r1831173
> > +   http://svn.apache.org/r1835240
> > +   http://svn.apache.org/r1835242
> > +   http://svn.apache.org/r1835615
> > +  2.4.x patch: http://people.apache.org/~jorton/mod_ssl_pkcs11.patch
> > +  +1: jorton, 
> > +
> >  PATCHES/ISSUES THAT ARE BEING WORKED
> >[ New entries should be added at the START of the list ]
> >  
> > 
> > 
> > 
> -- 
> Daniel Ruggeri



Re: mod_systemd question

2020-01-31 Thread Joe Orton
On Fri, Jan 31, 2020 at 08:30:06AM +0100, Ruediger Pluem wrote:
> On 01/31/2020 04:14 AM, Luca Toscano wrote:
> > Hi everybody,
> > 
> > I tested a bit the mod_systemd backport proposal (thanks Joe for
> > working on it!) and I have some doubts, that might be due to my
> > limited understanding of systemd. I tried the following unit on Debian
> > 10 (Buster):
> > 
> > 
> > # /etc/systemd/system/apache2.service
> > [Unit]
> > Description=The Apache HTTP Server
> > After=network.target
> > 
> > [Service]
> > Type=notify
> > ExecStart=/usr/local/apache2/bin/httpd -k start
> 
> What if you add -D FOREGROUND to the above?

That should fix the problem described.  The Fedora httpd.service is 
designed to be used with mod_systemd and looks like this:

https://src.fedoraproject.org/rpms/httpd/blob/master/f/httpd.service

We could add a cut-down version of that without some Fedora-specific 
stuff, but there are some policy choices here about what reload & stop 
do so it is maybe best left to distributions, not sure.  At least I 
could document it better!

Regards, Joe



catching empty APLOGNO()

2020-01-24 Thread Joe Orton
I'd like to catch empty APLOGNO() use in CI, I think this should be an 
error and IIRC it's tripped us up in 2.4 releases in the past.  Two ways 
to do this are:

1. For #ifdef AP_DEBUG, or for some other special-case, catch and fail 
at compile-time for empty APLOGNO(). I think this should be possible 
with some preprocessor tricks.

2. Have a special CI job which runs "make update-log-msg-tags", treats
duplicates as errors and fails if there any newly-filled APLOGNO tags

I think my preference is (2). During development it's easier to leave 
APLOGNO() empty and fill it in at the last minute, so having this caught 
in Travis is sufficient and should prevent errors getting as far as 
2.4.x backports.

Any other opinions, or an alternative method?

Regards, Joe



Re: mod_systemd in httpd 2.4

2020-01-14 Thread Joe Orton
On Mon, Jan 13, 2020 at 08:46:54AM +0300, Benson Muite wrote:
> 
> Hi,
> 
> There was an earlier message on this list [0] about systemd support in 2.4 -
> r1625531 which am looking to find. Will this likely happen? As many
> distributions now use systemd, there is interest from distributions other
> than the Fedora, CentOS and Redhat packaged versions, for example[1]. From
> [2], it seems that this would not be too challenging, and would be helpful
> to have when building httpd from source. Documentation seems to already be
> available for 2.5 [3]

There was feedback from the review of this previously which had not been 
addressed.  I've updated the module and will propose again for 2.4.

Regards, Joe


> Thanks,
> Benson
> 
> [0] 
> https://lists.apache.org/thread.html/050073e0b232297be307fcf08177a3a60290ea5d706387d51a0f862e%401410693687%40%3Cdev.httpd.apache.org%3E
> 
> [1] 
> https://unix.stackexchange.com/questions/262051/install-apache-module-mod-systemd
> 
> [2] 
> https://lists.apache.org/thread.html/f6f4deb66398b32cd7e587b599adc2c4b04d32e479a8f91dbb1a%40%3Cdev.httpd.apache.org%3E
> 
> [3] https://httpd.apache.org/docs/trunk/mod/mod_systemd.html
> 



Re: arm64 support for Travis CI testing

2020-01-13 Thread Joe Orton
On Fri, Jan 10, 2020 at 09:55:24AM -0800, Mike Rumph wrote:
> I'm interested in taking a look at this t/apache/expr_string.t 
> testcase failure. For example, is it the testcase that is in error or 
> is it the code that it's testing? Maybe this can be discussed in a 
> separate thread since it is not exclusive to arm64. Has a thread been 
> started on this already?

I don't think there's another thread.

I'm guessing this is some kind of timing issue in the test case, which 
tries to read an error_log entry immediately after the request which 
generates it. I've added a sleep which will maybe mitigate this - 
http://svn.apache.org/r1872705 so let's see how this one goes

Regards, Joe




Re: arm64 support for Travis CI testing

2020-01-09 Thread Joe Orton
On Thu, Jan 09, 2020 at 03:01:42PM +0100, Luca Toscano wrote:
> Il giorno gio 9 gen 2020 alle ore 14:19 Joe Orton 
> ha scritto:
> >
> > The caching does work on arm64 so the build & test only takes 11 minutes
> > which is reasonable.
> >
> > https://travis-ci.org/apache/httpd/jobs/634661216
> >
> > Looks like this one is the next unreliable test to attack -
> >
> > 2425# Failed test 17 in t/apache/expr_string.t at line 87
> > 2426# Failed test 20 in t/apache/expr_string.t at line 87 fail #2
> > 2427t/apache/expr_string.t ..
> > 2428Failed 2/44 subtests
> > 2429
> >
> > Can anybody reliably reproduce that failure?
> 
> Does it happen only for ARM or also for other architectures? Never
> seen this before..

Definitely all architectures, it's one Rainer reports as failing 
regularly in release testing.  Here's another one and I'm pretty sure it 
has shown up in more of the recent failures too -

https://travis-ci.org/apache/httpd/jobs/633442262



Re: arm64 support for Travis CI testing

2020-01-09 Thread Joe Orton
The caching does work on arm64 so the build & test only takes 11 minutes 
which is reasonable.  

https://travis-ci.org/apache/httpd/jobs/634661216

Looks like this one is the next unreliable test to attack -

2425# Failed test 17 in t/apache/expr_string.t at line 87
2426# Failed test 20 in t/apache/expr_string.t at line 87 fail #2
2427t/apache/expr_string.t .. 
2428Failed 2/44 subtests 
2429

Can anybody reliably reproduce that failure?



Re: arm64 support for Travis CI testing

2020-01-09 Thread Joe Orton
On Thu, Jan 09, 2020 at 09:10:31AM +, Pluem, Ruediger, Vodafone Group wrote:
> Would
> 
> lscpu -p | grep -c -E ^[0-9]
> 
> work on Ubuntu to deliver the number of cores?
> 
> So
> 
> make -j `lscpu -p | grep -c -E ^[0-9]`

I've got no idea if this DTRT for the mix of VMs & containers that 
travis uses, but it's probably worth trying!

Regards, Joe



Re: arm64 support for Travis CI testing

2020-01-09 Thread Joe Orton
On Wed, Jan 08, 2020 at 02:50:17PM -0800, Mike Rumph wrote:
> Okay the first Travis run with arm64 (#201) passed.
> The arm64 job (#201.4) took 33 minutes, 5 seconds.
> Compared tp ppc64le at 4 minutes, 30 seconds.
> This is possibly due to need for extra caching on the first run.
> Otherwise, the jobs seem to have the same results.

Nice, thanks Mike!  From the timings in the log a large chunk of the 
time (21m) is building and installing the Perl modules, so if the 
caching works it should indeed be much faster second time round.

I wonder if the non-x86 builds have more CPUs available, we have 
hardcoded "make -j2" at the moment but possibly it makes sense to adjust 
this.

Regards, Joe



Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

2020-01-08 Thread Joe Orton
On Tue, Jan 07, 2020 at 07:31:42PM +0100, Yann Ylavic wrote:
> Could you please try the attached patch?
> The goal is to make ap_request_core_filter() a connection filter, so
> that it remains in c->output_filters until the EOR is handled.
> The patch has subtleties, but ap_request_core_filter() is really
> special. It has to do its best to use r->pool for setasides and
> morphing buckets until the EOR, and depending on network congestion
> there may be multiple ap_request_core_filter()s stacked (one per
> request not fully flushed) so I had to introduce
> AP_FILTER_PROTO_INSERT_BEFORE to stack them in the right order...

This has passed 50 iterations with no failures, so LGTM, thanks for 
looking at this!  Without this I could reliably trigger a failure within 
~10 iterations.

BTW attached the repro script I'm using, different N for stress -cN may 
help. -c2 works well for my 4 core laptop.

Regards, Joe


repro.sh
Description: Bourne shell script


Re: arm64 support for Travis CI testing

2020-01-08 Thread Joe Orton
On Tue, Jan 07, 2020 at 01:27:33PM -0800, Mike Rumph wrote:
> I would like to add support for arm64 to httpd/trunk/.travis.yml.
> I would then devote some time to getting this support to work.
> Are there some steps I should take before adding this commit?

That'd be great!  If you are familiar with github & Travis, you could 
try it using a fork of the git repo, following the instructions in 
test/README.travis.  Otherwise you could just push the change to trunk 
and see ;)

I can't remember if I tested arm before, but it seems the non-x86 VMs in 
Travis don't have the same set of Debian packages installed as x86, so 
some tweaking to the apt: section might be required to get it working.

Regards, Joe



Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

2020-01-07 Thread Joe Orton
This is a classic heisenbug since it is timing related, when you add 
more debugging it slows the server down enough that the client can keep 
up, write never returns EAGAIN, and the bug disappears, I think...

I see from 2016 people have reported similar failures before ("Subject: 
Some test failures in trunk") and they have been seen with prefork too.

I'm afraid I don't understand the dance being done with setaside and 
reinstate in the filters.  Following the debugging through from the last 
ap_pass_brigade from mod_test_pass_brigade -

[Tue Jan 07 13:09:32.090818 2020] [:info] [pid 117876:tid 140235421763328] 
[client 127.0.0.1:41112] [mod_test_pass_brigade] wrote 8192 of 8192 bytes
[Tue Jan 07 13:09:32.090821 2020] [:info] [pid 117876:tid 140235421763328] 
[client 127.0.0.1:41112] [mod_test_pass_brigade] done writing 1024 of 
1024 bytes
[Tue Jan 07 13:09:32.090824 2020] [core:trace6] [pid 117876:tid 
140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate full 
brigade to full brigade in 'req_core' output filter
[Tue Jan 07 13:09:32.090827 2020] [core:trace8] [pid 117876:tid 
140235421763328] util_filter.c(1133): [client 127.0.0.1:41112] brigade 
contains: bytes: 49205, non-file bytes: 49205, eor buckets: 1, morphing buckets
: 0
[Tue Jan 07 13:09:32.090829 2020] [core:trace6] [pid 117876:tid 
140235421763328] util_filter.c(942): [client 127.0.0.1:41112] setaside full 
brigade to empty brigade in 'req_core' output filter
[Tue Jan 07 13:09:32.090835 2020] [core:trace6] [pid 117876:tid 
140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate full 
brigade to full brigade in 'core' output filter

>From the above we can see there is 48K of data buffered in one of the 
filters.  At this point we've passed through:

/* Prepend buckets set aside, if any. */
ap_filter_reinstate_brigade(f, bb, NULL);
if (APR_BRIGADE_EMPTY(bb)) {
return APR_SUCCESS;
}

I then dumped the contents of bb to the error log directly after:

[Tue Jan 07 13:09:32.090837 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(373): [client 127.0.0.1:41112] sbb: cof - 
bucket 0 HEAP length 1654
[Tue Jan 07 13:09:32.090839 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(373): [client 127.0.0.1:41112] sbb: cof - 
bucket 1 IMMORTAL length 2
[Tue Jan 07 13:09:32.090842 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(373): [client 127.0.0.1:41112] sbb: cof - 
bucket 2 FLUSH length 0
[Tue Jan 07 13:09:32.090844 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(373): [client 127.0.0.1:41112] sbb: cof - 
bucket 3 EOC length 0

so the 48K does not make it back to the core output filter and is never 
sent.

[Tue Jan 07 13:09:32.090846 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(655): [client 127.0.0.1:41112] sbb: 2 vectors, 
bytes to write 1656
[Tue Jan 07 13:09:32.090849 2020] [core:trace6] [pid 117876:tid 
140235421763328] core_filters.c(710): (11)Resource temporarily unavailable: 
[client 127.0.0.1:41112] writev_nonblocking: 0/1656
[Tue Jan 07 13:09:32.090852 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(431): (11)Resource temporarily unavailable: 
[client 127.0.0.1:41112] sbb: cof - send_brigade_nonblocking RETURNED
[Tue Jan 07 13:09:32.090856 2020] [core:trace6] [pid 117876:tid 
140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate empty 
brigade to full brigade in 'core' output filter
[Tue Jan 07 13:09:32.090858 2020] [core:trace6] [pid 117876:tid 
140235421763328] util_filter.c(1108): [client 127.0.0.1:41112] will flush 
because of FLUSH bucket
[Tue Jan 07 13:09:32.090861 2020] [core:trace8] [pid 117876:tid 
140235421763328] util_filter.c(1110): [client 127.0.0.1:41112] seen in brigade 
so far: bytes: 1656, non-file bytes: 1656, eor buckets: 0, morphing buc
kets: 0
[Tue Jan 07 13:09:32.090863 2020] [core:trace8] [pid 117876:tid 
140235421763328] util_filter.c(1133): [client 127.0.0.1:41112] brigade 
contains: bytes: 0, non-file bytes: 0, eor buckets: 0, morphing buckets: 0
[Tue Jan 07 13:09:32.090865 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(446): [client 127.0.0.1:41112] sbb: BLOCKED 
EAGAIN, polling
[Tue Jan 07 13:09:32.094662 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(655): [client 127.0.0.1:41112] sbb: 2 vectors, 
bytes to write 1656
[Tue Jan 07 13:09:32.094678 2020] [core:trace6] [pid 117876:tid 
140235421763328] core_filters.c(710): [client 127.0.0.1:41112] 
writev_nonblocking: 1656/1656
[Tue Jan 07 13:09:32.094681 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(431): [client 127.0.0.1:41112] sbb: cof - 
send_brigade_nonblocking RETURNED
[Tue Jan 07 13:09:32.094683 2020] [core:trace1] [pid 117876:tid 
140235421763328] core_filters.c(431): [client 127.0.0.1:41112] sbb: cof - 
send_brigade_nonblocking RETURNED
[Tue Jan 07 13:09:32.094686 2020] [core:trace6] [pi

worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

2020-01-07 Thread Joe Orton
On Mon, Jan 06, 2020 at 06:38:29PM +, Travis CI wrote:
> Build Update for apache/httpd
> -
> 
> Build: #190
> Status: Still Failing
> 
> Duration: 7 mins and 11 secs
> Commit: 894b6a1 (trunk)
> Author: Luca Toscano
> Message: travis: add verbose config to perl test suite for Ubuntu Worker All 
> Modules
> 
> For some reason we get sporadic failures only in Ubuntu Worker All Modules' 
> test:
> 
> t/apache/rwrite.t ... 53/? # Failed test 113 in 
> /home/travis/build/apache/httpd/test/perl-framework/blib/lib/Apache/TestCommon.pm
>  at line 56 fail #113
> t/apache/rwrite.t ... Failed 1/114 subtests 

It is quite easy to reproduce this for me under worker as well, if you run

./t/TEST t/apache/rwrite.t t/apache/pass_brigade.t 

in a loop, one of them will fail quite quickly.  It looks like the 
client gets a socket closure instead of the very last block of the 
response, here is the client output:

#server response:
#HTTP/1.1 200 OK
#Connection: close
#Date: Tue, 07 Jan 2020 10:55:28 GMT
#Server: Apache/2.5.1-dev (Unix) OpenSSL/1.1.1d
#Vary: In-If1
#Content-Length: 0
#Client-Aborted: die
#Client-Date: Tue, 07 Jan 2020 10:55:28 GMT
#Client-Peer: 127.0.0.1:8529
#Client-Response-Num: 1
#Client-Transfer-Encoding: chunked
#DMMATCH1: 1
#X-Content-Length-Note: added by Apache::TestRequest
#X-Died: EOF when chunk header expected at 
/usr/share/perl5/vendor_perl/Net/HTTP/Methods.pm line 532.
#
# testing : bytes in body
# expected: 10190848
# received: 1024
not ok 114
Failed 1/114 subtests 

The tail end of error_log it looks like this:

[Tue Jan 07 10:55:28.414864 2020] [core:trace8] [pid 37542:tid 140091146098432] 
util_filter.c(1129): [client 127.0.0.1:49470] brigade contains: bytes: 49205, 
non-file bytes: 49205, eor buckets: 1, morphing buckets: 0
[Tue Jan 07 10:55:28.414866 2020] [core:trace6] [pid 37542:tid 140091146098432] 
util_filter.c(942): [client 127.0.0.1:49470] setaside full brigade to empty 
brigade in 'req_core' output filter
[Tue Jan 07 10:55:28.414872 2020] [core:trace6] [pid 37542:tid 140091146098432] 
util_filter.c(1015): [client 127.0.0.1:49470] reinstate full brigade to full 
brigade in 'core' output filter
[Tue Jan 07 10:55:28.414876 2020] [core:trace6] [pid 37542:tid 140091146098432] 
core_filters.c(670): (11)Resource temporarily unavailable: [client 
127.0.0.1:49470] writev_nonblocking: 0/1656
[Tue Jan 07 10:55:28.414879 2020] [core:trace6] [pid 37542:tid 140091146098432] 
util_filter.c(1015): [client 127.0.0.1:49470] reinstate empty brigade to full 
brigade in 'core' output filter
[Tue Jan 07 10:55:28.414881 2020] [core:trace6] [pid 37542:tid 140091146098432] 
util_filter.c(1104): [client 127.0.0.1:49470] will flush because of FLUSH bucket
[Tue Jan 07 10:55:28.414883 2020] [core:trace8] [pid 37542:tid 140091146098432] 
util_filter.c(1106): [client 127.0.0.1:49470] seen in brigade so far: bytes: 
1656, non-file bytes: 1656, eor buckets: 0, morphing buckets: 0
[Tue Jan 07 10:55:28.414888 2020] [core:trace8] [pid 37542:tid 140091146098432] 
util_filter.c(1129): [client 127.0.0.1:49470] brigade contains: bytes: 0, 
non-file bytes: 0, eor buckets: 0, morphing buckets: 0
[Tue Jan 07 10:55:28.418754 2020] [core:trace6] [pid 37542:tid 140091146098432] 
core_filters.c(670): [client 127.0.0.1:49470] writev_nonblocking: 1656/1656
[Tue Jan 07 10:55:28.418764 2020] [core:trace6] [pid 37542:tid 140091146098432] 
util_filter.c(942): [client 127.0.0.1:49470] setaside empty brigade to empty 
brigade in 'core' output filter
[Tue Jan 07 10:55:28.427067 2020] [optional_hook_import:error] [pid 37542:tid 
140091146098432] AH01866: Optional hook test said: GET 
/test_rwrite?8192,1024 HTTP/1.1
[Tue Jan 07 10:55:28.427072 2020] [optional_fn_export:error] [pid 37542:tid 
140091146098432] AH01871: Optional function test said: GET 
/test_rwrite?8192,1024 HTTP/1.1

and nothing else is logged for the connection.

In both cases, with an explicit FLUSH at the end of the response I 
cannot reproduce this any more.

I added one #if 0'ed out here:

http://svn.apache.org/viewvc?view=revision&revision=1872431

and it's a simple ap_rflush(r) in mod_test_rwrite.c as well.

This seems like a bug to me (not obviously in the tests) but I'm not 
sure exactly where.  The "writev_nonblocking: 1656/1656" line *appears* 
to suggest the last block was successfully written to the socket.

Regards, Joe

> Recent examples:
> https://travis-ci.org/apache/httpd/jobs/632425202
> https://travis-ci.org/apache/httpd/jobs/633250739
> 
> Add "-verbose" as test option to capture more data about the failure
> when it happens.
> 
> 
> 
> git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@1872389 
> 13f79535-47bb-0310-9956-ffa450edef68
> 
> View the changeset: 
> https://github.com/apache/httpd/compare/0bd4a3d88cba...894b6a185041
> 
> View the full build log and details: 
> https://travis-ci.org/apache/httpd/builds/633409517?utm_medium=notification&utm_source=e

Re: Travis CI failures

2019-12-02 Thread Joe Orton
On Sun, Dec 01, 2019 at 12:05:15PM +0100, Luca Toscano wrote:
> Hi everybody,
> 
> Travis seems not able to deliver emails to dev@, I opened a jira to
> Infra: https://issues.apache.org/jira/browse/INFRA-19508

Thanks!

> +svn export -q -r
> https://svn.apache.org/repos/asf/apr/apr/branches/1.7.x
> /home/travis/build/apr-1.7.x
> svn: E205000: Syntax error in revision argument
> 'https://svn.apache.org/repos/asf/apr/apr/branches/1.7.x'
> The command "./test/travis_before_${TRAVIS_OS_NAME}.sh" failed and
> exited with 1 during .
> 
> 
> This one seems more a temporary connectivity issue, so I am wondering
> if we should add a basic and limited retry/backoff logic in CI to make
> the external fetch operations more resilient to network glitches.

Looks like a brief outage yesterday -

https://status.apache.org/incidents/nbl2fw5f63fx

Yeah, I think it's good idea to retry that svn export a few times, I 
think I have seen it fail with some network error before.  I'd also 
considered caching the perl-framework checkout but would then need to 
duplicate the latest-revision checking stuff.

Regards, Joe



Re: svn commit: r1870077 - in /httpd/httpd/trunk: .travis.yml test/travis_before_linux.sh test/travis_run_linux.sh

2019-11-22 Thread Joe Orton
On Fri, Nov 22, 2019 at 09:35:01AM +0100, Ruediger Pluem wrote:
> On 11/21/2019 10:30 AM, jor...@apache.org wrote:
> > Author: jorton
> > Date: Thu Nov 21 09:30:34 2019
> > New Revision: 1870077
...
> > +function install_apx() {
> > +local name=$1
> > +local version=$2
> > +local root=https://svn.apache.org/repos/asf/apr/${name}
> > +local prefix=${HOME}/root/${name}-${version}
> > +local build=${HOME}/build/${name}-${version}
> > +local config=$3
> > +local buildconf=$4
> > +
> > +case $version in
> > +trunk) url=${root}/trunk ;;
> 
> This does the wrong thing for APR-UTIL trunk since there is none. I am 
> not sure if this really needs to fixed in code or if it should be just 
> documented that if APR_VERSION is set to trunk that no APR-UTIL 
> version should be set. Or we could ignore the APU_VERSION setting a 
> few lines down in case APR_VERSION is trunk.

Yes good point - I made this explicit in the docs now, which I think is 
enough.

Regards, Joe



Re: svn commit: r1869697 - in /httpd/httpd/trunk: .travis.yml buildconf test/travis_run_linux.sh

2019-11-18 Thread Joe Orton
On Fri, Nov 15, 2019 at 08:10:28PM +0100, Ruediger Pluem wrote:
> > --- httpd/httpd/trunk/test/travis_run_linux.sh (original)
> > +++ httpd/httpd/trunk/test/travis_run_linux.sh Tue Nov 12 12:45:57 2019
> > @@ -1,7 +1,7 @@
> >  #!/bin/bash -ex
> >  ### Installed apr/apr-util don't include the *.m4 files but the
> >  ### Debian packages helpfully install them, so use the system APR to 
> > buildconf
> > -./buildconf --with-apr=/usr/bin/apr-1-config
> > +./buildconf --with-apr=/usr/bin/apr-1-config ${BUILDCONFIG}
> 
> Why don't we use the APR sources we download in travis_before_linux.sh at 
> least if APR_VERSION is set?
> Like
> 
> ./buildconf --with-apr=$HOME/build/apr-${APR_VERSION} ${BUILDCONFIG}

Mainly because of caching - the ~/build/apr-* directories are only 
created in the case where a cached ~/root/apr-* is not available.  It 
might be useful to try more complicated combinations of buildconf too, 
though this is simple and works everywhere for now.

Regards, Joe



Re: trunk APR version requirement

2019-11-12 Thread Joe Orton
On Tue, Nov 12, 2019 at 02:39:58PM +0100, Luca Toscano wrote:
> Il giorno mar 12 nov 2019 alle ore 12:40 Joe Orton 
> ha scritto:
> >
> > Thanks to everyone who gave feedback, I made the change to require APR
> > 1.6 in r1869684.  While I kept the Travis builds passing, buildbot broke
> > since it's doing a trunk build against the system APR.  Does anybody
> > know if there is a buildbot slave running Bionic?
> 
> Or maybe we could think about swapping buildbot with Travis and set
> notifications for the latter to dev@. Travis seems working really well
> and after all your recent work IIUC it is already testing way more
> than what buildbot does :)

I'm fine with disabling bb if nobody wants to rescue it - and I didn't 
deliberately sabotage it!  We need to keep the doxygen bb stuff running, 
that has failed to build as well though I think it's only a 
non-reproducible failure, because an existing checkout has been 
reconfigured using the included APR (trunk) and still has objects built 
against APR 1.x.

How about:

1. If travis results look stable this week, turn on e-mail notifications 
to dev@ on Friday.

2. Friday following (22nd), disable the broken bits of buildbot if 
nobody has salvaged it.

Regards, Joe





Re: trunk APR version requirement

2019-11-12 Thread Joe Orton
Thanks to everyone who gave feedback, I made the change to require APR 
1.6 in r1869684.  While I kept the Travis builds passing, buildbot broke 
since it's doing a trunk build against the system APR.  Does anybody 
know if there is a buildbot slave running Bionic?

Regards, Joe



Re: svn commit: r1869459 - in /httpd/httpd/trunk: .travis.yml test/travis_before_linux.sh test/travis_run_linux.sh

2019-11-08 Thread Joe Orton
On Fri, Nov 08, 2019 at 11:04:32AM +0100, Ruediger Pluem wrote:
> On 11/06/2019 12:45 PM, jor...@apache.org wrote:
> > ==
> > --- httpd/httpd/trunk/test/travis_before_linux.sh (added)
> > +++ httpd/httpd/trunk/test/travis_before_linux.sh Wed Nov  6 11:45:21 2019
> > @@ -0,0 +1,5 @@
> > +#!/bin/bash -ex
> > +svn export -q https://svn.apache.org/repos/asf/apr/apr/trunk srclib/apr
> 
> Hmm. Why checking out apr trunk, when we supply --with-apr=/usr 
> --with-apr-util=/usr to configure?
> Just for buildconf to work?

Yup, I had it there for buildconf.  It was removed in r1869533 and 
actually we can switch to running buildconf against the APR version 
where available, thanks for the prompt.

Regards, Joe



trunk APR version requirement

2019-11-08 Thread Joe Orton
On trunk the configure script requires APR 1.4 but mod_proxy_balancer 
fails to build with APR < 1.5 since it uses apr_escape.h unconditionally 
[1].  Thinking of 2.5/2.6 it might make sense to bump up the APR version 
requirement to something newer, any opinions?  RHEL 8 and Bionic 
(18.04LTS) both have APR 1.6 so that might make a reasonable target for 
those wanting to build against distro APR versions.

grep for APR_VERSION_AT_LEAST --

./server/mpm/simple/simple_run.c:#if !APR_VERSION_AT_LEAST(1,4,0)
./server/mpm/event/event.c:#if !APR_VERSION_AT_LEAST(1,4,0)
./server/util_expr_eval.c:#if APR_VERSION_AT_LEAST(1,5,0)
./server/util_expr_eval.c:#if APR_VERSION_AT_LEAST(1,6,0)
./server/util_expr_eval.c:#if APR_VERSION_AT_LEAST(1,6,0)
./server/listen.c:#if APR_VERSION_AT_LEAST(1,7,0)
./server/listen.c:#if APR_VERSION_AT_LEAST(1,7,0)
./server/listen.c:#if !APR_VERSION_AT_LEAST(1,7,0)
./server/vhost.c:#if !APR_VERSION_AT_LEAST(1,7,0)
./server/vhost.c:#if APR_VERSION_AT_LEAST(1,7,0)
./modules/proxy/mod_serf.c:#if !APR_VERSION_AT_LEAST(1,4,0)
./modules/ssl/mod_ssl_ct.c:#if !APR_VERSION_AT_LEAST(1,5,0)
./modules/filters/mod_crypto.c:#if !APR_VERSION_AT_LEAST(2,0,0)
./modules/filters/mod_crypto.c:#if !APR_VERSION_AT_LEAST(2,0,0) && \
./modules/lua/lua_passwd.h:#if !APR_VERSION_AT_LEAST(2,0,0)
./modules/lua/lua_passwd.h:#if APR_VERSION_AT_LEAST(2,0,0) || \
./modules/metadata/mod_remoteip.c:#if !APR_VERSION_AT_LEAST(1,5,0)
./support/passwd_common.h:#if !APR_VERSION_AT_LEAST(2,0,0)
./support/passwd_common.h:#if APR_VERSION_AT_LEAST(2,0,0) || \

Regards, Joe

[1] https://travis-ci.org/notroj/httpd/jobs/609143667



Re: Integration tests running on Docker

2019-11-07 Thread Joe Orton
On Thu, Nov 07, 2019 at 10:37:50AM +, Pluem, Ruediger, Vodafone Group wrote:
> r1868314

Ah, thank you!  Obviously I tested this locally first too but hadn't 
updated my test/framework directory... duh.

Should we use RTC or can I get an exception for 2.4.x travis 
integration?  I have a few more things to merge to trunk then can 
backport:

  https://github.com/apache/httpd/compare/2.4.x...notroj:travisify-2.4.x

Index: STATUS
===
--- STATUS  (revision 1869499)
+++ STATUS  (working copy)
@@ -125,6 +125,7 @@
 . non-Unix, single-platform code
 . routine APLOGNO() backports
 . .gdbinit
+. Travis integration: .travis.yml and test/travis*.sh
 
 RELEASE SHOWSTOPPERS:
 



Re: Integration tests running on Docker

2019-11-06 Thread Joe Orton
Has anybody seen t/modules/deflate.t and t/modules/brotli.t fail on 
Debian-ish distro? I don't remember seeing this before but maybe it's something 
trivial?  t/TEST output -

https://travis-ci.org/notroj/httpd/jobs/608306769#L3244

(I'm trying to get Travis working in 2.4.x as well.)

Regards, Joe




Re: Integration tests running on Docker

2019-11-06 Thread Joe Orton
On Wed, Nov 06, 2019 at 10:54:56AM +0100, Luca Toscano wrote:
> I just noticed 
> https://docs.travis-ci.com/user/installing-dependencies/#installing-dependencies-on-multiple-operating-systems,
> so the 'addons' will be used by travis only if the os can use them (I
> was afraid of having windows and apt combined). The only thing that I
> would add to travis.yml are the ifs to separate
> before_install/script/etc.. for every os, I think that we should add
> windows as soon as possible.. Do we have a list of build commands to
> test in Travis?

I have no idea how to build on anything but Linux :) Someone could do 
osx too, travis supports that...

So it works for PRs and successfully failed for a bogus commit -

https://travis-ci.org/apache/httpd/pull_requests

This one is sensible?

https://github.com/apache/httpd/pull/71/commits/4a46327813e87f3d662e65361e6d9240ce855db5

I don't know if anything other than Linux can actually be tested like 
this tbh.  (Also trying to context switch between svn & git doing this 
stuff is going to melt my brain!)



Re: Integration tests running on Docker

2019-11-06 Thread Joe Orton
On Tue, Nov 05, 2019 at 10:06:23AM +0100, Luca Toscano wrote:
> Basically what Joe did, but following what the Apache Arrow Project
> did. There is a little bit of repetition, but in theory with this
> config people are free to add tests for other OSes (osx, windows) and
> I will probably be able to add custom ones with Docker to see the
> difference in execution timings etc.. (Apache Arrow's test do use
> docker, I didn't see it a first).

I haven't tried to follow all of what the Arrow .yml is doing (too much 
Java ;) but moving some of the logic into the repo itself out of 
.travis.yml seems like a good idea.  e.g. moving all of script: into 
something like ./test/travis_script_linux.sh or something for the Linux 
based tests, does that make sense?

This is working now -

https://travis-ci.org/apache/httpd/builds

and I guess it will work for forks and PRs too, haven't tried yet.

Anybody feel free to extend, play, hack, whatever with the travis.yml, I 
am no expert so may be doing something non-standard/wrong, feel free to 
correct/improve ;)

The jobs are quite quick but the queue time for the "apache" repos is 
long (10+ minutes) before they get started.  I guess it is doing 
notifications to committers directly, but we won't find out until a job 
fails?  Maybe let's keep experimenting with this a bit longer and then 
start sending mail to dev@ for state changes (exactly as buildbot does) 
when it seems stable?

Regards, Joe



Re: Time for httpd 2.6.x?

2019-11-06 Thread Joe Orton
On Mon, Nov 04, 2019 at 09:35:09PM +0100, Ruediger Pluem wrote:
> On 10/25/2019 12:52 PM, Yann Ylavic wrote:
> > My feeling is that it's easier to start from trunk, no strong feeling
> > (an intuitive one).
> > 
> > So how about:
> >  0. github workflow? meanwhile...
> >  1. compare trunk/2.4.x (inventory)
> >  2. discuss unused/deprecated in trunk to cleanup
> >  3. address STALLED entries in trunk if it's not for compatibily reasons
> >  4. do API/ABI breaking changes in trunk
> >  5. improve trunk w/ things we want in 2.6.x (the real part!)
> >  6. trunk => 2.6.x
> >  7. remove from 2.6.x things not deprecated in 2. but w/ no champion either
> >  8. 2.6.alpha/beta/gamma/rc/whatever
> >  9. fix in trunk and backport to 2.6.x
> > 10. if not good enough goto 8.
> > 11. 2.6.0
> > 12+ usual release cycle
> > ?
> > 
> > If in 1. we find that 2.4.x => 2.6.x is easier/better, well just
> > replace 2. with that and drop 6. and 7.
> 
> +1 to the above, with a preference to branch from trunk if possible.

+1 also and a strong preference to branch from trunk.

Regards, Joe



Re: Integration tests running on Docker

2019-11-04 Thread Joe Orton
Here is a proof of concept -

https://github.com/notroj/httpd/blob/a73adfc8f1c01fbb6a3d493582f9c49c7c942756/.travis.yml

This runs using the full test suite using a few different configurations 
and also does builds with -Werror and maintainer-mode -

https://travis-ci.org/notroj/httpd/builds/607213820

...this should be very easy to extend with more configurations.

Can we merge the docker way with this kind of matrix type travis 
configuration?

Regards, Joe



Re: Integration tests running on Docker

2019-10-29 Thread Joe Orton
On Sun, Oct 27, 2019 at 06:42:58PM +0100, Luca Toscano wrote:
> Some updates:
> 
> - We have support for httpd in travis - https://travis-ci.org/apache/httpd

Nice, thanks Luca & infra!

> - In order to configure automatic builds, a travis.yaml file is needed
> in the base path of the repository, in every branch that needs to be
> tested IIUC.
> - My idea is currently to add in trunk a travis.yaml initial
> configuration file, that builds a Docker image like [1] and runs the
> perl test suite.
>
> - Building the Docker image in [1] takes quite a while now (between 9
> and 13 mins on my laptop, depending also on network bandwidth etc..)
> so it will need more time before it could be as fast as we need, but
> we have to start from somewhere :)
> - In the travis.yaml file we'd need to put config options about what
> Docker image to build and with what parameters. Ideally I'd like to
> store the Docker images in the httpd-framework repository, and

Do you mean Dockerfiles rather than images here?

> reference them in the Travis config of the httpd branches, but not
> sure if it will be possible. Worst case scenario we'll need to add the
> Docker image in each httpd branch that we want to test (possibly in a
> dedicated dir, like "docker" or "testing".

I expected two goals for testing:

1. configure, build, and test using a variety of different configure 
options in the standard Ubuntu environment which Travis provides (enable 
different MPMs, different module sets, different 
--enable-foo/--with-foo, different gcc versions etc etc)

2. configure build and test in a variety of OS environments - this would 
make sense using container images.

I don't have much experience testing inside containers from Travis, but 
if we can do both (1) and (2) inside containers that might make sense?  
Or can we do both separately?  If you have a .travis.yml which works 
then I'd say go commit it and we can work from there.

Regards, Joe

> If you are strongly against the above or have more suggestions and
> knowledge about Travis please let me know!




Re: Migrate to git?

2019-10-14 Thread Joe Orton
On Mon, Oct 14, 2019 at 04:51:56PM +0200, Stefan Sperling wrote:
> On Mon, Oct 14, 2019 at 03:27:31PM +0100, Joe Orton wrote:
> > At the moment I think we have a quality control problem for 2.4.x, yet I 
> > find it hard to justify spending much time on writing test cases because 
> > that stuff is run so rarely.  How many tests proposed in 2.4.x STATUS 
> > have had a full test suite run?  I certainly don't always do it.  But if 
> > we get the tests running all the time automatically it's much easier to 
> > see a return on investment for improving test coverage.
> 
> I see there's already a buildbot job for httpd trunk:
> https://ci.apache.org/waterfall?tag=httpd-trunk
> It seems this build job is configured to run a compile but it does not
> run the test suite? Putting Github/Travis questions aside, an easy way
> to get automated tests going with minimal effort today could be to run
> the existing tests inside httpd's existing buildbot job.
> 
> The Subversion project has been doing this for years.
> See https://subversion.apache.org/buildbot/all

Yes, we have a buildbot which is great, but compared to a GitHub+Travis 
workflow it has two downsides: buildbot is horribly complicated, and it 
only runs *after* commits.  Seeing test results before merging a feature 
to trunk to 2.4.x is a world from waiting for a buildbot run.

(To the first: adding a new build with different ./configure arguments 
is a typically trivial copy&paste with .travis.yml even for somebody who 
doesn't know Travis syntax; I spent an hour reading our bb config once 
and was not confident to try changing it)

Regards, Joe


Re: Migrate to git?

2019-10-14 Thread Joe Orton
On Tue, Oct 08, 2019 at 02:17:12PM +0200, Luca Toscano wrote:
> Il giorno mar 8 ott 2019 alle ore 11:04 Greg Stein 
> ha scritto:
> >
> > Travis CI is possible *today* ... since the svn commits are 
> > replicated over to github, Travis can pick them up and run tests. 
> > Just file an INFRA ticket to enable it.
> >
> 
> Thanks for the pointer, will file a task to infra to enable it :)

This would be awesome, did you file something?  If not I can.

Like others I am no fan of git, but so long as we can get the process 
right, I think using PRs with CI would be a significant benefit for the 
project.  At minimum we could avoid some of the trivial build breakage 
type issues which have delayed 2.4 releases recently.

Even for trunk I would like to be able to develop new features and have 
a full test suite run (e.g. w/pool-debug, with different APR releases, 
etc etc) easily available without having to wait an hour with laptop 
fans giving me a headache.

At the moment I think we have a quality control problem for 2.4.x, yet I 
find it hard to justify spending much time on writing test cases because 
that stuff is run so rarely.  How many tests proposed in 2.4.x STATUS 
have had a full test suite run?  I certainly don't always do it.  But if 
we get the tests running all the time automatically it's much easier to 
see a return on investment for improving test coverage.

Regards, Joe


Re: Migrate to git?

2019-10-08 Thread Joe Orton
On Sat, Oct 05, 2019 at 04:09:34PM -0400, Jim Jagielski wrote:
> Various PMCs have made their default/de-facto SCM git and have seen an 
> increase in contributions and contributors...
> 
> Is this something the httpd project should consider? Especially w/ the 
> foundation officially supporting Github, it seems like time to have a 
> discussion about it, especially as we start thinking about the next 25 
> years of this project :)

Can we use Travis CI as well?  If so I am +1 on moving to github, being 
able to easily configure a consistent CI across branches and PRs will be 
a major improvement over the status quo.  (I have no idea how buildbot 
works or how to improve it and it's unusuable before commits)

Regards, Joe


Re: Backporting .gdbinit changes CTR?

2019-09-02 Thread Joe Orton
On Mon, Sep 02, 2019 at 08:12:38AM +0200, Ruediger Pluem wrote:
> I vaguely remember that backports of changes to .gdbinit are CTR but I cannot 
> find it any longer.
> So if we would agree on it I would add it to the "Current exceptions for RTC 
> for this branch:" in the STATUS file.
> Comments / feedback?

Definitely +1 on CTR for that.

Regards, Joe


Re: svn commit: r1864526 - /httpd/httpd/trunk/modules/metadata/mod_remoteip.c

2019-08-28 Thread Joe Orton
On Wed, Aug 28, 2019 at 02:24:40PM +0200, Ruediger Pluem wrote:
> On 08/06/2019 05:41 PM, jor...@apache.org wrote:
> > Author: jorton
> > Date: Tue Aug  6 15:41:22 2019
> > New Revision: 1864526
...
> > +ret = apr_brigade_length(ctx->bb, 1, &got);
> > +if (ret || got > want) {
> > +ap_log_cerror(APLOG_MARK, APLOG_ERR, ret, f->c, 
> > APLOGNO(10185)
> > +  "RemoteIPProxyProtocol header too long, "
> > +  "got %" APR_OFF_T_FMT " expected %" 
> > APR_OFF_T_FMT,
> > +  got, want);
> > +f->c->aborted = 1;
> 
> Shouldn't we do apr_brigade_destroy(ctx->bb) here as well like below?

The apr_brigade_destroy() calls should be all redundant in the failure 
cases AFAICT.  Can you see a reason why they need to be explicitly 
destroyed prior to the pool cleanups running?





Re: Disabling TLS session tickets causes handshake failures

2019-08-13 Thread Joe Orton
On Tue, Aug 13, 2019 at 02:50:17PM -0700, Brad Warren wrote:
> * httpd 2.4.18 (from Ubuntu 16.04)
> * httpd 2.4.25 (from Debian 9)
> * httpd 2.4.39 (from Amazon Linux 2)
> 
> They were presumably using the version of OpenSSL available in those 
> distributions as well although I haven’t been able to verify that yet.
> 
> The affected clients and their error messages were:
> 
> * Chrome: ERR_SSL_PROTOCOL_ERROR
> * Firefox: SSL_ERROR_RX_UNEXPECTED_NEW_SESSION_TICKET
> * Wget: OpenSSL: error:1408E0F4:SSL routines:ssl3_get_message:unexpected 
> message

I'm not aware of an issue like this, it might be a config which does not 
get widely tested.

Trying "SSLSessionTickets off" on Fedora 30 (httpd 2.4.39 + openssl 
1.1.1c) seems to work fine with TLSv1.1 thru v1.3 selected client side 
(w/same OpenSSL).

Can we get full mod_ssl configuration info, OpenSSL version and ideally 
the associated ssl_error_log output?

Regards, Joe

> 
> Clients that were reported to be unaffected were:
> 
> * OpenSSL s_client
> * Safari
> 
> Curl was affected for some users, but not others. One report of a failure 
> with curl provided the output below from which I removed their domain and IP:
> 
> $ curl -v https://example.com
> * Rebuilt URL to: https://example.com/
> *   Trying 1.2.3.4...
> * TCP_NODELAY set 
> * Connected to example.com (1.2.3.4) port 443 (#0)
> * schannel: SSL/TLS connection with example.com port 443 (step 1/3)
> * schannel: checking server certificate revocation
> * schannel: sending initial handshake data: sending 172 bytes...
> * schannel: sent initial handshake data: sent 172 bytes
> * schannel: SSL/TLS connection with example.com port 443 (step 2/3)
> * schannel: failed to receive handshake, need more data
> * schannel: SSL/TLS connection with example.com port 443 (step 2/3)
> * schannel: encrypted data got 2960
> * schannel: encrypted data buffer: offset 2960 length 4096
> * schannel: sending next handshake data: sending 126 bytes...
> * schannel: SSL/TLS connection with example.com port 443 (step 2/3)
> * schannel: encrypted data got 258 
> * schannel: encrypted data buffer: offset 258 length 4096
> * schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE 
> (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is 
> received (e.g. handshake failed). More detail may be available in the Windows 
> System event log.
> * Closing connection 0
> * schannel: shutting down SSL/TLS connection with example.com port 443 
> * schannel: clear security context handle
> curl: (35) schannel: next InitializeSecurityContext failed: 
> SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal 
> SSL/TLS alert is received (e.g. handshake failed). More detail may be 
> available in the Windows System event log.
> 
> Any ideas about what is going on here? So far we have been unable to even 
> reproduce the problem.
> 
> Thanks for any help,
> Brad Warren
> Senior Staff Technologist
> Electronic Frontier Foundation


Re: [VOTE] Release httpd-2.4.41

2019-08-11 Thread Joe Orton
On Fri, Aug 09, 2019 at 08:40:38AM -0500, Daniel Ruggeri wrote:
> Hi, all;
> Please find below the proposed release tarball and signatures:
> https://dist.apache.org/repos/dist/dev/httpd/
> 
> I would like to call a VOTE over the next few days to release this candidate 
> tarball as 2.4.41:
> [ ] +1: It's not just good, it's good enough!
> [ ] +0: Let's have a talk.
> [ ] -1: There's trouble in paradise. Here's what's wrong.
> 
> The computed digests of the tarball up for vote are:
> sha1: b713e835aa7cde823a4b7f8e3463164f3d9fe63e *httpd-2.4.41.tar.gz
> sha256: 3c0f9663240beb0f008acf3b4501c4f339d7467ee345a36c86c46b4d6f3a5461  
> *httpd-2.4.41.tar.gz

+1 for release, CHANGES good, digests good, test suite passes on Fedora 
30/x86_64.

Thanks for RMing!

Regards, Joe


<    1   2   3   4   5   6   7   8   9   10   >