Bus error for proxy_hcheck on Solaris

2016-01-21 Thread Rainer Jung

Probably an alignment problem, but I don't immediately see how:

at modules/proxy/mod_proxy_hcheck.c:410
wctx = 0x70706461
So this is not an address usable for a pointer.

But wctx comes from hc->s->context:

{name = "f9328\000/www.kippdata.de/", '\000' , scheme 
= "http", '\000' , hostname = "www.kippdata.de", 
'\000' ,
  route = '\000' , redirect = '\000' times>, flusher = '\000' , uds_path = '\000' 255 times>,
  hcuri = '\000' , hcexpr = '\000' times>, "\001", lbset = 0, retries = 0, lbstatus = 0, lbfactor = 1, 
min = 0, smax = 1, hmax = 0,
  flush_wait = 1, index = 0, passes = 1815278865, pcount = -1571419579, 
fails = 3, fcount = 0, hash = {def = 2239854630, fnv = 2173387474}, 
status = 10,
  flush_packets = flush_off, method = GET, updated = 0, error_time = 
6000, ttl = 0, retry = 0, timeout = 0, acquire = 0, ping_timeout = 
3000, conn_timeout = 0,
  interval = 0, recv_buffer_size = 5242880, io_buffer_size = 0, elected 
= 0, busy = 0, port = 80, transferred = 536870912, read = 
3400067598672358249, context = 0x70706461,
  keepalive = 0, disablereuse = 0, is_address_reusable = 0, retry_set = 
1, timeout_set = 0, acquire_set = 1, ping_timeout_set = 0, 
conn_timeout_set = 0,
  recv_buffer_size_set = 0, io_buffer_size_set = 1, keepalive_set = 1, 
disablereuse_set = 0, was_malloced = 0, is_name_matchable = 0}


and the context member gets set in line 403:

hc->context = wctx;

If I add a debug statement there the address looks different and fine. I 
don't immediately see where the address changes. Sparc is known to be 
picky about alignment. One might not observe this on x86.


Regards,

Rainer


Re: Bus error for proxy_hcheck on Solaris

2016-01-21 Thread Rainer Jung
I should add I'm using prefork (other problems with event and worker) 
and I didn't access the web server. Simply starting it and waiting a bit 
results in those cores.


A complete stack:

#0  hc_get_hcworker (ctx=ctx@entry=0xe3818, worker=worker@entry=0xf9328, 
p=p@entry=0x11ff38)

at modules/proxy/mod_proxy_hcheck.c:410
wctx = 0x70706461
hc = 0xe59f8
wptr = 0xe6058 "f9328"

#1  0xfeb432f4 in hc_check_http (worker=0xf9328, p=0x11ff38, 
ctx=0xe3818) at modules/proxy/mod_proxy_hcheck.c:650

status = 
c = {pool = 0xfeeba558, base_server = 0xfe8fbe60, 
vhost_lookup_data = 0xff08a398, local_addr = 0x0, client_addr = 0x1c00, 
client_ip = 0x0, remote_host = 0x0,
  remote_logname = 0x0, local_ip = 0x5f5e100 out of bounds>,
  local_host = 0xfeb49788 "modules/proxy/mod_proxy_hcheck.c", 
id = -21649064, conn_config = 0x529dd,
  notes = 0x6ed21e3c, input_filters = 0x529dd, output_filters = 
0xe3818, sbh = 0xcc290, bucket_alloc = 0xfeb5a0fc, cs = 0x0, 
data_in_input_filters = 10,
  data_in_output_filters = 0, clogging_input_filters = 0, 
double_reverse = -1, aborted = 0, keepalive = 188, keepalives = 
-24133952, log = 0xfecc2b7c,
  log_id = 0xfe8fbec0 
"▒▒y`▒▒\223\060▒▒\207\220▒▒\222▒▒\217▒<", current_thread = 
0xfecc2aec, slaves = 0x0, master = 0x0, ctx = 0xfecc8b18,
  suspended_baton = 0xfe8fbf20, requests = 0x0, empty = 0x0, 
filters = 0x56a136d3, async_filter = 335104}

wctx = 
cond = 0x4
backend = 0x0
hc = 
r = 0x6ed21e3c
method = 

#2  hc_check (worker=0xf9328, now=1453405907335111, p=0x11ff38, 
ctx=0xe3818) at modules/proxy/mod_proxy_hcheck.c:757

s = 0xcc290
rv = 

#3  hc_watchdog_callback (state=, data=0xe3818, 
pool=) at modules/proxy/mod_proxy_hcheck.c:839

n = 0
workers = 0xf8bb0
worker = 0xf9328
i = 
rv = 
now = 
balancer = 
ctx = 0xe3818
s = 
conf = 
p = 0x11ff38
#4  0xfecc2d08 in wd_worker (thread=, data=0xd3710) at 
modules/core/mod_watchdog.c:203

ctx = 0x11df30
curr = 1453405907235104
wl = 0xd37f8
w = 0xd3710
rv = 
locked = 1
probed = 
inited = 0
mpmq_s = 1
#5  0xff095468 in dummy_worker (opaque=0x107f88) at 
apr-1.5.2/threadproc/unix/thread.c:142

thread = 0x107f88

Config in global server:

ProxyHCExpr ok234 {%{REQUEST_STATUS} =~ /^[234]/}

ProxyPass "/" "balancer://mycluster/"

BalancerMember "http://www.kippdata.de/"; \
hcexpr=ok234 hcmethod=GET hcinterval=10


Regards,

Rainer

Am 21.01.2016 um 20:59 schrieb Rainer Jung:

Probably an alignment problem, but I don't immediately see how:

at modules/proxy/mod_proxy_hcheck.c:410
 wctx = 0x70706461
So this is not an address usable for a pointer.

But wctx comes from hc->s->context:

{name = "f9328\000/www.kippdata.de/", '\000' , scheme
= "http", '\000' , hostname = "www.kippdata.de",
'\000' ,
   route = '\000' , redirect = '\000' , flusher = '\000' , uds_path = '\000' ,
   hcuri = '\000' , hcexpr = '\000' , "\001", lbset = 0, retries = 0, lbstatus = 0, lbfactor = 1,
min = 0, smax = 1, hmax = 0,
   flush_wait = 1, index = 0, passes = 1815278865, pcount = -1571419579,
fails = 3, fcount = 0, hash = {def = 2239854630, fnv = 2173387474},
status = 10,
   flush_packets = flush_off, method = GET, updated = 0, error_time =
6000, ttl = 0, retry = 0, timeout = 0, acquire = 0, ping_timeout =
3000, conn_timeout = 0,
   interval = 0, recv_buffer_size = 5242880, io_buffer_size = 0, elected
= 0, busy = 0, port = 80, transferred = 536870912, read =
3400067598672358249, context = 0x70706461,
   keepalive = 0, disablereuse = 0, is_address_reusable = 0, retry_set =
1, timeout_set = 0, acquire_set = 1, ping_timeout_set = 0,
conn_timeout_set = 0,
   recv_buffer_size_set = 0, io_buffer_size_set = 1, keepalive_set = 1,
disablereuse_set = 0, was_malloced = 0, is_name_matchable = 0}

and the context member gets set in line 403:

hc->context = wctx;

If I add a debug statement there the address looks different and fine. I
don't immediately see where the address changes. Sparc is known to be
picky about alignment. One might not observe this on x86.

Regards,

Rainer


Re: Bus error for proxy_hcheck on Solaris

2016-01-21 Thread Jim Jagielski
That is weird... does this help?

hc->context = (void *)wctx;

??

> On Jan 21, 2016, at 2:59 PM, Rainer Jung  wrote:
> 
> Probably an alignment problem, but I don't immediately see how:
> 
> at modules/proxy/mod_proxy_hcheck.c:410
>wctx = 0x70706461
> So this is not an address usable for a pointer.
> 
> But wctx comes from hc->s->context:
> 
> {name = "f9328\000/www.kippdata.de/", '\000' , scheme = 
> "http", '\000' , hostname = "www.kippdata.de", '\000' 
> ,
>  route = '\000' , redirect = '\000' , 
> flusher = '\000' , uds_path = '\000' ,
>  hcuri = '\000' , hcexpr = '\000' , 
> "\001", lbset = 0, retries = 0, lbstatus = 0, lbfactor = 1, min = 0, smax 
> = 1, hmax = 0,
>  flush_wait = 1, index = 0, passes = 1815278865, pcount = -1571419579, fails 
> = 3, fcount = 0, hash = {def = 2239854630, fnv = 2173387474}, status = 10,
>  flush_packets = flush_off, method = GET, updated = 0, error_time = 6000, 
> ttl = 0, retry = 0, timeout = 0, acquire = 0, ping_timeout = 3000, 
> conn_timeout = 0,
>  interval = 0, recv_buffer_size = 5242880, io_buffer_size = 0, elected = 0, 
> busy = 0, port = 80, transferred = 536870912, read = 3400067598672358249, 
> context = 0x70706461,
>  keepalive = 0, disablereuse = 0, is_address_reusable = 0, retry_set = 1, 
> timeout_set = 0, acquire_set = 1, ping_timeout_set = 0, conn_timeout_set = 0,
>  recv_buffer_size_set = 0, io_buffer_size_set = 1, keepalive_set = 1, 
> disablereuse_set = 0, was_malloced = 0, is_name_matchable = 0}
> 
> and the context member gets set in line 403:
> 
> hc->context = wctx;
> 
> If I add a debug statement there the address looks different and fine. I 
> don't immediately see where the address changes. Sparc is known to be picky 
> about alignment. One might not observe this on x86.
> 
> Regards,
> 
> Rainer



Re: Bus error for proxy_hcheck on Solaris

2016-01-21 Thread Jim Jagielski
Hold in a tic... I don't use the context field in ->s.



Re: Bus error for proxy_hcheck on Solaris

2016-01-21 Thread Jim Jagielski
I found one place, which we should never hit, that
does adjust hc->context, but that's only if the
bal-mgr changes settings, which so far it can't.

Anyone, removed that.


Re: Bus error for proxy_hcheck on Solaris

2016-01-21 Thread Jim Jagielski
Based on your stack, then that was the section you hit. But I
have no idea how you hit it. The test is:

if (hc->s->method != worker->s->method)

but neither is ever changed :/

> On Jan 21, 2016, at 3:22 PM, Jim Jagielski  wrote:
> 
> I found one place, which we should never hit, that
> does adjust hc->context, but that's only if the
> bal-mgr changes settings, which so far it can't.
> 
> Anyone, removed that.



Re: Bus error for proxy_hcheck on Solaris

2016-01-21 Thread Rainer Jung

At least it works now, does probing and also hc() works. Cool.

Rainer

Am 21.01.2016 um 21:28 schrieb Jim Jagielski:

Based on your stack, then that was the section you hit. But I
have no idea how you hit it. The test is:

 if (hc->s->method != worker->s->method)


I'll have a look.


but neither is ever changed :/


On Jan 21, 2016, at 3:22 PM, Jim Jagielski  wrote:

I found one place, which we should never hit, that
does adjust hc->context, but that's only if the
bal-mgr changes settings, which so far it can't.

Anyone, removed that.


Re: Bus error for proxy_hcheck on Solaris

2016-01-21 Thread Jim Jagielski
I added logging to that section... let me know if you see
'Updating hc worker ...' (it's @ DEBUG) because, if you do,
something is wonky.

> On Jan 21, 2016, at 3:53 PM, Rainer Jung  wrote:
> 
> At least it works now, does probing and also hc() works. Cool.
> 
> Rainer
> 
> Am 21.01.2016 um 21:28 schrieb Jim Jagielski:
>> Based on your stack, then that was the section you hit. But I
>> have no idea how you hit it. The test is:
>> 
>> if (hc->s->method != worker->s->method)
> 
> I'll have a look.
> 
>> but neither is ever changed :/
>> 
>>> On Jan 21, 2016, at 3:22 PM, Jim Jagielski  wrote:
>>> 
>>> I found one place, which we should never hit, that
>>> does adjust hc->context, but that's only if the
>>> bal-mgr changes settings, which so far it can't.
>>> 
>>> Anyone, removed that.



Re: Bus error for proxy_hcheck on Solaris

2016-01-21 Thread Yann Ylavic
On Thu, Jan 21, 2016 at 9:53 PM, Rainer Jung  wrote:
>
> Am 21.01.2016 um 21:28 schrieb Jim Jagielski:
>>
>> Based on your stack, then that was the section you hit. But I
>> have no idea how you hit it. The test is:
>>
>>  if (hc->s->method != worker->s->method)
>
>
> I'll have a look.

Could it be that sizeof(flush_packets) == 1 in struct
proxy_worker_shared (still 'apr_time_t updated' would be aligned but
possibly not 'method')?
What's APR_OFFSETOF(proxy_worker_shared, method)?

(Btw, shouldn't new 'method' field go to the end of the struct?)