At 12:26 AM 10/13/2002, you wrote:

OK, it's dying inside the name resolution routines, trying to obtain the
contents of a class IN record of type IN for "42.192.99.68.in-addr.arpa".
You can do this yourself by doing "nslookup -q=ptr
42.192.88.68.in-addr.arpa", but I don't think you're going to get a
problem, because this is using the multithreaded version of the call, and
nslookup probably uses the unithreaded version.

Do you have a restriction in nsperm on what addresses may access as
nsadmin?  From what I'm reading in the code, nsd/dns.c is trying to verify
that the IP address is allowed to access as the specified user (in this
case 68.99.192.42 as "nsadmin").  If so, try removing the address
restriction and see if the segfault goes away.  If not, say so, and
perhaps someone else may notice something I'm missing.
Yes, I have a restriction based on IP, and taking it out removes the
problem.  I am starting to think that maybe something that got updated with
up2date hosed something here...

I have another server that has three instances of aolserver running, and
one of the three has started doing exactly the same thing.  this one is RH
7.0.  It also has had up2date run on it recently.  But, even restarting the
other two servers to be sure they are running in the same state as the one
thats crashing doesn't induce them to crash.

This behavior is all very recent, the hosts.allow setup has been
established for some time.

I also did some further testing, I had forgotten that a address lookup
script stopped working as well..  the brunt of the page is simply:

  set ipaddr [ns_conn peeraddr]
  set host [get_host]
  if { [catch { set hostaddr [ns_hostbyaddr $ipaddr] }] } {
    set hostaddr "<unknown>"
  }

and then returning that in HTML.  I use it for administrative stuff at
times.    I just did some testing,  ns_hostbyaddr is the call thats
crashing from that routine.



Pete.

On Sat, 12 Oct 2002, Patrick Spence wrote:

> At 03:26 PM 10/12/2002, you wrote:
>
> I am using Redhat 7.2, latest revisions via Up2date.  Aolserver 3.4.2
>
> No core file in evidence, but the gdb instructions you gave were
> perfect..  and I definately appreciate the assist here.
>
>
> here is the result:
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 6151 (LWP 23047)]
> 0x401d6ca0 in __res_nquery (statp=0x40503ddc, name=0x404f2ecc
> "42.192.99.68.in-addr.arpa",
>      class=1, type=12, answer=0x404f32dc "", anslen=65536) at
res_query.c:110
> 110     res_query.c: No such file or directory.
>          in res_query.c
>
>
> stack backtrace:
>
> #0  0x401d6ca0 in __res_nquery (statp=0x40503ddc, name=0x404f2ecc
> "42.192.99.68.in-addr.arpa", class=1, type=12, answer=0x404f32dc "",
> anslen=65536) at res_query.c:110
> #1  0x401cd1f8 in _nss_dns_gethostbyaddr_r (addr=0x405033b0, len=4, af=2,
> result=0x4019dca4,buffer=0x84bddd8 "\177", buflen=1024, errnop=0x40503c68,
> h_errnop=0x40503374) at nss_dns/dns-host.c:292
> #2  0x40162919 in __gethostbyaddr_r (addr=0x405033b0, len=4, type=2,
> resbuf=0x4019dca4, buffer=0x84bddd8 "\177", buflen=1024, result=0x40503370,
> h_errnop=0x40503374) at ../nss/getXXbyYY_r.c:200
> #3  0x401626fb in gethostbyaddr (addr=0x405033b0, len=4, type=2) at
> ../nss/getXXbyYY.c:131
> #4  0x0806dffb in GetHost (dsPtr=0x4050345c, addr=0x821057c "68.99.192.42")
> at dns.c:231
> #5  0x0806de24 in DnsGet (getProc=0x806dfa0 <GetHost>, dsPtr=0x4050345c,
> cachePtr=0x813cdec, key=0x821057c "68.99.192.42") at dns.c:153
> #6  0x0806dc5f in Ns_GetHostByAddr (dsPtr=0x4050345c, addr=0x821057c
> "68.99.192.42") at dns.c:100
> #7  0x4023acc3 in ValidateUserAddr (userPtr=0x81fdeb8, peer=0x821057c
> "68.99.192.42") at nsperm.c:819
> #8  0x4023a4c1 in AuthProc (server=0x8156fb8 "ariven", method=0x8439640
> "GET", url=0x8439660 "/NS/Admin", user=0x83d07b8 "nsadmin", pass=0x9999999
> "XXXXXXX", peer=0x821057c "68.99.192.42") at nsperm.c:435
> #9  0x08061bb3 in Ns_AuthorizeRequest (server=0x8156fb8 "ariven",
> method=0x8439640 "GET", url=0x8439660 "/NS/Admin", user=0x83d07b8
> "nsadmin", passwd=0x9999999 "XXXXXXX", peer=0x821057c "68.99.192.42") at
> auth.c:76 #10 0x0807e250 in ConnRun (connPtr=0x8210520) at serv.c:873
> #11 0x0807dc10 in NsConnThread (arg=0x822b3d0) at serv.c:671
> #12 0x0811833b in NsThreadMain (arg=0x822b3e0) at thread.c:228
> #13 0x40032c6f in pthread_start_thread (arg=0x40503be0) at manager.c:284
> #14 0x40032d5f in pthread_start_thread_event (arg=0x40503be0) at
manager.c:308
>
>
>
> >If the process segfaults, it should leave a file named "core" in the
> >directory from which the process was started.  There are, however, a
> >number of reasons why a core file may have been suppressed; to figure this
> >out, if you don't have a core file, you'd need to say which OS you're
> >using.
> >
> >If you have gdb, you can run the aolserver process as follows:
> >
> >         - Change directories to the aolserver root dir
> >         - run "gdb bin/nsd8x"
> >         - Give gdb the command "set args -ft yourconfigfile.tcl"
> >                 (replace "yourconfigfile.tcl" with the name of
> >                 your actual config file)
> >         - give gdb the "run" command
> >         - induce the segfault.  You should get a gdb prompt.
> >         - Give gdb the command "where" and it will produce a stack
> >backtrace, which should tell you what the thread was executing when it
> >segfaulted.
> >
> >If your stack backtrace shows that the process died somewhere inside
> >"malloc", then something somewhere corrupted the dynamic memory allocation
> >heap of your process prior to the segfault.  These are really hard to
> >track down.
> >
> >Pete.
>
Patrick Spence <ariven AT ariven DOT com>
www.RandomRamblings.com
www.Ariven.com

Reply via email to