I don't have any really useful insights, but I do note: * I can't match up the line numbers in the stack trace to the kfw-4.1 source code; for instance, in the kfw-4.1-final tag, krb5_sendto_kdc() runs from lines 412-493, but the stack trace shows line 507 as being a call from krb5_sendto_kdc() to service_fds().
* There are many calls (over 12K) to service_tcp_write(), suggesting that select() is reporting the socket as writable when it isn't yet. * It would be very useful to know the value of nwritten after each call to SOCKET_WRITEV(). On 05/08/2018 06:28 AM, Puran Chand wrote: > Hi, > > PFA logs for the same. > > Any pointers regarding this is highly appreciated. > > -Puran > > On Wed, Feb 14, 2018 at 10:15 AM, Puran Chand <[email protected] > <mailto:[email protected]>> wrote: > > Also, the crash is seen with kfw-4.1 dlls as well(based on 1.13 > version). > > > On Wed, Feb 14, 2018 at 10:04 AM, Puran Chand <[email protected] > <mailto:[email protected]>> wrote: > > Hi Greg, > > We can rule out the first possibility because had it been > NULL, SOCKET_WRITEV() it self will crash or at-least return an > error which will be handled immediately in next statement. > if (nwritten < 0) { > TRACE_SENDTO_KDC_TCP_ERROR_SEND(context, &conn->addr, > SOCKET_ERRNO); > kill_conn(context, conn, selstate); > return FALSE; > } > > About third part where it could have been corrupted by another > thread, I want to inform (hope it helps) that my application is > single threaded. > Also the final token size for the user will go up-to 30k bytes. > > I will keep looking and will keep you posted for further assistance. > > Appreciate all help, Thanks. > > On Tue, Feb 13, 2018 at 9:03 PM, Greg Hudson <[email protected] > <mailto:[email protected]>> wrote: > > On 02/12/2018 11:44 PM, Puran Chand wrote: > > The code works fine and generates token most of the time but > once in a > > while it crashes and the crash happens in library. > > I have looked at the stack traces and have a vague idea of > the problem > area, but I don't see a bug in the code, nor do I see any > potentially > related changes to sendto_kdc.c between 1.16 and the last > KfW release. > I will describe what's going on in case it helps you debug > this further. > > sendto_kdc.c:1113 (in krb5 1.16) is "if ((size_t)nwritten < > SG_LEN(sgp))", where SG_LEN(sgp) is sgp->len. Since the code is > crashing here, sgp is presumably a null or invalid pointer. > > sgp is set from conn->out.sgp. conn->out.sgp should have been > initialized to state->out.sgbuf in add_connection(). sgbuf > is an array > of scatter-gather vectors of up to two elements. (We use > this array to > avoid having to recopy the packet for TCP requests, while > still sending > the length and the packet in one write operation.) > > I can think of three general possibilities: > > * conn->out.sgp for some reason never got set, so is a null > pointer at > the time of the crash. But I don't know why it wouldn't > have been set. > > * conn->out.sgp is incremented during each iteration of the > loop (at > line 1119) until we run out of written bytes to account for. If > nwritten is for some reason much larger than it should be, > conn->out.sgp > could run off the end of conn->out.sgbuf by enough to produce a > segmentation fault. But I don't know why nwritten would > ever be larger > than the lengths of the two scatter-gather vectors. > > * conn->out.sgp could have been corrupted by a memory error > elsewhere. > Since sendto_kdc() is synchronous, I would think the > corruption would > have to have occurred in another thread. > > > > _______________________________________________ kfwdev mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/kfwdev
