Hi Matt

Trying out your patch. Will keep you posted. In meanwhile we ran into more
valgrind issues .. on the server end. Can you please comment on them?

==621== 8,680 (1,488 direct, 7,192 indirect) bytes in 62 blocks are
definitely lost in loss record 899 of 952
==621==    at 0x4A05F80: malloc (vg_replace_malloc.c:296)

==621==    by 0x5BFCC86: default_malloc_ex (mem.c:79)

==621==    by 0x5BFD315: CRYPTO_malloc (mem.c:308)

==621==    by 0x5D2414D: pitem_new (pqueue.c:73)

==621==    by 0x5958F74: dtls1_buffer_message (d1_both.c:1233)

==621==    by 0x594E3B2: dtls1_send_server_done (d1_srvr.c:1032)

==621==    by 0x594D696: dtls1_accept (d1_srvr.c:564)

==621==    by 0x595C555: SSL_accept (ssl_lib.c:940)

==621==    by 0x59539F7: dtls1_listen (d1_lib.c:491)

==621==    by 0x59533BF: dtls1_ctrl (d1_lib.c:267)

==621==    by 0x595CAF2: SSL_ctrl (ssl_lib.c:1106)

==621==    by 0x416229: server_ssl_event_cb (server.c:3823)

==621==

==621== 67,766 (1,488 direct, 66,278 indirect) bytes in 62 blocks are
definitely lost in loss record 933 of 952
==621==    at 0x4A05F80: malloc (vg_replace_malloc.c:296)

==621==    by 0x5BFCC86: default_malloc_ex (mem.c:79)

==621==    by 0x5BFD315: CRYPTO_malloc (mem.c:308)

==621==    by 0x5D2414D: pitem_new (pqueue.c:73)

==621==    by 0x5958F74: dtls1_buffer_message (d1_both.c:1233)

==621==    by 0x594FAD4: dtls1_send_server_certificate (d1_srvr.c:1612)
==621==    by 0x594D367: dtls1_accept (d1_srvr.c:426)
==621==    by 0x595C555: SSL_accept (ssl_lib.c:940)
==621==    by 0x59539F7: dtls1_listen (d1_lib.c:491)
==621==    by 0x59533BF: dtls1_ctrl (d1_lib.c:267)
==621==    by 0x595CAF2: SSL_ctrl (ssl_lib.c:1106)
==621==    by 0x416229:server_ssl_event_cb (server.c:3823)
==621==
==621== LEAK SUMMARY:
==621==    definitely lost: 2,976 bytes in 124 blocks
==621==    indirectly lost: 73,470 bytes in 248 blocks
==621==      possibly lost: 288 bytes in 1 blocks


Thanks
-Praveen



On Tue, Nov 25, 2014 at 6:28 AM, Matt Caswell via RT <r...@openssl.org> wrote:

> On Mon Nov 24 21:52:04 2014, prav...@viptela.com wrote:
> > * state = 4384,*
>
> This is SSL3_ST_CR_SRVR_HELLO_A, i.e. we are trying to read a ServerHello.
> This
> confirms what we expected.
>
>
> > > So if s->init_num is 0 then frag_len is 0 and frag->fragment gets
> > set to
> > > NULL.
>
> What I missed in the above is that there are some OPENSSL_assert calls in
> dtls_buffer_message that check init_num, so it cannot be 0. Something else
> is
> happening.
>
>
> > *Agreed. All good points. Just another data point, is that we ran
> > valgrind
> > on another node, saw a leak in this related code. See if this helps
> > you.*
> >
> > *==697== HEAP SUMMARY:
> > ==697== in use at exit: 1,282,108 bytes in 20,788 blocks
> > ==697== total heap usage: 664,349 allocs, 643,561 frees, 105,419,006
> > bytes allocated
> > ==697==
> > ==697== 120 bytes in 1 blocks are definitely lost in loss record 27 of
> > 96
> > ==697== at 0x4A05F80: malloc (vg_replace_malloc.c:296)
> > ==697== by 0x5BFBC86: default_malloc_ex (mem.c:79)
> > ==697== by 0x5BFC315: CRYPTO_malloc (mem.c:308)
> > ==697== by 0x5955875: dtls1_hm_fragment_new (d1_both.c:199)
> > ==697== by 0x5956817: dtls1_reassemble_fragment (d1_both.c:625)
> > ==697== by 0x595720A: dtls1_get_message_fragment (d1_both.c:852)
> > ==697== by 0x5956174: dtls1_get_message (d1_both.c:443)
> > ==697== by 0x59504DA: dtls1_get_hello_verify (d1_clnt.c:918)
> > ==697== by 0x594F5AB: dtls1_connect (d1_clnt.c:360)
> > ==697== by 0x595B591: SSL_connect (ssl_lib.c:949)
> > ==697== by 0x430409: ssl_connect_timer_cb (vdaemon_peer.c:303)
> > ==697== by 0x48573E: timer_exec_pri (timer.c:612)
> > ==697==
> > ==697== LEAK SUMMARY:
> > ==697== definitely lost: 120 bytes in 1 blocks
> > ==697== indirectly lost: 0 bytes in 0 blocks
> > ==697== possibly lost: 0 bytes in 0 blocks
> > ==697== still reachable: 1,281,988 bytes in 20,787 blocks
> > ==697== suppressed: 0 bytes in 0 blocks
> > ==697== Reachable blocks (those to which a pointer was found) are not
> > shown.
> > ==697== To see them, rerun with: --leak-check=full
> > --show-leak-kinds=all
> > ==697==
> > ==697== For counts of detected and suppressed errors, rerun with: -v
> > ==697== Use --track-origins=yes to see where uninitialised values come
> > from *
> >
> > *==697== ERROR SUMMARY: 126394 errors from 117 contexts (suppressed: 1
> > from
> > 1)*
> >
>
> That's very interesting. I've tracked that down to a problem in
> dtls1_clear_queues which is failing to correct free bufferred fragments.
> I've
> attached a patch. Please let me know if you have any problems with it.
> Unfortunately I think this is unconnected to your main problem.
>
> >
> >
> > > If I sent you some instrumented code would you be able to apply it
> > and see
> > > if
> > > that helps us narrow down what's going on?
> > >
> >
> > *[viptela.com <http://viptela.com>] *
> >
> > *Ofcourse. But as I mentioned earlier, we dont know the likelyhood of
> > this
> > happening again. Please send me any instrumented patch. We will keep
> > trying.*
>
> Ok, thanks. I've attached a second patch which adds a number of
> OPENSSL_assert
> calls at various points to check that frag->fragment is not null. I'm
> hoping it
> will help us track down why its not being correctly set. If you get another
> crash with this patch applied, then please capture the core and let me know
> what you find out.
>
> Thanks
>
> Matt
>
>


-- 

Regards
-Praveen

Reply via email to