Hi,

Since I'll be away next week I'm writing down the recent study of the
core dumps of liquid.

I think we fixed the FPE (cf. /tmp/core.24944 on sci1). Here are the
relevant parts of the IRC logs:

<julio> Core was generated by `/usr/local/bin/liquidsoap -v
/etc/liquidsoap/geek.liq'.
<julio> Program terminated with signal 8, Arithmetic exception.
<julio> #0  0x0807c217 in __udivdi3 ()
<julio> #1  0x08067e42 in ocaml_mixer_convert_format ()
<julio> il est ou ce fichier ?
<smimou>        tools/
<smimou>        mixer_c.c
<smimou>        de mon côté je vois deux possibilités
<smimou>          unsigned long long out_buf_len = (in_buf_len * BUF_CHANS *
BUF_FREQ * BUF_SAMPLESIZE)
<smimou>            / (channels * sample_freq * sample_size) ;
<smimou>        ou
<smimou>                float f = 44100 / ((float)sample_freq);
<smimou>        une en fait

I added a check to ensure that channels, sample_freq, sample_size > 0
(and raise Invalid_format). I guess it just came from bogus files, we'll
see.

Then, we had the following error (/tmp/core.22021):

*** glibc detected *** malloc(): memory corruption: 0xb4d17548 ***

The backtrace (on thread 2, the only relevant thread):

(gdb) bt
#0  0x0806270d in ocaml_vorbis_get_dec_file_info (d_f=-1270258428)
    at vorbis_stubs.c:803
#1  0x0807a095 in caml_fl_add_block ()
#2  0x08075eb0 in caml_weak_get_copy ()
#3  0x08075efa in caml_weak_get_copy ()
#4  0x080670b8 in ocaml_mixer_convert_format (fmt=135509008,
buff=-1253239664)
    at tools/mixer_c.c:64
#5  0xb7b94e60 in ?? ()
#6  0x0813b410 in ?? ()
#7  0xb54d1490 in ?? ()
#8  0xb54d1490 in ?? ()
#9  0xb54d1490 in ?? ()
#10 0xb54d1490 in ?? ()
#11 0x00000000 in ?? ()
(gdb) frame
#0  0x0806270d in ocaml_vorbis_get_dec_file_info (d_f=-1270258428)
    at vorbis_stubs.c:803
803       Store_field(ans, 7, Val_int(0));

The experienced reader will have immediately noticed that 0xb4d17548 =
-1261341368 = d_f. However d_f is a *value*. And there I'm kinda stuck,
the only conclusion that comes to my mind is: WTF !!?? I cannot
understand how this line of the soure is accessing to d_f. Moreover, I
cannot manage to understand how it could jump from #4 to #1, #4 being:

(gdb) frame
#4  0x080670b8 in ocaml_mixer_convert_format (fmt=135509008,
buff=-1253239664)
    at tools/mixer_c.c:64
64        memcpy(buf, String_val(buff), in_buf_len);

Anyway, let's try to debug this. Just before, df is declared by

  myvorbis_dec_file_t *df = Decfile_val(d_f);

and its value is

(gdb) printf "%p\n", &((value*)(d_f))[1]
0xb4496508

We store it in a variable ($36):

(gdb) p *((myvorbis_dec_file_t**)0xb4496508)
$36 = (struct myvorbis__dec_file_t *) 0x816afa8

and try to look at the values of its members

(gdb) p $36 ->sample_size
$37 = 2
(gdb) p $36->sign
$39 = -1
(gdb) p $36->bitstream
$43 = 128000

so far so good... (even though I don't know exactly what bitstream
should look like)

(gdb) p $36->big_endian
$38 = 44100
(gdb) p $36->ovf
$40 = (OggVorbis_File *) 0x0
(gdb) p $36->read_func
$44 = -1
(gdb) p $36->seek_func
$45 = 2
(gdb) p $36->close_func
$46 = 135705176
(gdb) p $36->tell_func
$47 = 1

Ouch............. Let's recap:
- big_endian is supposed to be a boolean
- ovf is supposed to be a pointer to the oggvorbis file
- *_func are supposed to be callback functions (they are supposed to be
either all NULL when there's no callback or all valid values when there
are callbacks)

I don't have much time to go deeper in this mess. My guess here is that
close_dec_file was called *before* accessing the decoder with
get_dec_file_info (ouf, I would not be the one to be held responsible
for this one :)).

Good week-end everybody!

Cheers,

Sam.


Répondre à