Re: [FFmpeg-soc] [PATCH] AMR-WB Decoder

Vitor Sessak Thu, 02 Sep 2010 14:11:47 -0700

On 09/02/2010 09:30 PM, Marcelo Galvão Póvoa wrote:

Sorry about the delay to reply


On 30 August 2010 12:39, Vitor Sessak<[email protected]>  wrote:

On 08/30/2010 04:44 PM, Marcelo Galvão Póvoa wrote:

On 30 August 2010 09:31, Vitor Sessak<[email protected]>    wrote:

On 08/27/2010 11:13 PM, Marcelo Galvão Póvoa wrote:

On 26 August 2010 18:55, Vitor Sessak<[email protected]>      wrote:

Marcelo Galvão Póvoa wrote:

Fixed excitation array incorrect length.

Should now be applicable cleanly to the latest ffmpeg revision
(fd151a5f8bd152c456a).

First look:

[...]

+/**
+ * Convert an ISF vector into an ISP vector
+ *
+ * @param[in] isf                  Isf vector
+ * @param[out] isp                 Output isp vector
+ * @param[in] size                 Isf/isp size
+ */
+static void isf2isp(const float *isf, double *isp, int size)
+{
+    int i;
+
+    for (i = 0; i<      size - 1; i++)
+        isp[i] = cos(2.0 * M_PI * isf[i]);
+
+    isp[size - 1] = cos(4.0 * M_PI * isf[size - 1]);
+}

Almost duplication of amrnbdec.c:lsf2lsp().

Yes, but this last element doubling diverges between AMR-NB and AMR-WB
reference sources.

Can't you just do "isp[size - 1] *=  2" or it is used elsewhere?

It's not the last element output that is doubled but the input. Since
it is a cosine, I would have to calculate it again for the last
element I guess.

I mean doubling the last element of the input before calling the function,
if it doesn't make the rest of the code more complex.

Well, I only call this function in two places, so it seems to worth
reusing this way


nice

[...]

+        for(j = i - 1; j>      1; j--)
+            f[j] += f[j - 1] * val + f[j - 2];
+        f[1] += 0.25 * val;
+    }
+}

Hmm, can't this function be replaced by ff_lsp2polyf() with some
rescaling?

Actually I realize now that the difference between these functions is
just the entire output scaled down by 0.25. That was silly because I
was scaling by 4.0 after this function. It is weird why the reference
decoder does this process. Maybe it should worth someone checking it
out to be sure about this.

I suppose you checked that after this change the output of your decoder
didn't change, no?

Actually I cannot compare the output bitwise because the high band
involves a random excitation, I am afraid that it is not the same for
each run (or it is?).

Yes, you always send 1 as seed, so the prng output is always the same (and
this is a good thing), but since you use floats output might not be
bitexact. I suggest you do:

./tests/tiny_psnr output_ref.wav output_patched.wav 2

And see if stddev is ~ 0.1

By "output_ref.wav" you mean the earlier output from my decoder or the
fixed-point reference implementation? I did both tests comparing my
recent decoder with a 1kHz sine wave to:
- 3GPP:
stddev:  340.60 PSNR: 45.68 MAXDIFF: 1130 bytes:   959360/   959360
- Mine before the reviews:
stddev:    0.00 PSNR:999.99 MAXDIFF:    0 bytes:   959360/   959360

I don't know how to analyze all these values

stddev = 1/N sqrt(sum[i=0,...,N-1] (file1[i] - file2[i])*(file1[i] -file2[i]))


PSNR = log(stddev)

thus,

stddev == 0 means the files are identical

stddev < 0.04 means the files are really close, just a rare +- 1 difference

stddev < 1 means the files are very close, no audible difference

But I suggest you do not test with a sine-wave, it might amplify someproblems of minor differences adding up to make a big difference.

[...]

+/**

+ * Parses a speech frame, storing data in the Context
+ *
+ * @param[in,out] ctx              The context
+ * @param[in] buf                  Pointer to the input buffer
+ * @param[in] buf_size             Size of the input buffer
+ *
+ * @return The frame mode
+ */
+static enum Mode unpack_bitstream(AMRWBContext *ctx, const uint8_t
*buf,
+                                  int buf_size)
+{
+    GetBitContext gb;
+    uint16_t *data;
+
+    init_get_bits(&gb, buf, buf_size * 8);
+
+    /* decode frame header (1st octet) */
+    skip_bits(&gb, 1);  // padding bit
+    ctx->fr_cur_mode  = get_bits(&gb, 4);
+    ctx->fr_quality   = get_bits1(&gb);
+    skip_bits(&gb, 2);  // padding bits
+
+    // XXX: We are using only the "MIME/storage" format
+    // used by libopencore. This format is simpler and
+    // does not carry the auxiliary information of the frame
+
+    data = (uint16_t *)&ctx->frame;
+    memset(data, 0, sizeof(AMRWBFrame));
+    buf++;
+
+    if (ctx->fr_cur_mode<      MODE_SID) { /* Normal speech frame */
+        const uint16_t *perm =
amr_bit_orderings_by_mode[ctx->fr_cur_mode];
+        int field_size;
+
+        while ((field_size = *perm++)) {
+            int field = 0;
+            int field_offset = *perm++;
+            while (field_size--) {
+               uint16_t bit_idx = *perm++;
+               field<<= 1;
+               /* The bit index inside the byte is reversed (MSB->LSB)
*/
+               field |= BIT_POS(buf[bit_idx>>      3], 7 - (bit_idx&
  7));
+            }
+            data[field_offset] = field;
+        }
+    }
+
+    return ctx->fr_cur_mode;
+}

Can't you reorder the bit indexes in a way it is in the LSB->MSB so you
can
use just the usual bitstream functions?

Which bitstream functions?

Those in libavcodec/get_bits.h (get_bits(), etc).

I am following the unpack_bitstream() from AMR-NB


Sorry, I remembered wrong how unpack_bitstream() was coded in amrnb.

If the tables were in LSB->MSB order, it
would be the same algorithm as in AMR-NB, the only difference is that
I do (7 - index) to reverse the order.

Can't you change the tables in a way you can reuse the unpack function of
AMRNB?

I guess so, after all it's an interesting method of reordering and
assigning the variables

So, until now we have three functions that can be reused and would
need to move their code to an appropriate header file:

Not only the header, but the actual code should be moved to theassociated .c files.

- sipr.c:lsp2lpc_sipr() ->  lsp.h: Do you have a suggestion to a new
name for this? There is already a function called ff_acelp_lsp2lpc()

I suggest some AMR-centric name like ff_amrwb_lsp2lpc(). SIPR probablycopied from AMRWB and not the other way around.

- amrnbdec.c:lsf2lsp() ->  lsp.h


Looks reasonable. I suggest ff_acelp_lsf2lspd() as name.

- internal part of amrnbdec.c:unpack_bitstream() ->  get_bits.h (I guess)

I think the internal part of unpack_bitstream() is still too AMRspecific to be in such widely used file. Unless someone came up with abetter idea, I'd say to create a amr.c for code shared between both.

I will try to create patches for these modifications and send to ffmpeg-devel


Great!

-Vitor
_______________________________________________
FFmpeg-soc mailing list
[email protected]
https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-soc

Re: [FFmpeg-soc] [PATCH] AMR-WB Decoder

Reply via email to