Hi Brian,
Apologies for taking so long to reply, I have had my head down working
on the fdmdv2 GUI program with Dave Witten, which we hope to release
before the end of this year.
Brian, I listened to the samples and I do think there is some
improvements on some samples. I'd agree with you that the samples sound
more natural on average. Nice work !
Some comments:
1/ Many of the samples sound less clicky or buzzy, more natural.
2/ However the level of many samples has shifted down, they
sound quieter
3/ Also some of the samples sound more band-limited, this might
affect intelligibility, not sure. morig is a good example of
this, sounds like some HF filtering. The power law might be
doing this, enhancing higher amplitude LF sounds more while
attenuating lower energy HF energy.
4/ the background noise on mmt1 was noticeably reduced. Maybe
this is positive effect of (3), i.e. annoying HF noise has been
attenuated, which in this case dominates the sample, so we don't
miss the HF speech energy.
It would be good if we can keep (1) & (4), but improve on (2) and (3).
I have been trying to remove the clicky/buzzy artefact of Codec 2 for
some time, usually by playing with phase models. Your work shows some
interesting possibilities. The recent LPC post filter also manipulated
the spectral amplitudes, and helped reduce the buzzy effect. So this
looks like a good area for further work. I might have been looking at
the wrong area (phase models), when trying to improve the naturalness.
It could also indicate the speech can be made more natural by reducing
the HF energy.
The reason I averaged the LPC spectrum over an entire harmonic was to
get around some issues with low pitched speakers and the LPC model.
Perhaps it's time to re-examine these assumptions.
BTW to test these sorts of improvements I usually start with c2sim and
test without quantisation, e.g.
c2sim ../raw/hts1a.raw --phase0 --lpc 10 -o hts1a_test.raw
(maybe with the post filters on as well). That lets us separate
quantisation issues and helps reduce the number of interacting
variables. It would also be good to separate out the effect of (i) how
u are sampling the harmonics and (ii) the power law.
This could be a switch we add to the decoder, as it won't affect the bit
stream.
Others on the list - pls listen to these samples as well and give us
some feedback.
Cheers,
David
On Mon, 2012-11-05 at 22:54 +0000, Brian Smith wrote:
> Hi David.
> There are now samples at:
> http://www.shapeseeker.com/samples
> either as individual files or bundled up as a samples.tar.bz2 file. They
> are the standard codec2 raw/ directory samples encoded/decoded at
> 1200bps using the original (e.g. hts.1200.raw) and modified (e.g.
> hts.1200b.raw) versions for comparisons. Encode/decode is done with:
>
> c2enc 1200 xxx.raw xxx.c2; c2dec 1200 xxx.c2 xxx.1200.raw; c2dec.new
> 1200 xxx.c2 xxx.1200b.raw
>
> I can convert to wav as well if you want.
>
> David Rowe wrote:
> > Hello Brian,
> >
> > That sounds interesting. Could you pls post (i) some before and after
> > samples that illustrate the change and (ii) the command line you used to
> > test?
> >
> > Cheers,
> >
> > David
> >
>
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Freetel-codec2 mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/freetel-codec2