"Ronald S. Bultje" <rsbul...@gmail.com> writes: > Hi, > > On Mon, Jul 18, 2011 at 8:33 AM, Joseph Artsimovich <jos...@mirriad.com> > wrote: >> On 18/07/2011 16:29, Kostya wrote: >>> On Mon, Jul 18, 2011 at 04:20:14PM +0100, Joseph Artsimovich wrote: >>>> On 18/07/2011 16:01, Kostya wrote: >>>>> On Mon, Jul 18, 2011 at 03:56:42PM +0100, Joseph Artsimovich wrote: >>>>>> On 18/07/2011 15:45, Måns Rullgård wrote: >>>>>>> Maxim<max_p...@gmx.de> writes: >>>>>>>> Am 18.07.2011 16:09, schrieb Måns Rullgård: >>>>>>>>> [...] >>>>>>>>>> An integer implementation is surely much faster than floating-point >>>>>>>>>> one but it's still insufficient for realtime applications >>>>>>>>>> >>>>>>>>> The point? >>>>>>>>> >>>>>>>> You'll be able to obtain reasonable performance benefits only by >>>>>>>> using >>>>>>>> SIMD-optimized code. That's the point. >>>>>>> >>>>>>> I fail to see how that justifies using the even slower floating-point >>>>>>> non-simd code. >>>>>>> >>>>>> I used that as a starting point, thinking it's the only option I had. >>>>>> Today I tried making various integer-based (I)DCT implementations >>>>>> work with 10-bit samples. I had success with >>>>>> ff_jpeg_fdct_islow() by setting BITS_IN_JSAMPLE to 10 and turning my >>>>>> unsigned samples into signed ones. The latter was required because: >>>>>> [0 .. 1023] samples -> [0 -> 8191] DCT coeffs -> [0 -> 65528] >>>>>> after >>>>>> post-scaling by 8, which is outside of DCTELEM (16 bit signed) >>>>>> range. With signed samples, I get: >>>>>> [-512 .. 511] samples -> [-4096 .. 4095] DCT coefs -> [-32768 .. >>>>>> 32760] after post-scaling by 8, which barely fits in. >>>>>> >>>>>> I still had no success with fdct_ifast() or j_rev_dct() though. >>>>> >>>>> change intermediate sample type into 32-bit integers too >>>> >>>> Do you mean DCTELEM? >>> >>> Only for the cases _inside_ DCT transform, input and output should stay >>> DCTELEM. Look at libavcodec/binkidct.c for example (not the best one >>> though). >> >> DCTELEM being 16-bit + 8 times post-scaling is already enough to make you go >> to signed samples for 10-bit and makes it impossible to go higher than that. > > The idea is (see h264) to make DCTELEM 32-bits for 10-bits.
Is this necessary? What is the precision of the transformed values? I saw the figure 14 bits mentioned a while ago, and that should still fit in 16 bits. Intermediate values in the transform will of course need to be 32-bit. -- Måns Rullgård m...@mansr.com _______________________________________________ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel