"Ronald S. Bultje" <rsbul...@gmail.com> writes:

> Hi,
>
> On Mon, Jul 18, 2011 at 8:33 AM, Joseph Artsimovich <jos...@mirriad.com> 
> wrote:
>> On 18/07/2011 16:29, Kostya wrote:
>>> On Mon, Jul 18, 2011 at 04:20:14PM +0100, Joseph Artsimovich wrote:
>>>> On 18/07/2011 16:01, Kostya wrote:
>>>>> On Mon, Jul 18, 2011 at 03:56:42PM +0100, Joseph Artsimovich wrote:
>>>>>> On 18/07/2011 15:45, Måns Rullgård wrote:
>>>>>>> Maxim<max_p...@gmx.de>    writes:
>>>>>>>> Am 18.07.2011 16:09, schrieb Måns Rullgård:
>>>>>>>>> [...]
>>>>>>>>>> An integer implementation is surely much faster than floating-point
>>>>>>>>>> one but it's still insufficient for realtime applications
>>>>>>>>>>
>>>>>>>>> The point?
>>>>>>>>>
>>>>>>>> You'll be able to obtain reasonable performance benefits only by
>>>>>>>> using
>>>>>>>> SIMD-optimized code. That's the point.
>>>>>>>
>>>>>>> I fail to see how that justifies using the even slower floating-point
>>>>>>> non-simd code.
>>>>>>>
>>>>>> I used that as a starting point, thinking it's the only option I had.
>>>>>> Today I tried making various integer-based (I)DCT implementations
>>>>>> work with 10-bit samples.  I had success with
>>>>>> ff_jpeg_fdct_islow() by setting BITS_IN_JSAMPLE to 10 and turning my
>>>>>> unsigned samples into signed ones.  The latter was required because:
>>>>>> [0 .. 1023] samples ->   [0 ->   8191] DCT coeffs ->   [0 ->   65528]
>>>>>> after
>>>>>> post-scaling by 8, which is outside of DCTELEM (16 bit signed)
>>>>>> range.  With signed samples, I get:
>>>>>> [-512 .. 511] samples ->   [-4096 .. 4095] DCT coefs ->   [-32768 ..
>>>>>> 32760] after post-scaling by 8, which barely fits in.
>>>>>>
>>>>>> I still had no success with fdct_ifast() or j_rev_dct() though.
>>>>>
>>>>> change intermediate sample type into 32-bit integers too
>>>>
>>>> Do you mean DCTELEM?
>>>
>>> Only for the cases _inside_ DCT transform, input and output should stay
>>> DCTELEM. Look at libavcodec/binkidct.c for example (not the best one
>>> though).
>>
>> DCTELEM being 16-bit + 8 times post-scaling is already enough to make you go
>> to signed samples for 10-bit and makes it impossible to go higher than that.
>
> The idea is (see h264) to make DCTELEM 32-bits for 10-bits.

Is this necessary?  What is the precision of the transformed values?
I saw the figure 14 bits mentioned a while ago, and that should still
fit in 16 bits.  Intermediate values in the transform will of course
need to be 32-bit.

-- 
Måns Rullgård
m...@mansr.com
_______________________________________________
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to