2015-10-13 17:44 GMT+02:00 Christophe Gisquet :
> But I'll check.
Indeed not bit-exact to faani and C simple idct:
stddev:0.00 PSNR:163.48 MAXDIFF:1
This would at least result in fate no longer passing as this stands. I
don't think it's worth the speed difference.
No change on dct-test r
2015-10-13 15:43 GMT+02:00 Michael Niedermayer :
> On Tue, Oct 13, 2015 at 01:33:07PM +0200, Christophe Gisquet wrote:
>> Hi,
>>
>> 2015-10-13 13:10 GMT+02:00 Michael Niedermayer :
>> > hmm, iam a bit concerned that adding the rounder (which effectively is
>> > 0.5) causes a overflow, that would if
On Tue, Oct 13, 2015 at 01:33:07PM +0200, Christophe Gisquet wrote:
> Hi,
>
> 2015-10-13 13:10 GMT+02:00 Michael Niedermayer :
> > hmm, iam a bit concerned that adding the rounder (which effectively is
> > 0.5) causes a overflow, that would if iam not mistaken imlpy that
> > things are very close
Hi,
2015-10-13 13:10 GMT+02:00 Michael Niedermayer :
> hmm, iam a bit concerned that adding the rounder (which effectively is
> 0.5) causes a overflow, that would if iam not mistaken imlpy that
> things are very close to overflowing already without it
It's true, but the immediate cause here is th
On Tue, Oct 13, 2015 at 09:01:44AM +0200, Christophe Gisquet wrote:
> Hi,
>
> 2015-10-13 2:26 GMT+02:00 Michael Niedermayer :
> > On Mon, Oct 12, 2015 at 07:37:46PM +0200, Christophe Gisquet wrote:
> >> When the input of a pass has 15 or 16 bits of precision (in particular
> >> the column pass), t
Hi,
2015-10-13 2:26 GMT+02:00 Michael Niedermayer :
> On Mon, Oct 12, 2015 at 07:37:46PM +0200, Christophe Gisquet wrote:
>> When the input of a pass has 15 or 16 bits of precision (in particular
>> the column pass), the addition of a bias to W4 may lead to overflows
>> in the input to pmaddwd.
>>
On Mon, Oct 12, 2015 at 07:37:46PM +0200, Christophe Gisquet wrote:
> When the input of a pass has 15 or 16 bits of precision (in particular
> the column pass), the addition of a bias to W4 may lead to overflows
> in the input to pmaddwd.
>
> This requires postponing the adding of the bias to afte
When the input of a pass has 15 or 16 bits of precision (in particular
the column pass), the addition of a bias to W4 may lead to overflows
in the input to pmaddwd.
This requires postponing the adding of the bias to after the first
butterfly. To do so, the fact that m15, unused although zeroed, is