On Thu, Apr 23, 2015 at 12:20:37AM -0400, Tucker DiNapoli wrote: > A few notes on changes from the last version of this patch. > The main issue with the previous code was with the sse2/avx2 > implementation of the blockCopy function, so for the time being the MMX2 > version > is used instead. I tried to place the MMX2 version into a function, but this > did not work, my best guess as to why is alignment, but I really don't know. > The way it's done now is a bit ugly but it works and I don't have time to > figure > out the issue right now. > > This commit adds several new files containing yasm assembly code, they are: > PPContext.asm; Defines the PPContext struct using the yasm struc command > PPUtil.asm; Various utility macros used in the other asm code > block_copy.asm; Implements the block copy function, the sse2 and avx2 > versions copy multiple blocks at once. > deinterlace.asm; Contains implemenations of the postprocessing filters > with support for sse2 and avx2. > > Adding these new functions to postprocess_template entailed adding a new > templates for AVX2 and modifying the current SSE2 template to use the > sse2 functions. A new deinterlace function was added to move the logic > of which deinterlace function to use out of the postprocess function and > make adding the new functions eaiser. The inline code for packing QP > into pQPb was moved into a seperate asm file and uptaded for sse2/avx2. > > Currently the sse2/avx2 deinterlace filters don't give results which are > bitexact to the C results, so I've modified one of the postprocessing > tests so that only the C funcitons are tested by fate. Ultimately either > the sse2/avx2 code will need to be fixed or diffrent tests will need to > be added. I'm not sure if this is a problem with my code, a problem inherent > in using sse2/avx2 or a problem that's a result of deinterlacing being done > blockwise.
On a AVX machine (using SSE2 i assume) -vf pp=md -vf pp=lb -vf pp=ci -vf pp=fd -vf pp=l5 tested with ./ffplay matrixbench_mpeg2.mpg -vf tinterlace=4,pp=XX shows blocking artifacts as if some deinterlace function is applied to only every second column of some size thats not just not bitexact that is completely not working On a AVX2 machine the tests above produce an output that looks as if one out of 4 columns is filtered with the other 3 still showing interlacing artifacts also the code produces these warnings libpostproc/x86/PPContext.asm:37: warning: section flags ignored on section redeclaration libpostproc/x86/PPContext.asm:77: warning: section flags ignored on section redeclaration libpostproc/x86/PPUtil.asm:215: warning: section flags ignored on section redeclaration [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Old school: Use the lowest level language in which you can solve the problem conveniently. New school: Use the highest level language in which the latest supercomputer can solve the problem without the user falling asleep waiting.
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel