Pushed. Thanks Haihao
> background: > not each pixel requires update in post processing, it happends: > - on left boundary blocks since dst horizontal offset must be dword > alignment > - on right boundary blocks when offset_x+width is not 16 aligned > - on botton boundary blocks when offset_y+height is not 8 aligned > - (we needn't take special care to top boundary blocks, > since dst vertical offset only requires byte alignment) > https://bugs.freedesktop.org/show_bug.cgi?id=51553 > > solution: > we introduce left/right/bottom mask to mask out the pixels which are not > interested. > - dst horizontal offset is shifted left to meet dword alignment > - src horizontal offset requires recalcultion > - src and dst width requires recalculation > - src_normalized_x and normalized_video_x_scaling_step requires > recalculation > - block horizontal mask is initialized > + to left mask (1*N or left edge of M*1) > + to right mask (right edge of M*1) > + to 0xffff, in else case > - block horitontal mask is updated to middle mask after the very first > block, > and update to right mask for the very last block. > - 'middle mask': > + middle mask can be equal to left mask for left strip (in M*1 block > partition), > + middle mask can be equal to right mask for the right strip (in M*1 > block partition) > + middle mask is equal to 0xffff in 1*N block partition mode > + right edge condition is checked after middle block condition in case > there are just 2 blocks in the dst region > - right mask has the similar trick as middle mask > - block vertical mask is initialized to 0xff except it is a bottom strip > (in 1*N block partition mode), > - block vertical mask is updated to 'bottom' mask for the very last > block. > + it is a real bottom mask in 1*N block partition, or bottom strip in > M*1 block partition. > + it is 0xff in else condition (M*1 block partition, not bottom strip) > > besides nv12 avs update, I also update load/save and nv12 scaling since I > used them during debug > > > Zhao Halley (8): > enable horizontal and vertical mask for bottom/right boundary blocks > reload block mask for the last(bottom/right) block in a strip > PL8x4_Save_IMC3.asm fix of masked block > work around hw limitation(dword alignment) of horizontal offset > reload horizontal mask after the first block in asm code > use load/save procedure instead of scaling only when > work around hw limitation(dword alignment) of horizontal offset > work around hw limitation(dword alignment) of horizontal offset > > src/i965_post_processing.c | 136 > ++++++++++++++++---- > src/i965_post_processing.h | 22 +++- > .../gen5_6/Common/Multiple_Loop.asm | 11 +- > .../gen5_6/Common/PL8x4_Save_IMC3.asm | 2 +- > .../post_processing/gen5_6/Common/common.inc | 7 + > .../post_processing/gen5_6/nv12_avs_nv12.g4b.gen5 | 10 +- > .../post_processing/gen5_6/nv12_avs_nv12.g6b | 10 +- > .../post_processing/gen5_6/nv12_dn_nv12.g4b.gen5 | 10 +- > .../post_processing/gen5_6/nv12_dn_nv12.g6b | 10 +- > .../post_processing/gen5_6/nv12_dndi_nv12.g4b.gen5 | 10 +- > .../post_processing/gen5_6/nv12_dndi_nv12.g6b | 10 +- > .../gen5_6/nv12_load_save_nv12.g4b.gen5 | 10 +- > .../post_processing/gen5_6/nv12_load_save_nv12.g6b | 10 +- > .../gen5_6/nv12_load_save_pa.g4b.gen5 | 10 +- > .../post_processing/gen5_6/nv12_load_save_pa.g6b | 10 +- > .../gen5_6/nv12_load_save_pl3.g4b.gen5 | 18 ++- > .../post_processing/gen5_6/nv12_load_save_pl3.g6b | 18 ++- > .../gen5_6/nv12_load_save_rgbx.g4b.gen5 | 10 +- > .../post_processing/gen5_6/nv12_load_save_rgbx.g6b | 10 +- > .../gen5_6/nv12_scaling_nv12.g4b.gen5 | 10 +- > .../post_processing/gen5_6/nv12_scaling_nv12.g6b | 10 +- > .../gen5_6/pa_load_save_nv12.g4b.gen5 | 10 +- > .../post_processing/gen5_6/pa_load_save_nv12.g6b | 10 +- > .../gen5_6/pa_load_save_pl3.g4b.gen5 | 18 ++- > .../post_processing/gen5_6/pa_load_save_pl3.g6b | 18 ++- > .../gen5_6/pl3_load_save_nv12.g4b.gen5 | 10 +- > .../post_processing/gen5_6/pl3_load_save_nv12.g6b | 10 +- > .../gen5_6/pl3_load_save_pa.g4b.gen5 | 10 +- > .../post_processing/gen5_6/pl3_load_save_pa.g6b | 10 +- > .../gen5_6/pl3_load_save_pl3.g4b.gen5 | 18 ++- > .../post_processing/gen5_6/pl3_load_save_pl3.g6b | 18 ++- > .../gen5_6/rgbx_load_save_nv12.g4b.gen5 | 10 +- > .../post_processing/gen5_6/rgbx_load_save_nv12.g6b | 10 +- > 33 files changed, 370 insertions(+), 136 deletions(-) > mode change 100644 => 100755 > src/shaders/post_processing/gen5_6/Common/Multiple_Loop.asm > mode change 100644 => 100755 > src/shaders/post_processing/gen5_6/Common/PL8x4_Save_IMC3.asm > mode change 100644 => 100755 > src/shaders/post_processing/gen5_6/Common/common.inc > _______________________________________________ Libva mailing list Libva@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libva