Re: [Libva] [PATCH v1 7/9] ENC: add VME pipeline for AVC encoder

2017-01-18 Thread Qu, Pengfei


-Original Message-
From: Mark Thompson [mailto:s...@jkqxz.net] 
Sent: Thursday, January 19, 2017 7:25 AM
To: libva@lists.freedesktop.org; Qu, Pengfei 
Subject: Re: [Libva] [PATCH v1 7/9] ENC: add VME pipeline for AVC encoder

On 13/01/17 09:24, Pengfei Qu wrote:
> VME pipeline:
> add resource and surface allocation and free function add init table 
> for frame mbbrc update add scaling kernel for AVC encoder add BRC init 
> reset kernel for AVC RC logic add BRC frame update-kernel for AVC RC 
> logic add BRC MB level update kernel for AVC RC logic add REF frame QA 
> caculation and MB level const data add MBENC kernel for AVC encoder 
> add ME kernel for AVC encoder add WP/SFD kernel for AVC encoder add 
> kernel init/destroy function for AVC encoder add kernel related 
> parameter check function for AVC add VME pipeline init prepare/run 
> function for AVC encoder
> 
> Reviewed-by: Sean V Kelley
> Signed-off-by: Pengfei Qu 
> ---
>  src/gen9_avc_encoder.c | 5745 
> 
>  1 file changed, 5745 insertions(+)
>  create mode 100755 src/gen9_avc_encoder.c
> 
> diff --git a/src/gen9_avc_encoder.c b/src/gen9_avc_encoder.c new file 
> mode 100755 index 000..5caa9f4
> --- /dev/null
> +++ b/src/gen9_avc_encoder.c
> ...
> +
> +static void
> +gen9_avc_update_misc_parameters(VADriverContextP ctx,
> +struct encode_state *encode_state,
> +struct intel_encoder_context 
> +*encoder_context) {
> +struct encoder_vme_mfc_context * vme_context = (struct 
> encoder_vme_mfc_context *)encoder_context->vme_context;
> +struct generic_enc_codec_state * generic_state = (struct 
> generic_enc_codec_state * )vme_context->generic_enc_state;
> +int i;
> +
> +/* brc */
> +generic_state->max_bit_rate = 
> ALIGN(encoder_context->brc.bits_per_second[0], 1000) / 1000;
> +generic_state->window_size = encoder_context->brc.window_size;
> +generic_state->brc_need_reset = encoder_context->brc.need_reset;
> +
> +if (generic_state->internal_rate_mode == VA_RC_CBR) {
> +generic_state->min_bit_rate = generic_state->max_bit_rate;
> +generic_state->mb_brc_enabled = 
> + encoder_context->brc.mb_rate_control[0];
> +
> +if (generic_state->target_bit_rate != generic_state->max_bit_rate) {
> +generic_state->target_bit_rate = generic_state->max_bit_rate;
> +generic_state->brc_need_reset = 1;
> +}
> +} else if (generic_state->internal_rate_mode == VA_RC_VBR) {
> +generic_state->min_bit_rate = generic_state->max_bit_rate * (2 * 
> encoder_context->brc.target_percentage[0] - 100) / 100;
> +generic_state->mb_brc_enabled = 
> + encoder_context->brc.mb_rate_control[0];
> +
> +if (generic_state->target_bit_rate != generic_state->max_bit_rate * 
> encoder_context->brc.target_percentage[0] / 100) {
> +generic_state->target_bit_rate = generic_state->max_bit_rate * 
> encoder_context->brc.target_percentage[0] / 100;
> +generic_state->brc_need_reset = 1;
> +}
> +}
> +
> +/*  frame rate */
> +generic_state->frames_per_100s = 
> + encoder_context->brc.framerate[0].num/encoder_context->brc.framerate
> + [0].den * 100;

The framerate need not be set in CQP mode - this crashes with a division by 
zero in that case.
[Pengfei] sure. I will add check here to fix it.

Also, it discards the fractional part of the framerate by rounding down before 
multiplying by 100 - is that deliberate?
[Pengfei] I check the RC algorithm interface, it use 100x to do calculation. 

> +generic_state->frame_rate = 
> + encoder_context->brc.framerate[0].num/encoder_context->brc.framerate
> + [0].den ;
> +
> +/*  HRD */
> +if (generic_state->internal_rate_mode != VA_RC_CQP)
> +{
> +generic_state->vbv_buffer_size_in_bit = 
> encoder_context->brc.hrd_buffer_size;//misc->buffer_size;
> +generic_state->init_vbv_buffer_fullness_in_bit = 
> encoder_context->brc.hrd_initial_buffer_fullness;//misc->initial_buffer_fullness;
> +}
> +
> +/* ROI */
> +generic_state->num_roi = MIN(encoder_context->brc.num_roi, 3);
> +if (generic_state->num_roi > 0) {
> +generic_state->max_delta_qp = encoder_context->brc.roi_max_delta_qp;
> +generic_state->min_delta_qp = 
> + encoder_context->brc.roi_min_delta_qp;
> +
> +for (i = 0; i < generic_state->num_roi; i++) {
> +generic_state->roi[i].left   = encoder_context->brc.roi[i].left;
> +generic_state->roi[i].right  = encoder_context->brc.roi[i].right;
> +generic_state->roi[i].top= encoder_context->brc.roi[i].top;
> +generic_state->roi[i].bottom = 
> encoder_context->brc.roi[i].bottom;
> +generic_state->roi[i].value  = 
> + encoder_context->brc.roi[i].value;
> +
> +generic_state->roi[i].left /= 16;
> +generic_state->roi[i].right /= 16;
> + 

[Libva] [PATCH] va: Keep compatibility with the backend driver built against 0.39.x

2017-01-18 Thread Xiang, Haihao
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99375
Signed-off-by: Xiang, Haihao 
---
 va/va.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/va/va.c b/va/va.c
index 5cf7220..5040dc1 100644
--- a/va/va.c
+++ b/va/va.c
@@ -303,6 +303,7 @@ static VAStatus va_openDriver(VADisplay dpy, char 
*driver_name)
 int minor;
 } compatible_versions[] = {
 { VA_MAJOR_VERSION, VA_MINOR_VERSION },
+{ 0, 39 },
 { 0, 38 },
 { 0, 37 },
 { 0, 36 },
-- 
1.9.1

___
Libva mailing list
Libva@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libva


Re: [Libva] [PATCH V1] libva-intel-driver: fix i965 encoder wrong bits shift operation

2017-01-18 Thread Xiang, Haihao

Thanks for the patch, applied.

> From: Kuang-che Wu 
> 
> shift uint32_t by 32 bits is undefined behavior.
> 
> For this particular case: when invoke avc_bitstream_put_ui() with 32
> bits value at byte position of multiple of 4, existing 32 bits garbage
> data in the buffer may be retained instead of cleared. The result is,
> the position of NALU start code (0x0001) looks like overwritten by
> garbage value.
> 
> Patch has been tested and used upstream:
> https://chromium-review.googlesource.com/#/c/410541/
> 
> Signed-off-by: Kuang-che Wu 
> Signed-off-by: Sean V Kelley 
> ---
>  src/i965_encoder_utils.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/src/i965_encoder_utils.c b/src/i965_encoder_utils.c
> index ac58cd1..8db1b87 100644
> --- a/src/i965_encoder_utils.c
> +++ b/src/i965_encoder_utils.c
> @@ -134,7 +134,11 @@ avc_bitstream_put_ui(avc_bitstream *bs, unsigned int val,
> int size_in_bits)
>  bs->buffer[pos] = (bs->buffer[pos] << size_in_bits | val);
>  } else {
>  size_in_bits -= bit_left;
> -bs->buffer[pos] = (bs->buffer[pos] << bit_left) | (val >>
> size_in_bits);
> +if (bit_left == 32) {
> +bs->buffer[pos] = val;
> +} else {
> +bs->buffer[pos] = (bs->buffer[pos] << bit_left) | (val >>
> size_in_bits);
> +}
>  bs->buffer[pos] = swap32(bs->buffer[pos]);
>  
>  if (pos + 1 == bs->max_size_in_dword) {
___
Libva mailing list
Libva@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libva


Re: [Libva] [PATCH v1 0/9]Encoder Architecture Changes (Primarily AVC)

2017-01-18 Thread Mark Thompson
On 13/01/17 09:24, Pengfei Qu wrote:
> Encoder architecture restructuring for H.264 (with some impact to HEVC now) 
> on HSW+
> * Improvements to the shaders
> * Improvements to the B frame efficiency
> * Improvements to the low bit rate mode
> * Improved features in two stage VME/PAK pipeline
> 
> v1:
> Reduce the patch number and re org for VME and MFX related patches.
> Patch re org for VME pipeline
> Patch re org for MFX pipeline
> keep assert for internal logic and replace assert for input validation 
> function.
> Remove unnecessary comments and enum value.
> Use the 64bit version OUT_BCS_RELOC64.
> Move kernel binary into header file.
> use misc parameter from encoder_context structure.

I've had a go with this on Skylake.  In general, I see significant gains in 
quality with similar performance (yay), however I found some issues as well.

CQP mode seems to have regressed significantly in speed - it is maybe 25% 
slower than CBR/VBR now (though indeed higher quality, particularly on 
B-frames).  Is this expected?  I would have thought it should be the "easiest" 
(and therefore fastest) mode.

Also, there seems to be something funny going on in the VBR rate controller.  
Sometimes (nondeterministically, with the same parameters) the beginning of the 
stream gets stuck at a very high QP / low bitrate for a long period, making the 
output video terrible quality.  After some time (maybe a few thousand frames) 
it recovers and thereafter acts normally.  It seems to happen entirely randomly 
with low probability (less than 10%, maybe?), with no obvious connection to the 
encoding parameters.

I found some things which might be related (but equally could just be 
perturbing something else, for example by changing the timing):
* It never seems to happen if the encoder input comes directly from a decoder - 
I have only seen it when there is a VPP instance in between them (though it 
need not do anything to the video - it can just copy to a new surface of the 
same size).
* I tested on two different machines and it only seems to happen on one of 
them: it happens on a 6260U (GT3), but not on a 6300 (GT2).

Can you offer any thoughts on what might be relevant which I could test for?  
(Currently my reproduction method is just "transcode videos between sizes 
repeatedly until it happens", which I realise is not very helpful.  I am happy 
to try to narrow that down a bit if I could have any idea what I can look for.)

Thanks,

- Mark
___
Libva mailing list
Libva@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libva


Re: [Libva] [PATCH v1 7/9] ENC: add VME pipeline for AVC encoder

2017-01-18 Thread Mark Thompson
On 13/01/17 09:24, Pengfei Qu wrote:
> VME pipeline:
> add resource and surface allocation and free function
> add init table for frame mbbrc update
> add scaling kernel for AVC encoder
> add BRC init reset kernel for AVC RC logic
> add BRC frame update-kernel for AVC RC logic
> add BRC MB level update kernel for AVC RC logic
> add REF frame QA caculation and MB level const data
> add MBENC kernel for AVC encoder
> add ME kernel for AVC encoder
> add WP/SFD kernel for AVC encoder
> add kernel init/destroy function for AVC encoder
> add kernel related parameter check function for AVC
> add VME pipeline init prepare/run function for AVC encoder
> 
> Reviewed-by: Sean V Kelley
> Signed-off-by: Pengfei Qu 
> ---
>  src/gen9_avc_encoder.c | 5745 
> 
>  1 file changed, 5745 insertions(+)
>  create mode 100755 src/gen9_avc_encoder.c
> 
> diff --git a/src/gen9_avc_encoder.c b/src/gen9_avc_encoder.c
> new file mode 100755
> index 000..5caa9f4
> --- /dev/null
> +++ b/src/gen9_avc_encoder.c
> ...
> +
> +static void
> +gen9_avc_update_misc_parameters(VADriverContextP ctx,
> +struct encode_state *encode_state,
> +struct intel_encoder_context 
> *encoder_context)
> +{
> +struct encoder_vme_mfc_context * vme_context = (struct 
> encoder_vme_mfc_context *)encoder_context->vme_context;
> +struct generic_enc_codec_state * generic_state = (struct 
> generic_enc_codec_state * )vme_context->generic_enc_state;
> +int i;
> +
> +/* brc */
> +generic_state->max_bit_rate = 
> ALIGN(encoder_context->brc.bits_per_second[0], 1000) / 1000;
> +generic_state->window_size = encoder_context->brc.window_size;
> +generic_state->brc_need_reset = encoder_context->brc.need_reset;
> +
> +if (generic_state->internal_rate_mode == VA_RC_CBR) {
> +generic_state->min_bit_rate = generic_state->max_bit_rate;
> +generic_state->mb_brc_enabled = 
> encoder_context->brc.mb_rate_control[0];
> +
> +if (generic_state->target_bit_rate != generic_state->max_bit_rate) {
> +generic_state->target_bit_rate = generic_state->max_bit_rate;
> +generic_state->brc_need_reset = 1;
> +}
> +} else if (generic_state->internal_rate_mode == VA_RC_VBR) {
> +generic_state->min_bit_rate = generic_state->max_bit_rate * (2 * 
> encoder_context->brc.target_percentage[0] - 100) / 100;
> +generic_state->mb_brc_enabled = 
> encoder_context->brc.mb_rate_control[0];
> +
> +if (generic_state->target_bit_rate != generic_state->max_bit_rate * 
> encoder_context->brc.target_percentage[0] / 100) {
> +generic_state->target_bit_rate = generic_state->max_bit_rate * 
> encoder_context->brc.target_percentage[0] / 100;
> +generic_state->brc_need_reset = 1;
> +}
> +}
> +
> +/*  frame rate */
> +generic_state->frames_per_100s = 
> encoder_context->brc.framerate[0].num/encoder_context->brc.framerate[0].den * 
> 100;

The framerate need not be set in CQP mode - this crashes with a division by 
zero in that case.

Also, it discards the fractional part of the framerate by rounding down before 
multiplying by 100 - is that deliberate?

> +generic_state->frame_rate = 
> encoder_context->brc.framerate[0].num/encoder_context->brc.framerate[0].den ;
> +
> +/*  HRD */
> +if (generic_state->internal_rate_mode != VA_RC_CQP)
> +{
> +generic_state->vbv_buffer_size_in_bit = 
> encoder_context->brc.hrd_buffer_size;//misc->buffer_size;
> +generic_state->init_vbv_buffer_fullness_in_bit = 
> encoder_context->brc.hrd_initial_buffer_fullness;//misc->initial_buffer_fullness;
> +}
> +
> +/* ROI */
> +generic_state->num_roi = MIN(encoder_context->brc.num_roi, 3);
> +if (generic_state->num_roi > 0) {
> +generic_state->max_delta_qp = encoder_context->brc.roi_max_delta_qp;
> +generic_state->min_delta_qp = encoder_context->brc.roi_min_delta_qp;
> +
> +for (i = 0; i < generic_state->num_roi; i++) {
> +generic_state->roi[i].left   = encoder_context->brc.roi[i].left;
> +generic_state->roi[i].right  = encoder_context->brc.roi[i].right;
> +generic_state->roi[i].top= encoder_context->brc.roi[i].top;
> +generic_state->roi[i].bottom = 
> encoder_context->brc.roi[i].bottom;
> +generic_state->roi[i].value  = encoder_context->brc.roi[i].value;
> +
> +generic_state->roi[i].left /= 16;
> +generic_state->roi[i].right /= 16;
> +generic_state->roi[i].top /= 16;
> +generic_state->roi[i].bottom /= 16;
> +}
> +}
> +
> +}
> +
___
Libva mailing list
Libva@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libva


[Libva] [PATCH V1] libva-intel-driver: fix i965 encoder wrong bits shift operation

2017-01-18 Thread Sean V Kelley
From: Kuang-che Wu 

shift uint32_t by 32 bits is undefined behavior.

For this particular case: when invoke avc_bitstream_put_ui() with 32
bits value at byte position of multiple of 4, existing 32 bits garbage
data in the buffer may be retained instead of cleared. The result is,
the position of NALU start code (0x0001) looks like overwritten by
garbage value.

Patch has been tested and used upstream:
https://chromium-review.googlesource.com/#/c/410541/

Signed-off-by: Kuang-che Wu 
Signed-off-by: Sean V Kelley 
---
 src/i965_encoder_utils.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/i965_encoder_utils.c b/src/i965_encoder_utils.c
index ac58cd1..8db1b87 100644
--- a/src/i965_encoder_utils.c
+++ b/src/i965_encoder_utils.c
@@ -134,7 +134,11 @@ avc_bitstream_put_ui(avc_bitstream *bs, unsigned int val, 
int size_in_bits)
 bs->buffer[pos] = (bs->buffer[pos] << size_in_bits | val);
 } else {
 size_in_bits -= bit_left;
-bs->buffer[pos] = (bs->buffer[pos] << bit_left) | (val >> 
size_in_bits);
+if (bit_left == 32) {
+bs->buffer[pos] = val;
+} else {
+bs->buffer[pos] = (bs->buffer[pos] << bit_left) | (val >> 
size_in_bits);
+}
 bs->buffer[pos] = swap32(bs->buffer[pos]);
 
 if (pos + 1 == bs->max_size_in_dword) {
-- 
2.11.0

___
Libva mailing list
Libva@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libva


Re: [Libva] [PATCH] libva-intel-driver: fix i965 encoder wrong bits shift operation

2017-01-18 Thread Sean V Kelley
On Tue, Jan 17, 2017 at 6:48 PM, Xiang, Haihao 
wrote:

> On Tue, 2017-01-17 at 13:15 -0800, Sean V Kelley wrote:
> > From: Kuang-che Wu 
> >
> > shift uint32_t by 32 bits is undefined behavior.
> >
> > For this particular case: when invoke avc_bitstream_put_ui() with 32
> > bits value at byte position of multiple of 4, existing 32 bits garbage
> > data in the buffer may be retained instead of cleared. The result is,
> > the position of NALU start code (0x0001) looks like overwritten by
> > garbage value.
> >
> > Patch has been tested and used upstream:
> > https://chromium-review.googlesource.com/#/c/410541/
> >
> > Signed-off-by: Kuang-che Wu 
> > Signed-off-by: Sean V Kelley 
> > ---
> >  src/i965_encoder_utils.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/i965_encoder_utils.c b/src/i965_encoder_utils.c
> > index ac58cd1a..e061d071 100644
> > --- a/src/i965_encoder_utils.c
> > +++ b/src/i965_encoder_utils.c
> > @@ -134,7 +134,11 @@ avc_bitstream_put_ui(avc_bitstream *bs, unsigned
> int val, int size_in_bits)
> >  bs->buffer[pos] = (bs->buffer[pos] << size_in_bits | val);
> >  } else {
> >  size_in_bits -= bit_left;
> > -bs->buffer[pos] = (bs->buffer[pos] << bit_left) | (val >>
> size_in_bits);
> > +if (bit_left == 32) {
> > +bs->buffer[pos] = (val >> size_in_bits);
>
> The input value of size_in_bits should be less than or equal to 32, so now
> the
> value of size_in_bits is 0 for this case, I think  "bs->buffer[pos] = val"
> is
> more clear.
>

That's a reasonable change.  Thanks Haihao.  I'll update the patch.

Sean


>
> > +} else {
> > +bs->buffer[pos] = (bs->buffer[pos] << bit_left) | (val >>
> size_in_bits);
> > +}
> >  bs->buffer[pos] = swap32(bs->buffer[pos]);
> >
> >  if (pos + 1 == bs->max_size_in_dword) {
> ___
> Libva mailing list
> Libva@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/libva
>



-- 
Sean V. Kelley 
Open Source Technology Center / SSG
Intel Corp.
___
Libva mailing list
Libva@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libva


Re: [Libva] [PATCH v1 8/9] ENC: add MFX pipeline for AVC encoder

2017-01-18 Thread Qu, Pengfei


-Original Message-
From: Zhao, Yakui 
Sent: Tuesday, January 17, 2017 10:47 AM
To: Qu, Pengfei 
Cc: libva@lists.freedesktop.org
Subject: Re: [Libva] [PATCH v1 8/9] ENC: add MFX pipeline for AVC encoder

On 01/13/2017 05:24 PM, Pengfei Qu wrote:
> MFX pipeline:
> add MFX command for AVC encoder
> add MFX Picture slice level command init for AVC
> add MFX pipeline init prepare run for AVC encode
> add VME/MFX context init for AVC encoder
>

Please see the inline comment.

> Reviewed-by: Sean V Kelley
> Signed-off-by: Pengfei Qu
> ---
>   src/gen9_avc_encoder.c | 1887 
> +++-
>   1 file changed, 1886 insertions(+), 1 deletion(-)
>
> diff --git a/src/gen9_avc_encoder.c b/src/gen9_avc_encoder.c
> index 5caa9f4..a7545f1 100755
> --- a/src/gen9_avc_encoder.c
> +++ b/src/gen9_avc_encoder.c
> @@ -5742,4 +5742,1889 @@ gen9_avc_kernel_init(VADriverContextP ctx,
>   generic_ctx->pfn_send_brc_mb_update_surface = 
> gen9_avc_send_surface_brc_mb_update;
>   generic_ctx->pfn_send_sfd_surface = gen9_avc_send_surface_sfd;
>   generic_ctx->pfn_send_wp_surface = gen9_avc_send_surface_wp;
> -}
> \ No newline at end of file
> +}
> +
> +/*
> +PAK pipeline related function
> +*/
> +extern int
> +intel_avc_enc_slice_type_fixup(int slice_type);
> +
> +static void
> +gen9_mfc_avc_pipe_mode_select(VADriverContextP ctx,
> +  struct encode_state *encode_state,
> +  struct intel_encoder_context *encoder_context)
> +{
> +struct encoder_vme_mfc_context * pak_context = (struct 
> encoder_vme_mfc_context *)encoder_context->vme_context;
> +struct gen9_avc_encoder_context * avc_ctx = (struct 
> gen9_avc_encoder_context * )pak_context->private_enc_ctx;
> +struct generic_enc_codec_state * generic_state = (struct 
> generic_enc_codec_state * )pak_context->generic_enc_state;
> +struct intel_batchbuffer *batch = encoder_context->base.batch;
> +
> +BEGIN_BCS_BATCH(batch, 5);
> +
> +OUT_BCS_BATCH(batch, MFX_PIPE_MODE_SELECT | (5 - 2));
> +OUT_BCS_BATCH(batch,
> +  (0<<  29) |
> +  (MFX_LONG_MODE<<  17) |   /* Must be long format for 
> encoder */
> +  (MFD_MODE_VLD<<  15) |
> +  (0<<  13) |   /* VDEnc mode  is 1*/
> +  ((generic_state->curr_pak_pass != 
> (generic_state->num_pak_passes -1))<<  10) |   /* Stream-Out 
> Enable */
> +  ((!!avc_ctx->res_post_deblocking_output.bo)<<  9)  |/* 
> Post Deblocking Output */
> +  ((!!avc_ctx->res_pre_deblocking_output.bo)<<  8)  | /* 
> Pre Deblocking Output */
> +  (0<<  7)  |   /* Scaled surface enable */
> +  (0<<  6)  |   /* Frame statistics stream 
> out enable, always '1' in VDEnc mode */
> +  (0<<  5)  |   /* not in stitch mode */
> +  (1<<  4)  |   /* encoding mode */
> +  (MFX_FORMAT_AVC<<  0));
> +OUT_BCS_BATCH(batch,
> +  (0<<  7)  | /* expand NOA bus flag */
> +  (0<<  6)  | /* disable slice-level clock gating */
> +  (0<<  5)  | /* disable clock gating for NOA */
> +  (0<<  4)  | /* terminate if AVC motion and POC table error 
> occurs */
> +  (0<<  3)  | /* terminate if AVC mbdata error occurs */
> +  (0<<  2)  | /* terminate if AVC CABAC/CAVLC decode error 
> occurs */
> +  (0<<  1)  |
> +  (0<<  0));
> +OUT_BCS_BATCH(batch, 0);
> +OUT_BCS_BATCH(batch, 0);
> +
> +ADVANCE_BCS_BATCH(batch);
> +}
> +
> +static void
> +gen9_mfc_avc_surface_state(VADriverContextP ctx,
> +   struct intel_encoder_context *encoder_context,
> +   struct i965_gpe_resource *gpe_resource,
> +   int id)
> +{
> +struct intel_batchbuffer *batch = encoder_context->base.batch;
> +
> +BEGIN_BCS_BATCH(batch, 6);
> +
> +OUT_BCS_BATCH(batch, MFX_SURFACE_STATE | (6 - 2));
> +OUT_BCS_BATCH(batch, id);
> +OUT_BCS_BATCH(batch,
> +  ((gpe_resource->height - 1)<<  18) |
> +  ((gpe_resource->width - 1)<<  4));
> +OUT_BCS_BATCH(batch,
> +  (MFX_SURFACE_PLANAR_420_8<<  28) |/* 420 planar YUV 
> surface */
> +  (1<<  27) |   /* must be 1 for 
> interleave U/V, hardware requirement */
> +  ((gpe_resource->pitch - 1)<<  3) |/* pitch */
> +  (0<<  2)  |   /* must be 0 for 
> interleave U/V */
> +  (1<<  1)  |   /* must be tiled */
> +  (I965_TILEWALK_YMAJOR<<  0)); /* tile walk, 
> TILEWALK_YMAJOR */
> +OUT_BCS_BATCH(batch,
> +  (0<<  1

Re: [Libva] [PATCH v1 0/9]Encoder Architecture Changes (Primarily AVC)

2017-01-18 Thread Qu, Pengfei


-Original Message-
From: Zhao, Yakui 
Sent: Tuesday, January 17, 2017 10:30 AM
To: Qu, Pengfei 
Cc: libva@lists.freedesktop.org
Subject: Re: [Libva] [PATCH v1 0/9]Encoder Architecture Changes (Primarily AVC)

On 01/13/2017 05:24 PM, Pengfei Qu wrote:
> Encoder architecture restructuring for H.264 (with some impact to HEVC 
> now) on HSW+
> * Improvements to the shaders
> * Improvements to the B frame efficiency
> * Improvements to the low bit rate mode
> * Improved features in two stage VME/PAK pipeline
>
> v1:
> Reduce the patch number and re org for VME and MFX related patches.
> Patch re org for VME pipeline
> Patch re org for MFX pipeline
> keep assert for internal logic and replace assert for input validation 
> function.
> Remove unnecessary comments and enum value.
> Use the 64bit version OUT_BCS_RELOC64.
> Move kernel binary into header file.
> use misc parameter from encoder_context structure.
>

The patch set looks much clearer.

But it seems that sometimes the newly added file is not added into 
src/Makefile.am in time. In such case it is not compiled before the last patch 
is applied.

It will be better that the newly added file is added into src/Makefile.am in 
time, which is also convenient to cherry-pick them.
[Pengfei] sure. As there are big code changes in this series patches and so add 
into Makefile.am at the last patch. 
>
> Pengfe (9):
>ENC: move gpe related function into src/i965_gpe_utils.h/c
>ENC: add common structure for AVC/HEVC encoder
>ENC: add const data/table for AVC encoder
>ENC: add AVC kernel binary on SKL
>ENC: add AVC common structure and functions
>ENC: add kernel related structure and define for AVC
>ENC: add VME pipeline for AVC encoder
>ENC: add MFX pipeline for AVC encoder
>ENC:support more quality level and switch to new AVC encoder solution
>  on SKL
>
>   src/Makefile.am|10 +
>   src/gen9_avc_const_def.c   |  1090 
>   src/gen9_avc_const_def.h   |   115 +
>   src/gen9_avc_encoder.c |  7630 +
>   src/gen9_avc_encoder.h |  2339 
>   src/gen9_avc_encoder_kernels.h | 12078 
> +++
>   src/gen9_vp9_encoder.c |   154 +-
>   src/gen9_vp9_encoder.h |10 -
>   src/i965_avc_encoder_common.c  |   319 ++
>   src/i965_avc_encoder_common.h  |   305 +
>   src/i965_defines.h | 3 +
>   src/i965_drv_video.c   | 8 +-
>   src/i965_drv_video.h   | 2 +
>   src/i965_encoder.c |52 +-
>   src/i965_encoder_api.h |47 +
>   src/i965_encoder_common.c  |   124 +
>   src/i965_encoder_common.h  |   541 ++
>   src/i965_gpe_utils.c   |   282 +
>   src/i965_gpe_utils.h   |86 +
>   19 files changed, 25027 insertions(+), 168 deletions(-)
>   create mode 100755 src/gen9_avc_const_def.c
>   create mode 100755 src/gen9_avc_const_def.h
>   create mode 100755 src/gen9_avc_encoder.c
>   create mode 100755 src/gen9_avc_encoder.h
>   create mode 100755 src/gen9_avc_encoder_kernels.h
>   create mode 100755 src/i965_avc_encoder_common.c
>   create mode 100755 src/i965_avc_encoder_common.h
>   create mode 100755 src/i965_encoder_api.h
>   create mode 100755 src/i965_encoder_common.c
>   create mode 100755 src/i965_encoder_common.h
>

___
Libva mailing list
Libva@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libva


Re: [Libva] [PATCH v1 2/9] ENC: add common structure for AVC/HEVC encoder

2017-01-18 Thread Qu, Pengfei


-Original Message-
From: Zhao, Yakui 
Sent: Tuesday, January 17, 2017 9:15 AM
To: Qu, Pengfei 
Cc: libva@lists.freedesktop.org
Subject: Re: [Libva] [PATCH v1 2/9] ENC: add common structure for AVC/HEVC 
encoder

On 01/13/2017 05:24 PM, Pengfei Qu wrote:
> add context init function for AVC encoder
>

More comments are added.

> Signed-off-by: Pengfei Qu
> Reviewed-by: Sean V Kelley
> ---
>   src/i965_encoder_api.h|  47 
>   src/i965_encoder_common.c | 124 +++
>   src/i965_encoder_common.h | 533 
> ++
>   3 files changed, 704 insertions(+)
>   create mode 100755 src/i965_encoder_api.h
>   create mode 100755 src/i965_encoder_common.c
>   create mode 100755 src/i965_encoder_common.h
>
> diff --git a/src/i965_encoder_api.h b/src/i965_encoder_api.h new file 
> mode 100755 index 000..ebb0edc
> --- /dev/null
> +++ b/src/i965_encoder_api.h
> @@ -0,0 +1,47 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person 
> +obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, 
> +including
> + * without limitation the rights to use, copy, modify, merge, 
> +publish,
> + * distribute, sub license, and/or sell copies of the Software, and 
> +to
> + * permit persons to whom the Software is furnished to do so, subject 
> +to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including 
> +the
> + * next paragraph) shall be included in all copies or substantial 
> +portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> +EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> + * IN NO EVENT SHALL PRECISION INSIGHT AND/OR ITS SUPPLIERS BE LIABLE 
> +FOR
> + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF 
> +CONTRACT,
> + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> + *
> + * Authors:
> + * Pengfei Qu
> + *
> + */
> +
> +#ifndef _I965_ENCODER_API_H_
> +#define _I965_ENCODER_API_H_
> +
> +#include
> +
> +struct intel_encoder_context;
> +struct hw_context;
> +
> +/* H264/AVC */
> +extern Bool
> +gen9_avc_vme_context_init(VADriverContextP ctx, struct 
> +intel_encoder_context *encoder_context);
> +
> +extern Bool
> +gen9_avc_pak_context_init(VADriverContextP ctx, struct 
> +intel_encoder_context *encoder_context);
> +
> +extern VAStatus
> +gen9_avc_coded_status(VADriverContextP ctx, char *buffer, struct 
> +hw_context *hw_context);
> +
> +#endif  // _I965_ENCODER_API_H_
> diff --git a/src/i965_encoder_common.c b/src/i965_encoder_common.c new 
> file mode 100755 index 000..930aba9
> --- /dev/null
> +++ b/src/i965_encoder_common.c
> @@ -0,0 +1,124 @@
> +/*
> + * Copyright ? 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person 
> +obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, 
> +including
> + * without limitation the rights to use, copy, modify, merge, 
> +publish,
> + * distribute, sub license, and/or sell copies of the Software, and 
> +to
> + * permit persons to whom the Software is furnished to do so, subject 
> +to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including 
> +the
> + * next paragraph) shall be included in all copies or substantial 
> +portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> +EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> + * IN NO EVENT SHALL PRECISION INSIGHT AND/OR ITS SUPPLIERS BE LIABLE 
> +FOR
> + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF 
> +CONTRACT,
> + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> + * SOFTWAR
> + *
> + * Authors:
> + * Pengfei Qu
> + *
> + */
> +#include
> +#include
> +#include "intel_batchbuffer.h"
> +#include "intel_driver.h"
> +#include "i965_encoder_common.h"
> +#include "i965_gpe_utils.h"
> +
> +
> +const unsigned int table_enc_search_path[2][8][16] = {
> +// I-Frame&  P-Frame
> +{
> +// MEMethod: 0
> +{
> +0x120FF10F, 0x1E22E20D, 0x20E2FF10, 0x2EDD06FC, 0x11D33FF1, 
> 0xEB1FF33D, 0x4EF1F1F1, 0xF1F21211,
> +0x0DE0, 0x11201F1F, 0x1105F1CF, 0x, 0x, 
> 0x, 0x, 0x
> +},
> +// MEMethod: 1
> +{
> +0x120FF10F, 0x1E22E20D, 0x20E2FF10, 0x2EDD06FC, 0x11D33FF1, 
> 0xEB1FF33D, 0x4EF1F1F1, 0xF1F21211,
> +0x0DE0

Re: [Libva] [PATCH v1 2/9] ENC: add common structure for AVC/HEVC encoder

2017-01-18 Thread Qu, Pengfei


-Original Message-
From: Zhao, Yakui 
Sent: Tuesday, January 17, 2017 9:06 AM
To: Qu, Pengfei 
Cc: libva@lists.freedesktop.org
Subject: Re: [Libva] [PATCH v1 2/9] ENC: add common structure for AVC/HEVC 
encoder

On 01/13/2017 05:24 PM, Pengfei Qu wrote:
> add context init function for AVC encoder
>
> Signed-off-by: Pengfei Qu
> Reviewed-by: Sean V Kelley
> ---
>   src/i965_encoder_api.h|  47 
>   src/i965_encoder_common.c | 124 +++
>   src/i965_encoder_common.h | 533 
> ++
>   3 files changed, 704 insertions(+)
>   create mode 100755 src/i965_encoder_api.h
>   create mode 100755 src/i965_encoder_common.c
>   create mode 100755 src/i965_encoder_common.h
>
> diff --git a/src/i965_encoder_api.h b/src/i965_encoder_api.h new file 
> mode 100755 index 000..ebb0edc
> --- /dev/null
> +++ b/src/i965_encoder_api.h
> @@ -0,0 +1,47 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person 
> +obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, 
> +including
> + * without limitation the rights to use, copy, modify, merge, 
> +publish,
> + * distribute, sub license, and/or sell copies of the Software, and 
> +to
> + * permit persons to whom the Software is furnished to do so, subject 
> +to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including 
> +the
> + * next paragraph) shall be included in all copies or substantial 
> +portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> +EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> + * IN NO EVENT SHALL PRECISION INSIGHT AND/OR ITS SUPPLIERS BE LIABLE 
> +FOR
> + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF 
> +CONTRACT,
> + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> + *
> + * Authors:
> + * Pengfei Qu
> + *
> + */
> +
> +#ifndef _I965_ENCODER_API_H_
> +#define _I965_ENCODER_API_H_
> +
> +#include
> +
> +struct intel_encoder_context;
> +struct hw_context;
> +
> +/* H264/AVC */
> +extern Bool
> +gen9_avc_vme_context_init(VADriverContextP ctx, struct 
> +intel_encoder_context *encoder_context);
> +
> +extern Bool
> +gen9_avc_pak_context_init(VADriverContextP ctx, struct 
> +intel_encoder_context *encoder_context);
> +
> +extern VAStatus
> +gen9_avc_coded_status(VADriverContextP ctx, char *buffer, struct 
> +hw_context *hw_context);
> +
> +#endif  // _I965_ENCODER_API_H_
> diff --git a/src/i965_encoder_common.c b/src/i965_encoder_common.c new 
> file mode 100755 index 000..930aba9
> --- /dev/null
> +++ b/src/i965_encoder_common.c
> @@ -0,0 +1,124 @@
> +/*
> + * Copyright ? 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person 
> +obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, 
> +including
> + * without limitation the rights to use, copy, modify, merge, 
> +publish,
> + * distribute, sub license, and/or sell copies of the Software, and 
> +to
> + * permit persons to whom the Software is furnished to do so, subject 
> +to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including 
> +the
> + * next paragraph) shall be included in all copies or substantial 
> +portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> +EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> + * IN NO EVENT SHALL PRECISION INSIGHT AND/OR ITS SUPPLIERS BE LIABLE 
> +FOR
> + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF 
> +CONTRACT,
> + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> + * SOFTWAR
> + *
> + * Authors:
> + * Pengfei Qu
> + *
> + */
> +#include
> +#include
> +#include "intel_batchbuffer.h"
> +#include "intel_driver.h"
> +#include "i965_encoder_common.h"
> +#include "i965_gpe_utils.h"
> +
> +
> +const unsigned int table_enc_search_path[2][8][16] = {
> +// I-Frame&  P-Frame
> +{
> +// MEMethod: 0
> +{
> +0x120FF10F, 0x1E22E20D, 0x20E2FF10, 0x2EDD06FC, 0x11D33FF1, 
> 0xEB1FF33D, 0x4EF1F1F1, 0xF1F21211,
> +0x0DE0, 0x11201F1F, 0x1105F1CF, 0x, 0x, 
> 0x, 0x, 0x
> +},
> +// MEMethod: 1
> +{
> +0x120FF10F, 0x1E22E20D, 0x20E2FF10, 0x2EDD06FC, 0x11D33FF1, 
> 0xEB1FF33D, 0x4EF1F1F1, 0xF1F21211,
> +0x0DE0, 0x11201F1F, 0x1105F1CF, 0

Re: [Libva] [PATCH v1 1/9] ENC: move gpe related function into src/i965_gpe_utils.h/c

2017-01-18 Thread Qu, Pengfei


-Original Message-
From: Zhao, Yakui 
Sent: Tuesday, January 17, 2017 8:55 AM
To: Qu, Pengfei 
Cc: libva@lists.freedesktop.org
Subject: Re: [Libva] [PATCH v1 1/9] ENC: move gpe related function into 
src/i965_gpe_utils.h/c

On 01/13/2017 05:24 PM, Pengfei Qu wrote:
> v1:
> add align version for obj surface conversion to gpe surface remove 
> comments and enum value
>

This version patch looks much clearer.

But it seems that the this patch still adds more defintions besides moving the 
GPE functions for generic usage.
[Pengfei] the function will be commented out.

> Signed-off-by: Pengfei Qu
> Reviewed-by: Sean V Kelley
> ---
>   src/gen9_vp9_encoder.c | 154 ++-
>   src/gen9_vp9_encoder.h |  10 --
>   src/i965_defines.h |   3 +
>   src/i965_gpe_utils.c   | 282 
> +
>   src/i965_gpe_utils.h   |  86 +++
>   5 files changed, 377 insertions(+), 158 deletions(-)
>
> diff --git a/src/gen9_vp9_encoder.c b/src/gen9_vp9_encoder.c index 
> 05d86da..32ed729 100644
> --- a/src/gen9_vp9_encoder.c
> +++ b/src/gen9_vp9_encoder.c
> @@ -58,7 +58,6 @@
>   #define BRC_KERNEL_AVBR 0x0040
>   #define BRC_KERNEL_CQL  0x0080
>
> -#define DEFAULT_MOCS  0x02
>   #define VP9_PIC_STATE_BUFFER_SIZE 192
>
>   typedef struct _intel_kernel_header_ @@ -842,7 +841,7 @@ 
> gen9_vp9_free_resources(struct gen9_encoder_context_vp9 *vme_context)
>
>   static void
>   gen9_init_media_object_walker_parameter(struct intel_encoder_context 
> *encoder_context,
> -struct 
> vp9_encoder_kernel_walker_parameter *kernel_walker_param,
> +struct 
> + gpe_encoder_kernel_walker_parameter *kernel_walker_param,
>   struct 
> gpe_media_object_walker_parameter *walker_param)
>   {
>   memset(walker_param, 0, sizeof(*walker_param)); @@ -924,147 
> +923,6 @@ gen9_init_media_object_walker_parameter(struct 
> intel_encoder_context *encoder_co
>   }
>
>   static void
> -gen9_add_2d_gpe_surface(VADriverContextP ctx,
> -struct i965_gpe_context *gpe_context,
> -struct object_surface *obj_surface,
> -int is_uv_surface,
> -int is_media_block_rw,
> -unsigned int format,
> -int index)
> -{
> -struct i965_gpe_resource gpe_resource;
> -struct i965_gpe_surface gpe_surface;
> -
> -memset(&gpe_surface, 0, sizeof(gpe_surface));
> -
> -i965_object_surface_to_2d_gpe_resource(&gpe_resource, obj_surface);
> -gpe_surface.gpe_resource =&gpe_resource;
> -gpe_surface.is_2d_surface = 1;
> -gpe_surface.is_uv_surface = !!is_uv_surface;
> -gpe_surface.is_media_block_rw = !!is_media_block_rw;
> -
> -gpe_surface.cacheability_control = DEFAULT_MOCS;
> -gpe_surface.format = format;
> -
> -gen9_gpe_context_add_surface(gpe_context,&gpe_surface, index);
> -i965_free_gpe_resource(&gpe_resource);
> -}
> -
> -static void
> -gen9_add_adv_gpe_surface(VADriverContextP ctx,
> - struct i965_gpe_context *gpe_context,
> - struct object_surface *obj_surface,
> - int index)
> -{
> -struct i965_gpe_resource gpe_resource;
> -struct i965_gpe_surface gpe_surface;
> -
> -memset(&gpe_surface, 0, sizeof(gpe_surface));
> -
> -i965_object_surface_to_2d_gpe_resource(&gpe_resource, obj_surface);
> -gpe_surface.gpe_resource =&gpe_resource;
> -gpe_surface.is_adv_surface = 1;
> -gpe_surface.cacheability_control = DEFAULT_MOCS;
> -gpe_surface.v_direction = 2;
> -
> -gen9_gpe_context_add_surface(gpe_context,&gpe_surface, index);
> -i965_free_gpe_resource(&gpe_resource);
> -}
> -
> -static void
> -gen9_add_buffer_gpe_surface(VADriverContextP ctx,
> -struct i965_gpe_context *gpe_context,
> -struct i965_gpe_resource *gpe_buffer,
> -int is_raw_buffer,
> -unsigned int size,
> -unsigned int offset,
> -int index)
> -{
> -struct i965_gpe_surface gpe_surface;
> -
> -memset(&gpe_surface, 0, sizeof(gpe_surface));
> -
> -gpe_surface.gpe_resource = gpe_buffer;
> -gpe_surface.is_buffer = 1;
> -gpe_surface.is_raw_buffer = !!is_raw_buffer;
> -gpe_surface.cacheability_control = DEFAULT_MOCS;
> -gpe_surface.size = size;
> -gpe_surface.offset = offset;
> -
> -gen9_gpe_context_add_surface(gpe_context,&gpe_surface, index);
> -}
> -
> -static void
> -gen9_add_buffer_2d_gpe_surface(VADriverContextP ctx,
> -   struct i965_gpe_context *gpe_context,
> -   struct i965_gpe_resource *gpe_buffer,
> -   i