Re: [RESEND RFC/PATCH 3/8] media: platform: mtk-vpu: Support Mediatek VPU

2015-12-01 Thread andrew-ct chen
On Mon, 2015-11-30 at 15:36 +, Daniel Thompson wrote:
> On 30 November 2015 at 11:43, andrew-ct chen
>  wrote:
> > On Fri, 2015-11-27 at 12:21 +, Daniel Thompson wrote:
> >> On 27/11/15 12:10, andrew-ct chen wrote:
> >> >>> +
> >> >>> > >+  memcpy((void *)send_obj->share_buf, buf, len);
> >> >>> > >+  send_obj->len = len;
> >> >>> > >+  send_obj->id = id;
> >> >>> > >+  vpu_cfg_writel(vpu, 0x1, HOST_TO_VPU);
> >> >>> > >+
> >> >>> > >+  /* Wait until VPU receives the command */
> >> >>> > >+  timeout = jiffies + msecs_to_jiffies(IPI_TIMEOUT_MS);
> >> >>> > >+  do {
> >> >>> > >+  if (time_after(jiffies, timeout)) {
> >> >>> > >+  dev_err(vpu->dev, "vpu_ipi_send: IPI 
> >> >>> > >timeout!\n");
> >> >>> > >+  return -EIO;
> >> >>> > >+  }
> >> >>> > >+  } while (vpu_cfg_readl(vpu, HOST_TO_VPU));
> >> >> >
> >> >> >Do we need to busy wait every time we communicate with the 
> >> >> >co-processor?
> >> >> >Couldn't we put this wait*before*  we write to HOST_TO_VPU above.
> >> >> >
> >> >> >That way we only spin when there is a need to.
> >> >> >
> >> > Since the hardware VPU only allows that one client sends the command to
> >> > it each time.
> >> > We need the wait to make sure VPU accepted the command and cleared the
> >> > interrupt and then the next command would be served.
> >>
> >> I understand that the VPU  can only have on message outstanding at once.
> >>
> >> I just wonder why we busy wait *after* sending the first command rather
> >> than *before* sending the second one.
> >
> > No other special reasons. Just send one command and wait until VPU gets
> > the command. Then, I think this wait also can be put before we write to
> > HOST_TO_VPU.Is this better than former? May I know the reason?
> 
> Busy waiting is bad; it is a waste of host CPU processor time and/or power.
> 
> When the busy wait occurs after queuing then we will busy wait for
> every command we send.
> 
> If busy wait occurs before next queuing then we will wait for a
> shorter time in total because we have the chance to do something
> useful on the host before we have to wait.
> 

Got it. Thanks a lot for the explanation.
I'll put the busy wait before next queuing in next version. 

> 
> >> Streamed decode/encode typically ends up being rate controlled by
> >> capture or display meaning that in these cases we don't need to busy
> >> wait at all (because by the time we send the next frame the VPU has
> >> already accepted the previous message).
> >
> > For now, only one device "encoder" exists, it is true.
> > But, we'll have encoder and decoder devices, the decode and encode
> > requested to VPU are simultaneous.
> 
> Sure, I accept that lock and busy-wait is an appropriate way to ensure
> safety (assuming the VPU is fairly quick clearing the HOST_TO_VPU
> flag).
> 

The busy wait time is less than 1ms in average.

> 
> > Is this supposed to be removed for this patches and we can add it back
> > if the another device(decoder) is ready for review?
> 
> No I'm not suggesting that.
> 
> I only recommend moving the busy wait *before* end sending of command
> (for efficiency reasons mentioned above).

Ok. Thanks.

> 
> 
> Daniel.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND RFC/PATCH 3/8] media: platform: mtk-vpu: Support Mediatek VPU

2015-11-30 Thread Daniel Thompson
On 30 November 2015 at 11:43, andrew-ct chen
 wrote:
> On Fri, 2015-11-27 at 12:21 +, Daniel Thompson wrote:
>> On 27/11/15 12:10, andrew-ct chen wrote:
>> >>> +
>> >>> > >+  memcpy((void *)send_obj->share_buf, buf, len);
>> >>> > >+  send_obj->len = len;
>> >>> > >+  send_obj->id = id;
>> >>> > >+  vpu_cfg_writel(vpu, 0x1, HOST_TO_VPU);
>> >>> > >+
>> >>> > >+  /* Wait until VPU receives the command */
>> >>> > >+  timeout = jiffies + msecs_to_jiffies(IPI_TIMEOUT_MS);
>> >>> > >+  do {
>> >>> > >+  if (time_after(jiffies, timeout)) {
>> >>> > >+  dev_err(vpu->dev, "vpu_ipi_send: IPI 
>> >>> > >timeout!\n");
>> >>> > >+  return -EIO;
>> >>> > >+  }
>> >>> > >+  } while (vpu_cfg_readl(vpu, HOST_TO_VPU));
>> >> >
>> >> >Do we need to busy wait every time we communicate with the co-processor?
>> >> >Couldn't we put this wait*before*  we write to HOST_TO_VPU above.
>> >> >
>> >> >That way we only spin when there is a need to.
>> >> >
>> > Since the hardware VPU only allows that one client sends the command to
>> > it each time.
>> > We need the wait to make sure VPU accepted the command and cleared the
>> > interrupt and then the next command would be served.
>>
>> I understand that the VPU  can only have on message outstanding at once.
>>
>> I just wonder why we busy wait *after* sending the first command rather
>> than *before* sending the second one.
>
> No other special reasons. Just send one command and wait until VPU gets
> the command. Then, I think this wait also can be put before we write to
> HOST_TO_VPU.Is this better than former? May I know the reason?

Busy waiting is bad; it is a waste of host CPU processor time and/or power.

When the busy wait occurs after queuing then we will busy wait for
every command we send.

If busy wait occurs before next queuing then we will wait for a
shorter time in total because we have the chance to do something
useful on the host before we have to wait.


>> Streamed decode/encode typically ends up being rate controlled by
>> capture or display meaning that in these cases we don't need to busy
>> wait at all (because by the time we send the next frame the VPU has
>> already accepted the previous message).
>
> For now, only one device "encoder" exists, it is true.
> But, we'll have encoder and decoder devices, the decode and encode
> requested to VPU are simultaneous.

Sure, I accept that lock and busy-wait is an appropriate way to ensure
safety (assuming the VPU is fairly quick clearing the HOST_TO_VPU
flag).


> Is this supposed to be removed for this patches and we can add it back
> if the another device(decoder) is ready for review?

No I'm not suggesting that.

I only recommend moving the busy wait *before* end sending of command
(for efficiency reasons mentioned above).


Daniel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND RFC/PATCH 3/8] media: platform: mtk-vpu: Support Mediatek VPU

2015-11-30 Thread andrew-ct chen
On Fri, 2015-11-27 at 12:21 +, Daniel Thompson wrote:
> On 27/11/15 12:10, andrew-ct chen wrote:
> >>> +
> >>> > >+  memcpy((void *)send_obj->share_buf, buf, len);
> >>> > >+  send_obj->len = len;
> >>> > >+  send_obj->id = id;
> >>> > >+  vpu_cfg_writel(vpu, 0x1, HOST_TO_VPU);
> >>> > >+
> >>> > >+  /* Wait until VPU receives the command */
> >>> > >+  timeout = jiffies + msecs_to_jiffies(IPI_TIMEOUT_MS);
> >>> > >+  do {
> >>> > >+  if (time_after(jiffies, timeout)) {
> >>> > >+  dev_err(vpu->dev, "vpu_ipi_send: IPI 
> >>> > >timeout!\n");
> >>> > >+  return -EIO;
> >>> > >+  }
> >>> > >+  } while (vpu_cfg_readl(vpu, HOST_TO_VPU));
> >> >
> >> >Do we need to busy wait every time we communicate with the co-processor?
> >> >Couldn't we put this wait*before*  we write to HOST_TO_VPU above.
> >> >
> >> >That way we only spin when there is a need to.
> >> >
> > Since the hardware VPU only allows that one client sends the command to
> > it each time.
> > We need the wait to make sure VPU accepted the command and cleared the
> > interrupt and then the next command would be served.
> 
> I understand that the VPU  can only have on message outstanding at once.
> 
> I just wonder why we busy wait *after* sending the first command rather 
> than *before* sending the second one.

No other special reasons. Just send one command and wait until VPU gets
the command. Then, I think this wait also can be put before we write to
HOST_TO_VPU.Is this better than former? May I know the reason?


> 
> Streamed decode/encode typically ends up being rate controlled by 
> capture or display meaning that in these cases we don't need to busy 
> wait at all (because by the time we send the next frame the VPU has 
> already accepted the previous message).

For now, only one device "encoder" exists, it is true.
But, we'll have encoder and decoder devices, the decode and encode
requested to VPU are simultaneous.
Is this supposed to be removed for this patches and we can add it back
if the another device(decoder) is ready for review?


Andrew


> 
> 
> Daniel.
> 
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND RFC/PATCH 3/8] media: platform: mtk-vpu: Support Mediatek VPU

2015-11-27 Thread Daniel Thompson

On 27/11/15 12:10, andrew-ct chen wrote:

+
> >+  memcpy((void *)send_obj->share_buf, buf, len);
> >+  send_obj->len = len;
> >+  send_obj->id = id;
> >+  vpu_cfg_writel(vpu, 0x1, HOST_TO_VPU);
> >+
> >+  /* Wait until VPU receives the command */
> >+  timeout = jiffies + msecs_to_jiffies(IPI_TIMEOUT_MS);
> >+  do {
> >+  if (time_after(jiffies, timeout)) {
> >+  dev_err(vpu->dev, "vpu_ipi_send: IPI timeout!\n");
> >+  return -EIO;
> >+  }
> >+  } while (vpu_cfg_readl(vpu, HOST_TO_VPU));

>
>Do we need to busy wait every time we communicate with the co-processor?
>Couldn't we put this wait*before*  we write to HOST_TO_VPU above.
>
>That way we only spin when there is a need to.
>

Since the hardware VPU only allows that one client sends the command to
it each time.
We need the wait to make sure VPU accepted the command and cleared the
interrupt and then the next command would be served.


I understand that the VPU  can only have on message outstanding at once.

I just wonder why we busy wait *after* sending the first command rather 
than *before* sending the second one.


Streamed decode/encode typically ends up being rate controlled by 
capture or display meaning that in these cases we don't need to busy 
wait at all (because by the time we send the next frame the VPU has 
already accepted the previous message).



Daniel.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND RFC/PATCH 3/8] media: platform: mtk-vpu: Support Mediatek VPU

2015-11-27 Thread andrew-ct chen
Thanks a lot for your review comments.

On Wed, 2015-11-25 at 16:11 +, Daniel Thompson wrote:
> On 17/11/15 12:54, Tiffany Lin wrote:
> > From: Andrew-CT Chen 
> >
> > The VPU driver for hw video codec embedded in Mediatek's MT8173 SOCs.
> > It is able to handle video decoding/encoding of in a range of formats.
> > The driver provides with VPU firmware download, memory management and
> > the communication interface between CPU and VPU.
> > For VPU initialization, it will create virtual memory for CPU access and
> > IOMMU address for vcodec hw device access. When a decode/encode instance
> > opens a device node, vpu driver will download vpu firmware to the device.
> > A decode/encode instant will decode/encode a frame using VPU
> > interface to interrupt vpu to handle decoding/encoding jobs.
> >
> > Signed-off-by: Andrew-CT Chen 
> > ---
> >   drivers/media/platform/Kconfig |6 +
> >   drivers/media/platform/Makefile|2 +
> >   drivers/media/platform/mtk-vpu/Makefile|1 +
> >   .../platform/mtk-vpu/h264_enc/venc_h264_vpu.h  |  127 +++
> >   .../media/platform/mtk-vpu/include/venc_ipi_msg.h  |  212 +
> >   drivers/media/platform/mtk-vpu/mtk_vpu_core.c  |  823 
> > 
> >   drivers/media/platform/mtk-vpu/mtk_vpu_core.h  |  161 
> >   .../media/platform/mtk-vpu/vp8_enc/venc_vp8_vpu.h  |  119 +++
> >   8 files changed, 1451 insertions(+)
> >   create mode 100644 drivers/media/platform/mtk-vpu/Makefile
> >   create mode 100644 drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h
> >   create mode 100644 drivers/media/platform/mtk-vpu/include/venc_ipi_msg.h
> >   create mode 100644 drivers/media/platform/mtk-vpu/mtk_vpu_core.c
> >   create mode 100644 drivers/media/platform/mtk-vpu/mtk_vpu_core.h
> >   create mode 100644 drivers/media/platform/mtk-vpu/vp8_enc/venc_vp8_vpu.h
> >
> > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > index ccbc974..f98eb47 100644
> > --- a/drivers/media/platform/Kconfig
> > +++ b/drivers/media/platform/Kconfig
> > @@ -148,6 +148,12 @@ config VIDEO_CODA
> >Coda is a range of video codec IPs that supports
> >H.264, MPEG-4, and other video formats.
> >
> > +config MEDIATEK_VPU
> > +   bool "Mediatek Video Processor Unit"
> > +   ---help---
> > +   This driver provides downloading firmware vpu and
> > +   communicating with vpu.
> > +
> 
> This looks pretty broken.
> 
> Shouldn't this option be tristate? Why are there no depends-on or selects?
> 

Yes, this should be tristate and depends on VIDEO_DEV && VIDEO_V4L2 &&
ARCH_MEDIATEK

> Also I think the help text could be more descriptive here (and so does 
> checkpatch ;-) ).
> 

I'll put more descriptive. Thanks.

> 
> > diff --git a/drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h 
> > b/drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h
> > new file mode 100644
> > index 000..9c8ebdd
> > --- /dev/null
> > +++ b/drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h
> 
> Why is this file included in *this* patch? It is not included by 
> anything in the patch and defines functions that do not exist yet.

I'll move this to the h264 patch.Thanks

> 
> 
> > diff --git a/drivers/media/platform/mtk-vpu/include/venc_ipi_msg.h 
> > b/drivers/media/platform/mtk-vpu/include/venc_ipi_msg.h
> > new file mode 100644
> > index 000..a345b98
> 
> This file also is not included by anything and should, perhaps be 
> included in a different patch.
> 

I'll move this to the v4l2 encoder patch.

> 
> > diff --git a/drivers/media/platform/mtk-vpu/mtk_vpu_core.c 
> > b/drivers/media/platform/mtk-vpu/mtk_vpu_core.c
> > new file mode 100644
> > index 000..b524dbc
> > --- /dev/null
> > +++ b/drivers/media/platform/mtk-vpu/mtk_vpu_core.c
> > @@ -0,0 +1,823 @@
> > +/*
> > +* Copyright (c) 2015 MediaTek Inc.
> > +* Author: Andrew-CT Chen 
> > +*
> > +* This program is free software; you can redistribute it and/or modify
> > +* it under the terms of the GNU General Public License version 2 as
> > +* published by the Free Software Foundation.
> > +*
> > +* This program is distributed in the hope that it will be useful,
> > +* but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +* GNU General Public License for more details.
> > +*/
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "mtk_vpu_core.h"
> > +
> > +/**
> > + * VPU (video processor unit) is a tiny processor controlling video 
> > hardware
> > + * related to video codec, scaling and color format converting.
> > + * VPU interfaces with other blocks by share memory and interrupt.
> > + **/
> > +#define MTK_VPU_DRV_NAME   "mtk_vpu"
> 
> This is only used once, why not just put this string directly into the 
> .nam

Re: [RESEND RFC/PATCH 3/8] media: platform: mtk-vpu: Support Mediatek VPU

2015-11-25 Thread Daniel Thompson

On 17/11/15 12:54, Tiffany Lin wrote:

From: Andrew-CT Chen 

The VPU driver for hw video codec embedded in Mediatek's MT8173 SOCs.
It is able to handle video decoding/encoding of in a range of formats.
The driver provides with VPU firmware download, memory management and
the communication interface between CPU and VPU.
For VPU initialization, it will create virtual memory for CPU access and
IOMMU address for vcodec hw device access. When a decode/encode instance
opens a device node, vpu driver will download vpu firmware to the device.
A decode/encode instant will decode/encode a frame using VPU
interface to interrupt vpu to handle decoding/encoding jobs.

Signed-off-by: Andrew-CT Chen 
---
  drivers/media/platform/Kconfig |6 +
  drivers/media/platform/Makefile|2 +
  drivers/media/platform/mtk-vpu/Makefile|1 +
  .../platform/mtk-vpu/h264_enc/venc_h264_vpu.h  |  127 +++
  .../media/platform/mtk-vpu/include/venc_ipi_msg.h  |  212 +
  drivers/media/platform/mtk-vpu/mtk_vpu_core.c  |  823 
  drivers/media/platform/mtk-vpu/mtk_vpu_core.h  |  161 
  .../media/platform/mtk-vpu/vp8_enc/venc_vp8_vpu.h  |  119 +++
  8 files changed, 1451 insertions(+)
  create mode 100644 drivers/media/platform/mtk-vpu/Makefile
  create mode 100644 drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h
  create mode 100644 drivers/media/platform/mtk-vpu/include/venc_ipi_msg.h
  create mode 100644 drivers/media/platform/mtk-vpu/mtk_vpu_core.c
  create mode 100644 drivers/media/platform/mtk-vpu/mtk_vpu_core.h
  create mode 100644 drivers/media/platform/mtk-vpu/vp8_enc/venc_vp8_vpu.h

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index ccbc974..f98eb47 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -148,6 +148,12 @@ config VIDEO_CODA
   Coda is a range of video codec IPs that supports
   H.264, MPEG-4, and other video formats.

+config MEDIATEK_VPU
+   bool "Mediatek Video Processor Unit"
+   ---help---
+   This driver provides downloading firmware vpu and
+   communicating with vpu.
+


This looks pretty broken.

Shouldn't this option be tristate? Why are there no depends-on or selects?

Also I think the help text could be more descriptive here (and so does 
checkpatch ;-) ).




diff --git a/drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h 
b/drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h
new file mode 100644
index 000..9c8ebdd
--- /dev/null
+++ b/drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h


Why is this file included in *this* patch? It is not included by 
anything in the patch and defines functions that do not exist yet.




diff --git a/drivers/media/platform/mtk-vpu/include/venc_ipi_msg.h 
b/drivers/media/platform/mtk-vpu/include/venc_ipi_msg.h
new file mode 100644
index 000..a345b98


This file also is not included by anything and should, perhaps be 
included in a different patch.




diff --git a/drivers/media/platform/mtk-vpu/mtk_vpu_core.c 
b/drivers/media/platform/mtk-vpu/mtk_vpu_core.c
new file mode 100644
index 000..b524dbc
--- /dev/null
+++ b/drivers/media/platform/mtk-vpu/mtk_vpu_core.c
@@ -0,0 +1,823 @@
+/*
+* Copyright (c) 2015 MediaTek Inc.
+* Author: Andrew-CT Chen 
+*
+* This program is free software; you can redistribute it and/or modify
+* it under the terms of the GNU General Public License version 2 as
+* published by the Free Software Foundation.
+*
+* This program is distributed in the hope that it will be useful,
+* but WITHOUT ANY WARRANTY; without even the implied warranty of
+* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+* GNU General Public License for more details.
+*/
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mtk_vpu_core.h"
+
+/**
+ * VPU (video processor unit) is a tiny processor controlling video hardware
+ * related to video codec, scaling and color format converting.
+ * VPU interfaces with other blocks by share memory and interrupt.
+ **/
+#define MTK_VPU_DRV_NAME   "mtk_vpu"


This is only used once, why not just put this string directly into the 
.name field?



+
+#define INIT_TIMEOUT_MS2000U
+#define IPI_TIMEOUT_MS 2000U
+#define VPU_FW_VER_LEN 16
+
+/* vpu extended virtural address */
+#define VPU_PMEM0_VIRT(vpu)((vpu)->mem.p_va)
+#define VPU_DMEM0_VIRT(vpu)((vpu)->mem.d_va)
+/* vpu extended iova address*/
+#define VPU_PMEM0_IOVA(vpu)((vpu)->mem.p_iova)
+#define VPU_DMEM0_IOVA(vpu)((vpu)->mem.d_iova)


These feel like obfuscation to me. The code looks clearer without the 
macro For example:


vpu->mem.p_va = dma_alloc_coherent(dev, ...);

Is much clearer than:

VPU_PMEM0_VIRT(vpu) = dma_alloc_coherent(dev, ...);


+
+#define VPU_PTCM(dev)  ((dev)->reg.sram)
+#