Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-19 Thread Ronald S. Bultje
Hi

On Wed, Aug 19, 2015 at 9:45 AM, Pedro Arthur  wrote:

> patch committed.


Nice work!

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-19 Thread Pedro Arthur
patch committed.

2015-08-19 7:11 GMT-03:00 Michael Niedermayer :

> On Tue, Aug 18, 2015 at 11:36:16PM -0300, Pedro Arthur wrote:
> > Added copyright.
> > I've tried to push it (git push ffmpeg master --dry-run) but got the
> > following error:
> > fatal: remote error: access denied or repository not exported:
> /ffmpeg.git
> > with remote:
> > ffmpeggit+ssh://source.ffmpeg.org/ffmpeg.git (fetch)
> > ffmpeggit+ssh://source.ffmpeg.org/ffmpeg.git (push)
> >
> > is it correct?
>
> > Anyway I'm attaching the patch.
>
> still looks good
>
> [...]
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> He who knows, does not speak. He who speaks, does not know. -- Lao Tsu
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-19 Thread Michael Niedermayer
On Tue, Aug 18, 2015 at 11:36:16PM -0300, Pedro Arthur wrote:
> Added copyright.
> I've tried to push it (git push ffmpeg master --dry-run) but got the
> following error:
> fatal: remote error: access denied or repository not exported: /ffmpeg.git
> with remote:
> ffmpeggit+ssh://source.ffmpeg.org/ffmpeg.git (fetch)
> ffmpeggit+ssh://source.ffmpeg.org/ffmpeg.git (push)
> 
> is it correct?

> Anyway I'm attaching the patch.

still looks good

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

He who knows, does not speak. He who speaks, does not know. -- Lao Tsu


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-19 Thread Michael Niedermayer
On Tue, Aug 18, 2015 at 11:36:16PM -0300, Pedro Arthur wrote:
> Added copyright.
> I've tried to push it (git push ffmpeg master --dry-run) but got the
> following error:
> fatal: remote error: access denied or repository not exported: /ffmpeg.git
> with remote:
> ffmpeggit+ssh://source.ffmpeg.org/ffmpeg.git (fetch)
> ffmpeggit+ssh://source.ffmpeg.org/ffmpeg.git (push)

its:
g...@source.ffmpeg.org:ffmpeg

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Rewriting code that is poorly written but fully understood is good.
Rewriting code that one doesnt understand is a sign that one is less smart
then the original author, trying to rewrite it will not make it better.


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-18 Thread Pedro Arthur
Added copyright.
I've tried to push it (git push ffmpeg master --dry-run) but got the
following error:
fatal: remote error: access denied or repository not exported: /ffmpeg.git
with remote:
ffmpeggit+ssh://source.ffmpeg.org/ffmpeg.git (fetch)
ffmpeggit+ssh://source.ffmpeg.org/ffmpeg.git (push)

is it correct?
Anyway I'm attaching the patch.



2015-08-18 22:48 GMT-03:00 James Almer :

> On 18/08/15 6:30 PM, Pedro Arthur wrote:
> > diff --git a/libswscale/vscale.c b/libswscale/vscale.c
> > new file mode 100644
> > index 000..b62b385
> > --- /dev/null
> > +++ b/libswscale/vscale.c
> > @@ -0,0 +1,268 @@
> > +#include "swscale_internal.h"
> > +
> > +static int lum_planar_vscale(SwsContext *c, SwsFilterDescriptor *desc,
> int sliceY, int sliceH)
>
> Please add a copyright header before pushing.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
From deedf31c05dee4896a867b45790ded94c173a959 Mon Sep 17 00:00:00 2001
From: Pedro Arthur 
Date: Tue, 18 Aug 2015 11:47:55 -0300
Subject: [PATCH] swscale: refactor vertical scaler

---
 libswscale/Makefile   |   1 +
 libswscale/slice.c|  20 ++-
 libswscale/swscale.c  |  88 +++--
 libswscale/swscale_internal.h |  20 ++-
 libswscale/vscale.c   | 287 ++
 libswscale/x86/swscale.c  |   6 +-
 6 files changed, 380 insertions(+), 42 deletions(-)
 create mode 100644 libswscale/vscale.c

diff --git a/libswscale/Makefile b/libswscale/Makefile
index b2b6381..e70e358 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -17,6 +17,7 @@ OBJS = alphablend.o \
yuv2rgb.o\
slice.o  \
hscale.o \
+   vscale.o \
 
 OBJS-$(CONFIG_SHARED)+= log2_tab.o
 
diff --git a/libswscale/slice.c b/libswscale/slice.c
index 611e4e6..8fd16d3 100644
--- a/libswscale/slice.c
+++ b/libswscale/slice.c
@@ -214,6 +214,7 @@ int ff_init_filters(SwsContext * c)
 int index;
 int num_ydesc;
 int num_cdesc;
+int num_vdesc = isPlanarYUV(c->dstFormat) && !isGray(c->dstFormat) ? 2 : 1;
 int need_lum_conv = c->lumToYV12 || c->readLumPlanar || c->alpToYV12 || c->readAlpPlanar;
 int need_chr_conv = c->chrToYV12 || c->readChrPlanar;
 int srcIdx, dstIdx;
@@ -228,8 +229,8 @@ int ff_init_filters(SwsContext * c)
 num_ydesc = need_lum_conv ? 2 : 1;
 num_cdesc = need_chr_conv ? 2 : 1;
 
-c->numSlice = FFMAX(num_ydesc, num_cdesc) + 1;
-c->numDesc = num_ydesc + num_cdesc;
+c->numSlice = FFMAX(num_ydesc, num_cdesc) + 2;
+c->numDesc = num_ydesc + num_cdesc + num_vdesc;
 c->descIndex[0] = num_ydesc;
 c->descIndex[1] = num_ydesc + num_cdesc;
 
@@ -243,12 +244,13 @@ int ff_init_filters(SwsContext * c)
 
 res = alloc_slice(&c->slice[0], c->srcFormat, c->srcH, c->chrSrcH, c->chrSrcHSubSample, c->chrSrcVSubSample, 0);
 if (res < 0) goto cleanup;
-for (i = 1; i < c->numSlice-1; ++i) {
+for (i = 1; i < c->numSlice-2; ++i) {
 res = alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize + MAX_LINES_AHEAD, c->vChrFilterSize + MAX_LINES_AHEAD, c->chrSrcHSubSample, c->chrSrcVSubSample, 0);
 if (res < 0) goto cleanup;
 res = alloc_lines(&c->slice[i], FFALIGN(c->srcW*2+78, 16), c->srcW);
 if (res < 0) goto cleanup;
 }
+// horizontal scaler output
 res = alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize + MAX_LINES_AHEAD, c->vChrFilterSize + MAX_LINES_AHEAD, c->chrDstHSubSample, c->chrDstVSubSample, 1);
 if (res < 0) goto cleanup;
 res = alloc_lines(&c->slice[i], dst_stride, c->dstW);
@@ -256,6 +258,11 @@ int ff_init_filters(SwsContext * c)
 
 fill_ones(&c->slice[i], dst_stride>>1, c->dstBpc == 16);
 
+// vertical scaler output
+++i;
+res = alloc_slice(&c->slice[i], c->dstFormat, c->dstH, c->chrDstH, c->chrDstHSubSample, c->chrDstVSubSample, 0);
+if (res < 0) goto cleanup;
+
 index = 0;
 srcIdx = 0;
 dstIdx = 1;
@@ -290,6 +297,13 @@ int ff_init_filters(SwsContext * c)
 ff_init_desc_no_chr(&c->desc[index], &c->slice[srcIdx], &c->slice[dstIdx]);
 }
 
+++index;
+{
+srcIdx = c->numSlice - 2;
+dstIdx = c->numSlice - 1;
+ff_init_vscale(c, c->desc + index, c->slice + srcIdx, c->slice + dstIdx);
+}
+
 return 0;
 
 cleanup:
diff --git a/libswscale/swscale.c b/libswscale/swscale.c
index 03019d4..d87efda 100644
--- a/libswscale/swscale.c
+++ b/libswscale/swscale.c
@@ -326,8 +326,8 @@ static int swscale(SwsContext *c, const uint8_t *src[],
 #endif
 const int dstW   = c->dstW;
 const int dstH   = c->dstH;
-const int chrDstW= c->chrDstW;
 #ifndef 

Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-18 Thread James Almer
On 18/08/15 6:30 PM, Pedro Arthur wrote:
> diff --git a/libswscale/vscale.c b/libswscale/vscale.c
> new file mode 100644
> index 000..b62b385
> --- /dev/null
> +++ b/libswscale/vscale.c
> @@ -0,0 +1,268 @@
> +#include "swscale_internal.h"
> +
> +static int lum_planar_vscale(SwsContext *c, SwsFilterDescriptor *desc, int 
> sliceY, int sliceH)

Please add a copyright header before pushing.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-18 Thread Michael Niedermayer
On Tue, Aug 18, 2015 at 06:30:28PM -0300, Pedro Arthur wrote:
> Patch with alpha fixed.
> 
> 2015-08-18 18:07 GMT-03:00 Michael Niedermayer :
> 
> > On Tue, Aug 18, 2015 at 04:27:42PM -0300, Pedro Arthur wrote:
> > > Attached patch with new vertical scaler code, added license and fixed
> > > compiler warnings.
> >
> >
> > split and applied first patch, had to change 2 asserts to make it work
> > without the vscale code
> >
> > [...]
> >
> > --
> > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> >
> > Republics decline into democracies and democracies degenerate into
> > despotisms. -- Aristotle
> >
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> >

>  Makefile   |1 
>  slice.c|   20 +++
>  swscale.c  |   88 ++---
>  swscale_internal.h |   20 +++
>  vscale.c   |  268 
> +
>  x86/swscale.c  |6 -
>  6 files changed, 361 insertions(+), 42 deletions(-)
> a9980f543258376a8e333cdcc68115157db73bff  vscale.patch
> From 01e23c1b2302d9e4627a0ac872203f76d31a0492 Mon Sep 17 00:00:00 2001
> From: Pedro Arthur 
> Date: Tue, 18 Aug 2015 11:47:55 -0300
> Subject: [PATCH] swscale: refactor vertical scaler

patch looks good to me
feel free to push unless you prefer that i apply/push it

thanks
[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The educated differ from the uneducated as much as the living from the
dead. -- Aristotle 


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-18 Thread Pedro Arthur
Patch with alpha fixed.

2015-08-18 18:07 GMT-03:00 Michael Niedermayer :

> On Tue, Aug 18, 2015 at 04:27:42PM -0300, Pedro Arthur wrote:
> > Attached patch with new vertical scaler code, added license and fixed
> > compiler warnings.
>
>
> split and applied first patch, had to change 2 asserts to make it work
> without the vscale code
>
> [...]
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Republics decline into democracies and democracies degenerate into
> despotisms. -- Aristotle
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
From 01e23c1b2302d9e4627a0ac872203f76d31a0492 Mon Sep 17 00:00:00 2001
From: Pedro Arthur 
Date: Tue, 18 Aug 2015 11:47:55 -0300
Subject: [PATCH] swscale: refactor vertical scaler

---
 libswscale/Makefile   |   1 +
 libswscale/slice.c|  20 +++-
 libswscale/swscale.c  |  88 --
 libswscale/swscale_internal.h |  20 +++-
 libswscale/vscale.c   | 268 ++
 libswscale/x86/swscale.c  |   6 +-
 6 files changed, 361 insertions(+), 42 deletions(-)
 create mode 100644 libswscale/vscale.c

diff --git a/libswscale/Makefile b/libswscale/Makefile
index b2b6381..e70e358 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -17,6 +17,7 @@ OBJS = alphablend.o \
yuv2rgb.o\
slice.o  \
hscale.o \
+   vscale.o \
 
 OBJS-$(CONFIG_SHARED)+= log2_tab.o
 
diff --git a/libswscale/slice.c b/libswscale/slice.c
index 611e4e6..8fd16d3 100644
--- a/libswscale/slice.c
+++ b/libswscale/slice.c
@@ -214,6 +214,7 @@ int ff_init_filters(SwsContext * c)
 int index;
 int num_ydesc;
 int num_cdesc;
+int num_vdesc = isPlanarYUV(c->dstFormat) && !isGray(c->dstFormat) ? 2 : 1;
 int need_lum_conv = c->lumToYV12 || c->readLumPlanar || c->alpToYV12 || c->readAlpPlanar;
 int need_chr_conv = c->chrToYV12 || c->readChrPlanar;
 int srcIdx, dstIdx;
@@ -228,8 +229,8 @@ int ff_init_filters(SwsContext * c)
 num_ydesc = need_lum_conv ? 2 : 1;
 num_cdesc = need_chr_conv ? 2 : 1;
 
-c->numSlice = FFMAX(num_ydesc, num_cdesc) + 1;
-c->numDesc = num_ydesc + num_cdesc;
+c->numSlice = FFMAX(num_ydesc, num_cdesc) + 2;
+c->numDesc = num_ydesc + num_cdesc + num_vdesc;
 c->descIndex[0] = num_ydesc;
 c->descIndex[1] = num_ydesc + num_cdesc;
 
@@ -243,12 +244,13 @@ int ff_init_filters(SwsContext * c)
 
 res = alloc_slice(&c->slice[0], c->srcFormat, c->srcH, c->chrSrcH, c->chrSrcHSubSample, c->chrSrcVSubSample, 0);
 if (res < 0) goto cleanup;
-for (i = 1; i < c->numSlice-1; ++i) {
+for (i = 1; i < c->numSlice-2; ++i) {
 res = alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize + MAX_LINES_AHEAD, c->vChrFilterSize + MAX_LINES_AHEAD, c->chrSrcHSubSample, c->chrSrcVSubSample, 0);
 if (res < 0) goto cleanup;
 res = alloc_lines(&c->slice[i], FFALIGN(c->srcW*2+78, 16), c->srcW);
 if (res < 0) goto cleanup;
 }
+// horizontal scaler output
 res = alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize + MAX_LINES_AHEAD, c->vChrFilterSize + MAX_LINES_AHEAD, c->chrDstHSubSample, c->chrDstVSubSample, 1);
 if (res < 0) goto cleanup;
 res = alloc_lines(&c->slice[i], dst_stride, c->dstW);
@@ -256,6 +258,11 @@ int ff_init_filters(SwsContext * c)
 
 fill_ones(&c->slice[i], dst_stride>>1, c->dstBpc == 16);
 
+// vertical scaler output
+++i;
+res = alloc_slice(&c->slice[i], c->dstFormat, c->dstH, c->chrDstH, c->chrDstHSubSample, c->chrDstVSubSample, 0);
+if (res < 0) goto cleanup;
+
 index = 0;
 srcIdx = 0;
 dstIdx = 1;
@@ -290,6 +297,13 @@ int ff_init_filters(SwsContext * c)
 ff_init_desc_no_chr(&c->desc[index], &c->slice[srcIdx], &c->slice[dstIdx]);
 }
 
+++index;
+{
+srcIdx = c->numSlice - 2;
+dstIdx = c->numSlice - 1;
+ff_init_vscale(c, c->desc + index, c->slice + srcIdx, c->slice + dstIdx);
+}
+
 return 0;
 
 cleanup:
diff --git a/libswscale/swscale.c b/libswscale/swscale.c
index 5faf1e6..be950ed 100644
--- a/libswscale/swscale.c
+++ b/libswscale/swscale.c
@@ -326,8 +326,8 @@ static int swscale(SwsContext *c, const uint8_t *src[],
 #endif
 const int dstW   = c->dstW;
 const int dstH   = c->dstH;
-const int chrDstW= c->chrDstW;
 #ifndef NEW_FILTER
+const int chrDstW= c->chrDstW;
 const int chrSrcW= c->chrSrcW;
 const int lumXInc= c->lumXInc;
 const int chrXInc= c->chrXInc;
@@ -341,9 +341,9 @@ static int swscale(SwsContext *c, const 

Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-18 Thread Michael Niedermayer
On Tue, Aug 18, 2015 at 04:27:42PM -0300, Pedro Arthur wrote:
> Attached patch with new vertical scaler code, added license and fixed
> compiler warnings.


split and applied first patch, had to change 2 asserts to make it work
without the vscale code

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-18 Thread Michael Niedermayer
On Tue, Aug 18, 2015 at 04:27:42PM -0300, Pedro Arthur wrote:
> Attached patch with new vertical scaler code, added license and fixed
> compiler warnings.

the vscaler works much better now than prevously, theres one bug left
in it though

./ffplay ./laraShadow_dl.flv -vf 
scale=400x400:flags=16,format=yuva420p,colorchannelmixer=ga=1

shows some horizotal line artifacts in the green channel (alpha from sws)

wget https://samples.mplayerhq.hu/FLV/flash_with_alpha/laraShadow_dl.flv

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

What does censorship reveal? It reveals fear. -- Julian Assange


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-18 Thread Pedro Arthur
Attached patch with new vertical scaler code, added license and fixed
compiler warnings.


2015-08-17 20:35 GMT-03:00 Michael Niedermayer :

> On Mon, Aug 17, 2015 at 05:35:32PM -0300, Pedro Arthur wrote:
> > ops, added missing file.
> >
> >
> > 2015-08-17 17:31 GMT-03:00 Pedro Arthur :
> >
> > >
> > >
> > > 2015-08-17 0:19 GMT-03:00 Michael Niedermayer  >:
> > >
> > >> also feel free to split the batch addition into a seperate commit
> > >> (should be easy as you already have a versionn with and without)
> > >>
> > > Attached proper patchs.
> > >
> > > also, please send me your public ssh key, i think you should have
> > >> direct write access to ffmpeg git
> > >>
> > > I've sent it privately.
> > >
>
> applied patches
>
> please add license headers to the new files
> also please put variables which are just used by the old code
> under #ifndef NEW_FILTER  or something so they do not produce warnings
>
> the comments also could benefit from spellchecking but thats not
> important ATM
>
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Freedom in capitalist society always remains about the same as it was in
> ancient Greek republics: Freedom for slave owners. -- Vladimir Lenin
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
From 4bb7303d1849bbccd98c5c74c3c90d228b489bef Mon Sep 17 00:00:00 2001
From: Pedro Arthur 
Date: Tue, 18 Aug 2015 15:06:49 -0300
Subject: [PATCH 1/2] swscale: added copyrights & fixed compiler warnings

---
 libswscale/hscale.c   | 20 
 libswscale/slice.c| 20 
 libswscale/swscale.c  | 24 +++-
 libswscale/swscale_internal.h |  6 --
 libswscale/x86/swscale.c  | 15 +++
 5 files changed, 66 insertions(+), 19 deletions(-)

diff --git a/libswscale/hscale.c b/libswscale/hscale.c
index c8543a2..ca09576 100644
--- a/libswscale/hscale.c
+++ b/libswscale/hscale.c
@@ -1,3 +1,23 @@
+/*
+ * Copyright (C) 2015 Pedro Arthur 
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
 #include "swscale_internal.h"
 
 static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH)
diff --git a/libswscale/slice.c b/libswscale/slice.c
index 242367d..611e4e6 100644
--- a/libswscale/slice.c
+++ b/libswscale/slice.c
@@ -1,3 +1,23 @@
+/*
+ * Copyright (C) 2015 Pedro Arthur 
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
 #include "swscale_internal.h"
 
 static void free_lines(SwsSlice *s)
diff --git a/libswscale/swscale.c b/libswscale/swscale.c
index e5bab9c..5faf1e6 100644
--- a/libswscale/swscale.c
+++ b/libswscale/swscale.c
@@ -321,35 +321,45 @@ static int swscale(SwsContext *c, const uint8_t *src[],
 {
 /* load a few things into local vars to make the code more readable?
  * and faster */
+#ifndef NEW_FILTER
 const int srcW   = c->srcW;
+#endif
 const int dstW   = c->dstW;
 const int dstH   = c->dstH;
 const int chrDstW= c->chrDstW;
+#ifndef NEW_FILTER
 const int chrSrcW= c->chrSrcW;
 const int lumXInc= c->lumXInc;
 const int chrXInc= c->chrXInc;
+#endif
 const enum AVPixelFormat dstFormat = c->dstFormat;
 const int flags  = c->flags;
 int32_t *vLumFilterPos   = c->vLumFilterPos;
 int32_t *vChrFilterPos 

Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-17 Thread Michael Niedermayer
On Mon, Aug 17, 2015 at 05:35:32PM -0300, Pedro Arthur wrote:
> ops, added missing file.
> 
> 
> 2015-08-17 17:31 GMT-03:00 Pedro Arthur :
> 
> >
> >
> > 2015-08-17 0:19 GMT-03:00 Michael Niedermayer :
> >
> >> also feel free to split the batch addition into a seperate commit
> >> (should be easy as you already have a versionn with and without)
> >>
> > Attached proper patchs.
> >
> > also, please send me your public ssh key, i think you should have
> >> direct write access to ffmpeg git
> >>
> > I've sent it privately.
> >

applied patches

please add license headers to the new files
also please put variables which are just used by the old code
under #ifndef NEW_FILTER  or something so they do not produce warnings

the comments also could benefit from spellchecking but thats not
important ATM

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Freedom in capitalist society always remains about the same as it was in
ancient Greek republics: Freedom for slave owners. -- Vladimir Lenin


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-17 Thread Pedro Arthur
ops, added missing file.


2015-08-17 17:31 GMT-03:00 Pedro Arthur :

>
>
> 2015-08-17 0:19 GMT-03:00 Michael Niedermayer :
>
>> also feel free to split the batch addition into a seperate commit
>> (should be easy as you already have a versionn with and without)
>>
> Attached proper patchs.
>
> also, please send me your public ssh key, i think you should have
>> direct write access to ffmpeg git
>>
> I've sent it privately.
>
From 90c8dc145b64ffc775aadee8c4f851057d45bb69 Mon Sep 17 00:00:00 2001
From: Pedro Arthur 
Date: Mon, 17 Aug 2015 17:03:20 -0300
Subject: [PATCH] swscale: refactor horizontal scaling

+ split color conversion from scaling
- disabled gamma correction, util it's refactored too
---
 libswscale/Makefile   |   2 +
 libswscale/hscale.c   | 255 +++
 libswscale/slice.c| 301 ++
 libswscale/swscale.c  |  77 ++-
 libswscale/swscale_internal.h | 100 ++
 libswscale/utils.c|   3 +-
 libswscale/x86/swscale.c  |  31 -
 7 files changed, 758 insertions(+), 11 deletions(-)
 create mode 100644 libswscale/hscale.c
 create mode 100644 libswscale/slice.c

diff --git a/libswscale/Makefile b/libswscale/Makefile
index b11e789..b2b6381 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -15,6 +15,8 @@ OBJS = alphablend.o \
swscale_unscaled.o   \
utils.o  \
yuv2rgb.o\
+   slice.o  \
+   hscale.o \
 
 OBJS-$(CONFIG_SHARED)+= log2_tab.o
 
diff --git a/libswscale/hscale.c b/libswscale/hscale.c
new file mode 100644
index 000..bcbc87f
--- /dev/null
+++ b/libswscale/hscale.c
@@ -0,0 +1,255 @@
+#include "swscale_internal.h"
+
+static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH)
+{
+FilterContext *instance = desc->instance;
+int srcW = desc->src->width;
+int dstW = desc->dst->width;
+int xInc = instance->xInc;
+
+int i;
+for (i = 0; i < sliceH; ++i) {
+uint8_t ** src = desc->src->plane[0].line;
+uint8_t ** dst = desc->dst->plane[0].line;
+int src_pos = sliceY+i - desc->src->plane[0].sliceY;
+int dst_pos = sliceY+i - desc->dst->plane[0].sliceY;
+
+
+if (c->hyscale_fast) {
+c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], srcW, xInc);
+} else {
+c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], instance->filter,
+   instance->filter_pos, instance->filter_size);
+}
+
+if (c->lumConvertRange)
+c->lumConvertRange((int16_t*)dst[dst_pos], dstW);
+
+desc->dst->plane[0].sliceH += 1;
+
+if (desc->alpha) {
+src = desc->src->plane[3].line;
+dst = desc->dst->plane[3].line;
+
+src_pos = sliceY+i - desc->src->plane[3].sliceY;
+dst_pos = sliceY+i - desc->dst->plane[3].sliceY;
+
+desc->dst->plane[3].sliceH += 1;
+
+if (c->hyscale_fast) {
+c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], srcW, xInc);
+} else {
+c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], instance->filter,
+instance->filter_pos, instance->filter_size);
+}
+}
+}
+
+return sliceH;
+}
+
+static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH)
+{
+int srcW = desc->src->width;
+ColorContext * instance = desc->instance;
+uint32_t * pal = instance->pal;
+int i;
+
+desc->dst->plane[0].sliceY = sliceY;
+desc->dst->plane[0].sliceH = sliceH;
+desc->dst->plane[3].sliceY = sliceY;
+desc->dst->plane[3].sliceH = sliceH;
+
+for (i = 0; i < sliceH; ++i) {
+int sp0 = sliceY+i - desc->src->plane[0].sliceY;
+int sp1 = ((sliceY+i) >> desc->src->v_chr_sub_sample) - desc->src->plane[1].sliceY;
+const uint8_t * src[4] = { desc->src->plane[0].line[sp0],
+desc->src->plane[1].line[sp1],
+desc->src->plane[2].line[sp1],
+desc->src->plane[3].line[sp0]};
+uint8_t * dst = desc->dst->plane[0].line[i];
+
+if (c->lumToYV12) {
+c->lumToYV12(dst, src[0], src[1], src[2], srcW, pal);
+} else if (c->readLumPlanar) {
+c->readLumPlanar(dst, src, srcW, c->input_rgb2yuv_table);
+} 
+
+
+if (desc->alpha) {
+dst = desc->dst->plane[3].line[i];
+if (c->alpToYV12) {
+c->alpToYV12(dst, src[3], src[1], src[2], srcW, pal);
+} else if (c->readAlpPlanar) {
+  

Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-17 Thread Pedro Arthur
2015-08-17 0:19 GMT-03:00 Michael Niedermayer :

> also feel free to split the batch addition into a seperate commit
> (should be easy as you already have a versionn with and without)
>
Attached proper patchs.

also, please send me your public ssh key, i think you should have
> direct write access to ffmpeg git
>
I've sent it privately.
From 8430533c8a41b4abd4ce3b45e4b68627a0f76d58 Mon Sep 17 00:00:00 2001
From: Pedro Arthur 
Date: Mon, 17 Aug 2015 17:03:20 -0300
Subject: [PATCH 1/2] swscale: refactor horizontal scaling

+ split color conversion from scaling
- disabled gamma correction, util it's refactored too
---
 libswscale/Makefile   |   2 +
 libswscale/hscale.c   | 255 ++
 libswscale/swscale.c  |  77 +++--
 libswscale/swscale_internal.h | 100 +
 libswscale/utils.c|   3 +-
 libswscale/x86/swscale.c  |  31 -
 6 files changed, 457 insertions(+), 11 deletions(-)
 create mode 100644 libswscale/hscale.c

diff --git a/libswscale/Makefile b/libswscale/Makefile
index b11e789..b2b6381 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -15,6 +15,8 @@ OBJS = alphablend.o \
swscale_unscaled.o   \
utils.o  \
yuv2rgb.o\
+   slice.o  \
+   hscale.o \
 
 OBJS-$(CONFIG_SHARED)+= log2_tab.o
 
diff --git a/libswscale/hscale.c b/libswscale/hscale.c
new file mode 100644
index 000..bcbc87f
--- /dev/null
+++ b/libswscale/hscale.c
@@ -0,0 +1,255 @@
+#include "swscale_internal.h"
+
+static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH)
+{
+FilterContext *instance = desc->instance;
+int srcW = desc->src->width;
+int dstW = desc->dst->width;
+int xInc = instance->xInc;
+
+int i;
+for (i = 0; i < sliceH; ++i) {
+uint8_t ** src = desc->src->plane[0].line;
+uint8_t ** dst = desc->dst->plane[0].line;
+int src_pos = sliceY+i - desc->src->plane[0].sliceY;
+int dst_pos = sliceY+i - desc->dst->plane[0].sliceY;
+
+
+if (c->hyscale_fast) {
+c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], srcW, xInc);
+} else {
+c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], instance->filter,
+   instance->filter_pos, instance->filter_size);
+}
+
+if (c->lumConvertRange)
+c->lumConvertRange((int16_t*)dst[dst_pos], dstW);
+
+desc->dst->plane[0].sliceH += 1;
+
+if (desc->alpha) {
+src = desc->src->plane[3].line;
+dst = desc->dst->plane[3].line;
+
+src_pos = sliceY+i - desc->src->plane[3].sliceY;
+dst_pos = sliceY+i - desc->dst->plane[3].sliceY;
+
+desc->dst->plane[3].sliceH += 1;
+
+if (c->hyscale_fast) {
+c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], srcW, xInc);
+} else {
+c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], instance->filter,
+instance->filter_pos, instance->filter_size);
+}
+}
+}
+
+return sliceH;
+}
+
+static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH)
+{
+int srcW = desc->src->width;
+ColorContext * instance = desc->instance;
+uint32_t * pal = instance->pal;
+int i;
+
+desc->dst->plane[0].sliceY = sliceY;
+desc->dst->plane[0].sliceH = sliceH;
+desc->dst->plane[3].sliceY = sliceY;
+desc->dst->plane[3].sliceH = sliceH;
+
+for (i = 0; i < sliceH; ++i) {
+int sp0 = sliceY+i - desc->src->plane[0].sliceY;
+int sp1 = ((sliceY+i) >> desc->src->v_chr_sub_sample) - desc->src->plane[1].sliceY;
+const uint8_t * src[4] = { desc->src->plane[0].line[sp0],
+desc->src->plane[1].line[sp1],
+desc->src->plane[2].line[sp1],
+desc->src->plane[3].line[sp0]};
+uint8_t * dst = desc->dst->plane[0].line[i];
+
+if (c->lumToYV12) {
+c->lumToYV12(dst, src[0], src[1], src[2], srcW, pal);
+} else if (c->readLumPlanar) {
+c->readLumPlanar(dst, src, srcW, c->input_rgb2yuv_table);
+} 
+
+
+if (desc->alpha) {
+dst = desc->dst->plane[3].line[i];
+if (c->alpToYV12) {
+c->alpToYV12(dst, src[3], src[1], src[2], srcW, pal);
+} else if (c->readAlpPlanar) {
+c->readAlpPlanar(dst, src, srcW, NULL);
+}
+}
+}
+
+return sliceH;
+}
+
+int ff_init_desc_fmt_convert(SwsFilterDescriptor *desc, SwsSlice * src, SwsSlice *dst, 

Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-16 Thread Michael Niedermayer
On Sun, Aug 16, 2015 at 07:13:07PM -0300, Pedro Arthur wrote:
> 2015-08-15 7:24 GMT-03:00 Michael Niedermayer :
> 
> > these are not git patches
> >
> Yes, they are raw git diffs.
> 
> >
> > > A - New code
> >
> > doesnt compile (but that doesnt matter as you say this is slower anyway)
> > libswscale/swscale.c: In function ‘swscale’:
> > libswscale/swscale.c:529:18: error: ‘i’ undeclared (first use in this
> > function)
> >
> Fixed.
> 
> time ./ffmpeg -i matrixbench_mpeg2.mpg -an -vf
> > scale=1920:1080,scale=720:480 -f null -
> >
>  Performance seems good for C. But this is not a good test for measuring
> the difference
> between the split vs merged color conversion and horizontal scaling as the
> source slice passed
> to be scaled is already in YUV format and thus there is no need for color
> conversion.
> Indeed the split color conversion should perform better as the code path is
> shorter, instead
> of calling the "process" function twice, one for color conversion and one
> for  h scaling, it will
> call only the hscaling function.
> 
> 
> > also this seems well working except
> > make -j4 libswscale/swscale-test
> > gdb --args libswscale/swscale-test
> >
> It seems the api is being used incorrectly in swscale-test.c.
> 
> The following code creates a sws context with srcH = H / 12, srcW = W / 12.
> Next it calls the scaling functions with srcY = 0, srcH = H. Thus it is
> scaling more lines
> than were specified when creating the context.
> Is it intended or it is a bug? If it is a bug I can put a check in the
> sws_scale function, if not

bug
and yes, a check to avoid crashing is a good idea


> I'll have to think a solution for this, as the new code expects only H/12
> lines to be scaled.
> 
> sws = sws_getContext(W / 12, H / 12, AV_PIX_FMT_RGB32, W, H,
> AV_PIX_FMT_YUVA420P, SWS_BILINEAR, NULL, NULL, NULL);
> [...]
> sws_scale(sws, rgb_src, rgb_stride, 0, H, src, stride);
> 
> 
> I'm attaching the  diff for the fixed new code with split color
> conversion/hscaling
> (referenced as A previously) and a new one, that I'll call D, which is A
> with line batches.
> Thus you can test both approaches, split/merged color conversion
> with/without line
> batches.
> As soon as we decide which approach is better I can send a definitive patch.

the batches seem always faster (D and swscale_merge_batch.patch)
the difference between the split non split seems alot smaller in my tests
but having the steps split seems an advantage to me so possibly
the split variant might be best

also feel free to split the batch addition into a seperate commit
(should be easy as you already have a versionn with and without)

also, please send me your public ssh key, i think you should have
direct write access to ffmpeg git

A2
real0m20.942s
real0m20.995s
real0m20.989s

D
real0m20.650s
real0m20.647s
real0m20.658s

swscale_merge_batch.patch
real0m20.730s
real0m20.663s
real0m20.601s

time ./ffmpeg -i matrixbench_mpeg2.mpg -an -vf 
format=bgr32,scale=1920:1080,scale=720:480,format=bgr32 -t 30 -f null -
swscale_merge_batch.patch
real0m34.334s
real0m34.346s
real0m34.339s

D
real0m34.509s
real0m34.470s
real0m34.637s

ref:
real0m34.579s
real0m34.441s
real0m34.486s

time ./ffmpeg -i matrixbench_mpeg2.mpg -an -vf 
format=yuyv422,scale=1920:1080,scale=720:480,format=yuyv422 -t 90 -f null -
real0m18.415s

D
real0m18.473s
real0m18.512s
real0m18.484s

swscale_merge_batch.patch
real0m18.468s
real0m18.500s
real0m18.516s

A2
real0m18.653s
real0m18.582s

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The bravest are surely those who have the clearest vision
of what is before them, glory and danger alike, and yet
notwithstanding go out to meet it. -- Thucydides


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-16 Thread Pedro Arthur
2015-08-15 7:24 GMT-03:00 Michael Niedermayer :

> these are not git patches
>
Yes, they are raw git diffs.

>
> > A - New code
>
> doesnt compile (but that doesnt matter as you say this is slower anyway)
> libswscale/swscale.c: In function ‘swscale’:
> libswscale/swscale.c:529:18: error: ‘i’ undeclared (first use in this
> function)
>
Fixed.

time ./ffmpeg -i matrixbench_mpeg2.mpg -an -vf
> scale=1920:1080,scale=720:480 -f null -
>
 Performance seems good for C. But this is not a good test for measuring
the difference
between the split vs merged color conversion and horizontal scaling as the
source slice passed
to be scaled is already in YUV format and thus there is no need for color
conversion.
Indeed the split color conversion should perform better as the code path is
shorter, instead
of calling the "process" function twice, one for color conversion and one
for  h scaling, it will
call only the hscaling function.


> also this seems well working except
> make -j4 libswscale/swscale-test
> gdb --args libswscale/swscale-test
>
It seems the api is being used incorrectly in swscale-test.c.

The following code creates a sws context with srcH = H / 12, srcW = W / 12.
Next it calls the scaling functions with srcY = 0, srcH = H. Thus it is
scaling more lines
than were specified when creating the context.
Is it intended or it is a bug? If it is a bug I can put a check in the
sws_scale function, if not
I'll have to think a solution for this, as the new code expects only H/12
lines to be scaled.

sws = sws_getContext(W / 12, H / 12, AV_PIX_FMT_RGB32, W, H,
AV_PIX_FMT_YUVA420P, SWS_BILINEAR, NULL, NULL, NULL);
[...]
sws_scale(sws, rgb_src, rgb_stride, 0, H, src, stride);


I'm attaching the  diff for the fixed new code with split color
conversion/hscaling
(referenced as A previously) and a new one, that I'll call D, which is A
with line batches.
Thus you can test both approaches, split/merged color conversion
with/without line
batches.
As soon as we decide which approach is better I can send a definitive patch.
diff --git a/libswscale/Makefile b/libswscale/Makefile
index b11e789..24dae8a 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -15,6 +15,7 @@ OBJS = alphablend.o \
swscale_unscaled.o   \
utils.o  \
yuv2rgb.o\
+   slice.o  \
 
 OBJS-$(CONFIG_SHARED)+= log2_tab.o
 
diff --git a/libswscale/hscale.c b/libswscale/hscale.c
new file mode 100644
index 000..83f082e
--- /dev/null
+++ b/libswscale/hscale.c
@@ -0,0 +1,274 @@
+static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, 
int sliceH)
+{
+FilterContext *instance = desc->instance;
+int srcW = desc->src->width;
+int dstW = desc->dst->width;
+int xInc = instance->xInc;
+
+int i;
+for (i = 0; i < sliceH; ++i) {
+uint8_t ** src = desc->src->plane[0].line;
+uint8_t ** dst = desc->dst->plane[0].line;
+int src_pos = sliceY+i - desc->src->plane[0].sliceY;
+int dst_pos = sliceY+i - desc->dst->plane[0].sliceY;
+
+
+if (c->hyscale_fast) {
+c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], 
srcW, xInc);
+} else {
+c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t 
*)src[src_pos], instance->filter,
+   instance->filter_pos, instance->filter_size);
+}
+
+if (c->lumConvertRange)
+c->lumConvertRange((int16_t*)dst[dst_pos], dstW);
+
+desc->dst->plane[0].sliceH += 1;
+
+if (desc->alpha) {
+src = desc->src->plane[3].line;
+dst = desc->dst->plane[3].line;
+
+src_pos = sliceY+i - desc->src->plane[3].sliceY;
+dst_pos = sliceY+i - desc->dst->plane[3].sliceY;
+
+desc->dst->plane[3].sliceH += 1;
+
+if (c->hyscale_fast) {
+c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], 
srcW, xInc);
+} else {
+c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t 
*)src[src_pos], instance->filter,
+instance->filter_pos, instance->filter_size);
+}
+}
+}
+
+return sliceH;
+}
+
+static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, 
int sliceH)
+{
+int srcW = desc->src->width;
+ColorContext * instance = desc->instance;
+uint32_t * pal = instance->pal;
+int i;
+
+desc->dst->plane[0].sliceY = sliceY;
+desc->dst->plane[0].sliceH = sliceH;
+desc->dst->plane[3].sliceY = sliceY;
+desc->dst->plane[3].sliceH = sliceH;
+
+for (i = 0; i < sliceH; ++i) {
+int sp0 = sliceY+i - desc->src->plane[0].sliceY;
+int sp1 = ((sliceY+i) >> desc->src->v_chr_sub_sample) - 
desc->src->plane[1].sliceY;
+const uint8_t *

Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-15 Thread Michael Niedermayer
On Sat, Aug 15, 2015 at 12:17:27AM -0300, Pedro Arthur wrote:
> Hi,
> Since the last patch I was trying to improve the performance regression.
> First I tried to process horizontal lines in batches, processing
> (horizontal_filter_size + n)
> lines at a time. I also tried to remove branch code from the processing
> function, for example:
> int process(...) {
> if (c->hcscale_fast) {
> do_x()
> } else {
>  do_y()
>  }
> }
> changed to:
> int process_fast(...) {do_x()}
> int process_(...) {do_y()}
> 
> But these changes more or less didn't improve the performance at all.

yes, a single if() more or less per line is unlikely to make
much of a differece, lines have hudreads of pixels normally so they,
compared to pixels would only have a comparably small impact


> As the most significant difference between the old and new code is that
> the color conversion is separated from the horizontal scaling I merged
> back the color conversion with the horizontal scaling and the performance
> seemed to be on par with the original code again.
> 
> One point I would like to comment is the performance measurement method. I
> used 3 methods
> 1 - using the scaling code, scale each line n times and measure the total
> scaling time
> this method was the most reliable as the measured time deviation between
> different runs
> was > 0.1%.
> 2 - Call the scaling function n times, this method was not much reliable
> with total time
> deviation of 0.1% to 20%.
> 3 - Run the program n times,  measured time as not reliable deviation of
> 10%-30%.
> For all the 3 methods the time measurement as done for only the horizontal
> scaling code.
> 
> I think method 2 and 3 would be more close to real world usage but its
> deviation is to high
> to get any conclusion from its results.
> 
> 
> Using method 1 with merge color conversion + horizontal scaling performance
> seems to be
> on par with the original code.
> 
> Some numbers. Performance penalty %. (< 0 means gain)
> 

these are not git patches

> A - New code

doesnt compile (but that doesnt matter as you say this is slower anyway)
libswscale/swscale.c: In function ‘swscale’:
libswscale/swscale.c:529:18: error: ‘i’ undeclared (first use in this function)


> B - New code with merged color conversion and horizontal scaling

time ./ffmpeg -i matrixbench_mpeg2.mpg -an -vf scale=1920:1080,scale=720:480 -f 
null -
old code:
real0m20.730s
real0m20.763s
real0m20.765s

new code:
real0m20.929s
real0m20.892s
real0m20.893s


> C - B + line batches

new code:
real0m20.730s
real0m20.690s
real0m20.683s

also this seems well working except
make -j4 libswscale/swscale-test
gdb --args libswscale/swscale-test
r
bt
#0  ff_rgbaToY_avx.loop () at libswscale/x86/input.asm:524
#1  0x0044cc17 in lum_h_scale1 (c=0x6d7100, desc=0x6e29a0, sliceY=6, 
sliceH=5) at libswscale/hscale.c:115
#2  0x004059e9 in swscale (c=0x6d7100, src=0x7fffe120, 
srcStride=0x7fffe160, srcSliceY=0, srcSliceH=96, dst=0x7fffe140, 
dstStride=0x7fffe170) at libswscale/swscale.c:558
#3  0x004082d0 in sws_scale (c=0x6d7100, srcSlice=0x7fffe330, 
srcStride=0x7fffe370, srcSliceY=0, srcSliceH=96, dst=0x7fffe350, 
dstStride=0x7fffe380) at libswscale/swscale.c:1205
#4  0x004032c6 in main (argc=1, argv=0x7fffe4c8) at 
libswscale/swscale-test.c:402
(gdb) up
#1  0x0044cc17 in lum_h_scale1 (c=0x6d7100, desc=0x6e29a0, sliceY=6, 
sliceH=5) at libswscale/hscale.c:115
115 c->lumToYV12(lBuf, src[0], src[1], src[2], srcW, pal);
(gdb) print lBuf
$1 = (uint8_t *) 0x6e0460 ""
(gdb) print src[0]
$2 = (const uint8_t *) 0x0

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Asymptotically faster algorithms should always be preferred if you have
asymptotical amounts of data


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-11 Thread Michael Niedermayer
On Tue, Aug 11, 2015 at 03:17:55PM -0300, Pedro Arthur wrote:
> Hi,
> 
> The past week I worked on refactoring the vertical scaler code.
> The vertical scaler was not split in scaling pass and color conversion pass
> (as the horizontal scaler)
> because the output functions currently merge these passes and it would
> require rewriting all these
> functions.
> This week I should clean up the code, check the new code performance and
> get the patch ready
> for push.

./ffmpeg -i lena.pnm -s 4096x2048 test.mp4
segfaults

./ffmpeg -i fate-suite/cine/bayer_gbrg8.cine  -vf scale=300:300 -vframes 1 
img.jpg
all green image

./ffplay -i ~/videos/lena.pnm -vf format=yuv410p
mostly green

also please submit patches more often than once a week(as theres
little time left) and if you
easily can then please split the patch into one for horizontal and
one for vertical. But if this is alot of work then leave it rather
in one patch.
For example if you could get the horizontal code bug free without
speed regression then that could be pushed and the amount of
code to work on would be half ...


[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

It is what and why we do it that matters, not just one of them.


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-08-05 Thread Pedro Arthur
New patch with the green line bug fixed.
Now I intend to speedup the work on the vertical scaling and also try to
reduce
the performance gap.

2015-07-29 10:05 GMT-03:00 Michael Niedermayer :

> On Tue, Jul 28, 2015 at 11:39:59PM -0300, Pedro Arthur wrote:
> > > do you think this patch would be ready to push to main ffmpeg once
> > > this (and any other remaining) issues are fixed
> > > or is there still some speed loss ?
> > >
> > I think I should work a bit more on it, in my tests some cases there is
>
> ok
>
>
> > still 3% speed loss.
> > Usualy it occurs when the vertical filter size is ~1 so it processes one
> > line at a time. I think
> > if we process lines in "batches"  it can be improved. But it is more
> > complex to implement
> > because for example there is filters which skips some lines, and the
> memory
> > required will
> > be increased.
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Many that live deserve death. And some that die deserve life. Can you give
> it to them? Then do not be too eager to deal out death in judgement. For
> even the very wise cannot see all ends. -- Gandalf
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
diff --git a/libswscale/Makefile b/libswscale/Makefile
index a60b057..d876e75 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -14,6 +14,7 @@ OBJS = hscale_fast_bilinear.o   \
swscale_unscaled.o   \
utils.o  \
yuv2rgb.o\
+   slice.o  \
 
 OBJS-$(CONFIG_SHARED)+= log2_tab.o
 
diff --git a/libswscale/slice.c b/libswscale/slice.c
new file mode 100644
index 000..f882557
--- /dev/null
+++ b/libswscale/slice.c
@@ -0,0 +1,554 @@
+#include "swscale_internal.h"
+
+static void free_lines(SwsSlice *s)
+{
+int i;
+for (i = 0; i < 2; ++i) {
+int n = s->plane[i].available_lines;
+int j;
+for (j = 0; j < n; ++j) {
+av_freep(&s->plane[i].line[j]);
+if (s->is_ring)
+   s->plane[i].line[j+n] = NULL;
+}
+}
+
+for (i = 0; i < 4; ++i)
+memset(s->plane[i].line, 0, sizeof(uint8_t*) * s->plane[i].available_lines * (s->is_ring ? 3 : 1));
+s->should_free_lines = 0;
+}
+
+/*
+ slice lines contains extra bytes for vetorial code thus @size
+ is the allocated memory size and @width is the number of pixels 
+*/
+static int alloc_lines(SwsSlice *s, int size, int width)
+{
+int i;
+int idx[2] = {3, 2};
+
+s->should_free_lines = 1;
+s->width = width;
+
+for (i = 0; i < 2; ++i) {
+int n = s->plane[i].available_lines;
+int j;
+int ii = idx[i];
+
+av_assert0(n == s->plane[ii].available_lines);
+for (j = 0; j < n; ++j) {
+// chroma plane line U and V are expected to be contiguous in memory
+// by mmx vertical scaler code
+s->plane[i].line[j] = av_malloc(size * 2 + 32);
+if (!s->plane[i].line[j]) {
+free_lines(s);
+return AVERROR(ENOMEM);
+}
+s->plane[ii].line[j] = s->plane[i].line[j] + size + 16; 
+if (s->is_ring) {
+   s->plane[i].line[j+n] = s->plane[i].line[j];
+   s->plane[ii].line[j+n] = s->plane[ii].line[j];
+}
+}
+}
+
+return 0;
+}
+
+static int alloc_slice(SwsSlice *s, enum AVPixelFormat fmt, int lumLines, int chrLines, int h_sub_sample, int v_sub_sample, int ring)
+{
+int i;
+int size[4] = { lumLines,
+chrLines,
+chrLines,
+lumLines };
+
+s->h_chr_sub_sample = h_sub_sample;
+s->v_chr_sub_sample = v_sub_sample;
+s->fmt = fmt;
+s->is_ring = ring;
+s->should_free_lines = 0;
+
+for (i = 0; i < 4; ++i) {
+int n = size[i] * ( ring == 0 ? 1 : 3);
+s->plane[i].line = av_mallocz_array(sizeof(uint8_t*), n);
+if (!s->plane[i].line) 
+return AVERROR(ENOMEM);
+
+s->plane[i].tmp = ring ? s->plane[i].line + size[i] * 2 : NULL;
+s->plane[i].available_lines = size[i];
+s->plane[i].sliceY = 0;
+s->plane[i].sliceH = 0;
+}
+return 0;
+}
+
+static void free_slice(SwsSlice *s)
+{
+int i;
+if (s) {
+if (s->should_free_lines)
+free_lines(s);
+for (i = 0; i < 4; ++i) {
+av_freep(&s->plane[i].line);
+s->plane[i].tmp = NULL;
+}
+}
+}
+
+int ff_rotate_slice(SwsSlice *s, int lum, int chr)
+{
+int i;
+if (lum) {
+for (i = 0; i <

Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-07-29 Thread Michael Niedermayer
On Tue, Jul 28, 2015 at 11:39:59PM -0300, Pedro Arthur wrote:
> > do you think this patch would be ready to push to main ffmpeg once
> > this (and any other remaining) issues are fixed
> > or is there still some speed loss ?
> >
> I think I should work a bit more on it, in my tests some cases there is

ok


> still 3% speed loss.
> Usualy it occurs when the vertical filter size is ~1 so it processes one
> line at a time. I think
> if we process lines in "batches"  it can be improved. But it is more
> complex to implement
> because for example there is filters which skips some lines, and the memory
> required will
> be increased.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Many that live deserve death. And some that die deserve life. Can you give
it to them? Then do not be too eager to deal out death in judgement. For
even the very wise cannot see all ends. -- Gandalf


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-07-28 Thread Pedro Arthur
> do you think this patch would be ready to push to main ffmpeg once
> this (and any other remaining) issues are fixed
> or is there still some speed loss ?
>
I think I should work a bit more on it, in my tests some cases there is
still 3% speed loss.
Usualy it occurs when the vertical filter size is ~1 so it processes one
line at a time. I think
if we process lines in "batches"  it can be improved. But it is more
complex to implement
because for example there is filters which skips some lines, and the memory
required will
be increased.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-07-28 Thread Michael Niedermayer
On Tue, Jul 28, 2015 at 06:06:04PM -0300, Pedro Arthur wrote:
> All these weird bugs were caused by the mmx code expecting the chroma U
> and V buffer to be contiguous in memory. As I was allocating the slice
> lines U
> and V separately the mmx code was using some random memory.
> In the attached patch there is a fix for it.
> I also added some documentation to the new structs.

very good, this works much better
one remaining issue i could find is
./ffplay -i matrixbench_mpeg2.mpg -vf 
scale=96:96,format=nv12,scale=128:128:flags=1
shows a green line at the right border
it happens also with some other formats but seems specific to flags=1
-cpuflags 0 doesnt help

do you think this patch would be ready to push to main ffmpeg once
this (and any other remaining) issues are fixed
or is there still some speed loss ?
[...]

> +#define FREE_FILTERS_ON_ERROR(err, ctx) if ((err) < 0) {\
> +ff_free_filters((ctx)); \
> +return (err);   \
> +}

you can avoid the macro by using a goto
yes i know goto sucks but such error cleanup is one of the very few
cases where using a goto is IMO a good idea

[...]

> diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h
> index 2299aa5..af4b635 100644
> --- a/libswscale/swscale_internal.h
> +++ b/libswscale/swscale_internal.h
> @@ -269,6 +269,9 @@ typedef void (*yuv2anyX_fn)(struct SwsContext *c, const 
> int16_t *lumFilter,
>  const int16_t **alpSrc, uint8_t **dest,
>  int dstW, int y);
>  
> +struct SwsSlice;
> +struct SwsFilterDescriptor;
> +
>  /* This struct should be aligned on at least a 32-byte boundary. */
>  typedef struct SwsContext {
>  /**
> @@ -319,6 +322,12 @@ typedef struct SwsContext {
>  uint16_t *gamma;
>  uint16_t *inv_gamma;
>  
> +int numDesc;
> +int descIndex[2];
> +int numSlice;
> +struct SwsSlice *slice;
> +struct SwsFilterDescriptor *desc;
> +
>  uint32_t pal_yuv[256];
>  uint32_t pal_rgb[256];
>  
> @@ -908,4 +917,75 @@ static inline void fillPlane16(uint8_t *plane, int 
> stride, int width, int height
>  }
>  }
>  
> +#define MAX_SLICE_PLANES 4
> +
> +/// Slice plane
> +typedef struct SwsPlane
> +{
> +int available_lines;///< max number of lines that can be hold by 
> this plane
> +int sliceY; ///< index of first line
> +int sliceH; ///< number of lines
> +uint8_t **line; ///< line buffer
> +uint8_t **tmp;  ///< Tmp line buffer used by mmx code
> +} SwsPlane;
> +
> +/**
> + * Struct which defines a slice of an image to be scaled or a output for
> + * a scaled slice.
> + * A slice can also be used as intermediate ring buffer for scaling steps.
> + */
> +typedef struct SwsSlice 
> +{
> +int width;  ///< Slice line width
> +int h_chr_sub_sample;   ///< horizontal chroma subsampling factor
> +int v_chr_sub_sample;   ///< vertical chroma subsampling factor
> +int is_ring;///< flag to identify if this slice is a ring 
> buffer
> +int should_free_lines;  ///< flag to identify if there are dynamic 
> allocated lines
> +enum AVPixelFormat fmt; ///< planes pixel format
> +SwsPlane plane[MAX_SLICE_PLANES];   ///< color planes
> +} SwsSlice;
> +

> +/**
> + * Struct which holds all necessary data for processing a slice.
> + * A processing step can be a color conversion or horizontal/vertical 
> scaling.
> + */
> +typedef struct SwsFilterDescriptor
> +{
> +SwsSlice *src;  ///< Source slice
> +SwsSlice *dst;  ///< Output slice
> +
> +int alpha;  ///< Flag for processing alpha channel
> +void *instance; ///< Filter instance data
> +
> +/// Function for processing input slice sliceH lines starting from line 
> sliceY
> +int (*process)(SwsContext *c, struct SwsFilterDescriptor *desc, int 
> sliceY, int sliceH);
> +} SwsFilterDescriptor;
> +
> +/// Color conversion instance data
> +typedef struct ColorContext
> +{
> +uint32_t *pal;
> +} ColorContext;
> +
> +/// Scaler instance data
> +typedef struct FilterContext
> +{
> +uint16_t *filter;
> +int *filter_pos;
> +int filter_size;
> +int xInc;
> +} FilterContext;

This looks fine in what is in there but i think it should be
restructured a bit.

There should be a struct that contains only constant fields which
describes a type/class of filter, like a palette->yuv filter or whatever
(constant not in the sense of const but that they are the same for
 all instances)

then a generic context for filter instances is needed which would
contain all generic fields a instance needs

and then a "private" context for a filter instance which would contain
type/class specific fields

ColorContext / FilterContext would match that private
contexts already i think
though FilterContext sounds a bit too generic for a pr

Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-07-28 Thread Pedro Arthur
All these weird bugs were caused by the mmx code expecting the chroma U
and V buffer to be contiguous in memory. As I was allocating the slice
lines U
and V separately the mmx code was using some random memory.
In the attached patch there is a fix for it.
I also added some documentation to the new structs.

2015-07-23 14:17 GMT-03:00 Pedro Arthur :

>
> ./ffplay -f lavfi testsrc  -vf format=yvyu422,scale=flags=1 (black
>> screen)
>>
> I'm working on a proper fix for it (I did a  workaround for the first bug
> that also solves it)
>
>> ./ffplay -f lavfi testsrc  -vf format=rgb555 (gray and some odd
>> distortion)
>>
> This I'll have to check.
>
>
>
diff --git a/libswscale/Makefile b/libswscale/Makefile
index a60b057..d876e75 100644
--- a/libswscale/Makefile
+++ b/libswscale/Makefile
@@ -14,6 +14,7 @@ OBJS = hscale_fast_bilinear.o   \
swscale_unscaled.o   \
utils.o  \
yuv2rgb.o\
+   slice.o  \
 
 OBJS-$(CONFIG_SHARED)+= log2_tab.o
 
diff --git a/libswscale/slice.c b/libswscale/slice.c
new file mode 100644
index 000..7541c9c
--- /dev/null
+++ b/libswscale/slice.c
@@ -0,0 +1,551 @@
+#include "swscale_internal.h"
+
+static void free_lines(SwsSlice *s)
+{
+int i;
+for (i = 0; i < 2; ++i) {
+int n = s->plane[i].available_lines;
+int j;
+for (j = 0; j < n; ++j) {
+av_freep(&s->plane[i].line[j]);
+if (s->is_ring)
+   s->plane[i].line[j+n] = NULL;
+}
+}
+
+for (i = 0; i < 4; ++i)
+memset(s->plane[i].line, 0, sizeof(uint8_t*) * s->plane[i].available_lines * (s->is_ring ? 3 : 1));
+s->should_free_lines = 0;
+}
+
+static int alloc_lines(SwsSlice *s, int width)
+{
+int i;
+int idx[2] = {3, 2};
+
+s->should_free_lines = 1;
+s->width = width;
+
+for (i = 0; i < 2; ++i) {
+int n = s->plane[i].available_lines;
+int j;
+int ii = idx[i];
+
+av_assert0(n == s->plane[ii].available_lines);
+for (j = 0; j < n; ++j) {
+// chroma plane line U and V are expected to be contiguous in memory
+// by mmx vertical scaler code
+s->plane[i].line[j] = av_malloc(width * 2 + 32);
+if (!s->plane[i].line[j]) {
+free_lines(s);
+return AVERROR(ENOMEM);
+}
+s->plane[ii].line[j] = s->plane[i].line[j] + width + 16; 
+if (s->is_ring) {
+   s->plane[i].line[j+n] = s->plane[i].line[j];
+   s->plane[ii].line[j+n] = s->plane[ii].line[j];
+}
+}
+}
+
+return 0;
+}
+
+static int alloc_slice(SwsSlice *s, enum AVPixelFormat fmt, int lumLines, int chrLines, int h_sub_sample, int v_sub_sample, int ring)
+{
+int i;
+int size[4] = { lumLines,
+chrLines,
+chrLines,
+lumLines };
+
+s->h_chr_sub_sample = h_sub_sample;
+s->v_chr_sub_sample = v_sub_sample;
+s->fmt = fmt;
+s->is_ring = ring;
+s->should_free_lines = 0;
+
+for (i = 0; i < 4; ++i) {
+int n = size[i] * ( ring == 0 ? 1 : 3);
+s->plane[i].line = av_mallocz_array(sizeof(uint8_t*), n);
+if (!s->plane[i].line) 
+return AVERROR(ENOMEM);
+
+s->plane[i].tmp = ring ? s->plane[i].line + size[i] * 2 : NULL;
+s->plane[i].available_lines = size[i];
+s->plane[i].sliceY = 0;
+s->plane[i].sliceH = 0;
+}
+return 0;
+}
+
+static void free_slice(SwsSlice *s)
+{
+int i;
+if (s) {
+if (s->should_free_lines)
+free_lines(s);
+for (i = 0; i < 4; ++i) {
+av_freep(&s->plane[i].line);
+s->plane[i].tmp = NULL;
+}
+}
+}
+
+int ff_rotate_slice(SwsSlice *s, int lum, int chr)
+{
+int i;
+if (lum) {
+for (i = 0; i < 4; i+=3) {
+int n = s->plane[i].available_lines;
+int l = s->plane[i].sliceH;
+
+if (l+lum >= n * 2) {
+s->plane[i].sliceY += n;
+s->plane[i].sliceH -= n;
+}
+}
+}
+if (chr) {
+for (i = 1; i < 3; ++i) {
+int n = s->plane[i].available_lines;
+int l = s->plane[i].sliceH;
+
+if (l+chr >= n * 2) {
+s->plane[i].sliceY += n;
+s->plane[i].sliceH -= n;
+}
+}
+}
+return 0;
+}
+
+int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int lumY, int lumH, int chrY, int chrH)
+{
+int i = 0;
+
+const int start[4] = {lumY,
+chrY,
+chrY,
+lumY};
+
+const int end[4] = {lumY +lumH,
+chrY + chrH,
+chrY + chrH,
+ 

Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-07-23 Thread Pedro Arthur
> ./ffplay -f lavfi testsrc  -vf format=yvyu422,scale=flags=1 (black
> screen)
>
I'm working on a proper fix for it (I did a  workaround for the first bug
that also solves it)

> ./ffplay -f lavfi testsrc  -vf format=rgb555 (gray and some odd distortion)
>
This I'll have to check.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly report (libswscale)

2015-07-22 Thread Michael Niedermayer
On Wed, Jul 22, 2015 at 10:49:38AM -0300, Pedro Arthur wrote:
> Last week I worked on fixing the following
> * Memory allocation checks
> * Code style
> I also investigated the ffplay bug (./ffplay -f lavfi testsrc -vf
> format=yuvj420p) which presents a grey screen. It still not fixed in this
> patch but I already found the cause and I intend to fix in the next patch.
> 
> For the next week I intend to write some documentation for the code added
> and start to work on the vertical scaler.

i suggest to try to fix all bugs before adding more features. Adding
docs and any other work should be no problem but its better if
code is in fully testable condition early so we could for example
use git bisect and not be stuck with a range of untestable revissions

Here are 2 more cases that dont work:
./ffplay -f lavfi testsrc  -vf format=yvyu422,scale=flags=1 (black screen)
valgrind shows this:

==4194== Invalid read of size 1
==4194==at 0xE4499B: ff_hyscale_fast_mmxext 
(hscale_fast_bilinear_simd.c:278)
==4194==by 0xE7D82C: lum_h_scale (slice.c:164)
==4194==by 0xE38C3A: swscale (swscale.c:530)
==4194==by 0xE3B54A: sws_scale (swscale.c:1177)
==4194==by 0x4BF5FE: scale_slice (vf_scale.c:440)
==4194==by 0x4BFC69: filter_frame (vf_scale.c:540)
==4194==by 0x446E6E: ff_filter_frame_framed (avfilter.c:1091)
==4194==by 0x4473E6: ff_filter_frame (avfilter.c:1172)
==4194==by 0x4469D1: default_filter_frame (avfilter.c:1002)
==4194==by 0x446E6E: ff_filter_frame_framed (avfilter.c:1091)
==4194==by 0x4473E6: ff_filter_frame (avfilter.c:1172)
==4194==by 0x4BFC8B: filter_frame (vf_scale.c:544)
==4194==  Address 0x20c643ff is 1 bytes before a block of size 720 alloc'd
==4194==at 0x4C2A6C5: memalign (vg_replace_malloc.c:727)
==4194==by 0x4C2A760: posix_memalign (vg_replace_malloc.c:876)
==4194==by 0xEB84BB: av_malloc (mem.c:97)
==4194==by 0xE7CE85: alloc_lines (slice.c:27)
==4194==by 0xE7E9BB: ff_init_filters (slice.c:465)
==4194==by 0xE428A0: sws_init_context (utils.c:1603)
==4194==by 0x4BF266: config_props (vf_scale.c:392)
==4194==by 0x444A3E: avfilter_config_links (avfilter.c:262)
==4194==by 0x4449D2: avfilter_config_links (avfilter.c:251)
==4194==by 0x447B6B: graph_config_links (avfiltergraph.c:275)
==4194==by 0x44AA20: avfilter_graph_config (avfiltergraph.c:1212)
==4194==by 0x41EAF3: configure_filtergraph (ffplay.c:1951)
...


./ffplay -f lavfi testsrc  -vf format=rgb555 (gray and some odd distortion)
the issue goes away with -cpuflags 0, so it seems related to the use
of asm code
valgrind shows this:
==4069== Thread 17:
==4069== Invalid read of size 8
==4069==at 0xE4B47B: yuv2rgb555_1_mmxext (swscale_template.c:1328)
==4069==by 0x1A37B5EF: ???
==4069==by 0x1A37B84F: ???
==4069==by 0xE39408: swscale (swscale.c:699)
==4069==by 0x13F: ???
==4069==  Address 0x20c50c80 is 16 bytes after a block of size 720 alloc'd
==4069==at 0x4C2A6C5: memalign (vg_replace_malloc.c:727)
==4069==by 0x4C2A760: posix_memalign (vg_replace_malloc.c:876)
==4069==by 0xEB84BB: av_malloc (mem.c:97)
==4069==by 0xE7CE85: alloc_lines (slice.c:27)
==4069==by 0xE7EAA9: ff_init_filters (slice.c:470)
==4069==by 0xE428A0: sws_init_context (utils.c:1603)
==4069==by 0x4BF266: config_props (vf_scale.c:392)
==4069==by 0x444A3E: avfilter_config_links (avfilter.c:262)
==4069==by 0x4449D2: avfilter_config_links (avfilter.c:251)
==4069==by 0x4449D2: avfilter_config_links (avfilter.c:251)
==4069==by 0x4449D2: avfilter_config_links (avfilter.c:251)
==4069==by 0x447B6B: graph_config_links (avfiltergraph.c:275)


also if you need help with debuging something / are stuck with
debuging, dont wait the whole week but please mail me or the list
or try IRC
iam happy to help!

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The real ebay dictionary, page 2
"100% positive feedback" - "All either got their money back or didnt complain"
"Best seller ever, very honest" - "Seller refunded buyer after failed scam"


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly Report (libswscale refactor)

2015-06-25 Thread wm4
On Thu, 25 Jun 2015 13:34:57 -0300
Pedro Arthur  wrote:

> > it passes fate but seems to crash for example with:
> > ./ffmpeg -i lena.pnm  -vf format=gbrp test.avi
> >
> I'll check it.
> 
> how /over what did you meassure the 3% exactly ?
> > iam asking, so i understand if this performance chnage is irrelevant
> > or not.
> > for init code that runs once some speed loss is irrelevant but for
> > example the horizontal scale code as a whole should not become 3%
> > slower.
> > but i do not suggest to do non trivial optimization on this yet.
> > better get it all working first.
> 
> The measurement was done only for the horizontal scaling loop.
> But it should improve without much effort  when the implementation advance.
> 
> please always attach patches, that makes reviewing them with with
> > MUAs easy and also keeps them together with any discussions about
> > them. who knows if the github links will still work in 10 years if
> > someone tries to debug something and tried to look up old discussions
> > and patches ...
> >
> Attached.

For this it would be really better if you sent the patches with git
send-email. It sends 1 patch per mail, and makes it much easier to read
and review. Once you've learned how to do it, it'll be much easier for
you as well.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly Report (libswscale refactor)

2015-06-25 Thread Pedro Arthur
> it passes fate but seems to crash for example with:
> ./ffmpeg -i lena.pnm  -vf format=gbrp test.avi
>
I'll check it.

how /over what did you meassure the 3% exactly ?
> iam asking, so i understand if this performance chnage is irrelevant
> or not.
> for init code that runs once some speed loss is irrelevant but for
> example the horizontal scale code as a whole should not become 3%
> slower.
> but i do not suggest to do non trivial optimization on this yet.
> better get it all working first.

The measurement was done only for the horizontal scaling loop.
But it should improve without much effort  when the implementation advance.

please always attach patches, that makes reviewing them with with
> MUAs easy and also keeps them together with any discussions about
> them. who knows if the github links will still work in 10 years if
> someone tries to debug something and tried to look up old discussions
> and patches ...
>
Attached.


0001-Add-gamma-encodign-decoding-before-after-scaling-in-.patch
Description: Binary data


0002-swscale-refactor-added-initial-filters.patch
Description: Binary data


0003-FIXES.patch
Description: Binary data


0004-swscale-move-variable-filter-attributes-to-filter-in.patch
Description: Binary data


0005-swscale-rename-instance-structs.patch
Description: Binary data


0006-swscale-corrected-chroma-sliceY-sliceH.patch
Description: Binary data


0007-swscale-initial-horizontal-chroma-scaling-work.patch
Description: Binary data


0008-swscale-plug-chroma-horizontal-scaling.patch
Description: Binary data
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] GSoC Weekly Report (libswscale refactor)

2015-06-24 Thread Michael Niedermayer
On Wed, Jun 24, 2015 at 02:41:54PM -0300, Pedro Arthur wrote:
> Hi,
> 
> I'm working on the libswscale refactoring and Michael advised me to send
> the changes to the
> mailing list so that I can get more feedback about it. Thus I added the
> references [1] - [7] which
> are links to commits on my github fork of FFmpeg.
> 
> Last week I wrote the horizontal chroma scaling (patches [3]-[7]) code and
> spent some time
> ensuring it passes all tests.

it passes fate but seems to crash for example with:
./ffmpeg -i lena.pnm  -vf format=gbrp test.avi


> 
> 
> As we are approaching the midterm I'll also present my scheduling plans.
> 
>- Line pool allocator for SwsSlice
>- Implement ring buffer logic into SwsSlice
>- Implement horizontal scaling (refactor)
>- Implement vertical scaling (refactor)
>- Measure refactor performance/overhead
>- Document new code
> 

> The horizontal scaling is already working. I did some tests and initially
> the overhead of the new
> code is ~3% (measured only in the modified scaling code) which should be
> almost 0% for total
> program time execution.

how /over what did you meassure the 3% exactly ?
iam asking, so i understand if this performance chnage is irrelevant
or not.
for init code that runs once some speed loss is irrelevant but for
example the horizontal scale code as a whole should not become 3%
slower.
but i do not suggest to do non trivial optimization on this yet.
better get it all working first.


> 
> For the next week (or 2) I plan to implement the line pool allocator and
> the ring buffer logic.
> 
> Besides the scheduling list, with the new scaling design I think it is
> possible to remove the need
> for cascade SwsContexts and also work on some kind of parallelization of
> the scaling code. These I should work after finishing the scheduling.
> 
> [1] 7efb0fae8ed52b6f841d70c4d8981399da42e7bd
> 
> [2] 0b955f28ab1cbba5188e0cbd44c250dc5b526d53
> 
> [3] b577b4d388743e6f90f54e65fdb8edddbeaf17de
> 
> [4] f83a83e8ab97e0fcab2248e828d4fb5433e8bcfe
> 
> [5] eaf6de8606b698baaeda1ecb9493a240d5b828e9
> 
> [6] b949d895da80979d44f6525de3a3f9a98118a0a6
> 
> [7] 353d20df59075d442e943f00c7fcb8bc7784089b
> 

please always attach patches, that makes reviewing them with with
MUAs easy and also keeps them together with any discussions about
them. who knows if the github links will still work in 10 years if
someone tries to debug something and tried to look up old discussions
and patches ...

Thanks

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No great genius has ever existed without some touch of madness. -- Aristotle


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel