Re: Libraries with NEON backends

2011-04-12 Thread Richard Sandiford
Michael Hope  writes:
> Richard, the implementation uses NEON intrinsics so it'd be
> interesting to see if your pack/unpack patches apply to it.

Thanks for the heads up.  FWIW, though, I don't think my changes
help here, because there are no strided loads and stores involved.
Jan's version doesn't use the intrinsics associated with the vldN
and vstN instructions that I'm working on.

Richard

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-04-10 Thread Michael Hope
On Sun, Apr 10, 2011 at 5:47 AM, Jim Huang  wrote:
> On 31 March 2011 08:23, Michael Hope  wrote:
>> Thanks all for your replies.  I mixed these in with a bit of Googling
>> and recorded them here:
>>  https://wiki.linaro.org/MichaelHope/Sandbox/LibrariesWithNeon
>
> hi Michael,
>
> Jan Seiffert implemented a series of adler32 vectorization for zlib:
>    http://blackfin.uclinux.org/git/?p=users/vapier/zlib.git;a=summary
>
> ARM NEON and ARMv6 SIMD are included.  It looks great and is being
> reviewed in zlib mailing-list:
>     
> http://mail.madler.net/pipermail/zlib-devel_madler.net/2011-April/date.html

Hi jserv.  I had a quick play with this on one of my machines.  It
looks promising but is a bit broken at the moment:

michaelh@ursa1:/scratch/michaelh/zlib$ gdb ./example
...
Starting program: /scratch/michaelh/zlib/example
zlib version 1.2.5 = 0x1250, compile flags = 0x155
uncompress(): hello, hello!
gzread(): hello, hello!
gzgets() after gzseek:  hello!
inflate(): hello, hello!

Program received signal SIGSEGV, Segmentation fault.
0x00015c48 in adler32_vec (adler=2363950230, buf=0x7b000 , len=0)
at adler32_arm.c:162
162 in16 = *(const uint8x16_t *)buf;
(gdb) back
#0  0x00015c48 in adler32_vec (adler=2363950230, buf=0x7b000 , len=0)
at adler32_arm.c:162
#1  0x00016446 in adler32 (adler=2363950230, buf=0x26008
"x\001\354\320\261\r", len=2)
at adler32.c:418
#2  0xb81c in read_buf (strm=0x7ebf3634, buf=0x44ba8 "",
size=25536) at deflate.c:1005
#3  0xbe7a in fill_window (s=0x39898) at deflate.c:1380
#4  0xc06c in deflate_stored (s=0x39898, flush=0) at deflate.c:1484
#5  0xb252 in deflate (strm=0x7ebf3634, flush=0) at deflate.c:822
#6  0x922e in test_large_deflate (compr=0x26008
"x\001\354\320\261\r", comprLen=4,
uncompr=0x2fc50 "hello, hello!", uncomprLen=4) at example.c:281
#7  0x9ca6 in main (argc=1, argv=0x7ebf37f4) at example.c:551

Richard, the implementation uses NEON intrinsics so it'd be
interesting to see if your pack/unpack patches apply to it.

I'll mention this on the zlib-devel list.

-- Michael

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-04-09 Thread Jim Huang
On 31 March 2011 08:23, Michael Hope  wrote:
> Thanks all for your replies.  I mixed these in with a bit of Googling
> and recorded them here:
>  https://wiki.linaro.org/MichaelHope/Sandbox/LibrariesWithNeon

hi Michael,

Jan Seiffert implemented a series of adler32 vectorization for zlib:
http://blackfin.uclinux.org/git/?p=users/vapier/zlib.git;a=summary

ARM NEON and ARMv6 SIMD are included.  It looks great and is being
reviewed in zlib mailing-list:
 http://mail.madler.net/pipermail/zlib-devel_madler.net/2011-April/date.html

Regards,
-jserv

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-30 Thread Michael Hope
Thanks all for your replies.  I mixed these in with a bit of Googling
and recorded them here:
 https://wiki.linaro.org/MichaelHope/Sandbox/LibrariesWithNeon

-- Michael

On Mon, Mar 28, 2011 at 5:52 PM, Jim Huang  wrote:
> On 28 March 2011 05:09, Michael Hope  wrote:
>> Hi there.  I'm looking for areas where the toolchain could generate
>> faster code, and a good way of doing that is seeing how compiled code
>> does against the best hand-written code.  I know of skia, ffmpeg,
>> pixman, Orc, and efl - what others are out there?
>>
>
> hi Michael,
>
> Great motivation to optimize the existing libraries by NEON !
>
> As far as I know, Android depends on several libraries, and some of
> them are computing bound:
>
> - libpixelflinger -- a bit like pixman
>  There is no official document about PixelFlinger, but you can always
> check out its source:
>    http://android.git.kernel.org/?p=platform/system/core.git;a=summary
>  I submitted one NEON optimization patch for libpixelflinger to AOSP before:
>    https://review.source.android.com//#change,16358
>
> - zlib
>  Using SIMD, we can optimize 'copy / repeat an existing sequence' in
> LZ-style encoding.
>  The reference Intel SSE2 optimization patch is attached in this mail.
>
> Sincerely,
> -jserv
>

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-30 Thread Matt Sealey
I forgot btw: the true solution here is adapt zlib so that the
functions are pluggable and the appropriate calls are made at the
appropriate times based on HW capabilities.

Then throw it at mainline, they may accept it.

-- 
Matt Sealey 
Product Development Analyst, Genesi USA, Inc.



On Wed, Mar 30, 2011 at 11:38 AM, Matt Sealey  wrote:
> Konstantinos is right with regards to the package naming, we discussed
> this with the zlib developers at the time.
>
> It is easily solved with a packaging solution though, Provides: as he
> said. You will have the same problems with
> something libjpegturbo if it forces NEON, or libmatrix shipped with
> the GLES test utilities we've seen in our bug
> reports (btw that is a prime target for some NEON optimization,
> Konstantinos can you throw some code in there)
> or in fact any library that has NEON code which is not properly
> inserted/overridden at runtime based on NEON
> hwcaps (the concern here is i.MX515 TO2, Marvell and nVidia chips
> which have broken NEON or no NEON at all).
>
> Shipping a libturboz.deb would not be a huge imposition. Given that
> Genesi provides system images and installers
> for Ubuntu we can install it by default (TO2 support for installation
> is going away with Natty). For Debian main, and other
> distributions which need to figure on supporting more platforms than
> ours, and for Ubuntu in the future if they ever
> get their act together on supporting real consumer products instead of
> just dev boards (looking at you too, Linaro!)
> then it will have to be a user installable option, but this might not
> be any more difficult than supplying a metapackage
> for the platform (like omap-extra) with some Recommends: line, which
> can only be resolved using an external
> repository (partner, or so) which is not enabled by default. As soon
> as someone enables that repo they will have
> the option at next update to "upgrade" their system to these new libraries.
>
> Unfortunately doing it from a distribution point of view takes away
> all the easiest potential for performance optimization,
> but I think the benefit of having it "standardized" is worth it.
>
> Speaking of standardization, we have had a LOT of customer complaints
> about xscreensaver-gl being installed
> by default on ARM platforms. In what world does the common ARM SoC
> ship with a full OpenGL implementation
> bolted on? Users are clicking some random 3D screensaver and
> complaining there is no acceleration - users do not
> understand the difference here between GL and GLES. As well as making
> new packages (libturboz in some other
> repo), it will have to be automated or automatically educating users
> to understand why they need this package and
> why, in fact in some cases, they may not actually need it.
>
> --
> Matt Sealey 
> Product Development Analyst, Genesi USA, Inc.
>
>
>
> On Tue, Mar 29, 2011 at 3:07 AM, Konstantinos Margaritis
>  wrote:
>> On 29 March 2011 10:53, Steve Langasek  wrote:
>>> Hi Konstantinos,
>>
>>> There must be some misunderstanding here; no license that prohibited
>>> distribution of binaries built from modified source would be considered a
>>> Free Software license, and zlib is certainly considered free. :)
>>
>> Yes, you're right, the problem is that a modified zlib would have to be 
>> clearly
>> marked as different -ie the package name would have to be different. This
>> would be easily solved by means of a Provides: field, but I'm unsure if the
>> differentiation also should include the libz.so filename. I was probably 
>> wrong
>> in my license interpretation in 2005, but I seem to remember it was something
>> like that that basically made me stop my work in vectorizing zlib :)
>>
>> I'd love to be corrected if it meant having a NEON-optimized zlib in 2011 :)
>>
>>> The only relevant requirements in the license (according to
>>> /usr/share/doc/zlib1g/copyright) are:
>>>
>>>  1. The origin of this software must not be misrepresented; you must not
>>>     claim that you wrote the original software. If you use this software
>>>     in a product, an acknowledgment in the product documentation would be
>>>     appreciated but is not required.
>>>  2. Altered source versions must be plainly marked as such, and must not be
>>>     misrepresented as being the original software.
>>
>> Yes, 2 is the problem, I think this was interpreted as having to rename
>> the package and possibly the .so name.
>>
>>> Are you looking at a different zlib license than this one?
>>
>> No, it's the same.
>>
>> Konstantinos
>>
>> ___
>> linaro-dev mailing list
>> linaro-dev@lists.linaro.org
>> http://lists.linaro.org/mailman/listinfo/linaro-dev
>>
>

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-30 Thread Matt Sealey
Konstantinos is right with regards to the package naming, we discussed
this with the zlib developers at the time.

It is easily solved with a packaging solution though, Provides: as he
said. You will have the same problems with
something libjpegturbo if it forces NEON, or libmatrix shipped with
the GLES test utilities we've seen in our bug
reports (btw that is a prime target for some NEON optimization,
Konstantinos can you throw some code in there)
or in fact any library that has NEON code which is not properly
inserted/overridden at runtime based on NEON
hwcaps (the concern here is i.MX515 TO2, Marvell and nVidia chips
which have broken NEON or no NEON at all).

Shipping a libturboz.deb would not be a huge imposition. Given that
Genesi provides system images and installers
for Ubuntu we can install it by default (TO2 support for installation
is going away with Natty). For Debian main, and other
distributions which need to figure on supporting more platforms than
ours, and for Ubuntu in the future if they ever
get their act together on supporting real consumer products instead of
just dev boards (looking at you too, Linaro!)
then it will have to be a user installable option, but this might not
be any more difficult than supplying a metapackage
for the platform (like omap-extra) with some Recommends: line, which
can only be resolved using an external
repository (partner, or so) which is not enabled by default. As soon
as someone enables that repo they will have
the option at next update to "upgrade" their system to these new libraries.

Unfortunately doing it from a distribution point of view takes away
all the easiest potential for performance optimization,
but I think the benefit of having it "standardized" is worth it.

Speaking of standardization, we have had a LOT of customer complaints
about xscreensaver-gl being installed
by default on ARM platforms. In what world does the common ARM SoC
ship with a full OpenGL implementation
bolted on? Users are clicking some random 3D screensaver and
complaining there is no acceleration - users do not
understand the difference here between GL and GLES. As well as making
new packages (libturboz in some other
repo), it will have to be automated or automatically educating users
to understand why they need this package and
why, in fact in some cases, they may not actually need it.

-- 
Matt Sealey 
Product Development Analyst, Genesi USA, Inc.



On Tue, Mar 29, 2011 at 3:07 AM, Konstantinos Margaritis
 wrote:
> On 29 March 2011 10:53, Steve Langasek  wrote:
>> Hi Konstantinos,
>
>> There must be some misunderstanding here; no license that prohibited
>> distribution of binaries built from modified source would be considered a
>> Free Software license, and zlib is certainly considered free. :)
>
> Yes, you're right, the problem is that a modified zlib would have to be 
> clearly
> marked as different -ie the package name would have to be different. This
> would be easily solved by means of a Provides: field, but I'm unsure if the
> differentiation also should include the libz.so filename. I was probably wrong
> in my license interpretation in 2005, but I seem to remember it was something
> like that that basically made me stop my work in vectorizing zlib :)
>
> I'd love to be corrected if it meant having a NEON-optimized zlib in 2011 :)
>
>> The only relevant requirements in the license (according to
>> /usr/share/doc/zlib1g/copyright) are:
>>
>>  1. The origin of this software must not be misrepresented; you must not
>>     claim that you wrote the original software. If you use this software
>>     in a product, an acknowledgment in the product documentation would be
>>     appreciated but is not required.
>>  2. Altered source versions must be plainly marked as such, and must not be
>>     misrepresented as being the original software.
>
> Yes, 2 is the problem, I think this was interpreted as having to rename
> the package and possibly the .so name.
>
>> Are you looking at a different zlib license than this one?
>
> No, it's the same.
>
> Konstantinos
>
> ___
> linaro-dev mailing list
> linaro-dev@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-dev
>

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-30 Thread David Rusling
Konstantinos, Steve,
  I think that it depends on how you interpret "plainly mark".I can imagine 
several ways of doing this 

- Naming the (binary) package explicitly
- install an additional README file / include in the binary package
- explicitly named source tar files

I think that the original authors did not want derived works being represented 
as their original work.So, need to avoid any confusion that the Neon 
enabled version with the original.   I'm with Steve, that marking sources, 
adding another notice etc is enough.

Are the original authors still involved?   It might be worth asking them...

Dave

Sent from yet another ARM powered mobile device

On 30 Mar 2011, at 01:04, Konstantinos Margaritis  wrote:

> On 30 March 2011 01:45, Steve Langasek  wrote:
>> I don't think this is a correct interpretation of the license.  You don't
>> have to change a package name to "plainly mark" the source as modified;
>> debian/copyright, changelogs, notices in the source files accomplish this.
>> This is done for packages all the time, not just for zlib.
> 
> from http://www.gzip.org/zlib/zlib_license.html
> 
> 2. Altered source versions must be plainly marked as such, and must not be
> misrepresented as being the original software.
> 
> I read this then as "you cannot distribute it as a replacement of the
> original zlib library".
> I'll take your word that it's not the case, but it still is confusing to me.
> 
> Konstantinos
> 
> ___
> linaro-dev mailing list
> linaro-dev@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-dev

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-29 Thread Konstantinos Margaritis
On 30 March 2011 01:45, Steve Langasek  wrote:
> I don't think this is a correct interpretation of the license.  You don't
> have to change a package name to "plainly mark" the source as modified;
> debian/copyright, changelogs, notices in the source files accomplish this.
> This is done for packages all the time, not just for zlib.

from http://www.gzip.org/zlib/zlib_license.html

2. Altered source versions must be plainly marked as such, and must not be
 misrepresented as being the original software.

I read this then as "you cannot distribute it as a replacement of the
original zlib library".
I'll take your word that it's not the case, but it still is confusing to me.

Konstantinos

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-29 Thread Steve Langasek
On Tue, Mar 29, 2011 at 11:07:05AM +0300, Konstantinos Margaritis wrote:
> On 29 March 2011 10:53, Steve Langasek  wrote:
> > Hi Konstantinos,

> > There must be some misunderstanding here; no license that prohibited
> > distribution of binaries built from modified source would be considered a
> > Free Software license, and zlib is certainly considered free. :)

> Yes, you're right, the problem is that a modified zlib would have to be
> clearly marked as different -ie the package name would have to be
> different.

I don't think this is a correct interpretation of the license.  You don't
have to change a package name to "plainly mark" the source as modified;
debian/copyright, changelogs, notices in the source files accomplish this.
This is done for packages all the time, not just for zlib.

> I was probably wrong in my license interpretation in 2005, but I seem to
> remember it was something like that that basically made me stop my work in
> vectorizing zlib :)

What a shame!  I think you could have gone ahead in good conscience :)

> I'd love to be corrected if it meant having a NEON-optimized zlib in 2011 :)

And I don't see any reason we can't go ahead with this now!

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: Digital signature
___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-29 Thread Riku Voipio
On Tue, Mar 29, 2011 at 11:07:05AM +0300, Konstantinos Margaritis wrote:
> On 29 March 2011 10:53, Steve Langasek  wrote:
> > Hi Konstantinos,
> 
> > There must be some misunderstanding here; no license that prohibited
> > distribution of binaries built from modified source would be considered a
> > Free Software license, and zlib is certainly considered free. :)
> 
> Yes, you're right, the problem is that a modified zlib would have to be 
> clearly
> marked as different -ie the package name would have to be different. 

Well, Debian zlib is already modified, hence the "dfsg" in the source and
binary versions.



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-29 Thread Konstantinos Margaritis
On 29 March 2011 10:53, Steve Langasek  wrote:
> Hi Konstantinos,

> There must be some misunderstanding here; no license that prohibited
> distribution of binaries built from modified source would be considered a
> Free Software license, and zlib is certainly considered free. :)

Yes, you're right, the problem is that a modified zlib would have to be clearly
marked as different -ie the package name would have to be different. This
would be easily solved by means of a Provides: field, but I'm unsure if the
differentiation also should include the libz.so filename. I was probably wrong
in my license interpretation in 2005, but I seem to remember it was something
like that that basically made me stop my work in vectorizing zlib :)

I'd love to be corrected if it meant having a NEON-optimized zlib in 2011 :)

> The only relevant requirements in the license (according to
> /usr/share/doc/zlib1g/copyright) are:
>
>  1. The origin of this software must not be misrepresented; you must not
>     claim that you wrote the original software. If you use this software
>     in a product, an acknowledgment in the product documentation would be
>     appreciated but is not required.
>  2. Altered source versions must be plainly marked as such, and must not be
>     misrepresented as being the original software.

Yes, 2 is the problem, I think this was interpreted as having to rename
the package and possibly the .so name.

> Are you looking at a different zlib license than this one?

No, it's the same.

Konstantinos

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-29 Thread Steve Langasek
Hi Konstantinos,

On Tue, Mar 29, 2011 at 10:21:53AM +0300, Konstantinos Margaritis wrote:
> On 28 March 2011 07:52, Jim Huang  wrote:

> The problem is the zlib license, it forbids distributing compiled
> versions that are modified from the original source, such optimizations
> can go in the contrib folder, but it's of little use to the average user.

There must be some misunderstanding here; no license that prohibited
distribution of binaries built from modified source would be considered a
Free Software license, and zlib is certainly considered free. :)

The only relevant requirements in the license (according to
/usr/share/doc/zlib1g/copyright) are:

  1. The origin of this software must not be misrepresented; you must not
 claim that you wrote the original software. If you use this software
 in a product, an acknowledgment in the product documentation would be
 appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
 misrepresented as being the original software.

Are you looking at a different zlib license than this one?

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: Digital signature
___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-29 Thread Konstantinos Margaritis
On 28 March 2011 07:52, Jim Huang  wrote:
> - zlib
>  Using SIMD, we can optimize 'copy / repeat an existing sequence' in
> LZ-style encoding.
>  The reference Intel SSE2 optimization patch is attached in this mail.

Regarding zlib in particular, in 2005 I had done an altivec port of this,
apart from vectorizing Adler32 hashing function (which was ~2x faster than
the C version [1], there are ~6 functions that are worth optimizing -as I found
out during profiling the code. These functions are in deflate.c and
inflate.c iirc,
I have to search for the old tarball, it's here somewhere. Performance increase
was from 20% to 50%, using plain C altivec code. I guess it should be similar
with NEON. IMHO, it's worth it, but:

The problem is the zlib license, it forbids distributing compiled
versions that are modified from the original source, such optimizations
can go in the contrib folder, but it's of little use to the average user.

Konstantinos

[1]: http://www.freevec.org/old/whitepapers/Adler32-Altivec.pdf

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Libraries with NEON backends

2011-03-27 Thread Jim Huang
On 28 March 2011 05:09, Michael Hope  wrote:
> Hi there.  I'm looking for areas where the toolchain could generate
> faster code, and a good way of doing that is seeing how compiled code
> does against the best hand-written code.  I know of skia, ffmpeg,
> pixman, Orc, and efl - what others are out there?
>

hi Michael,

Great motivation to optimize the existing libraries by NEON !

As far as I know, Android depends on several libraries, and some of
them are computing bound:

- libpixelflinger -- a bit like pixman
  There is no official document about PixelFlinger, but you can always
check out its source:
http://android.git.kernel.org/?p=platform/system/core.git;a=summary
  I submitted one NEON optimization patch for libpixelflinger to AOSP before:
https://review.source.android.com//#change,16358

- zlib
  Using SIMD, we can optimize 'copy / repeat an existing sequence' in
LZ-style encoding.
  The reference Intel SSE2 optimization patch is attached in this mail.

Sincerely,
-jserv
diff -urNp zlib-1.2.5-orig/deflate.c zlib-1.2.5/deflate.c
--- zlib-1.2.5-orig/deflate.c   2010-04-20 12:12:21.0 +0800
+++ zlib-1.2.5/deflate.c2010-07-26 03:53:34.0 +0800
@@ -49,6 +49,17 @@
 
 /* @(#) $Id$ */
 
+/* We can use 2-byte chunks only if 'unsigned short' has been defined
+ * appropriately and MAX_MATCH has the default value.
+ */
+#ifdef UNALIGNED_OK
+#  include 
+#  include "zutil.h"
+#  if (MAX_MATCH != 258) || (USHRT_MAX != 0x)
+#undef UNALIGNED_OK
+#  endif
+#endif
+
 #include "deflate.h"
 
 const char deflate_copyright[] =
@@ -1119,7 +1130,8 @@ local uInt longest_match(s, cur_match)
  * However the length of the match is limited to the lookahead, so
  * the output of deflate is not affected by the uninitialized values.
  */
-#if (defined(UNALIGNED_OK) && MAX_MATCH == 258)
+#ifdef UNALIGNED_OK
+
 /* This code assumes sizeof(unsigned short) == 2. Do not use
  * UNALIGNED_OK if your compiler uses a different size.
  */
diff -urNp zlib-1.2.5-orig/deflate.h zlib-1.2.5/deflate.h
--- zlib-1.2.5-orig/deflate.h   2010-04-19 12:00:46.0 +0800
+++ zlib-1.2.5/deflate.h2010-07-26 03:53:34.0 +0800
@@ -251,9 +251,12 @@ typedef struct internal_state {
 ulg bits_sent;  /* bit length of compressed data sent mod 2^32 */
 #endif
 
-ush bi_buf;
+ulg bi_buf;
 /* Output buffer. bits are inserted starting at the bottom (least
- * significant bits).
+ * significant bits).  Room for at least two short values to allow
+ * for a simpler overflow handling.  However, if more than 16 bits 
+ * have been buffered, it will be flushed and* and no more then 16 
+ * bits will be in use afterwards.
  */
 int bi_valid;
 /* Number of valid bits in bi_buf.  All bits above the last valid bit
@@ -274,6 +277,20 @@ typedef struct internal_state {
  */
 #define put_byte(s, c) {s->pending_buf[s->pending++] = (c);}
 
+/* Output a short LSB first on the stream.
+ * IN assertion: there is enough room in pendingBuf.
+ */
+#if defined(LITTLE_ENDIAN) && defined(UNALIGNED_OK)
+#  define put_short(s, w) { \
+*(ush*)(s->pending_buf + s->pending) = (ush)(w);\
+s->pending += 2; \
+}
+#else
+#  define put_short(s, w) { \
+put_byte(s, (uch)((w) & 0xff)); \
+put_byte(s, (uch)((ush)(w) >> 8)); \
+}
+#endif
 
 #define MIN_LOOKAHEAD (MAX_MATCH+MIN_MATCH+1)
 /* Minimum amount of lookahead, except at the end of the input file.
diff -urNp zlib-1.2.5-orig/inffast.c zlib-1.2.5/inffast.c
--- zlib-1.2.5-orig/inffast.c   2010-04-19 12:16:23.0 +0800
+++ zlib-1.2.5/inffast.c2010-07-26 03:53:34.0 +0800
@@ -1,5 +1,6 @@
 /* inffast.c -- fast decoding
- * Copyright (C) 1995-2008, 2010 Mark Adler
+ * Copyright (C) 1995-2004, 2010 Mark Adler
+ *   2010 Optimizations by Stefan Fuhrmann
  * For conditions of distribution and use, see copyright notice in zlib.h
  */
 
@@ -10,16 +11,35 @@
 
 #ifndef ASMINF
 
+/* This is a highly optimized implementation of the decoder function for
+ * large code blocks. It cannot be used to decode close to the end of
+ * input nor output buffers (see below).
+ *
+ * Before trying to hand-tune assembly code for your target, you should
+ * make sure that alignment, endianess, word size optimizations etc. have
+ * already been enabled for the respective target platform. 
+ 
+ * For MS VC++ 2008, the performance gain of specialized code against
+ * DISABLE_INFLATE_FAST_OPTIMIZATIONS (base line) is as follows:
+ *
+ * x86 (32 bit):+60% throughput
+ * x64 (64 bit):+70% throughput
+ *
+ * Measurements were taken on a Core i7 CPU with a mix of small and large
+ * buffers (110MB total) or varying content and an average compression rate 
+ * of 2.2 .
+ */
+
 /* Allow machine dependent optimization for post-increment or pre-increment.
-   Based on testing to date,
-   Pre-increment preferred for:
-   - PowerPC G3 (Adler)
-   - MIPS R5000 (Randers-Pehrson)

Libraries with NEON backends

2011-03-27 Thread Michael Hope
Hi there.  I'm looking for areas where the toolchain could generate
faster code, and a good way of doing that is seeing how compiled code
does against the best hand-written code.  I know of skia, ffmpeg,
pixman, Orc, and efl - what others are out there?

Thanks for any input,

-- Michael

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev