> On Jan 21, 2021, at 6:48 PM, Justin Pryzby <pry...@telsasoft.com> wrote:
> 
> @cfbot: rebased
> <0001-Reorganize-pglz-compression-code.patch>

Review comments.

First, I installed a build from master without this patch, created a test 
installation with lots of compressed text and array columns, upgraded the 
binaries to a build with this patch included, and tried to find problems with 
the data left over from the pre-patch binaries.  Everything checks out.  This 
is on little-endian mac osx intel core i9, not on any ARM platform that you are 
targeting with portions of the patch.

+/**************************************
+ *  CPU Feature Detection            *
+ **************************************/
+/* PGLZ_FORCE_MEMORY_ACCESS
+ * By default, access to unaligned memory is controlled by `memcpy()`, which 
is safe and portable.
+ * Unfortunately, on some target/compiler combinations, the generated assembly 
is sub-optimal.
+ * The below switch allow to select different access method for improved 
performance.
+ * Method 0 (default) : use `memcpy()`. Safe and portable.
+ * Method 1 : direct access. This method is portable but violate C standard.
+ *         It can generate buggy code on targets which assembly generation 
depends on alignment.
+ *         But in some circumstances, it's the only known way to get the most 
performance (ie GCC + ARMv6)
+ * See 
https://fastcompression.blogspot.fr/2015/08/accessing-unaligned-memory.html for 
details.
+ * Prefer these methods in priority order (0 > 1)
+ */

The link to blogspot.fr has a lot more information than your summary in the 
code comments.  It might be hard to understand this comment if the blogspot 
article were ever to disappear.  Perhaps you could include a bit more of the 
relevant details?

+#ifndef PGLZ_FORCE_MEMORY_ACCESS   /* can be defined externally */
+#if defined(__GNUC__) && \
+  ( defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) || 
defined(__ARM_ARCH_6K__) \
+  || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) || 
defined(__ARM_ARCH_6T2__) )
+#define PGLZ_FORCE_MEMORY_ACCESS 1
+#endif
+#endif

I can understand wanting to set this on gcc + ARMv6, but doesn't this belong in 
a configure script rather than directly in the compression code?

The blogspot article indicates that the author lied about alignment to the 
compiler when using gcc on ARMv6, thereby generating a fast load instruction 
which happens to work on ARMv6.  You appear to be using that same approach.  
Your #if defined(__GNUC__), seems to assume that all future versions of gcc 
will generate the instructions that you want, and not start generating some 
other set of instructions.  Wouldn't you at least need a configure test to 
verify that the version of gcc being used generates the desired assembly?  Even 
then, you'd be banking on gcc doing the same thing for the test code and for 
the pglz code, which I guess might not be true.  Have you considered using 
inline assembly instead?



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Reply via email to