On Tue, Aug 30, 2022 at 12:17 AM Nathan Bossart
wrote:
> Thanks! I've attached a follow-up patch with a couple of small
> suggestions.
Pushed, thanks!
--
John Naylor
EDB: http://www.enterprisedb.com
On Mon, Aug 29, 2022 at 05:49:46PM +0700, John Naylor wrote:
> Bowerbird just reported the same error, so I went ahead and pushed a
> fix with this.
Thanks! I've attached a follow-up patch with a couple of small
suggestions.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
diff
On Mon, Aug 29, 2022 at 4:28 PM John Naylor
wrote:
>
> Here's the simplest fix I can think of:
>
> /*
> * Exactly like vector8_is_highbit_set except for the input type, so
> it still looks
> * at each _byte_ separately.
> *
> * XXX x86 uses the same underlying type for vectors with 8-bit,
>
On Mon, Aug 29, 2022 at 3:19 PM John Naylor
wrote:
>
> It turns out MSVC animal drongo doesn't like this cast -- on x86 they
> are the same underlying type. Will look into that as more results come
> in.
Here's the simplest fix I can think of:
/*
* Exactly like vector8_is_highbit_set except
On Mon, Aug 29, 2022 at 11:25 AM John Naylor
wrote:
> +static inline bool
> +vector32_is_highbit_set(const Vector32 v)
> +{
> +#ifdef USE_SSE2
> + return (_mm_movemask_epi8(v) & 0x) != 0;
> +#endif
> +}
>
> I'm not sure why we need this function -- AFAICS it just adds more
> work on x86 for
On Mon, Aug 29, 2022 at 12:44 PM Nathan Bossart
wrote:
> [v6]
Pushed with a couple comment adjustments, let's see what the build
farm thinks...
--
John Naylor
EDB: http://www.enterprisedb.com
On Mon, Aug 29, 2022 at 11:25:50AM +0700, John Naylor wrote:
> + uint32 nelem_per_vector = sizeof(Vector32) / sizeof(uint32);
> + uint32 nelem_per_iteration = 4 * nelem_per_vector;
>
> Using local #defines would be my style. I don't have a reason to
> object to this way, but adding const makes
John Naylor writes:
> I wonder if we should explain briefly what saturating arithmetic is. I
> had never encountered it outside of a SIMD programming context.
+1, it's at least worth a sentence to define the term.
regards, tom lane
On Sun, Aug 28, 2022 at 10:58 AM Nathan Bossart
wrote:
>
> Here is a new patch set in which I've attempted to address all feedback.
Looks in pretty good shape. Some more comments:
+ uint32 nelem_per_vector = sizeof(Vector32) / sizeof(uint32);
+ uint32 nelem_per_iteration = 4 * nelem_per_vector;
Here is a new patch set in which I've attempted to address all feedback.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From a5f381097819db05b6e47418597cd56bab411fad Mon Sep 17 00:00:00 2001
From: Nathan Bossart
Date: Thu, 25 Aug 2022 22:18:30 -0700
Subject: [PATCH v5 1/2]
On Sun, Aug 28, 2022 at 10:39:09AM +1200, Thomas Munro wrote:
> On Sun, Aug 28, 2022 at 10:12 AM Nathan Bossart
> wrote:
>> Yup. The problem is that AFAICT there's no equivalent to
>> _mm_movemask_epi8() on aarch64, so you end up with something like
>>
>> vmaxvq_u8(vandq_u8(v,
On Sun, Aug 28, 2022 at 10:12 AM Nathan Bossart
wrote:
> Yup. The problem is that AFAICT there's no equivalent to
> _mm_movemask_epi8() on aarch64, so you end up with something like
>
> vmaxvq_u8(vandq_u8(v, vector8_broadcast(0x80))) != 0
>
> But for pg_lfind32(), we really just want to
On Sat, Aug 27, 2022 at 05:18:34PM -0400, Tom Lane wrote:
> In short, I think the critical part of 0002 needs to look more like
> this:
>
> +#elif defined(__aarch64__) && defined(__ARM_NEON)
> +/*
> + * We use the Neon instructions if the compiler provides access to them
> + * (as indicated by
Thanks for taking a look.
On Sat, Aug 27, 2022 at 01:59:06PM +0700, John Naylor wrote:
> I don't forsee any use of emulating vector registers with uint64 if
> they only hold two ints. I wonder if it'd be better if all vector32
> functions were guarded with #ifndef NO_USE_SIMD. (I wonder if
>
I spent a bit more time researching the portability implications of
this patch. I think that we should check __ARM_NEON before #including
; there is authoritative documentation out there telling
you to, eg [1], and I can see no upside at all to not checking.
We cannot check *only* __ARM_NEON,
On Sat, Aug 27, 2022 at 1:24 AM Nathan Bossart wrote:
>
> Here is a rebased patch set that applies to HEAD.
0001:
#define USE_NO_SIMD
typedef uint64 Vector8;
+typedef uint64 Vector32;
#endif
I don't forsee any use of emulating vector registers with uint64 if
they only hold two ints. I
On Thu, Aug 25, 2022 at 11:13:47PM -0700, Nathan Bossart wrote:
> Here is a new patch set that applies on top of v9-0001 in the
> json_lex_string patch set [0] and v3 of the is_valid_ascii patch [1].
Here is a rebased patch set that applies to HEAD.
--
Nathan Bossart
Amazon Web Services:
Here is a new patch set that applies on top of v9-0001 in the
json_lex_string patch set [0] and v3 of the is_valid_ascii patch [1].
[0]
https://postgr.es/m/CAFBsxsFV4v802idV0-Bo%3DV7wLMHRbOZ4er0hgposhyGCikmVGA%40mail.gmail.com
[1]
On Fri, Aug 26, 2022 at 10:45:10AM +0700, John Naylor wrote:
> On Thu, Aug 25, 2022 at 11:57 AM Nathan Bossart
> wrote:
>> The ARM literature appears to indicate that Neon support is pretty standard
>> on aarch64, and AFAICT it's pretty common to just assume it's available.
>
> This doesn't
On Thu, Aug 25, 2022 at 11:57 AM Nathan Bossart
wrote:
>
> On Thu, Aug 25, 2022 at 10:38:34AM +0700, John Naylor wrote:
> > On Thu, Aug 25, 2022 at 1:01 AM Nathan Bossart
> > wrote:
> >> On Wed, Aug 24, 2022 at 11:07:03AM +0700, John Naylor wrote:
> >> > - Can a user on ARM64 ever get a runtime
On Thu, Aug 25, 2022 at 10:38:34AM +0700, John Naylor wrote:
> On Thu, Aug 25, 2022 at 1:01 AM Nathan Bossart
> wrote:
>> On Wed, Aug 24, 2022 at 11:07:03AM +0700, John Naylor wrote:
>> > - Can a user on ARM64 ever get a runtime fault if the machine attempts
>> > to execute NEON instructions?
>>
On Thu, Aug 25, 2022 at 1:01 AM Nathan Bossart wrote:
>
> On Wed, Aug 24, 2022 at 11:07:03AM +0700, John Naylor wrote:
> > The important thing is: if we compile with __aarch64__ as a target:
> > - Will the compiler emit the intended instructions from the intrinsics
> > without extra flags?
>
> My
On Wed, Aug 24, 2022 at 11:07:03AM +0700, John Naylor wrote:
> The important thing is: if we compile with __aarch64__ as a target:
> - Will the compiler emit the intended instructions from the intrinsics
> without extra flags?
My testing with GCC and Clang did not require any extra flags. GCC
On Tue, Aug 23, 2022 at 4:15 AM Nathan Bossart wrote:
>
> On Mon, Aug 22, 2022 at 11:50:35AM +0700, John Naylor wrote:
> > Is this also ever defined on 32-bit? If so, is it safe, meaning the
> > compiler will not emit these instructions without additional flags?
> > I'm wondering if __aarch64__
On Mon, Aug 22, 2022 at 11:50:35AM +0700, John Naylor wrote:
> On Sat, Aug 20, 2022 at 5:28 AM Nathan Bossart
> wrote:
>> Thanks for the pointer. GCC, Clang, and the Arm compiler all seem to
>> define __ARM_NEON, so here is a patch that uses that instead.
>
> Is this also ever defined on
On Sat, Aug 20, 2022 at 5:28 AM Nathan Bossart wrote:
>
> On Fri, Aug 19, 2022 at 02:26:02PM -0700, Andres Freund wrote:
> > Are you sure there's not an appropriate define for us to use here instead
> > of a
> > configure test? E.g.
> >
> > echo|cc -dM -P -E -|grep -iE 'arm|aarch'
> > ...
> >
On Fri, Aug 19, 2022 at 02:26:02PM -0700, Andres Freund wrote:
> Are you sure there's not an appropriate define for us to use here instead of a
> configure test? E.g.
>
> echo|cc -dM -P -E -|grep -iE 'arm|aarch'
> ...
> #define __AARCH64_SIMD__ 1
> ...
> #define __ARM_NEON 1
> #define
Hi,
On 2022-08-19 13:08:29 -0700, Nathan Bossart wrote:
> I've tested the patch on a recent macOS (M1 Pro) and Amazon Linux
> (Graviton2), and I've confirmed that the instructions aren't used on a
> Linux/Intel machine. I did add a new configure check to see if the
> relevant intrinsics are
Hi hackers,
This is a follow-up for recent changes that optimized [sub]xip lookups in
XidInMVCCSnapshot() on Intel hardware [0] [1]. I've attached a patch that
uses ARM Advanced SIMD (Neon) intrinsic functions where available to speed
up the search. The approach is nearly identical to the SSE2
29 matches
Mail list logo