> If you want to use some of the instructions as soon as possible

I can wait, and workaround with hardcoded words in the meantime.

> I'm working on these issues.

Glad to know that this is tracking along. 

I'm looking forward to trying out SVE on Graviton 3.

On Friday, January 21, 2022 at 3:57:45 PM UTC+13 eric...@arm.com wrote:

> > I see there's an existing issue to add a bunch of Neon floating point 
> instructions:
> > https://github.com/golang/go/issues/41092
> >
> > I actually spent a while having a go at adding the instructions myself, 
> but couldn't figure it out.
> >
> > I also see that there is also a proposal and a MR to refactor the Arm64 
> assembler.
> > https://github.com/golang/go/issues/44734
> >
> > Is there any ongoing work there, or has that effort stalled?
>
> I'm working on these issues. The plan is refactoring the assembler first, 
> the newly designed assembler should make it easier to add instructions. As 
> you said, there are a large number of arm64 instructions (NEON, SVE) that 
> are not supported and we want to spend as little effort as possible on this.
>
> But this will take some time, although the code is already under review. 
> If you want to use some of the instructions as soon as possible, please 
> submit an issue, or use word to workaround it first.  Thanks~
> 在2022年1月21日星期五 UTC+8 10:15:26<gr...@montoux.com> 写道:
>
>> Hi team,
>>
>> I'm a recent Gopher, and have had great success over the past year 
>> developing an insurance modelling application. The tooling is great, thanks 
>> to the team for creating it.
>>
>> 1) SIMD Workflow
>>
>> I've got hot functions in my application which are doing element wise 
>> operations on float slices. Some are just element-wise addition, and 
>> multiplication, some are slightly more complicated.
>>
>> I'm currently deploying on AWS Lambda X86 which as AX2 support (Xeon 
>> Haswell+), but I'm also experimenting with Arm64 (Graviton 2), and would 
>> also like to do some benchmarking on Graviton 3 (only available on EC2 ATM).
>>
>> I've been experimenting with implementing the hot functions in Go's ASM 
>> dialect, and using some simple code generation, to handle all the 
>> repetition. Nothing fancy, not much more than string templating. The 
>> results have been pretty good, but the workflow is pretty slow.
>>
>> As a side project I've been toying with the idea of writing a slightly 
>> more advanced tool, that could read a "SIMD kernel" written as a simple Go 
>> function with a specific form, and generate ASM implementations for it. No 
>> fancy optimisations, just loop unrolling and vector instructions.
>>
>> For example:
>>
>> import . asmgen
>>
>> // Implementation in a generated .s file
>> func Foo(dst []float32, a float32, x, y []float32)
>>
>> // AST used as input to ASM codegen
>> func kernelFoo(i int, dst []float32, a float32, x, y []float32) {
>> dst[i] = min(a * x[i], y[i])
>> }
>>
>> In reality, I probably don't have the time to do that, but it does feel 
>> like something minimal that would actually cover most of my immediate use 
>> cases is not a huge amount of work.
>>
>> I guess this is basically just a limited form of c2goasm . See: 
>> https://github.com/minio/c2goasm
>>
>> So maybe I should just use that, however including big blobs of hex 
>> encoded ASM doesn't seem great either. See: 
>> https://github.com/apache/arrow/blob/master/go/parquet/internal/utils/min_max_neon_arm64.s
>>
>> So apologies that this question is a bit vague and rambly. But the 
>> workflow for SIMD here is pretty slow, and it feels like there could be a 
>> better way to solve this. So I'm basically just reaching out to see if 
>> anyone else has been working on this, or thinking about it, or has ideas 
>> about better solutions.
>>
>>
>> 2) Arm64 ASM Neon Instructions:
>>
>> One problem that's come up, is there's a bunch of ARM instructions which 
>> aren't defined in Go's assembler. So it looks like I'm going to have to 
>> write some code to generate the hex for these. I can probably copy the 
>> approach used here: 
>> https://github.com/minio/asm2plan9s/blob/master/asm2plan9s_arm64.go
>>
>> For example - I'm currently writing:
>>
>> WORD $0x4E24D400 // fadd v0.4s, v0.4s, v4.4s
>>
>> But would like write:
>>
>> VFADD V0.S4, V0.S4, V4.S4
>>
>> I see there's an existing issue to add a bunch of Neon floating point 
>> instructions:
>> https://github.com/golang/go/issues/41092
>>
>> I actually spent a while having a go at adding the instructions myself, 
>> but couldn't figure it out.
>>
>> I also see that there is also a proposal and a MR to refactor the Arm64 
>> assembler.
>> https://github.com/golang/go/issues/44734
>>
>> Is there any ongoing work there, or has that effort stalled?
>>
>> Anyways, thanks for reading my big wall of text.
>>
>> Cheers,
>> Greg.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/473da81d-be50-4f24-93cb-407f7b9726fcn%40googlegroups.com.

Reply via email to