> If you want to use some of the instructions as soon as possible I can wait, and workaround with hardcoded words in the meantime.
> I'm working on these issues. Glad to know that this is tracking along. I'm looking forward to trying out SVE on Graviton 3. On Friday, January 21, 2022 at 3:57:45 PM UTC+13 eric...@arm.com wrote: > > I see there's an existing issue to add a bunch of Neon floating point > instructions: > > https://github.com/golang/go/issues/41092 > > > > I actually spent a while having a go at adding the instructions myself, > but couldn't figure it out. > > > > I also see that there is also a proposal and a MR to refactor the Arm64 > assembler. > > https://github.com/golang/go/issues/44734 > > > > Is there any ongoing work there, or has that effort stalled? > > I'm working on these issues. The plan is refactoring the assembler first, > the newly designed assembler should make it easier to add instructions. As > you said, there are a large number of arm64 instructions (NEON, SVE) that > are not supported and we want to spend as little effort as possible on this. > > But this will take some time, although the code is already under review. > If you want to use some of the instructions as soon as possible, please > submit an issue, or use word to workaround it first. Thanks~ > 在2022年1月21日星期五 UTC+8 10:15:26<gr...@montoux.com> 写道: > >> Hi team, >> >> I'm a recent Gopher, and have had great success over the past year >> developing an insurance modelling application. The tooling is great, thanks >> to the team for creating it. >> >> 1) SIMD Workflow >> >> I've got hot functions in my application which are doing element wise >> operations on float slices. Some are just element-wise addition, and >> multiplication, some are slightly more complicated. >> >> I'm currently deploying on AWS Lambda X86 which as AX2 support (Xeon >> Haswell+), but I'm also experimenting with Arm64 (Graviton 2), and would >> also like to do some benchmarking on Graviton 3 (only available on EC2 ATM). >> >> I've been experimenting with implementing the hot functions in Go's ASM >> dialect, and using some simple code generation, to handle all the >> repetition. Nothing fancy, not much more than string templating. The >> results have been pretty good, but the workflow is pretty slow. >> >> As a side project I've been toying with the idea of writing a slightly >> more advanced tool, that could read a "SIMD kernel" written as a simple Go >> function with a specific form, and generate ASM implementations for it. No >> fancy optimisations, just loop unrolling and vector instructions. >> >> For example: >> >> import . asmgen >> >> // Implementation in a generated .s file >> func Foo(dst []float32, a float32, x, y []float32) >> >> // AST used as input to ASM codegen >> func kernelFoo(i int, dst []float32, a float32, x, y []float32) { >> dst[i] = min(a * x[i], y[i]) >> } >> >> In reality, I probably don't have the time to do that, but it does feel >> like something minimal that would actually cover most of my immediate use >> cases is not a huge amount of work. >> >> I guess this is basically just a limited form of c2goasm . See: >> https://github.com/minio/c2goasm >> >> So maybe I should just use that, however including big blobs of hex >> encoded ASM doesn't seem great either. See: >> https://github.com/apache/arrow/blob/master/go/parquet/internal/utils/min_max_neon_arm64.s >> >> So apologies that this question is a bit vague and rambly. But the >> workflow for SIMD here is pretty slow, and it feels like there could be a >> better way to solve this. So I'm basically just reaching out to see if >> anyone else has been working on this, or thinking about it, or has ideas >> about better solutions. >> >> >> 2) Arm64 ASM Neon Instructions: >> >> One problem that's come up, is there's a bunch of ARM instructions which >> aren't defined in Go's assembler. So it looks like I'm going to have to >> write some code to generate the hex for these. I can probably copy the >> approach used here: >> https://github.com/minio/asm2plan9s/blob/master/asm2plan9s_arm64.go >> >> For example - I'm currently writing: >> >> WORD $0x4E24D400 // fadd v0.4s, v0.4s, v4.4s >> >> But would like write: >> >> VFADD V0.S4, V0.S4, V4.S4 >> >> I see there's an existing issue to add a bunch of Neon floating point >> instructions: >> https://github.com/golang/go/issues/41092 >> >> I actually spent a while having a go at adding the instructions myself, >> but couldn't figure it out. >> >> I also see that there is also a proposal and a MR to refactor the Arm64 >> assembler. >> https://github.com/golang/go/issues/44734 >> >> Is there any ongoing work there, or has that effort stalled? >> >> Anyways, thanks for reading my big wall of text. >> >> Cheers, >> Greg. >> > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/473da81d-be50-4f24-93cb-407f7b9726fcn%40googlegroups.com.