zeroshade commented on code in PR #37785:
URL: https://github.com/apache/arrow/pull/37785#discussion_r1333184383
##########
go/parquet/internal/bmi/bitmap_neon_arm64.s:
##########
@@ -27,8 +27,8 @@ LBB0_2:
MOVD R0, res+16(FP)
RET
LBB0_4:
- WORD $0xaa1f03e0 // mov x0, xzr
WORD $0xa8c17bfd // ldp x29, x30, [sp], #16
+ MOVD ZR, res+16(FP)
Review Comment:
@cyb70289 I'd be fine with removing the arm assembly implementation of the
pext function. I haven't benchmarked the lookup table vs the ARM assembly
version yet. but I agree the lookup table would likely be faster.
The other issue i'm having is the bit packing side of things, though I think
i may have figured out a solution to it... and i've learned more about ARM64
assembly in the last few days than i ever intended on learning LOL as I've
needed to do a lot of handcoding assembly to get this to work. Looks like the
issue boils down to needing anything with a label to be explicit in the
assembly rather than using the `WORD`/`BYTE` syntax to pass them to the
processor stream so that the labels get translated to the correct PC locations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]