Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]
On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >> . SUSUB : Saturating unsigned subtraction. >> . SSUB: Saturating signed subtraction. >> . UMAX: Unsigned max >> . UMIN: Unsigned min. >> >> >> New vector operators are applicable to only integral types since their >> values wraparound in over/underflowing scenarios after setting appropriate >> status flags. For floating point types, as per IEEE 754 specs there are >> multiple schemes to handler underflow, one of them is gradual underflow >> which transitions the value to subnormal range. Similarly, overflow >> implicitly saturates the floating-point value to an Infinite value. >> >> As the name suggests, these are saturating operations, i.e. the result of >> the computation is strictly capped by lower and upper bounds of the result >> type and is not wrapped around in underflowing or overflowing scenarios. >> >> Summary of changes: >> - Java side implementation of new vector operators. >> - Add new scalar saturating APIs for each of the above saturating vector >> operator in corresponding primitive box classes, fallback implementation of >> vector operators is based over it. >> - C2 compiler IR and inline expander changes. >> - Optimized x86 backend implementation for new vector operators and their >> predicated counterparts. >> - Extends existing VectorAPI Jtreg test suite to cover new operations. >> >> Kindly review and share your feedback. >> >> Best Regards, >> PS: Intrinsification and auto-vectorization of new core-lib API will be >> addressed separately in a follow-up patch. >> >> [1] https://mail.openjdk.org/pipermail/panama-dev/2024-May/020408.html > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > Review comments resolutions. src/hotspot/share/opto/vectornode.hpp line 150: > 148: class SaturatingVectorNode : public VectorNode { > 149: private: > 150: bool _is_unsigned; Would it be better to make it a `const bool`? src/hotspot/share/opto/vectornode.hpp line 172: > 170: class SaturatingAddVBNode : public SaturatingVectorNode { > 171: public: > 172: SaturatingAddVBNode(Node* in1, Node* in2, const TypeVect* vt, bool > is_unsigned) : SaturatingVectorNode(in1,in2,vt,is_unsigned) {} Style: spaces after the commas in `SaturatingVectorNode(in1,in2,vt,is_unsigned)` - PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1723459735 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1723463554
Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]
On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >> . SUSUB : Saturating unsigned subtraction. >> . SSUB: Saturating signed subtraction. >> . UMAX: Unsigned max >> . UMIN: Unsigned min. >> >> >> New vector operators are applicable to only integral types since their >> values wraparound in over/underflowing scenarios after setting appropriate >> status flags. For floating point types, as per IEEE 754 specs there are >> multiple schemes to handler underflow, one of them is gradual underflow >> which transitions the value to subnormal range. Similarly, overflow >> implicitly saturates the floating-point value to an Infinite value. >> >> As the name suggests, these are saturating operations, i.e. the result of >> the computation is strictly capped by lower and upper bounds of the result >> type and is not wrapped around in underflowing or overflowing scenarios. >> >> Summary of changes: >> - Java side implementation of new vector operators. >> - Add new scalar saturating APIs for each of the above saturating vector >> operator in corresponding primitive box classes, fallback implementation of >> vector operators is based over it. >> - C2 compiler IR and inline expander changes. >> - Optimized x86 backend implementation for new vector operators and their >> predicated counterparts. >> - Extends existing VectorAPI Jtreg test suite to cover new operations. >> >> Kindly review and share your feedback. >> >> Best Regards, >> PS: Intrinsification and auto-vectorization of new core-lib API will be >> addressed separately in a follow-up patch. >> >> [1] https://mail.openjdk.org/pipermail/panama-dev/2024-May/020408.html > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > Review comments resolutions. src/hotspot/share/opto/addnode.hpp line 404: > 402: > //--UMaxINode--- > 403: // Maximum of 2 unsigned integers. > 404: class UMaxINode : public Node { Would it be better to define `max_opcode()` and `min_opcode()` for `UMaxINode` and `UMinINode`? These are used to find commutative patterns in `AddNode::Ideal()` and `MulNode::Ideal()` and optimize them. - PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1723414044
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v6]
On Wed, 22 May 2024 14:47:43 GMT, Yudi Zheng wrote: >> Moving array construction within BigInteger.implMultiplyToLen intrinsic >> candidate to its caller simplifies the intrinsic implementation in JIT >> compiler. > > Yudi Zheng has updated the pull request incrementally with one additional > commit since the last revision: > > address comments. Tested tier1 on aarch64 and no failures. Also no regressions (or even gain) on aarch64 with the BigInteger testcase you mentioned. I think copyright year has not been updated for some of the files but I guess that's up to you. - PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2127335211
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v6]
On Wed, 22 May 2024 14:47:43 GMT, Yudi Zheng wrote: >> Moving array construction within BigInteger.implMultiplyToLen intrinsic >> candidate to its caller simplifies the intrinsic implementation in JIT >> compiler. > > Yudi Zheng has updated the pull request incrementally with one additional > commit since the last revision: > > address comments. src/java.base/share/classes/java/math/BigInteger.java line 1836: > 1834: > 1835: if (z == null || z.length < (xlen + ylen)) > 1836: z = new int[xlen + ylen]; Style: only 4 spaces indentation - PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1611422191
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v6]
On Wed, 22 May 2024 14:47:43 GMT, Yudi Zheng wrote: >> Moving array construction within BigInteger.implMultiplyToLen intrinsic >> candidate to its caller simplifies the intrinsic implementation in JIT >> compiler. > > Yudi Zheng has updated the pull request incrementally with one additional > commit since the last revision: > > address comments. src/hotspot/share/opto/library_call.cpp line 5925: > 5923: // Set the original stack and the reexecute bit for the interpreter > to reexecute > 5924: // the bytecode that invokes BigInteger.multiplyToLen() if > deoptimization happens > 5925: // on the return from z array allocation in runtime. Since we are not allocating z array during runtime anymore, do we still need these comments? - PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1611403873
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]
On Wed, 17 Apr 2024 12:37:01 GMT, Yudi Zheng wrote: >> `multiply_to_len` seems to be used by `generate_squareToLen` as well for >> aarch64 and riscv but `zlen` is still passed in a register. >> >> https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L4710 >> https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp#L2881 >> >> I think it might work anyway but it might be better to adapt them if only >> for completeness. > > @dafedafe @dean-long please take a look and let me know if there are further > issues, thanks! Hi @mur47x111, do you happen to have any performance results with this patch? - PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2120179701
Re: RFR: 8289552: Make intrinsic conversions between bit representations of half precision values and floats [v11]
On Wed, 28 Sep 2022 17:40:36 GMT, Smita Kamath wrote: >> 8289552: Make intrinsic conversions between bit representations of half >> precision values and floats > > Smita Kamath has updated the pull request incrementally with one additional > commit since the last revision: > > Addressed review comment to update test case Hi, would you be adding IR tests to verify the generation of the the newly introduced IR nodes? - PR: https://git.openjdk.org/jdk/pull/9781