Re: [PATCH] libstdc++: Optimize integer std::from_chars

2022-04-14 Thread Jonathan Wakely via Gcc-patches
On Thu, 14 Apr 2022 at 13:32, Patrick Palka via Libstdc++
 wrote:
>
> This applies the following optimizations to the integer std::from_chars
> implementation:
>
>   1. Use a lookup table for converting an alphanumeric digit to its
>  base-36 value instead of using a range test (for 0-9) and switch
>  (for a-z and A-Z).  The table is constructed using a C++14
>  constexpr function which doesn't assume a particular character
>  encoding or __CHAR_BIT__ value.  The new conversion function
>  __from_chars_alnum_to_val is templated on whether we care
>  only about the decimal digits, in which case we can perform the
>  conversion with a single subtraction since the digit characters
>  are guaranteed to be contiguous (unlike the letters).
>   2. Generalize __from_chars_binary to handle all power-of-two bases.
>  This function, now named __from_chars_pow2_base, is also templated
>  on whether we care only about the decimal digits in order to speed
>  up digit conversion for base 2, 4 and 8.
>   3. In __from_chars_digit, use
>static_cast(__c - '0') < __base
>  instead of
>'0' <= __c && __c <= ('0' + (__base - 1)).
>  as the digit recognition test (exhaustively verified that the two
>  tests are equivalent).
>   4. In __from_chars_alnum, use a nested loop to consume the rest of the
>  digits in the overflow case (mirroring __from_chars_digit) so that
>  the main loop doesn't have to maintain the __valid overflow flag.
>
> At this point, __from_chars_digit is nearly identical to
> __from_chars_alnum, so this patch combines the two functions, removing
> the former and templatizing the latter according to whether we care only
> about the decimal digits.  Finally,
>
>   5. In __from_chars_alnum, keep track of a lower bound on the number of
>  unused bits in the result and use that to omit the overflow check
>  when it's safe to do so.
>
> In passing this replaces the non-portable function ascii_to_hexit
> used by __floating_from_chars_hex with the new conversion function.
>
> Here are some runtime measurements for a simple 15-line benchmark that
> roundtrips printing/parsing 200 million integers via std::to/from_chars
> (average of 5 runs):
>
>   Base  Before  After (seconds, lower is better)
>  29.37   9.37
>  3   12.13  15.79
>  83.67   4.15
> 103.86   4.90
> 115.03   6.84
> 162.93   4.14
> 322.39   3.85
> 363.26   5.22
>
> Testedon x86_64-pc-linux-gnu, does this look OK for trunk?  Also tested
> against libc++'s from_chars tests for good measure.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/charconv (__from_chars_alnum_to_val_table): Define.
> (__from_chars_alnum_to_val): Define.
> (__from_chars_binary): Rename to ...
> (__from_chars_pow2_base): ... this.  Generalize to handle any
> power-of-two base using __from_chars_alnum_to_val.
> (__from_chars_digit): Optimize digit recognition to a single
> test instead of two tests.  Use [[__unlikely___]] attribute.
> (__from_chars_alpha_to_num): Remove.
> (__from_chars_alnum): Use __from_chars_alnum_to_val.  Use a
> nested loop for the overflow case.
> (from_chars): Adjust appropriately.
> * src/c++17/floating_from_chars.cc (ascii_to_hexit): Remove.
> (__floating_from_chars_hex): Use __from_chars_alnum_to_val
> to recognize a hex digit instead.
> ---
>  libstdc++-v3/include/std/charconv | 250 --
>  libstdc++-v3/src/c++17/floating_from_chars.cc |  18 +-
>  2 files changed, 105 insertions(+), 163 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/charconv 
> b/libstdc++-v3/include/std/charconv
> index 2ce9c7d4cb9..5e44459749a 100644
> --- a/libstdc++-v3/include/std/charconv
> +++ b/libstdc++-v3/include/std/charconv
> @@ -407,176 +407,127 @@ namespace __detail
>return true;
>  }
>
> -  /// std::from_chars implementation for integers in base 2.
> -  template
> +  // Construct and return a lookup table that maps 0-9, A-Z and a-z to the
> +  // corresponding corresponding base-36 value and maps all other characters
> +  // to 127.
> +  constexpr auto
> +  __from_chars_alnum_to_val_table()
> +  {
> +constexpr unsigned char __lower_letters[]
> +  = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
> + 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
> + 'u', 'v', 'w', 'x', 'y', 'z' };
> +constexpr unsigned char __upper_letters[]
> +  = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
> + 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
> + 'U', 'V', 'W', 'X', 'Y', 'Z' };
> +struct { unsigned char __data[1u << __CHAR_BIT__] = {}; } __table;
> +for (auto& __entry : __table.__data)
> +  __entry = 127;
> +for (int __i = 0; __i < 10; ++__i)
> +  __table.__data['0' + __i] = __i;
> +for (int __i = 0; __i < 26; +

Re: [PATCH] libstdc++: Optimize integer std::from_chars

2022-04-14 Thread Patrick Palka via Gcc-patches
On Thu, 14 Apr 2022, Patrick Palka wrote:

> This applies the following optimizations to the integer std::from_chars
> implementation:
> 
>   1. Use a lookup table for converting an alphanumeric digit to its
>  base-36 value instead of using a range test (for 0-9) and switch
>  (for a-z and A-Z).  The table is constructed using a C++14
>  constexpr function which doesn't assume a particular character
>  encoding or __CHAR_BIT__ value.  The new conversion function
>  __from_chars_alnum_to_val is templated on whether we care
>  only about the decimal digits, in which case we can perform the
>  conversion with a single subtraction since the digit characters
>  are guaranteed to be contiguous (unlike the letters).
>   2. Generalize __from_chars_binary to handle all power-of-two bases.
>  This function, now named __from_chars_pow2_base, is also templated
>  on whether we care only about the decimal digits in order to speed
>  up digit conversion for base 2, 4 and 8.
>   3. In __from_chars_digit, use
>static_cast(__c - '0') < __base
>  instead of
>'0' <= __c && __c <= ('0' + (__base - 1)).
>  as the digit recognition test (exhaustively verified that the two
>  tests are equivalent).
>   4. In __from_chars_alnum, use a nested loop to consume the rest of the
>  digits in the overflow case (mirroring __from_chars_digit) so that
>  the main loop doesn't have to maintain the __valid overflow flag.
> 
> At this point, __from_chars_digit is nearly identical to
> __from_chars_alnum, so this patch combines the two functions, removing
> the former and templatizing the latter according to whether we care only
> about the decimal digits.  Finally,
> 
>   5. In __from_chars_alnum, keep track of a lower bound on the number of
>  unused bits in the result and use that to omit the overflow check
>  when it's safe to do so.
> 
> In passing this replaces the non-portable function ascii_to_hexit
> used by __floating_from_chars_hex with the new conversion function.
> 
> Here are some runtime measurements for a simple 15-line benchmark that
> roundtrips printing/parsing 200 million integers via std::to/from_chars
> (average of 5 runs):
> 
>   Base  Before  After (seconds, lower is better)
>  29.37   9.37
>  3   12.13  15.79
>  83.67   4.15
> 103.86   4.90
> 115.03   6.84
> 162.93   4.14
> 322.39   3.85
> 363.26   5.22

Whoops, the second and third columns should be swapped (runtime is
smaller after the patch across the board).

> 
> Testedon x86_64-pc-linux-gnu, does this look OK for trunk?  Also tested
> against libc++'s from_chars tests for good measure.
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/charconv (__from_chars_alnum_to_val_table): Define.
>   (__from_chars_alnum_to_val): Define.
>   (__from_chars_binary): Rename to ...
>   (__from_chars_pow2_base): ... this.  Generalize to handle any
>   power-of-two base using __from_chars_alnum_to_val.
>   (__from_chars_digit): Optimize digit recognition to a single
>   test instead of two tests.  Use [[__unlikely___]] attribute.
>   (__from_chars_alpha_to_num): Remove.
>   (__from_chars_alnum): Use __from_chars_alnum_to_val.  Use a
>   nested loop for the overflow case.
>   (from_chars): Adjust appropriately.
>   * src/c++17/floating_from_chars.cc (ascii_to_hexit): Remove.
>   (__floating_from_chars_hex): Use __from_chars_alnum_to_val
>   to recognize a hex digit instead.
> ---
>  libstdc++-v3/include/std/charconv | 250 --
>  libstdc++-v3/src/c++17/floating_from_chars.cc |  18 +-
>  2 files changed, 105 insertions(+), 163 deletions(-)
> 
> diff --git a/libstdc++-v3/include/std/charconv 
> b/libstdc++-v3/include/std/charconv
> index 2ce9c7d4cb9..5e44459749a 100644
> --- a/libstdc++-v3/include/std/charconv
> +++ b/libstdc++-v3/include/std/charconv
> @@ -407,176 +407,127 @@ namespace __detail
>return true;
>  }
>  
> -  /// std::from_chars implementation for integers in base 2.
> -  template
> +  // Construct and return a lookup table that maps 0-9, A-Z and a-z to the
> +  // corresponding corresponding base-36 value and maps all other characters
> +  // to 127.
> +  constexpr auto
> +  __from_chars_alnum_to_val_table()
> +  {
> +constexpr unsigned char __lower_letters[]
> +  = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
> +   'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
> +   'u', 'v', 'w', 'x', 'y', 'z' };
> +constexpr unsigned char __upper_letters[]
> +  = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
> +   'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
> +   'U', 'V', 'W', 'X', 'Y', 'Z' };
> +struct { unsigned char __data[1u << __CHAR_BIT__] = {}; } __table;
> +for (auto& __entry : __table.__data)
> +  __entry = 127;
> +for (int __i = 0; __i < 10; ++__i)
> +  __tabl

RE: [PATCH] libstdc++: Optimize integer std::from_chars

2022-04-14 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Patrick Palka
> via Gcc-patches
> Sent: Thursday, April 14, 2022 1:31 PM
> To: gcc-patches@gcc.gnu.org
> Cc: libstd...@gcc.gnu.org
> Subject: [PATCH] libstdc++: Optimize integer std::from_chars
> 
> This applies the following optimizations to the integer std::from_chars
> implementation:
> 
>   1. Use a lookup table for converting an alphanumeric digit to its
>  base-36 value instead of using a range test (for 0-9) and switch
>  (for a-z and A-Z).  The table is constructed using a C++14
>  constexpr function which doesn't assume a particular character
>  encoding or __CHAR_BIT__ value.  The new conversion function
>  __from_chars_alnum_to_val is templated on whether we care
>  only about the decimal digits, in which case we can perform the
>  conversion with a single subtraction since the digit characters
>  are guaranteed to be contiguous (unlike the letters).
>   2. Generalize __from_chars_binary to handle all power-of-two bases.
>  This function, now named __from_chars_pow2_base, is also templated
>  on whether we care only about the decimal digits in order to speed
>  up digit conversion for base 2, 4 and 8.
>   3. In __from_chars_digit, use
>static_cast(__c - '0') < __base
>  instead of
>'0' <= __c && __c <= ('0' + (__base - 1)).
>  as the digit recognition test (exhaustively verified that the two
>  tests are equivalent).
>   4. In __from_chars_alnum, use a nested loop to consume the rest of the
>  digits in the overflow case (mirroring __from_chars_digit) so that
>  the main loop doesn't have to maintain the __valid overflow flag.
> 
> At this point, __from_chars_digit is nearly identical to
> __from_chars_alnum, so this patch combines the two functions, removing
> the former and templatizing the latter according to whether we care only
> about the decimal digits.  Finally,
> 
>   5. In __from_chars_alnum, keep track of a lower bound on the number of
>  unused bits in the result and use that to omit the overflow check
>  when it's safe to do so.
> 
> In passing this replaces the non-portable function ascii_to_hexit
> used by __floating_from_chars_hex with the new conversion function.
> 
> Here are some runtime measurements for a simple 15-line benchmark that
> roundtrips printing/parsing 200 million integers via std::to/from_chars
> (average of 5 runs):
> 
>   Base  Before  After (seconds, lower is better)
>  29.37   9.37
>  3   12.13  15.79
>  83.67   4.15
> 103.86   4.90
> 115.03   6.84
> 162.93   4.14
> 322.39   3.85
> 363.26   5.22

The after numbers look worse (higher)? Are the columns accidentally swapped?
Thanks,
Kyrill

> 
> Testedon x86_64-pc-linux-gnu, does this look OK for trunk?  Also tested
> against libc++'s from_chars tests for good measure.
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/charconv (__from_chars_alnum_to_val_table): Define.
>   (__from_chars_alnum_to_val): Define.
>   (__from_chars_binary): Rename to ...
>   (__from_chars_pow2_base): ... this.  Generalize to handle any
>   power-of-two base using __from_chars_alnum_to_val.
>   (__from_chars_digit): Optimize digit recognition to a single
>   test instead of two tests.  Use [[__unlikely___]] attribute.
>   (__from_chars_alpha_to_num): Remove.
>   (__from_chars_alnum): Use __from_chars_alnum_to_val.  Use a
>   nested loop for the overflow case.
>   (from_chars): Adjust appropriately.
>   * src/c++17/floating_from_chars.cc (ascii_to_hexit): Remove.
>   (__floating_from_chars_hex): Use __from_chars_alnum_to_val
>   to recognize a hex digit instead.
> ---
>  libstdc++-v3/include/std/charconv | 250 --
>  libstdc++-v3/src/c++17/floating_from_chars.cc |  18 +-
>  2 files changed, 105 insertions(+), 163 deletions(-)
> 
> diff --git a/libstdc++-v3/include/std/charconv b/libstdc++-
> v3/include/std/charconv
> index 2ce9c7d4cb9..5e44459749a 100644
> --- a/libstdc++-v3/include/std/charconv
> +++ b/libstdc++-v3/include/std/charconv
> @@ -407,176 +407,127 @@ namespace __detail
>return true;
>  }
> 
> -  /// std::from_chars implementation for integers in base 2.
> -  template
> +  // Construct and return a lookup table that maps 0-9, A-Z and a-z to the
> +  // corresponding corresponding base-36 value and maps all other
> characters
> +  // to 127.
> +  constexpr auto
> +  __from_chars_alnum_to_val_table()
> +  {
> +constexpr unsigned char __lower_letters[]
> +  = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
> +   'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
> +   'u', 'v', 'w', 'x', 'y', 'z' };
> +constexpr unsigned char __upper_letters[]
> +  = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
> +   'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S',