Hi, On 2019-09-03 20:10:37 +0200, Fabien COELHO wrote: > @@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 > *protocol_version, > errmsg("conflicting or > redundant options"))); > protocol_version_given = true; > > - if (!scanint8(strVal(defel->arg), true, &parsed)) > + if (unlikely(pg_strtoint64(strVal(defel->arg), &parsed) > != PG_STRTOINT_OK)) > ereport(ERROR, > > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("invalid > proto_version")));
Unexcited about adding unlikely() to any place that's not a hot path - which this certainly is not. But I also wonder if we shouldn't just put the branch likelihood of pg_strtoint64 not failing into one central location. E.g. something like static inline pg_strtoint_status pg_strtoint64(const char *str, int64 *result) { pg_strtoint_status status; status = pg_strtoint64_impl(str, result); likely(status == PG_STRTOINT_OK); return status; } I've not tested this, but IIRC that should be sufficient to propagate that knowledge up to callers. > + if (likely(stat == PG_STRTOINT_OK)) > + return true; > + else if (stat == PG_STRTOINT_RANGE_ERROR) > { > - /* could fail if input is most negative number */ > - if (unlikely(tmp == PG_INT64_MIN)) > - goto out_of_range; > - tmp = -tmp; > - } > - > - *result = tmp; > - return true; > - > -out_of_range: > - if (!errorOK) > ereport(ERROR, > (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), > errmsg("value \"%s\" is out of range for type > %s", > str, "bigint"))); > - return false; > - > -invalid_syntax: > - if (!errorOK) > + return false; > + } > + else if (stat == PG_STRTOINT_SYNTAX_ERROR) > + { > ereport(ERROR, > (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), > errmsg("invalid input syntax for type %s: > \"%s\"", > "bigint", str))); > - return false; > + return false; > + } > + else > + /* cannot get here */ > + Assert(0); I'd write this as a switch over the enum - that way we'll get a compile-time warning if we're not handling a value. > +static bool > +str2int64(const char *str, int64 *val) > +{ > + pg_strtoint_status stat = pg_strtoint64(str, val); > + I find it weird to have a wrapper that's named 'str2...' that then calls 'strto' to implement itself. > +/* > + * pg_strtoint64 -- convert a string to 64-bit integer > + * > + * The function returns if the conversion failed, or > + * "*result" is set to the result. > + */ > +pg_strtoint_status > +pg_strtoint64(const char *str, int64 *result) > +{ > + const char *ptr = str; > + int64 tmp = 0; > + bool neg = false; > + > + /* > + * Do our own scan, rather than relying on sscanf which might be broken > + * for long long. > + * > + * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate > + * value as a negative number. > + */ > + > + /* skip leading spaces */ > + while (isspace((unsigned char) *ptr)) > + ptr++; > + > + /* handle sign */ > + if (*ptr == '-') > + { > + ptr++; > + neg = true; > + } > + else if (*ptr == '+') > + ptr++; > + > + /* require at least one digit */ > + if (unlikely(!isdigit((unsigned char) *ptr))) > + return PG_STRTOINT_SYNTAX_ERROR; > + > + /* process digits, we know that we have one ahead */ > + do > + { > + int64 digit = (*ptr++ - '0'); > + > + if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) || > + unlikely(pg_sub_s64_overflow(tmp, digit, &tmp))) > + return PG_STRTOINT_RANGE_ERROR; > + } > + while (isdigit((unsigned char) *ptr)); Hm. If we're concerned about the cost of isdigit etc - and I think that's reasonable, after looking at their implementation in glibc (it performs a lookup in a global table to handle potential locale changes) - I think we ought to just not use the provided isdigit, and probably not isspace either. This code effectively relies on the input being ascii anyway, so we don't need any locale specific behaviour afaict (we'd e.g. get wrong results if isdigit() returned true for something other than the ascii chars). To me the generated code looks considerably better if I use something like static inline bool pg_isdigit(char c) { return c >= '0' && c <= '9'; } static inline bool pg_isspace(char c) { return c == ' '; } (if we were to actually go for this, we'd probably want to add some docs that we don't expect EOF, or make the code safe against that). I've not benchmarked that, but I'd be surprised if it didn't improve matters. And once coded using the above, there's no point in the do/while conversion imo, as any compiler can trivially optimize redundant checks. > +/* > + * pg_strtouint64 -- convert a string to unsigned 64-bit integer > + * > + * The function returns if the conversion failed, or > + * "*result" is set to the result. > + */ > +pg_strtoint_status > +pg_strtouint64(const char *str, uint64 *result) > +{ > + const char *ptr = str; > + uint64 tmp = 0; > + > + /* skip leading spaces */ > + while (isspace((unsigned char) *ptr)) > + ptr++; > + > + /* handle sign */ > + if (*ptr == '+') > + ptr++; > + else if (unlikely(*ptr == '-')) > + return PG_STRTOINT_SYNTAX_ERROR; Hm. Seems like that should return PG_STRTOINT_RANGE_ERROR? > +typedef enum { Please don't define anonyous types (the enum itself, which now is only reachable via the typedef). Also, there's a missing newline here). Greetings, Andres Freund