gortiz commented on code in PR #14337:
URL: https://github.com/apache/pinot/pull/14337#discussion_r1832331941
##########
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/JsonExtractScalarTransformFunction.java:
##########
@@ -184,8 +191,7 @@ public long[] transformToLongValuesSV(ValueBlock
valueBlock) {
if (result instanceof Number) {
_longValuesSV[i] = ((Number) result).longValue();
} else {
- // Handle scientific notation
- _longValuesSV[i] = (long) Double.parseDouble(result.toString());
+ _longValuesSV[i] = Long.parseLong(result.toString());
Review Comment:
> JSON standard does allow that for numbers but not for integers.
That is not precise. In JSON there is only one type of numeric literal:
_number_. A _number_ is defined as `number fraction exponent`. Therefore you
can write a number as `9007199254740993e0` and it will generate the _number_
`9007199254740993e`. We have to interpret that as `9007199254740992`, but
`(long) Double.parseDouble("9007199254740993e0")` returns `9007199254740992`.
Using `BigInteger`/`BigDecimal` should be correct, but IMHO pretty
expensive. I think it shouldn't be that hard to write our own parser. Something
like:
```java
int e = indexOfExponent(str);
if (e < 0) {
return Long.parse(str);
}
long base = Long.parse(str, 0, e, 10));
int exp = Integer.parseInt(str, e+1, 10));
long powerOfTwo = powerOfTwo(exp); // as suggested in
https://stackoverflow.com/questions/46983772/fastest-way-to-obtain-a-power-of-10
long result = verifyOverflow(base * powerOfTen(exp)); // Not sure if
checking if the value changed the sign is good enough
```
Alternatively, if we are not sure if that algorithm is correct, we can do:
```java
int e = indexOfExponent(str);
if (e < 0) {
return Long.parse(str);
}
if (str.contains('.')) {
throw whatever;
}
return toLong(new BigDecimal(str)); // verifying we fail if the value is not
representable as double.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]