[
https://issues.apache.org/jira/browse/KAFKA-17792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17889545#comment-17889545
]
Martin Sillence commented on KAFKA-17792:
-----------------------------------------
I feel there are a few options
* make the schema expicit
* an exclude list
* a limit on the number of digits
The latter is the least intrusive but possibly the worst in terms of suprises
but to quantifiy it:
positive exponents:
{color:#d8d8d8}1e+1 time 0.0 totalMemory 532676608 {color}
{color:#d8d8d8}1e+10 time 0.0 totalMemory 532676608 {color}
{color:#d8d8d8}1e+100 time 0.0 totalMemory 532676608 {color}
{color:#d8d8d8}1e+1000 time 0.0 totalMemory 532676608 {color}
{color:#d8d8d8}1e+10000 time 0.005 totalMemory 532676608 {color}
{color:#d8d8d8}1e+100000 time 0.035 totalMemory 532676608 {color}
{color:#d8d8d8}1e+1000000 time 0.228 totalMemory 532676608 {color}
{color:#d8d8d8}1e+10000000 time 4.308 totalMemory 926941184 {color}
{color:#d8d8d8}1e+100000000 time 117.119 totalMemory 3221225472 {color}
{color:#d8d8d8}1e+1000000000 time 0.0 totalMemory 3221225472 BigInteger would
overflow supported range{color}
{color:#d8d8d8}1e+10000000000 time 0.001 totalMemory 3221225472 Too many
nonzero exponent digits.{color}
{color:#d8d8d8}1e+100000000000 time 0.0 totalMemory 3221225472 Too many nonzero
exponent digits.{color}
{color:#d8d8d8}1e+1000000000000 time 0.0 totalMemory 3221225472 Too many
nonzero exponent digits.{color}
{color:#d8d8d8}1e+10000000000000 time 0.0 totalMemory 3221225472 Too many
nonzero exponent digits.{color}
negative exponents:
{color:#d8d8d8}1e-1 time 0.001 totalMemory 532676608 {color}
{color:#d8d8d8}1e-10 time 0.0 totalMemory 532676608 {color}
{color:#d8d8d8}1e-100 time 0.001 totalMemory 532676608 {color}
{color:#d8d8d8}1e-1000 time 0.0 totalMemory 532676608 {color}
{color:#d8d8d8}1e-10000 time 0.005 totalMemory 532676608 {color}
{color:#d8d8d8}1e-100000 time 0.034 totalMemory 532676608 {color}
{color:#d8d8d8}1e-1000000 time 0.242 totalMemory 532676608 {color}
{color:#d8d8d8}1e-10000000 time 4.342 totalMemory 926941184 {color}
{color:#d8d8d8}1e-100000000 time 121.199 totalMemory 3368026112 {color}
{color:#d8d8d8}1e-1000000000 time 0.0 totalMemory 3368026112 BigInteger would
overflow supported range{color}
{color:#d8d8d8}1e-10000000000 time 0.0 totalMemory 3368026112 Too many nonzero
exponent digits.{color}
{color:#d8d8d8}1e-100000000000 time 0.0 totalMemory 3368026112 Too many nonzero
exponent digits.{color}
{color:#d8d8d8}1e-1000000000000 time 0.0 totalMemory 3368026112 Too many
nonzero exponent digits.{color}
{color:#d8d8d8}1e-10000000000000 time 0.0 totalMemory 3368026112 Too many
nonzero exponent digits.{color}
so 1e+1000000 and 1e-1000000 seem to be where things start to get more expensive
we have two choices then - either not process them as exact or leave them as
strings
leaving them as strings seems likely to break things unexpectedly but quickly
not being exact sounds like it would lead to subtle errors - for us we really
don't want our header to be rounded it's not a number
> header parsing ends up timing out and using large quantities of memory if the
> string looks like a number
> --------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-17792
> URL: https://issues.apache.org/jira/browse/KAFKA-17792
> Project: Kafka
> Issue Type: Bug
> Components: connect
> Reporter: Martin Sillence
> Priority: Major
>
> {color:#172b4d}We have trace headers such as:{color}
> {color:#172b4d}"X-B3-SpanId": "74320e6e26adc8f8"{color}
> {color:#172b4d}if however the value happens to be: "407127e212797209"{color}
> {color:#172b4d}This is then treated as a numeric value and it tries to
> convert this as a numeric representation and an exact value using
> BigDecimal{color}
> we end up with the trace:
> BigDecimal.setScale(int, RoundingMode) line: 2876
> Values$ValueParser.parseAsExactDecimal(BigDecimal) line: 1044
> Values$ValueParser.parseAsNumber(String) line: 1025
> Values$ValueParser.parseNextToken(boolean, String) line: 892
> Values$ValueParser.parse(boolean) line: 875
> Values.parseString(String) line: 415
> SimpleHeaderConverter.toConnectHeader(String, String, byte[]) line: 68
> WorkerSinkTask.convertHeadersFor(ConsumerRecord<byte[],byte[]>) line: 578
>
> this takes a long time to convert to an exact representation of a 212 billion
> digit integer
--
This message was sent by Atlassian Jira
(v8.20.10#820010)