[ 
https://issues.apache.org/jira/browse/HIVE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14710693#comment-14710693
 ] 

Gopal V commented on HIVE-11544:
--------------------------------

[~ashutoshc]: the JVM is a strange beast.

This method is written to be easily inlined in C2N (<35 bytes in bytecode).

You're not far off - the first cut did show a performance penalty when I tried 
to use a generic NumberUtils::isNumber() from Apache Commons, which wasn't 
getting inlined.

Wrote a much smaller version to be good enough for the worst case.

In JDK8 at least, this extra code turns out to be faster even for the common 
case as the JIT C2N doesn't seem to have any new Exception() branches after a 
few runs, provided none of the bad cases ever make it into the Lazy*::parse() 
inner loops - they're all replaced by uncommon trap markers (Uncommon trap: 
reason=unreached) instead.

If I get time, I'll write a JMH case into the itests/hive-jmh/ to test that on 
different JDKs.

> LazyInteger should avoid throwing NumberFormatException
> -------------------------------------------------------
>
>                 Key: HIVE-11544
>                 URL: https://issues.apache.org/jira/browse/HIVE-11544
>             Project: Hive
>          Issue Type: Improvement
>          Components: Serializers/Deserializers
>    Affects Versions: 0.14.0, 1.2.0, 1.3.0, 2.0.0
>            Reporter: William Slacum
>            Assignee: Gopal V
>            Priority: Minor
>              Labels: Performance
>         Attachments: HIVE-11544.1.patch
>
>
> {{LazyInteger#parseInt}} will throw a {{NumberFormatException}} under these 
> conditions:
> # bytes are null
> # radix is invalid
> # length is 0
> # the string is '+' or '-'
> # {{LazyInteger#parse}} throws a {{NumberFormatException}}
> Most of the time, such as in {{LazyInteger#init}} and {{LazyByte#init}}, the 
> exception is caught, swallowed, and {{isNull}} is set to {{true}}.
> This is generally a bad workflow, as exception creation is a performance 
> bottleneck, and potentially repeating for many rows in a query can have a 
> drastic performance consequence.
> It would be better if this method returned an {{Optional<Integer>}}, which 
> would provide similar functionality with a higher throughput rate.
> I've tested against 0.14.0, and saw that the logic is unchanged in 1.2.0, so 
> I've marked those as affected. Any version in between would also suffer from 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to