[
https://issues.apache.org/jira/browse/PIG-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613528#comment-13613528
]
Prashant Kommireddi commented on PIG-3259:
------------------------------------------
{quote} The check you have here does not accept all valid double string
representations {quote} - thanks for noticing that.
{quote} One way to avoid performance degradation for 'correct' case would be to
start by doing .valueOf() without checks, then use the number of non-numbers
encountered to decide if want to be making the sanityCheckIntegerLongDecimal()
calls {quote} - I am not clear on the advantage here. How do we determine the
number of non-numbers without making calls to sanityCheck..()?
> Optimize byte to Long/Integer conversions
> -----------------------------------------
>
> Key: PIG-3259
> URL: https://issues.apache.org/jira/browse/PIG-3259
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.11, 0.11.1
> Reporter: Prashant Kommireddi
> Assignee: Prashant Kommireddi
> Fix For: 0.12
>
> Attachments: byteToLong.xlsx
>
>
> These conversions can be performing better. If the input is not numeric
> (1234abcd) the code calls Double.valueOf(String) regardless before finally
> returning null. Any script that inadvertently (user's mistake or not) tries
> to cast non-numeric column to int or long would result in many wasteful
> calls.
> We can avoid this and only handle the cases we find the input to be a decimal
> number (1234.56) and return null otherwise even before trying
> Double.valueOf(String).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira