[ https://issues.apache.org/jira/browse/PIG-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614940#comment-13614940 ]
Prashant Kommireddi commented on PIG-3259: ------------------------------------------ {quote} By counting the number of times exception has so far been thrown by .valueOf() {quote} I see what you mean. That could be an approach, though the heuristic for determining the threshold could be tricky. {quote}I wonder if there are good libraries that we can use for the sanity checks, as the decimal check seems bit more complicated{quote} I will try and look if any such libraries are available. There's a method to check for Double in the javadoc you pointed before, but it could be more expensive than we want http://docs.oracle.com/javase/6/docs/api/java/lang/Double.html#valueOf%28java.lang.String%29. > Optimize byte to Long/Integer conversions > ----------------------------------------- > > Key: PIG-3259 > URL: https://issues.apache.org/jira/browse/PIG-3259 > Project: Pig > Issue Type: Bug > Affects Versions: 0.11, 0.11.1 > Reporter: Prashant Kommireddi > Assignee: Prashant Kommireddi > Fix For: 0.12 > > Attachments: byteToLong.xlsx > > > These conversions can be performing better. If the input is not numeric > (1234abcd) the code calls Double.valueOf(String) regardless before finally > returning null. Any script that inadvertently (user's mistake or not) tries > to cast non-numeric column to int or long would result in many wasteful > calls. > We can avoid this and only handle the cases we find the input to be a decimal > number (1234.56) and return null otherwise even before trying > Double.valueOf(String). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira