Daniel and myself were discussing the way Pig does these conversions
currently and possibly simplify/optimize it further.
Long ret = null;
if (sanityCheckIntegerLong(s)) {
try {
ret = Long.valueOf(s);
} catch (NumberFormatException nfe) {
}
}
The code looks to see if all characters are numeric and then does a
conversion to Long.
private static boolean sanityCheckIntegerLong(String number){
for (int i=0; i < number.length(); i++){
if (number.charAt(i) >= '0' && number.charAt(i) <='9' || i == 0
&& number.charAt(i) == '-'){
// valid one
}
else{
// contains invalid characters, must not be a integer or
long.
return false;
}
}
return true;
}
If the input is not numeric (1234abcd) the code calls
Double.valueOf(String) regardless before finally returning null. Any script
that inadvertently (user's mistake or not) tries to cast alpha-numeric
column to int or long would result in many wasteful calls.
I think we can avoid this and only handle the cases we find the input to be
a decimal number (1234.56) and return null otherwise even before trying
Double.valueOf(String).
Thoughts/concerns? Just want to make sure such a change does not break
backward-compatibility.
-Prashant