Paulo Magalhaes created SPARK-12479: ---------------------------------------
Summary: sparkR collect on GroupedData throws R error "missing value where TRUE/FALSE needed" Key: SPARK-12479 URL: https://issues.apache.org/jira/browse/SPARK-12479 Project: Spark Issue Type: Bug Components: R, SparkR Affects Versions: 1.5.1 Reporter: Paulo Magalhaes sparkR collect on GroupedData throws "missing value where TRUE/FALSE needed" Spark Version: 1.5.1 R Version: 3.2.2 I tracked down the root cause of this exception to an specific key for which the hashCode could not be calculated. The following code recreates the problem when ran in sparkR: hashCode <- getFromNamespace("hashCode","SparkR") hashCode("bc53d3605e8a5b7de1e8e271c2317645") Error in if (value > .Machine$integer.max) { : missing value where TRUE/FALSE needed I went one step further and relaised the the problem happens because of the bit wise shift below returning NA. bitwShiftL(-1073741824,1) where bitwShiftL is an R function. I believe the bitwShiftL function is working as it is supposed to. Therefore, my PR will fix it in the SparkR package. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org