[ https://issues.apache.org/jira/browse/SPARK-25234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng updated SPARK-25234: ---------------------------------- Description: parallelize uses integer multiplication, which cannot handle size over ~47000. This cause issues with lapply {code:java} SparkR:::parallelize(sc, 1:47000, 47000) Error in rep(start, end - start) : invalid 'times' argument Error in rep(start, end - start) : invalid 'times' argument In addition: Warning message: In x * length(coll) : NAs produced by integer overflow{code} was:parallelize uses integer multiplication, which cannot handle size over ~47000. > SparkR:::parallelize doesn't handle integer overflow properly > ------------------------------------------------------------- > > Key: SPARK-25234 > URL: https://issues.apache.org/jira/browse/SPARK-25234 > Project: Spark > Issue Type: Story > Components: SparkR > Affects Versions: 2.3.1, 2.4.0 > Reporter: Xiangrui Meng > Priority: Major > > parallelize uses integer multiplication, which cannot handle size over > ~47000. This cause issues with lapply > > {code:java} > SparkR:::parallelize(sc, 1:47000, 47000) > Error in rep(start, end - start) : invalid 'times' argument > Error in rep(start, end - start) : invalid 'times' argument > In addition: Warning message: > In x * length(coll) : NAs produced by integer overflow{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org