[ https://issues.apache.org/jira/browse/SPARK-21143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054707#comment-16054707 ]
Ryan Williams commented on SPARK-21143: --------------------------------------- [~zsxwing] bq. it's too risky to upgrade from 4.0.X to 4.1.X makes sense, I wasn't meaning to suggest that as the action to take bq. The reason you cannot use 4.0.42.Final is because you are using 4.1.X APIs? I'm depending on [google-cloud-nio|https://github.com/GoogleCloudPlatform/google-cloud-java/tree/v0.10.0/google-cloud-contrib/google-cloud-nio] while running Spark apps in Google Cloud; and it depends transitively on Netty 4.1.6.Final: {code} org.hammerlab:google-cloud-nio:jar:0.10.0-alpha \- com.google.cloud:google-cloud-storage:jar:0.10.0-beta:compile \- com.google.cloud:google-cloud-core:jar:0.10.0-alpha:compile \- com.google.api:gax:jar:0.4.0:compile \- io.grpc:grpc-netty:jar:1.0.3:compile +- io.netty:netty-handler-proxy:jar:4.1.6.Final:compile {code} Luckily, google-cloud-nio [publishes a shaded JAR|http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22com.google.cloud%22%20AND%20a%3A%22google-cloud-nio%22] with [all dependencies shaded+renamed|https://github.com/GoogleCloudPlatform/google-cloud-java/blob/v0.10.0/google-cloud-contrib/google-cloud-nio/pom.xml#L101-L129], so I can just use that to avoid this conflict. I mostly interpret this kind of issue as a nudge toward increasingly isolating Spark's classpath (by preemptive shading+renaming of some or all dependencies), so that these kinds of issues don't happen. [~sowen] thanks for the pointer, gtk that upgrade is in progress. This is narrowly a Netty 4.0 vs. 4.1 conflict, but per the above could be interpreted as a shading / classpath-isolation concern. Anyway, feel free to triage as you like, I just doubt I'll be the last person to see these stack traces and didn't see any good google hits about them yet. > Fail to fetch blocks >1MB in size in presence of conflicting Netty version > -------------------------------------------------------------------------- > > Key: SPARK-21143 > URL: https://issues.apache.org/jira/browse/SPARK-21143 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.1.1 > Reporter: Ryan Williams > Priority: Minor > > One of my spark libraries inherited a transitive-dependency on Netty > 4.1.6.Final (vs. Spark's 4.0.42.Final), and I observed a strange failure I > wanted to document: fetches of blocks larger than 1MB (pre-compression, > afaict) seem to trigger a code path that results in {{AbstractMethodError}}'s > and ultimately stage failures. > I put a minimal repro in [this github > repo|https://github.com/ryan-williams/spark-bugs/tree/netty]: {{collect}} on > a 1-partition RDD with 1032 {{Array\[Byte\]}}'s of size 1000 works, but at > 1033 {{Array}}'s it dies in a confusing way. > Not sure what fixing/mitigating this in Spark would look like, other than > defensively shading+renaming netty. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org