[jira] [Commented] (SPARK-1718) pyspark doesn't work with assembly jar containing over 65536 files/dirs built on redhat
[ https://issues.apache.org/jira/browse/SPARK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029422#comment-14029422 ] Thomas Graves commented on SPARK-1718: -- So this actually appears to be an issue because of using jdk7 on redhat. If I switch back to use jdk6 then the build works and pyspark works. Note that in both cases I'm using jdk7 to run with, so it doesn't appear to be the same as SPARK-1520 > pyspark doesn't work with assembly jar containing over 65536 files/dirs built > on redhat > > > Key: SPARK-1718 > URL: https://issues.apache.org/jira/browse/SPARK-1718 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.0.0 >Reporter: Thomas Graves > > Recently pyspark was ported to yarn (pr 30), but when I went to try it I > couldn't get it work. I was building on a redhat 6 box. I figured out that > if the assembly jar file contained over 65536 files/directories then it > wouldn't work. If I unjarred the assembly and removed some stuff to get it > under 65536 and jarred it back up, then it would work. > It appears to only be an issue when building on a redhat box as I can build > on my mac and it works just fine there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1718) pyspark doesn't work with assembly jar containing over 65536 files/dirs built on redhat
[ https://issues.apache.org/jira/browse/SPARK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989906#comment-13989906 ] Sean Owen commented on SPARK-1718: -- Yeah I may not be adding anything here. I suppose I just advise to double-check what's being used to build, to run, and anything in between (like zip or jar). Like, does the python-related build zip or jar anything? (I don't know that part of the build.) That could reintroduce the problem if something outside of Java land is not using the zip64 format. > pyspark doesn't work with assembly jar containing over 65536 files/dirs built > on redhat > > > Key: SPARK-1718 > URL: https://issues.apache.org/jira/browse/SPARK-1718 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.0.0 >Reporter: Thomas Graves > > Recently pyspark was ported to yarn (pr 30), but when I went to try it I > couldn't get it work. I was building on a redhat 6 box. I figured out that > if the assembly jar file contained over 65536 files/directories then it > wouldn't work. If I unjarred the assembly and removed some stuff to get it > under 65536 and jarred it back up, then it would work. > It appears to only be an issue when building on a redhat box as I can build > on my mac and it works just fine there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1718) pyspark doesn't work with assembly jar containing over 65536 files/dirs built on redhat
[ https://issues.apache.org/jira/browse/SPARK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989903#comment-13989903 ] Thomas Graves commented on SPARK-1718: -- I am running with jdk7 and building with jdk7. > pyspark doesn't work with assembly jar containing over 65536 files/dirs built > on redhat > > > Key: SPARK-1718 > URL: https://issues.apache.org/jira/browse/SPARK-1718 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.0.0 >Reporter: Thomas Graves > > Recently pyspark was ported to yarn (pr 30), but when I went to try it I > couldn't get it work. I was building on a redhat 6 box. I figured out that > if the assembly jar file contained over 65536 files/directories then it > wouldn't work. If I unjarred the assembly and removed some stuff to get it > under 65536 and jarred it back up, then it would work. > It appears to only be an issue when building on a redhat box as I can build > on my mac and it works just fine there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1718) pyspark doesn't work with assembly jar containing over 65536 files/dirs built on redhat
[ https://issues.apache.org/jira/browse/SPARK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989892#comment-13989892 ] Sean Owen commented on SPARK-1718: -- I could be mistaken here. The primary problem combination is building with 7 and running with 6. But I also thought I understood that building or running with an older JDK 6 could be a problem too. (That, I am not 100% sure of.) If you are running with 6, then you definitely don't want to build with 7. (The source/target can't be set to 7 in this case; either you build with 6 and it balks, or, you successfully build with 7 but at runtime, 6 won't accept the bytecode.) It sounds like you are building with 7 then? but is that your Mac build? if your RedHat build is using JDK 7, then I think this is just the same problem as in SPARK-1520 and you should use JDK 6 to build on that machine. (keep in mind that unzipping / rezipping, and unjarring / rejarring, might affect the result, as it affects the format of the .jar file! Worth noting whether that alone is causing or solving the issue.) If you are sure you're building with 6, then my next question would be whether it's actually building with an older JDK 6, and whether that can be upgraded perhaps, and whether that resolves it. Running on JDK 7 should be fine either way. I wasn't clear whether Andrew was saying that didn't work either: https://github.com/apache/spark/pull/30#issuecomment-42057384 But I assume the question is how to get it running on 6. > pyspark doesn't work with assembly jar containing over 65536 files/dirs built > on redhat > > > Key: SPARK-1718 > URL: https://issues.apache.org/jira/browse/SPARK-1718 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.0.0 >Reporter: Thomas Graves > > Recently pyspark was ported to yarn (pr 30), but when I went to try it I > couldn't get it work. I was building on a redhat 6 box. I figured out that > if the assembly jar file contained over 65536 files/directories then it > wouldn't work. If I unjarred the assembly and removed some stuff to get it > under 65536 and jarred it back up, then it would work. > It appears to only be an issue when building on a redhat box as I can build > on my mac and it works just fine there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1718) pyspark doesn't work with assembly jar containing over 65536 files/dirs built on redhat
[ https://issues.apache.org/jira/browse/SPARK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989854#comment-13989854 ] Thomas Graves commented on SPARK-1718: -- I can build the jar on my mac, copy it over to the same redhat boxes I run on and it works fine. If it was the runtime environment was using jdk6 then that wouldn't work. I assume you are saying that if you build the jar with jdk6 and try to run on jdk7 it has the same issue? I also checked the MANIFEST to verify it was build with jdk7. $ cat MANIFEST.MF Build-Jdk: 1.7.0_25 I also went in and changed the pom.xml to use java version 1.7 as source and target, but that doesn't look like its working as when I check the .class files for the major version it comes back as 50 (jdk6) so perhaps this is what is causing the issue. It could still be possible there is something in my environment causing it but as of yet haven't figured out what so wanted to file a jira to track the issue. I took Andrew's comment as he also tried it and ran into the same issue but perhaps I misunderstood. Do you happen to have redhat box you could try it on? > pyspark doesn't work with assembly jar containing over 65536 files/dirs built > on redhat > > > Key: SPARK-1718 > URL: https://issues.apache.org/jira/browse/SPARK-1718 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.0.0 >Reporter: Thomas Graves > > Recently pyspark was ported to yarn (pr 30), but when I went to try it I > couldn't get it work. I was building on a redhat 6 box. I figured out that > if the assembly jar file contained over 65536 files/directories then it > wouldn't work. If I unjarred the assembly and removed some stuff to get it > under 65536 and jarred it back up, then it would work. > It appears to only be an issue when building on a redhat box as I can build > on my mac and it works just fine there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1718) pyspark doesn't work with assembly jar containing over 65536 files/dirs built on redhat
[ https://issues.apache.org/jira/browse/SPARK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989666#comment-13989666 ] Sean Owen commented on SPARK-1718: -- Yeah, but it seems like it could well be that the JDK binaries being used during the build aren't quite what is expected, because some home or path variable points to JDK6. That was the substance of Patrick's last comment, and I wasn't sure whether Andrew was definitely confirming the build happened with Java 7, just that it was installed. (?) I suppose it could also be some old version of zip being used to re-zip jars and such, though it strikes me as less likely, but hey. > pyspark doesn't work with assembly jar containing over 65536 files/dirs built > on redhat > > > Key: SPARK-1718 > URL: https://issues.apache.org/jira/browse/SPARK-1718 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.0.0 >Reporter: Thomas Graves > > Recently pyspark was ported to yarn (pr 30), but when I went to try it I > couldn't get it work. I was building on a redhat 6 box. I figured out that > if the assembly jar file contained over 65536 files/directories then it > wouldn't work. If I unjarred the assembly and removed some stuff to get it > under 65536 and jarred it back up, then it would work. > It appears to only be an issue when building on a redhat box as I can build > on my mac and it works just fine there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1718) pyspark doesn't work with assembly jar containing over 65536 files/dirs built on redhat
[ https://issues.apache.org/jira/browse/SPARK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989658#comment-13989658 ] Thomas Graves commented on SPARK-1718: -- No, see discussion on https://github.com/apache/spark/pull/30. This happens if you build on a redhat box and you build and run with jdk7. > pyspark doesn't work with assembly jar containing over 65536 files/dirs built > on redhat > > > Key: SPARK-1718 > URL: https://issues.apache.org/jira/browse/SPARK-1718 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.0.0 >Reporter: Thomas Graves > > Recently pyspark was ported to yarn (pr 30), but when I went to try it I > couldn't get it work. I was building on a redhat 6 box. I figured out that > if the assembly jar file contained over 65536 files/directories then it > wouldn't work. If I unjarred the assembly and removed some stuff to get it > under 65536 and jarred it back up, then it would work. > It appears to only be an issue when building on a redhat box as I can build > on my mac and it works just fine there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1718) pyspark doesn't work with assembly jar containing over 65536 files/dirs built on redhat
[ https://issues.apache.org/jira/browse/SPARK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989642#comment-13989642 ] Sean Owen commented on SPARK-1718: -- This is the same issue as https://issues.apache.org/jira/browse/SPARK-1520 right? > pyspark doesn't work with assembly jar containing over 65536 files/dirs built > on redhat > > > Key: SPARK-1718 > URL: https://issues.apache.org/jira/browse/SPARK-1718 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.0.0 >Reporter: Thomas Graves > > Recently pyspark was ported to yarn (pr 30), but when I went to try it I > couldn't get it work. I was building on a redhat 6 box. I figured out that > if the assembly jar file contained over 65536 files/directories then it > wouldn't work. If I unjarred the assembly and removed some stuff to get it > under 65536 and jarred it back up, then it would work. > It appears to only be an issue when building on a redhat box as I can build > on my mac and it works just fine there. -- This message was sent by Atlassian JIRA (v6.2#6252)