[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215345#comment-15215345 ] Mohit Sabharwal commented on PIG-4837: -- +1 (non-binding) for PIG-4837_3.patch > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, PIG-4837_2.patch, PIG-4837_3.patch, > build23.PNG, build27.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215102#comment-15215102 ] Mohit Sabharwal commented on PIG-4837: -- I agree with [~pallavi.rao]. Running MR job in Spark mode should not be our priority. We may want to support such "mixed mode" in the future. My vote would be a) add it test/excluded-tests-spark and b) add a comment there with reference to this jira. > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, PIG-4837_2.patch, build23.PNG, > build27.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213994#comment-15213994 ] Xianda Ke commented on PIG-4837: I also did some investigation on this issue: yarn's launchContainer uses ProcessBuilder to start a process. ProcessBuilder will concatenate all the environment variables and pass it to JDK's native interface UNIXProcess.forkAndExec(). JDK's native interface will call OS's interface. If the concatenate environment variables is too long, OS will return an error. Hmm... It is a bit weird that the ARG_MAX of the jenkins build server is big but it still failed. It also works on my machine. I agree with [~kellyzly] and [~pallavi.rao]. It is configuration or environment problem, not a significant issue. > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, PIG-4837_2.patch, build23.PNG, > build27.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213980#comment-15213980 ] Pallavi Rao commented on PIG-4837: -- TestNativeMapReduce tests the MAPREDUCE operator which really has no significance in the Spark world. I would prefer removing from test/spark-tests and add it to test/excluded-tests-spark. That way, the builds will be clean. > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, PIG-4837_2.patch, build23.PNG, > build27.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213970#comment-15213970 ] liyunzhang_intel commented on PIG-4837: --- [~xuefuz],[~pallavi.rao],[~mohitsabharwal] and [~kexianda]: TestNativeMapReduce fails on jenkins build while passes on my and Pallavi's own jenkins(see build27.png). Maybe it is related to the environment. Following are some proposals: 1. remove it from test/spark-tests 2. leave it, like TestLoad which randomly fails in the trunk(https://builds.apache.org/job/Pig-trunk-commit/2305/testReport/junit/org.apache.pig.test/TestLoad/testCommaSeparatedString2/) > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, PIG-4837_2.patch, build23.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213914#comment-15213914 ] liyunzhang_intel commented on PIG-4837: --- When i check the code, there is no change about TestNativeMapReduce except we update the hadoop version from 2.5 to 2.6 in ivy/libraries.properties {code} hadoop-common.version=2.6.0 hadoop-hdfs.version=2.6.0 hadoop-mapreduce.version=2.6.0 {code} The error is thrown when yarn executes container-launch.sh and seems that it exceed the bash max limitation. {code} Exception from container-launch. Container id: container_1458599533105_0001_01_04 Exit code: 0 Exception message: Cannot run program "bash" (in directory "/x1/jenkins/jenkins-slave/workspace/Pig-spark/target/PigMiniCluster/PigMiniCluster-localDir-nm-0_0/usercache/jenkins/appcache/application_1458599533105_0001/container_1458599533105_0001_01_04"): error=7, Argument list too long Stack trace: java.io.IOException: Cannot run program "bash" (in directory "/x1/jenkins/jenkins-slave/workspace/Pig-spark/target/PigMiniCluster/PigMiniCluster-localDir-nm-0_0/usercache/jenkins/appcache/application_1458599533105_0001/container_1458599533105_0001_01_04"): error=7, Argument list too long at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at org.apache.hadoop.util.Shell.runCommand(Shell.java:485) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.io.IOException: error=7, Argument list too long at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.(UNIXProcess.java:135) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1022) ... 10 more {code} The container-launcher.sh is like: {code} #!/bin/bash export HADOOP_CONF_DIR="/home/zly/prj/oss/hadoop-2.6.0/etc/hadoop" export MAX_APP_ATTEMPTS="2" export JAVA_HOME="/usr/java/jdk1.8.0_20" export APP_SUBMIT_TIME_ENV="1458808716291" export NM_HOST="zly1.sh.intel.com" export LD_LIBRARY_PATH="$PWD" export HADOOP_HDFS_HOME="/home/zly/prj/oss/hadoop-2.6.0" export LOGNAME="root" export JVM_PID="$$" export PWD="/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1458806299263_0002/container_1458806299263_0002_01_01" export HADOOP_COMMON_HOME="/home/zly/prj/oss/hadoop-2.6.0" export LOCAL_DIRS="/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1458806299263_0002" export APPLICATION_WEB_PROXY_BASE="/proxy/application_1458806299263_0002" export SHELL="/bin/bash" export NM_HTTP_PORT="8042" export LOG_DIRS="/home/zly/prj/oss/hadoop-2.6.0/logs/userlogs/application_1458806299263_0002/container_1458806299263_0002_01_01" export NM_AUX_SERVICE_mapreduce_shuffle="AAA0+gA= " export NM_PORT="34884" export USER="root" export HADOOP_YARN_HOME="/home/zly/prj/oss/hadoop-2.6.0" export CLASSPATH="$PWD:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_HOME/share/hadoop/yarn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*:job.jar/job.jar:job.jar/classes/:job.jar/lib/*:$PWD/*" export HADOOP_TOKEN_FILE_LOCATION="/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1458806299263_0002/container_1458806299263_0002_01_01/container_tokens" export HOME="/home/" export CONTAINER_ID="container_1458806299263_0002_01_01" export MALLOC_ARENA_MAX="4" ln -sf "/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1458806299263_0002/filecache/13/job.xml" "job.xml" hadoop_shell_errorcode=$? if [ $hadoop_shell_errorcode -ne 0 ] then exit $hadoop_shell_errorcode fi mkdir -p jobSubmitDir hadoop_shell_errorcode=$? if [ $hadoop_shell_errorcode -ne 0 ] then exit $hadoop_shell_errorcode fi ln -sf "/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1458806299263_0002/filecache/12/job.split" "jobSubmitDir/job.split" hadoo
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213860#comment-15213860 ] Dapeng Sun commented on PIG-4837: - [~kellyzly], Okay, I will change "Label Expression" from "Ubuntu" to "Hadoop". Please let me know if it works for you. > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, build23.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213858#comment-15213858 ] liyunzhang_intel commented on PIG-4837: --- [~sundapeng]: As you know how to change the build machine, can you help to change it? > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, build23.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213784#comment-15213784 ] Pallavi Rao commented on PIG-4837: -- [~xuefuz], I generally file a apache INFRA ticket for things like this. May be they will be able to help here too. > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, build23.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213292#comment-15213292 ] Xuefu Zhang commented on PIG-4837: -- Hi [~kellyzly], I don't know how to switch the build machine. Do you have any idea who to do that? > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, build23.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213125#comment-15213125 ] liyunzhang_intel commented on PIG-4837: --- [~xuefuz]: I guess this failure is about the build environment. The unit test passes both in my and Pallavi's jenkins. Can we use other build machine like H4(https://builds.apache.org/computer/H4/) not Ubuntu3 to run the unit test? > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, build23.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200866#comment-15200866 ] Xianda Ke commented on PIG-4837: Hi [~kellyzly], how about if move the static function executeCommand() to utility class, such as org.apache.pig.impl.util.Utils or SparkUtil ? > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, build23.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200926#comment-15200926 ] liyunzhang_intel commented on PIG-4837: --- [~xuefuz]: This issue can not be reproduced on my own jenkins now. Please first checkin the PIG-4837.patch so we can output the ARG_MAX value in the log. if ARG_MAX of the jenkins server is very small, the issue is reasonable. if ARG_MAX is big like 262144,this maybe a random issue. > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, build23.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200990#comment-15200990 ] Xuefu Zhang commented on PIG-4837: -- Committed. Thanks, Liyun! I will keep this JIRA open for now. > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4837.patch, build23.PNG > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200804#comment-15200804 ] liyunzhang_intel commented on PIG-4837: --- [~pallavi.rao]: yes i also met "Argument list too long" problem on my jenkins server. but when i used the attached patch to output the value of ARG_MAX of my jenkins server in the program. it shows the ARG_MAX is 2621440 and the error disappears. All the unit tests about TestNativeMapReduce pass on my jenkins(see build23.png). > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4837) TestNativeMapReduce test fix
[ https://issues.apache.org/jira/browse/PIG-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197234#comment-15197234 ] Pallavi Rao commented on PIG-4837: -- These tests pass on my machine. On the build machine, the error is : {noformat} Stack trace: java.io.IOException: Cannot run program "bash" (in directory "/home/jenkins/jenkins-slave/workspace/Pig-spark/target/PigMiniCluster/PigMiniCluster-localDir-nm-0_3/usercache/jenkins/appcache/application_1457627184976_0002/container_1457627184976_0002_01_04"): error=7, Argument list too long at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at org.apache.hadoop.util.Shell.runCommand(Shell.java:485) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) Caused by: java.io.IOException: error=7, Argument list too long at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.(UNIXProcess.java:135) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1022) ... 10 more {noformat} This, I believe is due to this setting -> http://www.in-ulm.de/~mascheck/various/argmax/ On my machine, {noformat} $ getconf ARG_MAX 262144 {noformat} > TestNativeMapReduce test fix > > > Key: PIG-4837 > URL: https://issues.apache.org/jira/browse/PIG-4837 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: Xianda Ke > Fix For: spark-branch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)