[jira] [Commented] (PIG-2956) Invalid cache specification for some streaming statement
[ https://issues.apache.org/jira/browse/PIG-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669566#comment-13669566 ] Alan Gates commented on PIG-2956: - +1 > Invalid cache specification for some streaming statement > > > Key: PIG-2956 > URL: https://issues.apache.org/jira/browse/PIG-2956 > Project: Pig > Issue Type: Sub-task >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.12 > > Attachments: PIG-2956-1_0.10.patch, PIG-2956-1.patch, PIG-2956-2.patch > > > Another category of failure in e2e tests, such as ComputeSpec_1, > ComputeSpec_2, ComputeSpec_3, RaceConditions_1, RaceConditions_3, > RaceConditions_4, RaceConditions_7, RaceConditions_8. > Here is stack: > ERROR 6003: Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > ERROR 2017: Internal error creating job configuration. > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:723) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:258) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:151) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1318) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1303) > at org.apache.pig.PigServer.execute(PigServer.java:1293) > at org.apache.pig.PigServer.executeBatch(PigServer.java:364) > at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:133) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > at org.apache.pig.Main.run(Main.java:561) > at org.apache.pig.Main.main(Main.java:111) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6003: > Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1151) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1129) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:447) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2956) Invalid cache specification for some streaming statement
[ https://issues.apache.org/jira/browse/PIG-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535335#comment-13535335 ] Julien Le Dem commented on PIG-2956: Daniel? any update on this? > Invalid cache specification for some streaming statement > > > Key: PIG-2956 > URL: https://issues.apache.org/jira/browse/PIG-2956 > Project: Pig > Issue Type: Sub-task >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.11 > > Attachments: PIG-2956-1_0.10.patch, PIG-2956-1.patch > > > Another category of failure in e2e tests, such as ComputeSpec_1, > ComputeSpec_2, ComputeSpec_3, RaceConditions_1, RaceConditions_3, > RaceConditions_4, RaceConditions_7, RaceConditions_8. > Here is stack: > ERROR 6003: Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > ERROR 2017: Internal error creating job configuration. > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:723) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:258) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:151) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1318) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1303) > at org.apache.pig.PigServer.execute(PigServer.java:1293) > at org.apache.pig.PigServer.executeBatch(PigServer.java:364) > at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:133) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > at org.apache.pig.Main.run(Main.java:561) > at org.apache.pig.Main.main(Main.java:111) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6003: > Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1151) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1129) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:447) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2956) Invalid cache specification for some streaming statement
[ https://issues.apache.org/jira/browse/PIG-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500580#comment-13500580 ] Julien Le Dem commented on PIG-2956: you should still catch exceptions that come out of toURI() and handle them in the same way the URI exception was handled before I think. Bad URI should still be handled. > Invalid cache specification for some streaming statement > > > Key: PIG-2956 > URL: https://issues.apache.org/jira/browse/PIG-2956 > Project: Pig > Issue Type: Sub-task >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.11 > > Attachments: PIG-2956-1_0.10.patch, PIG-2956-1.patch > > > Another category of failure in e2e tests, such as ComputeSpec_1, > ComputeSpec_2, ComputeSpec_3, RaceConditions_1, RaceConditions_3, > RaceConditions_4, RaceConditions_7, RaceConditions_8. > Here is stack: > ERROR 6003: Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > ERROR 2017: Internal error creating job configuration. > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:723) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:258) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:151) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1318) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1303) > at org.apache.pig.PigServer.execute(PigServer.java:1293) > at org.apache.pig.PigServer.executeBatch(PigServer.java:364) > at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:133) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > at org.apache.pig.Main.run(Main.java:561) > at org.apache.pig.Main.main(Main.java:111) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6003: > Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1151) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1129) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:447) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2956) Invalid cache specification for some streaming statement
[ https://issues.apache.org/jira/browse/PIG-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489134#comment-13489134 ] Daniel Dai commented on PIG-2956: - Here is the failure using "src.toURI()" directly with a order by statement: Message: java.io.FileNotFoundException: File does not exist: /tmp/temp-1510081022/tmp-1308657145#pigsample_1889145873_1351808882314 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:517) at org.apache.hadoop.filecache.DistributedCache.getFileStatus(DistributedCache.java:185) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestamps(TrackerDistributedCacheManager.java:721) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:763) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:655) at org.apache.hadoop.mapred.JobClient.access$300(JobClient.java:174) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:865) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824) at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157) at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134) at java.lang.Thread.run(Thread.java:680) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257) src.toURI() does encode "#" character, and Hadoop on Linux have trouble finding the distributed cache item with "#" encoded. However, Windows is happy to take encoded "#" character, I am not sure why. And the original "new URI(src.toString())" fail if src contains ":" character. So the logic becomes: 1. On Linux, "new URI(src.toString())" always success, src never encoded, and DistributedCache is happy 2. On Windows, src contains ":", "new URI(src.toString())" fail, src will be encoded using "src.toUri()", and DistributedCache don't mind > Invalid cache specification for some streaming statement > > > Key: PIG-2956 > URL: https://issues.apache.org/jira/browse/PIG-2956 > Project: Pig > Issue Type: Sub-task >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.11 > > Attachments: PIG-2956-1.patch > > > Another category of failure in e2e tests, such as ComputeSpec_1, > ComputeSpec_2, ComputeSpec_3, RaceConditions_1, RaceConditions_3, > RaceConditions_4, RaceConditions_7, RaceConditions_8. > Here is stack: > ERROR 6003: Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > ERROR 2017: Internal error creating job configuration. > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:723) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:258) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:151) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1318) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1303) > at org.apache.pig.PigServer.execute(PigServer.java:1293) > at org.apache.pig.PigServer.executeBatch(PigServer.java:364) > at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:133) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > at org.apache.pig.Main.run(Main.java:561) > at org.apache.pig.Main.main(Main.java:111) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6003: > Invalid cache
[jira] [Commented] (PIG-2956) Invalid cache specification for some streaming statement
[ https://issues.apache.org/jira/browse/PIG-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481074#comment-13481074 ] Daniel Dai commented on PIG-2956: - I did that before and hit exception in Linux (I forget the exception). So src.toURI() is only for Windows, for Linux, have to use new URL(src.toString()). I will post the Linux exception later. > Invalid cache specification for some streaming statement > > > Key: PIG-2956 > URL: https://issues.apache.org/jira/browse/PIG-2956 > Project: Pig > Issue Type: Sub-task >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.11 > > Attachments: PIG-2956-1.patch > > > Another category of failure in e2e tests, such as ComputeSpec_1, > ComputeSpec_2, ComputeSpec_3, RaceConditions_1, RaceConditions_3, > RaceConditions_4, RaceConditions_7, RaceConditions_8. > Here is stack: > ERROR 6003: Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > ERROR 2017: Internal error creating job configuration. > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:723) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:258) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:151) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1318) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1303) > at org.apache.pig.PigServer.execute(PigServer.java:1293) > at org.apache.pig.PigServer.executeBatch(PigServer.java:364) > at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:133) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > at org.apache.pig.Main.run(Main.java:561) > at org.apache.pig.Main.main(Main.java:111) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6003: > Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1151) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1129) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:447) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2956) Invalid cache specification for some streaming statement
[ https://issues.apache.org/jira/browse/PIG-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477484#comment-13477484 ] Dmitriy V. Ryaboy commented on PIG-2956: Why not replace "new URL(src.toString())" with src.toURI() in the first place? > Invalid cache specification for some streaming statement > > > Key: PIG-2956 > URL: https://issues.apache.org/jira/browse/PIG-2956 > Project: Pig > Issue Type: Sub-task >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.11 > > Attachments: PIG-2956-1.patch > > > Another category of failure in e2e tests, such as ComputeSpec_1, > ComputeSpec_2, ComputeSpec_3, RaceConditions_1, RaceConditions_3, > RaceConditions_4, RaceConditions_7, RaceConditions_8. > Here is stack: > ERROR 6003: Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > ERROR 2017: Internal error creating job configuration. > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:723) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:258) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:151) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1318) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1303) > at org.apache.pig.PigServer.execute(PigServer.java:1293) > at org.apache.pig.PigServer.executeBatch(PigServer.java:364) > at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:133) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > at org.apache.pig.Main.run(Main.java:561) > at org.apache.pig.Main.main(Main.java:111) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6003: > Invalid cache specification. File doesn't exist: C:/Program Files > (x86)/GnuWin32/bin/head.exe > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1151) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1129) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:447) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira