TestSortedTableUnion and TestSortedTableUnionMergeJoin fail on trunk due to
estimateNumberOfReducers bug
--------------------------------------------------------------------------------------------------------
Key: PIG-1652
URL: https://issues.apache.org/jira/browse/PIG-1652
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.8.0
Reporter: Daniel Dai
Fix For: 0.8.0
TestSortedTableUnion and TestSortedTableUnionMergeJoin fail on trunk due to the
input size estimation. Here is the stack of TestSortedTableUnionMergeJoin:
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store
alias records3
at org.apache.pig.PigServer.storeEx(PigServer.java:877)
at org.apache.pig.PigServer.store(PigServer.java:815)
at org.apache.pig.PigServer.openIterator(PigServer.java:727)
at
org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer(TestSortedTableUnionMergeJoin.java:203)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2043:
Unexpected error during execution.
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:326)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
at org.apache.pig.PigServer.storeEx(PigServer.java:873)
Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException:
Illegal character in scheme name at index 69:
org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer1,file:
at org.apache.hadoop.fs.Path.initialize(Path.java:140)
at org.apache.hadoop.fs.Path.<init>(Path.java:126)
at org.apache.hadoop.fs.Path.<init>(Path.java:50)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:963)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
at
org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:902)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:866)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:844)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getTotalInputFileSize(JobControlCompiler.java:715)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.estimateNumberOfReducers(JobControlCompiler.java:688)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SampleOptimizer.visitMROp(SampleOptimizer.java:140)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:246)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:41)
at
org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69)
at
org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71)
at
org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SampleOptimizer.visit(SampleOptimizer.java:69)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:491)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
Caused by: java.net.URISyntaxException: Illegal character in scheme name at
index 69:
org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer1,file:
at java.net.URI$Parser.fail(URI.java:2809)
at java.net.URI$Parser.checkChars(URI.java:2982)
at java.net.URI$Parser.parse(URI.java:3009)
at java.net.URI.<init>(URI.java:736)
at org.apache.hadoop.fs.Path.initialize(Path.java:137)
The reason is we are trying to do globStatus on a URL which is a comma
seperated list. Here is the URL we get in
JobControlCompiler.getTotalInputFileSize:
file:///homes/jianyong/pig2/build/contrib/zebra/test/data/org.apache.hadoop.zebra.pig.TestSortedTableUnion.testStorer1,file:///homes/jianyong/pig2/build/contrib/zebra/test/data/org.apache.hadoop.zebra.pig.TestSortedTableUnion.testStorer2
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.