TestSortedTableUnion and TestSortedTableUnionMergeJoin fail on trunk due to 
estimateNumberOfReducers bug
--------------------------------------------------------------------------------------------------------

                 Key: PIG-1652
                 URL: https://issues.apache.org/jira/browse/PIG-1652
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.8.0
            Reporter: Daniel Dai
             Fix For: 0.8.0


TestSortedTableUnion and TestSortedTableUnionMergeJoin fail on trunk due to the 
input size estimation. Here is the stack of TestSortedTableUnionMergeJoin:

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store 
alias records3
        at org.apache.pig.PigServer.storeEx(PigServer.java:877)
        at org.apache.pig.PigServer.store(PigServer.java:815)
        at org.apache.pig.PigServer.openIterator(PigServer.java:727)
        at 
org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer(TestSortedTableUnionMergeJoin.java:203)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2043: 
Unexpected error during execution.
        at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:326)
        at 
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
        at org.apache.pig.PigServer.storeEx(PigServer.java:873)
Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
Illegal character in scheme name at index 69: 
org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer1,file:
        at org.apache.hadoop.fs.Path.initialize(Path.java:140)
        at org.apache.hadoop.fs.Path.<init>(Path.java:126)
        at org.apache.hadoop.fs.Path.<init>(Path.java:50)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:963)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
        at 
org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:902)
        at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:866)
        at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:844)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getTotalInputFileSize(JobControlCompiler.java:715)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.estimateNumberOfReducers(JobControlCompiler.java:688)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SampleOptimizer.visitMROp(SampleOptimizer.java:140)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:246)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:41)
        at 
org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69)
        at 
org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71)
        at 
org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52)
        at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SampleOptimizer.visit(SampleOptimizer.java:69)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:491)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
        at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
Caused by: java.net.URISyntaxException: Illegal character in scheme name at 
index 69: 
org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer1,file:
        at java.net.URI$Parser.fail(URI.java:2809)
        at java.net.URI$Parser.checkChars(URI.java:2982)
        at java.net.URI$Parser.parse(URI.java:3009)
        at java.net.URI.<init>(URI.java:736)
        at org.apache.hadoop.fs.Path.initialize(Path.java:137)

The reason is we are trying to do globStatus on a URL which is a comma 
seperated list. Here is the URL we get in 
JobControlCompiler.getTotalInputFileSize:
file:///homes/jianyong/pig2/build/contrib/zebra/test/data/org.apache.hadoop.zebra.pig.TestSortedTableUnion.testStorer1,file:///homes/jianyong/pig2/build/contrib/zebra/test/data/org.apache.hadoop.zebra.pig.TestSortedTableUnion.testStorer2

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to