[ https://issues.apache.org/jira/browse/PIG-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915860#action_12915860 ]
Olga Natkovich commented on PIG-1652: ------------------------------------- I think the code needs to be modified to default to 1 if we can't perform the computation > TestSortedTableUnion and TestSortedTableUnionMergeJoin fail on trunk due to > estimateNumberOfReducers bug > -------------------------------------------------------------------------------------------------------- > > Key: PIG-1652 > URL: https://issues.apache.org/jira/browse/PIG-1652 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.8.0 > Reporter: Daniel Dai > Fix For: 0.8.0 > > > TestSortedTableUnion and TestSortedTableUnionMergeJoin fail on trunk due to > the input size estimation. Here is the stack of TestSortedTableUnionMergeJoin: > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias records3 > at org.apache.pig.PigServer.storeEx(PigServer.java:877) > at org.apache.pig.PigServer.store(PigServer.java:815) > at org.apache.pig.PigServer.openIterator(PigServer.java:727) > at > org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer(TestSortedTableUnionMergeJoin.java:203) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2043: > Unexpected error during execution. > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:326) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197) > at org.apache.pig.PigServer.storeEx(PigServer.java:873) > Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Illegal character in scheme name at index 69: > org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer1,file: > at org.apache.hadoop.fs.Path.initialize(Path.java:140) > at org.apache.hadoop.fs.Path.<init>(Path.java:126) > at org.apache.hadoop.fs.Path.<init>(Path.java:50) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:963) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966) > at > org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:902) > at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:866) > at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:844) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getTotalInputFileSize(JobControlCompiler.java:715) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.estimateNumberOfReducers(JobControlCompiler.java:688) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SampleOptimizer.visitMROp(SampleOptimizer.java:140) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:246) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:41) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71) > at > org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SampleOptimizer.visit(SampleOptimizer.java:69) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:491) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301) > Caused by: java.net.URISyntaxException: Illegal character in scheme name at > index 69: > org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer1,file: > at java.net.URI$Parser.fail(URI.java:2809) > at java.net.URI$Parser.checkChars(URI.java:2982) > at java.net.URI$Parser.parse(URI.java:3009) > at java.net.URI.<init>(URI.java:736) > at org.apache.hadoop.fs.Path.initialize(Path.java:137) > The reason is we are trying to do globStatus on a URL which is a comma > seperated list. Here is the URL we get in > JobControlCompiler.getTotalInputFileSize: > file:///homes/jianyong/pig2/build/contrib/zebra/test/data/org.apache.hadoop.zebra.pig.TestSortedTableUnion.testStorer1,file:///homes/jianyong/pig2/build/contrib/zebra/test/data/org.apache.hadoop.zebra.pig.TestSortedTableUnion.testStorer2 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.