[jira] Commented: (PIG-1645) Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
[ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914541#action_12914541 ] Yan Zhou commented on PIG-1645: --- The possibility of failure also depends upon the block distribution since the split combination makes use of that info. Using both small split combination and temporary file compression on a query of ORDER BY may cause crash Key: PIG-1645 URL: https://issues.apache.org/jira/browse/PIG-1645 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.8.0 Attachments: PIG-1645.patch The stack looks like the following: java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at org.apache.hadoop.mapred.Child.main(Child.java:211) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1645) Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
[ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914556#action_12914556 ] Thejas M Nair commented on PIG-1645: +1 Using both small split combination and temporary file compression on a query of ORDER BY may cause crash Key: PIG-1645 URL: https://issues.apache.org/jira/browse/PIG-1645 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.8.0 Attachments: PIG-1645.patch The stack looks like the following: java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at org.apache.hadoop.mapred.Child.main(Child.java:211) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1645) Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
[ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914128#action_12914128 ] Yan Zhou commented on PIG-1645: --- The problem is that both RandomSampleLoader and PossionSampleLoader have internal states from the previous invocations that should be reset when a different underlying split is worked on under the same umbrella split when the split combination (PIG-1518) is on. When temporary file compression is disabled, Pig internal storage will create empty files which will be discarded by split combiner, making the only non-empty split as the only split to be worked on, so it is ok in this case. Using both small split combination and temporary file compression on a query of ORDER BY may cause crash Key: PIG-1645 URL: https://issues.apache.org/jira/browse/PIG-1645 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.8.0 The stack looks like the following: java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at org.apache.hadoop.mapred.Child.main(Child.java:211) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.