[
https://issues.apache.org/jira/browse/CRUNCH-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583754#comment-13583754
]
Matthew Hayes commented on CRUNCH-166:
--------------------------------------
Hmm perhaps, let me dig through this some more.
> NullPointerException when attempting to use Sort.sortPairs
> ----------------------------------------------------------
>
> Key: CRUNCH-166
> URL: https://issues.apache.org/jira/browse/CRUNCH-166
> Project: Crunch
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.5.0
> Environment: Hadoop 1.0.4
> Reporter: Matthew Hayes
> Assignee: Josh Wills
> Attachments: WordSortIT.java
>
>
> I'm attempting to count some strings and then order by the count descending.
> My code effectively looks like this:
> {code}
> PCollection<SomeType> records = pipeline.read(...);
> PCollection<String> stringsToCount = records.parallelDo(
> new DoFn<SomeType, String>() {
> @Override
> public void process(SomeType input,Emitter<String> emitter) {
> if (input.getRecords() != null && input.getRecords().size()
> > 0)
> {
> for (MyRecord record : input.getRecords())
> {
> emitter.emit(record.getValue().toString());
> }
> }
> }
> },
> Writables.strings()
> );
> PTable<String, Long> stats = Aggregate.count(stringsToCount);
> PCollection<Pair<String, Long>> sortedStats = Sort.sortPairs(stats, new
> ColumnOrder(2, Order.DESCENDING));
> pipeline.writeTextFile(sortedStats,"somewhere");
> {code}
> The error I get is:
> {code}
> java.lang.NullPointerException
> at
> org.apache.crunch.lib.Sort$TupleWritableComparator.setConf(Sort.java:459)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> at
> org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:773)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:959)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {code}
> Note that the line numbers are shifted because I added some debugging and
> recompiled. The NullPointerException is thrown in
> TupleWritableComparator.setConf() here:
> {code}
> String[] columnOrderNames = ordering.split(",");
> {code}
> I suppose "crunch.ordering" is not set, and therefore ordering is null. When
> I check the conf in job tracker I also don't see this property set.
> Am I doing something wrong?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira