[
https://issues.apache.org/jira/browse/CRUNCH-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797716#comment-13797716
]
Chao Shi commented on CRUNCH-280:
---------------------------------
I found it difficult that MR needs RawComparator, which compares two buffers of
serialized records. But this would be not easy to use. I would be nice to
support:
1) RawComparator, this is the most efficient way, but users must know the
serialization format in mind
2) normal Comparator class (with extra record serialization overhead)
3) a serializable Comparator object, whose in-memory state is serialized to MR
workers (with serialization overhead)
I found 2) and 3) are not easy, as I don't know how to deserialize data at
runtime. Is it possible [~jwills]?
> Specify Comparator for total order sort
> ---------------------------------------
>
> Key: CRUNCH-280
> URL: https://issues.apache.org/jira/browse/CRUNCH-280
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Reporter: Chao Shi
> Assignee: Chao Shi
>
> It seems that Sort#sort can only uses the default comparator. It would be
> nice to make it to be specified by clients.
--
This message was sent by Atlassian JIRA
(v6.1#6144)