[
https://issues.apache.org/jira/browse/HADOOP-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550258
]
arkady borkovsky commented on HADOOP-2302:
------------------------------------------
This issue should probably be generalized to
"Streaming should provide more powerful specification primary keys comparator"
(meaning the comparator used for splits).
It should allow at least
-- numeric comparison
-- reverse order comparison
-- multiple field comparison
One possible way to specify the comparison in the streaming command line is to
use the familiar syntax of Unix sort command, like
"-k2,2rn -k1,1"
for "compare the second field, numerically, large value first; if equal,
compare the first field, alphabetically"
Note that this specification implicitly defines the part of the string that is
the key for shuffling purposes
> Streaming should provide an option for numerical sort of keys
> --------------------------------------------------------------
>
> Key: HADOOP-2302
> URL: https://issues.apache.org/jira/browse/HADOOP-2302
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/streaming
> Reporter: lohit vijayarenu
>
> It would be good to have an option for numerical sort of keys for streaming.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.