[ 
https://issues.apache.org/jira/browse/HADOOP-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538266
 ] 

Chris Douglas commented on HADOOP-2113:
---------------------------------------

(core tests failed HADOOP-2112; I assume the contrib tests are unrelated)

Each of those seem like valuable operations, but piping the output of "-text" 
through one's favorite text-processing utility seems very usable. Unless the 
keys contain tabs, I would expect 1-4 in your list to be pretty 
straightforward. I agree that the framework could be far more efficient for 
most operations- particularly for sorted data, which is almost certainly the 
most common case- and it could also help express "for keys matching this regexp 
in their string representation, emit them as their native type" (which this 
cannot), but isn't mapred the correct tool for that job, anyway? The intent was 
merely to provide an aid to people hoping to check the first few/some subset of 
values from a given SequenceFile; it aspires to sanity checks, not processing.

I could see extending -stat to support more info, re: (5), though. By "a more 
general set of tools", what did you have in mind?

> Add "-text" command to FsShell to decode SequenceFile to stdout
> ---------------------------------------------------------------
>
>                 Key: HADOOP-2113
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2113
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: 2113-0.patch
>
>
> FsShell should provide a command to examine SequenceFiles.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to