On Tue, Apr 3, 2012 at 8:25 AM, Pedro Costa wrote:
> What I want to ask is:
>
> - how do I read the values from sequence files that are block, or record
> compressed, or uncompressed?
You use the SequenceFile.Reader class.
> - how do I know if the sequence file is block compressed, record
> comp
On Tue, Apr 3, 2012 at 8:01 AM, Pedro Costa wrote:
> If I want to compare 2 sequence files to see if they are the same, how do I
> compare?
>From the command line, you can "textify" the files with:
hadoop fs -text myfile.seq
Of course, if you are using API you can iterate through the two
Sequen
What I want to ask is:
- how do I read the values from sequence files that are block, or record
compressed, or uncompressed?
- how do I know if the sequence file is block compressed, record
compressed, or uncompressed?
- how do I know if it's a sequence file or a Textfile?
On 3 April 2012 16:
If I want to compare 2 sequence files to see if they are the same, how do I
compare?
On 19 December 2011 14:43, Robert Evans wrote:
> Oh I forgot to say that part of the Random Characters are actually random
> characters. Sequence files store a set of random characters as synch
> points withi
Oh I forgot to say that part of the Random Characters are actually random
characters. Sequence files store a set of random characters as synch points
within the file. This allows for splitting the file easily without a high risk
that the random sequence appears inside the data itself just by c
It looks mostly correct to me. I am not an expert on sequence files, and I
have not checked the text against the spec nor have I checked the binary
numbers in it to be sure they add up to the correct lengths etc, but it looks
good from a first glance. I can see the SEQ tag at the beginning to
Hi,
In the hadoop MapReduce, I've executed the webdatascan example, and the
reduce output is in a SequeceFile. The result is shows here (
http://paste.lisp.org/display/126572). What's the trash (random
characters), like "u 265
100 330 320 252 " \n # ; 374 5 211 V ' 340 376" in the output? Is t