If I want to compare 2 sequence files to see if they are the same, how do I compare?
On 19 December 2011 14:43, Robert Evans <[email protected]> wrote: > Oh I forgot to say that part of the Random Characters are actually random > characters. Sequence files store a set of random characters as synch > points within the file. This allows for splitting the file easily without > a high risk that the random sequence appears inside the data itself just by > chance. > > --Bobby Evans > > On 12/19/11 7:51 AM, "Pedro Costa" <[email protected]> wrote: > > Hi, > > In the hadoop MapReduce, I've executed the webdatascan example, and the > reduce output is in a SequeceFile. The result is shows here ( > http://paste.lisp.org/display/126572). What's the trash (random > characters), like "u 265 > 0000100 330 320 252 " \n # ; 374 5 211 V ' 340 376" in the output? Is the > output correct? > > > 0000000 S E Q 006 031 o r g . a p a c h e . > 0000020 h a d o o p . i o . T e x t 031 o > 0000040 r g . a p a c h e . h a d o o p > 0000060 . i o . T e x t \0 \0 \0 \0 \0 \0 u 265 > 0000100 330 320 252 " \n # ; 374 5 211 V ' 340 376 \0 \0 > 0000120 \0 X \0 \0 \0 037 a p p l e a p p > 0000140 l e b a n a n a a p p l e > 0000160 a p p l e 7 c a r r o t c a > 0000200 r r o t c a r r o t c a r r > 0000220 o t a p p l e b a n a n a > 0000240 c a r r o t b a n a n a > 0000256 > > > -- > Thanks, > > -- Best regards,
