Re: Reduce output is strange

Pedro Costa Tue, 03 Apr 2012 08:02:35 -0700

If I want to compare 2 sequence files to see if they are the same, how do I
compare?




On 19 December 2011 14:43, Robert Evans <[email protected]> wrote:

> Oh I forgot to say that part of the Random Characters are actually random
> characters.  Sequence files store a set of random characters as synch
> points within the file.  This allows for splitting the file easily without
> a high risk that the random sequence appears inside the data itself just by
> chance.
>
> --Bobby Evans
>
> On 12/19/11 7:51 AM, "Pedro Costa" <[email protected]> wrote:
>
> Hi,
>
> In the hadoop MapReduce, I've executed the webdatascan example, and the
> reduce output is in a SequeceFile. The result is shows here (
> http://paste.lisp.org/display/126572). What's the trash (random
> characters), like "u 265
> 0000100 330 320 252 " \n # ; 374 5 211 V ' 340 376" in the output? Is the
> output correct?
>
>
> 0000000   S   E   Q 006 031   o   r   g   .   a   p   a   c   h   e   .
> 0000020   h   a   d   o   o   p   .   i   o   .   T   e   x   t 031   o
> 0000040   r   g   .   a   p   a   c   h   e   .   h   a   d   o   o   p
> 0000060   .   i   o   .   T   e   x   t  \0  \0  \0  \0  \0  \0   u 265
> 0000100 330 320 252   "  \n   #   ; 374   5 211   V   ' 340 376  \0  \0
> 0000120  \0   X  \0  \0  \0     037   a   p   p   l   e       a   p   p
> 0000140   l   e       b   a   n   a   n   a       a   p   p   l   e
> 0000160   a   p   p   l   e       7   c   a   r   r   o   t       c   a
> 0000200   r   r   o   t       c   a   r   r   o   t       c   a   r   r
> 0000220   o   t       a   p   p   l   e       b   a   n   a   n   a
> 0000240   c   a   r   r   o   t       b   a   n   a   n   a
> 0000256
>
>
> --
> Thanks,
>
>


-- 
Best regards,

Re: Reduce output is strange

Reply via email to