With hadoop streaming and no reducer, I would expect the output written to
HDFS to be the exact STDOUT from the mapper. I noticed that tab characters
(0x9) are getting inserted before every new line character (0xa). This is
problematic for me because the output of my mapper is binary data which I
> > mkdir mypackage
> > mv mypackage/
> > jar cvf NLineRecordReader.jar mypackage
> > [Use this jar]
> >
> > On Thu, Oct 18, 2012 at 10:54 AM, Jason Wang
> wrote:
> >> 1. I did try using NLineInputFormat, but this causes the
> >> "stream.map.input.
nto the front-end too?
>
> $ export HADOOP_CLASSPATH=/path/to/your/jar
> $ command…
>
> 3. Does jar -tf carry a proper mypackage.NLineRecordReader?
>
> 4. Is your class marked public?
>
> On Thu, Oct 18, 2012 at 9:32 AM, Jason Wang
> wrote:
> > Hi all,
> > I&
Hi all,
I'm experimenting with hadoop streaming on build 1.0.3.
To give background info, i'm streaming a text file into mapper written in
C. Using the default settings, streaming uses TextInputFormat which
creates one record from each line. The problem I am having is that I need
record boundarie