There must be some noise in your input that is getting interpreted
differently by Hive and Pig. Loading a bunch of newlines does generate
nulls, so I am not sure what's happening there. Are you loading using
PigStorage? Default delimiters? Can you upload a sample file and script that
reproduces the problem somewhere? Are you running this on Windows with its
weird newline delimiters?

grunt> cat tmp/nulltest
1
2

3
grunt> data = load 'tmp/nulltest' using PigStorage() as (num);
grunt> processed = foreach data generate (num is null OR num == 3 ? 'XXX' :
num) as num;
grunt> dump processed;
(1)
(2)
(XXX)
(XXX)

-Dmitriy

On Mon, May 17, 2010 at 12:44 PM, Syed Wasti <mdwa...@hotmail.com> wrote:

> Have tried both ways foo is null OR foo == '\n', doesn't work in pig.
> Why would null values be saved as \N in a file ? Is there a reason, is this
> hive or hadoop way which pig cant understand ?
>
>
> On 5/17/10 11:53 AM, "Dmitriy Ryaboy" <dvrya...@gmail.com> wrote:
>
> > Arguably, that's a Hive bug. What does hive do if you *want* to have a \n
> as
> > a value?
> >
> > For your case, I think it's as simple as foreach rel generate ( foo is
> null
> > OR foo == '\n' ? 'U' : foo);
> >
> > -D
> >
> > On Mon, May 17, 2010 at 11:42 AM, Syed Wasti <mdwa...@hotmail.com>
> wrote:
> >
> >> Well Dmitriy, my bad, I was looking at the data through a hive query and
> it
> >> shows as NULL, but when I looked into the flat file all the NULL values
> are
> >> are seen as \N.
> >> Hive is able to understand \N as NULL but pig is not... How can I
> resolve
> >> this ?
> >>
> >> On 5/16/10 4:33 PM, "Dmitriy Ryaboy" <dvrya...@gmail.com> wrote:
> >>
> >>> In that case, maybe it's the data, and what you think is null is
> actually
> >>> '\n' ?
> >>>
> >>> -D
> >>>
> >>> On Sun, May 16, 2010 at 4:07 PM, Syed Wasti <mdwa...@hotmail.com>
> wrote:
> >>>
> >>>> Doing absolutely the same thing and I am using pig 6 too.
> >>>> Tried with the fake data on both local and mapreduce modes, works
> fine.
> >>>> But on my script against actual data in mapreduce mode, it fails to do
> >> the
> >>>> same thing, places \N instead of U.
> >>>>
> >>>> grunt> rel1 = LOAD '/user/swasti/data' USING PigStorage('\t') as
> (num);
> >>>> grunt> dump rel1;
> >>>> (1)
> >>>> (2)
> >>>> (3)
> >>>> ()
> >>>> (5)
> >>>> grunt> find_null = FOREACH rel1 GENERATE (num is null?'U':num);
> >>>> grunt> dump find_null;
> >>>> (1)
> >>>> (2)
> >>>> (3)
> >>>> (U)
> >>>> (5)
> >>>>
> >>>>
> >>>> On 5/16/10 2:23 PM, "Dmitriy Ryaboy" <dvrya...@gmail.com> wrote:
> >>>>
> >>>>> So what I am saying is, check that you are not inserting some weird
> >>>>> non-ascii quotes in your actual script.
> >>>>> I just ran this on Pig 6, it worked:
> >>>>>
> >>>>> grunt> data = load 'tmp/nulltest' using PigStorage() as (num);
> >>>>> grunt> dump data;
> >>>>> (1)
> >>>>> (2)
> >>>>> ()
> >>>>> (3)
> >>>>> grunt> find_nulls = foreach data generate ( num is null ? 'U' : num
> );
> >>>>> grunt> dump find_nulls;
> >>>>> (1)
> >>>>> (2)
> >>>>> (U)
> >>>>> (3)
> >>>>>
> >>>>> I double-checked just in case, and it works in both local and
> mapreduce
> >>>>> modes.
> >>>>>
> >>>>> -Dmitriy
> >>>>>
> >>>>> On Sun, May 16, 2010 at 1:49 PM, Syed Wasti <mdwa...@hotmail.com>
> >> wrote:
> >>>>>
> >>>>>> Hmm not sure why, I used quotes in this mail, let me rewrite,
> >>>>>> SQL(U is within single quotes): NVL(city,U) city
> >>>>>> Pig(U is within single quotes): (city is null?U:city) AS city
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 5/16/10 1:36 PM, "Dmitriy Ryaboy" <dvrya...@gmail.com> wrote:
> >>>>>>
> >>>>>>> Syed,
> >>>>>>> The samples you pasted include all kinds of extraneous characters.
> >> Are
> >>>>>> you
> >>>>>>> sure your script is properly encoded?
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sun, May 16, 2010 at 1:16 PM, Syed Wasti <mdwa...@hotmail.com>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> I am trying the SQL ³NVL(city, ŒU¹) city² in pig I am using the
> >>>> bincond
> >>>>>>>> operator, ³(city is null?'U': city) AS city², which is of
> chararray
> >>>>>> type,
> >>>>>>>> the result file shows Œ\N¹ instead of U.  Any ideas ?
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>>
> >>
> >>
> >>
>
>
>

Reply via email to