There must be some noise in your input that is getting interpreted differently by Hive and Pig. Loading a bunch of newlines does generate nulls, so I am not sure what's happening there. Are you loading using PigStorage? Default delimiters? Can you upload a sample file and script that reproduces the problem somewhere? Are you running this on Windows with its weird newline delimiters?
grunt> cat tmp/nulltest 1 2 3 grunt> data = load 'tmp/nulltest' using PigStorage() as (num); grunt> processed = foreach data generate (num is null OR num == 3 ? 'XXX' : num) as num; grunt> dump processed; (1) (2) (XXX) (XXX) -Dmitriy On Mon, May 17, 2010 at 12:44 PM, Syed Wasti <mdwa...@hotmail.com> wrote: > Have tried both ways foo is null OR foo == '\n', doesn't work in pig. > Why would null values be saved as \N in a file ? Is there a reason, is this > hive or hadoop way which pig cant understand ? > > > On 5/17/10 11:53 AM, "Dmitriy Ryaboy" <dvrya...@gmail.com> wrote: > > > Arguably, that's a Hive bug. What does hive do if you *want* to have a \n > as > > a value? > > > > For your case, I think it's as simple as foreach rel generate ( foo is > null > > OR foo == '\n' ? 'U' : foo); > > > > -D > > > > On Mon, May 17, 2010 at 11:42 AM, Syed Wasti <mdwa...@hotmail.com> > wrote: > > > >> Well Dmitriy, my bad, I was looking at the data through a hive query and > it > >> shows as NULL, but when I looked into the flat file all the NULL values > are > >> are seen as \N. > >> Hive is able to understand \N as NULL but pig is not... How can I > resolve > >> this ? > >> > >> On 5/16/10 4:33 PM, "Dmitriy Ryaboy" <dvrya...@gmail.com> wrote: > >> > >>> In that case, maybe it's the data, and what you think is null is > actually > >>> '\n' ? > >>> > >>> -D > >>> > >>> On Sun, May 16, 2010 at 4:07 PM, Syed Wasti <mdwa...@hotmail.com> > wrote: > >>> > >>>> Doing absolutely the same thing and I am using pig 6 too. > >>>> Tried with the fake data on both local and mapreduce modes, works > fine. > >>>> But on my script against actual data in mapreduce mode, it fails to do > >> the > >>>> same thing, places \N instead of U. > >>>> > >>>> grunt> rel1 = LOAD '/user/swasti/data' USING PigStorage('\t') as > (num); > >>>> grunt> dump rel1; > >>>> (1) > >>>> (2) > >>>> (3) > >>>> () > >>>> (5) > >>>> grunt> find_null = FOREACH rel1 GENERATE (num is null?'U':num); > >>>> grunt> dump find_null; > >>>> (1) > >>>> (2) > >>>> (3) > >>>> (U) > >>>> (5) > >>>> > >>>> > >>>> On 5/16/10 2:23 PM, "Dmitriy Ryaboy" <dvrya...@gmail.com> wrote: > >>>> > >>>>> So what I am saying is, check that you are not inserting some weird > >>>>> non-ascii quotes in your actual script. > >>>>> I just ran this on Pig 6, it worked: > >>>>> > >>>>> grunt> data = load 'tmp/nulltest' using PigStorage() as (num); > >>>>> grunt> dump data; > >>>>> (1) > >>>>> (2) > >>>>> () > >>>>> (3) > >>>>> grunt> find_nulls = foreach data generate ( num is null ? 'U' : num > ); > >>>>> grunt> dump find_nulls; > >>>>> (1) > >>>>> (2) > >>>>> (U) > >>>>> (3) > >>>>> > >>>>> I double-checked just in case, and it works in both local and > mapreduce > >>>>> modes. > >>>>> > >>>>> -Dmitriy > >>>>> > >>>>> On Sun, May 16, 2010 at 1:49 PM, Syed Wasti <mdwa...@hotmail.com> > >> wrote: > >>>>> > >>>>>> Hmm not sure why, I used quotes in this mail, let me rewrite, > >>>>>> SQL(U is within single quotes): NVL(city,U) city > >>>>>> Pig(U is within single quotes): (city is null?U:city) AS city > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 5/16/10 1:36 PM, "Dmitriy Ryaboy" <dvrya...@gmail.com> wrote: > >>>>>> > >>>>>>> Syed, > >>>>>>> The samples you pasted include all kinds of extraneous characters. > >> Are > >>>>>> you > >>>>>>> sure your script is properly encoded? > >>>>>>> > >>>>>>> > >>>>>>> On Sun, May 16, 2010 at 1:16 PM, Syed Wasti <mdwa...@hotmail.com> > >>>> wrote: > >>>>>>> > >>>>>>>> I am trying the SQL ³NVL(city, ŒU¹) city² in pig I am using the > >>>> bincond > >>>>>>>> operator, ³(city is null?'U': city) AS city², which is of > chararray > >>>>>> type, > >>>>>>>> the result file shows Œ\N¹ instead of U. Any ideas ? > >>>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>> > >>>> > >>>> > >> > >> > >> > > >