Hi Andrew, I tried that too. Every field has got correct data.

Thanks,
Joel

On Wed, Nov 18, 2015 at 12:55 AM, Andrew Oliver <[email protected]> wrote:

> Project just screen_name. If it is blank or empty you have your answer.
> On Nov 17, 2015 23:47, "Sam Joe" <[email protected]> wrote:
>
> > debug is on. verbose have to try.
> >
> > Thx.
> >
> > On Tue, Nov 17, 2015 at 11:45 PM, Arvind S <[email protected]>
> wrote:
> >
> > > have you tried
> > > grunt> set debug on;
> > > grunt> set verbose on;
> > >
> > > this gives some counters which might help ..
> > >
> > >
> > > *Cheers !!*
> > > Arvind
> > >
> > > On Wed, Nov 18, 2015 at 9:51 AM, Sam Joe <[email protected]>
> > wrote:
> > >
> > > > Hi Arvind,
> > > >
> > > > Thanks but I ensured that each element is populated to their
> respective
> > > > fields. I also ensured that the data is clean since the record which
> is
> > > > getting eliminated is getting processed fine if only one record is
> > > > processed.
> > > >
> > > > How to find the root-cause? I am not getting anything from the server
> > > logs
> > > > or from the application logs. Is there any place I should look?
> > > >
> > > >
> > > > Thanks,
> > > > Joel
> > > >
> > > > On Tue, Nov 17, 2015 at 11:06 PM, Arvind S <[email protected]>
> > > wrote:
> > > >
> > > > > Hi ..
> > > > > if you are reading json then ensure that the file content is parsed
> > > > correct
> > > > > by pig before you do grouping.
> > > > > Simple dump sometimes does not show if the json was parsed into
> > > multiple
> > > > > columns or entire line was read as one string into the 1st column
> > only.
> > > > >
> > > > >
> > > > >
> > > > > *Cheers !!*
> > > > > Arvind
> > > > >
> > > > > On Wed, Nov 18, 2015 at 4:59 AM, Sam Joe <[email protected]>
> > > > wrote:
> > > > >
> > > > > > Hi Arvind,
> > > > > >
> > > > > > You are right. It works fine in local mode. No records
> eliminated.
> > > > > >
> > > > > > I need to now find out why while using mapreduce mode some
> records
> > > are
> > > > > > getting eliminated.
> > > > > >
> > > > > > Any suggestions on troubleshooting steps for finding out the
> > > root-cause
> > > > > in
> > > > > > mapreduce mode? Which logs to be checked, etc.
> > > > > >
> > > > > > Appreciate any help!
> > > > > >
> > > > > > Thanks,
> > > > > > Joel
> > > > > >
> > > > > > On Mon, Nov 16, 2015 at 11:32 PM, Arvind S <
> [email protected]>
> > > > > wrote:
> > > > > >
> > > > > > > tested on pig .15 using your data and in local mode .. could
> not
> > > > > > reproduce
> > > > > > > issue ..
> > > > > > > ==================================================
> > > > > > > final_by_lsn_g = GROUP final_by_lsn BY screen_name;
> > > > > > >
> > > > > > > (Ian_hoch,{(en,Ian_hoch)})
> > > > > > > (gwenshap,{(en,gwenshap)})
> > > > > > > (p2people,{(en,p2people)})
> > > > > > > (DoThisBest,{(en,DoThisBest)})
> > > > > > > (wesleyyuhn1,{(en,wesleyyuhn1)})
> > > > > > > (GuitartJosep,{(en,GuitartJosep)})
> > > > > > > (Komalmittal91,{(en,Komalmittal91)})
> > > > > > > (LornaGreenNWC,{(en,LornaGreenNWC)})
> > > > > > > (W4_Jobs_in_ARZ,{(en,W4_Jobs_in_ARZ)})
> > > > > > > (innovatesocialm,{(en,innovatesocialm)})
> > > > > > > ==================================================
> > > > > > > final_by_lsn_g = GROUP final_by_lsn BY language;
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> (en,{(en,DoThisBest),(en,wesleyyuhn1),(en,W4_Jobs_in_ARZ),(en,p2people),(en,Ian_hoch),(en,Komalmittal91),(en,innovatesocialm),(en,gwenshap),(en,GuitartJosep),(en,LornaGreenNWC)})
> > > > > > > ==================================================
> > > > > > >
> > > > > > > suggestions ..
> > > > > > > > try in local mode to reporduce issue .. (if you have not
> > already
> > > > done
> > > > > > so)
> > > > > > > > close all old sessions and open a new one... (i know its
> > > dumb..but
> > > > > > helped
> > > > > > > me some times)
> > > > > > >
> > > > > > >
> > > > > > > *Cheers !!*
> > > > > > > Arvind
> > > > > > >
> > > > > > > On Tue, Nov 17, 2015 at 8:09 AM, Sam Joe <
> > [email protected]>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I reproduced the issue with less columns as well.
> > > > > > > >
> > > > > > > > dump final_by_lsn;
> > > > > > > >
> > > > > > > > (en,LornaGreenNWC)
> > > > > > > > (en,GuitartJosep)
> > > > > > > > (en,gwenshap)
> > > > > > > > (en,innovatesocialm)
> > > > > > > > (en,Komalmittal91)
> > > > > > > > (en,Ian_hoch)
> > > > > > > > (en,p2people)
> > > > > > > > (en,W4_Jobs_in_ARZ)
> > > > > > > > (en,wesleyyuhn1)
> > > > > > > > (en,DoThisBest)
> > > > > > > >
> > > > > > > > grunt> final_by_lsn_g = GROUP final_by_lsn BY screen_name;
> > > > > > > >
> > > > > > > >
> > > > > > > > grunt> dump final_by_lsn_g;
> > > > > > > >
> > > > > > > > (gwenshap,{(en,gwenshap)})
> > > > > > > > (p2people,{(en,p2people),(en,p2people),(en,p2people)})
> > > > > > > >
> > > > >
> > (GuitartJosep,{(en,GuitartJosep),(en,GuitartJosep),(en,GuitartJosep)})
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> (W4_Jobs_in_ARZ,{(en,W4_Jobs_in_ARZ),(en,W4_Jobs_in_ARZ),(en,W4_Jobs_in_ARZ)})
> > > > > > > >
> > > > > > > >
> > > > > > > > Steps I tried to find the root-cause:
> > > > > > > > - Removing special characters from the data
> > > > > > > > - Setting the loglevel to 'Debug'
> > > > > > > > However, I couldn't find a clue about the problem.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Can someone please help me troubleshoot the issue?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Joel
> > > > > > > >
> > > > > > > > On Fri, Nov 13, 2015 at 12:18 PM, Steve Terrell <
> > > > > [email protected]
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Please try reproducing the problem with the smallest amount
> > of
> > > > data
> > > > > > > > > possible.  Use as few rows and the smallest strings
> possible
> > > that
> > > > > > still
> > > > > > > > > demonstrate the discrepancy.  And then repost your problem.
> > In
> > > > > doing
> > > > > > > so,
> > > > > > > > > it will make your request easier to digest by the readers
> of
> > > > group,
> > > > > > and
> > > > > > > > you
> > > > > > > > > might even discover a problem in your original data if you
> > can
> > > > not
> > > > > > > > > reproduce it on a smaller scale.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >     Steve
> > > > > > > > >
> > > > > > > > > On Fri, Nov 13, 2015 at 10:28 AM, Sam Joe <
> > > > [email protected]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > I am trying to group a table (final) containing 10
> records,
> > > by
> > > > a
> > > > > > > > > > column screen_name using the following command.
> > > > > > > > > >
> > > > > > > > > > final_by_sn = GROUP final BY screen_name;
> > > > > > > > > >
> > > > > > > > > > When I dump final_by_sn table, only 4 records are
> returned
> > as
> > > > > shown
> > > > > > > > > below:
> > > > > > > > > >
> > > > > > > > > > grunt> dump final_by_sn;
> > > > > > > > > >
> > > > > > > > > > (gwenshap,{(.@bigdata used this photo in his blog post
> and
> > > made
> > > > > me
> > > > > > > > > realize
> > > > > > > > > > how much I miss Japan:
> > > > > > > > > https://t.co/XdglxbLBhN,en,gwenshap,,4992,1887,2943
> > > > > > > > > > )
> > > > > > > > > > })
> > > > > > > > > > (p2people,{(6 new @p2pLanguages jobs w/ #BigData #Hadoop
> > > skills
> > > > > > > > > > http://t.co/UBAni5DPrw
> > > > > > > > > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,2437
> > > > > > > > > > ),(6
> > > > > > > > > > new @p2pLanguages jobs w/ #BigData #Hadoop skills
> > > > > > > > http://t.co/UBAni5DPrw
> > > > > > > > > > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,2437),(6
> new
> > > > > > > > @p2pLanguages
> > > > > > > > > > jobs w/ #BigData #Hadoop skills http://t.co/UBAni5DPrw
> > > > > > > > > > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,2437)})
> > > > > > > > > > (GuitartJosep,{(#BigData: What it can and can't do!
> > > > > > > > > > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,140
> > > ),(#BigData:
> > > > > > What
> > > > > > > it
> > > > > > > > > can
> > > > > > > > > > and can't do!
> > > > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,140
> > > > > > > > > > ),(#BigData:
> > > > > > > > > > What it can and can't do!
> > > > > > > > > > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,140)})
> > > > > > > > > > (W4_Jobs_in_ARZ,{(Big #Data #Lead Phoenix AZ (#job)
> wanted
> > in
> > > > > > > #Arizona.
> > > > > > > > > > #TechFetch
> > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,433
> > > > > ),(Big
> > > > > > > > #Data
> > > > > > > > > > #Lead Phoenix AZ (#job) wanted in #Arizona. #TechFetch
> > > > > > > > > > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,433),(Big
> > > #Data
> > > > > > #Lead
> > > > > > > > > > Phoenix
> > > > > > > > > > AZ (#job) wanted in #Arizona. #TechFetch
> > > > > > > > > > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,433)})
> > > > > > > > > >
> > > > > > > > > > dump final;
> > > > > > > > > >
> > > > > > > > > > (RT @lordlancaster: Absolutely blown away by
> > > @SciTecDaresbury!
> > > > > > > 'Proper'
> > > > > > > > > Big
> > > > > > > > > > Data, Smart Cities, Internet of Things &amp; more!
> > #TechNorth
> > > > > > > > > > http:/…,en,LornaGreenNWC,8,166,188,Mon May 12 10:19:39
> > +0000
> > > > > > > > > > 2014,654395184428515332)
> > > > > > > > > > (#BigData: What it can and can't do!
> > > > > > > > > > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,Thu Jun
> 18
> > > > > 10:20:02
> > > > > > > > +0000
> > > > > > > > > > 2015,654395189595869184)
> > > > > > > > > > (.@bigdata used this photo in his blog post and made me
> > > realize
> > > > > how
> > > > > > > > much
> > > > > > > > > I
> > > > > > > > > > miss Japan:
> > > https://t.co/XdglxbLBhN,en,gwenshap,,4992,1887,Mon
> > > > > Oct
> > > > > > > 15
> > > > > > > > > > 20:49:39 +0000 2007,654395195581009920)
> > > > > > > > > > ("Global Release [Big Data Book] Profit From Science" on
> > > > > @LinkedIn
> > > > > > > > > > http://t.co/WnJ2HwthYF Congrats to George
> > > > > > > > > > Danner!,en,innovatesocialm,,1517,1712,Wed Sep 12 13:46:43
> > > +0000
> > > > > > > > > > 2012,654395207065034752)
> > > > > > > > > > (Hi, BesPardon Don't Forget to follow --&gt;&gt;
> > > > > > > > http://t.co/Dahu964w5U
> > > > > > > > > > Thanks..
> http://t.co/9kKXJ0GQcT,en,Komalmittal91,,51,0,Thu
> > > Feb
> > > > > 12
> > > > > > > > > 16:44:50
> > > > > > > > > > +0000 2015,654395216208752641)
> > > > > > > > > > (On Google Books, language, and the possible limits of
> big
> > > data
> > > > > > > > > > https://t.co/OEebZSK952,en,Ian_hoch,,63,107,Fri Aug 31
> > > > 16:25:09
> > > > > > > +0000
> > > > > > > > > > 2012,654395216057659392)
> > > > > > > > > > (6 new @p2pLanguages jobs w/ #BigData #Hadoop skills
> > > > > > > > > > http://t.co/UBAni5DPrw
> > > > > > > > > > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,Wed Mar 04
> > > > > 06:17:09
> > > > > > > > +0000
> > > > > > > > > > 2009,654395220373729280)
> > > > > > > > > > (Big #Data #Lead Phoenix AZ (#job) wanted in #Arizona.
> > > > #TechFetch
> > > > > > > > > > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,Fri Aug 29
> > > > > 09:32:31
> > > > > > > > +0000
> > > > > > > > > > 2014,654395236718911488)
> > > > > > > > > > (#Appboy expands suite of #mobile #analytics @venturebeat
> > > > > > > @wesleyyuhn1
> > > > > > > > > > http://t.co/85P6vEJg08 #MarTech #automation
> > > > > > > > > > http://t.co/rWqzNNt1vW,en,wesleyyuhn1,,1531,1927,Mon Jul
> > 21
> > > > > > 12:35:12
> > > > > > > > > +0000
> > > > > > > > > > 2014,654395243975065600)
> > > > > > > > > > (Best Cloud Hosting and CDN services for Web Developers
> > > > > > > > > > http://t.co/9uf6IaUIlM #cdn #cloudcomputing
> #cloudhosting
> > > > > > > #webmasters
> > > > > > > > > > #websites,en,DoThisBest,,816,1092,Mon Nov 26 18:34:20
> +0000
> > > > > > > > > > 2012,654395246025904128)
> > > > > > > > > > grunt>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Could you please help me understand why 6 records are
> > > > eliminated
> > > > > > > while
> > > > > > > > > > doing a group by?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Joel
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to