Hi Andrew, I tried that too. Every field has got correct data. Thanks, Joel
On Wed, Nov 18, 2015 at 12:55 AM, Andrew Oliver <[email protected]> wrote: > Project just screen_name. If it is blank or empty you have your answer. > On Nov 17, 2015 23:47, "Sam Joe" <[email protected]> wrote: > > > debug is on. verbose have to try. > > > > Thx. > > > > On Tue, Nov 17, 2015 at 11:45 PM, Arvind S <[email protected]> > wrote: > > > > > have you tried > > > grunt> set debug on; > > > grunt> set verbose on; > > > > > > this gives some counters which might help .. > > > > > > > > > *Cheers !!* > > > Arvind > > > > > > On Wed, Nov 18, 2015 at 9:51 AM, Sam Joe <[email protected]> > > wrote: > > > > > > > Hi Arvind, > > > > > > > > Thanks but I ensured that each element is populated to their > respective > > > > fields. I also ensured that the data is clean since the record which > is > > > > getting eliminated is getting processed fine if only one record is > > > > processed. > > > > > > > > How to find the root-cause? I am not getting anything from the server > > > logs > > > > or from the application logs. Is there any place I should look? > > > > > > > > > > > > Thanks, > > > > Joel > > > > > > > > On Tue, Nov 17, 2015 at 11:06 PM, Arvind S <[email protected]> > > > wrote: > > > > > > > > > Hi .. > > > > > if you are reading json then ensure that the file content is parsed > > > > correct > > > > > by pig before you do grouping. > > > > > Simple dump sometimes does not show if the json was parsed into > > > multiple > > > > > columns or entire line was read as one string into the 1st column > > only. > > > > > > > > > > > > > > > > > > > > *Cheers !!* > > > > > Arvind > > > > > > > > > > On Wed, Nov 18, 2015 at 4:59 AM, Sam Joe <[email protected]> > > > > wrote: > > > > > > > > > > > Hi Arvind, > > > > > > > > > > > > You are right. It works fine in local mode. No records > eliminated. > > > > > > > > > > > > I need to now find out why while using mapreduce mode some > records > > > are > > > > > > getting eliminated. > > > > > > > > > > > > Any suggestions on troubleshooting steps for finding out the > > > root-cause > > > > > in > > > > > > mapreduce mode? Which logs to be checked, etc. > > > > > > > > > > > > Appreciate any help! > > > > > > > > > > > > Thanks, > > > > > > Joel > > > > > > > > > > > > On Mon, Nov 16, 2015 at 11:32 PM, Arvind S < > [email protected]> > > > > > wrote: > > > > > > > > > > > > > tested on pig .15 using your data and in local mode .. could > not > > > > > > reproduce > > > > > > > issue .. > > > > > > > ================================================== > > > > > > > final_by_lsn_g = GROUP final_by_lsn BY screen_name; > > > > > > > > > > > > > > (Ian_hoch,{(en,Ian_hoch)}) > > > > > > > (gwenshap,{(en,gwenshap)}) > > > > > > > (p2people,{(en,p2people)}) > > > > > > > (DoThisBest,{(en,DoThisBest)}) > > > > > > > (wesleyyuhn1,{(en,wesleyyuhn1)}) > > > > > > > (GuitartJosep,{(en,GuitartJosep)}) > > > > > > > (Komalmittal91,{(en,Komalmittal91)}) > > > > > > > (LornaGreenNWC,{(en,LornaGreenNWC)}) > > > > > > > (W4_Jobs_in_ARZ,{(en,W4_Jobs_in_ARZ)}) > > > > > > > (innovatesocialm,{(en,innovatesocialm)}) > > > > > > > ================================================== > > > > > > > final_by_lsn_g = GROUP final_by_lsn BY language; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (en,{(en,DoThisBest),(en,wesleyyuhn1),(en,W4_Jobs_in_ARZ),(en,p2people),(en,Ian_hoch),(en,Komalmittal91),(en,innovatesocialm),(en,gwenshap),(en,GuitartJosep),(en,LornaGreenNWC)}) > > > > > > > ================================================== > > > > > > > > > > > > > > suggestions .. > > > > > > > > try in local mode to reporduce issue .. (if you have not > > already > > > > done > > > > > > so) > > > > > > > > close all old sessions and open a new one... (i know its > > > dumb..but > > > > > > helped > > > > > > > me some times) > > > > > > > > > > > > > > > > > > > > > *Cheers !!* > > > > > > > Arvind > > > > > > > > > > > > > > On Tue, Nov 17, 2015 at 8:09 AM, Sam Joe < > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > I reproduced the issue with less columns as well. > > > > > > > > > > > > > > > > dump final_by_lsn; > > > > > > > > > > > > > > > > (en,LornaGreenNWC) > > > > > > > > (en,GuitartJosep) > > > > > > > > (en,gwenshap) > > > > > > > > (en,innovatesocialm) > > > > > > > > (en,Komalmittal91) > > > > > > > > (en,Ian_hoch) > > > > > > > > (en,p2people) > > > > > > > > (en,W4_Jobs_in_ARZ) > > > > > > > > (en,wesleyyuhn1) > > > > > > > > (en,DoThisBest) > > > > > > > > > > > > > > > > grunt> final_by_lsn_g = GROUP final_by_lsn BY screen_name; > > > > > > > > > > > > > > > > > > > > > > > > grunt> dump final_by_lsn_g; > > > > > > > > > > > > > > > > (gwenshap,{(en,gwenshap)}) > > > > > > > > (p2people,{(en,p2people),(en,p2people),(en,p2people)}) > > > > > > > > > > > > > > > (GuitartJosep,{(en,GuitartJosep),(en,GuitartJosep),(en,GuitartJosep)}) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (W4_Jobs_in_ARZ,{(en,W4_Jobs_in_ARZ),(en,W4_Jobs_in_ARZ),(en,W4_Jobs_in_ARZ)}) > > > > > > > > > > > > > > > > > > > > > > > > Steps I tried to find the root-cause: > > > > > > > > - Removing special characters from the data > > > > > > > > - Setting the loglevel to 'Debug' > > > > > > > > However, I couldn't find a clue about the problem. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Can someone please help me troubleshoot the issue? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Joel > > > > > > > > > > > > > > > > On Fri, Nov 13, 2015 at 12:18 PM, Steve Terrell < > > > > > [email protected] > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Please try reproducing the problem with the smallest amount > > of > > > > data > > > > > > > > > possible. Use as few rows and the smallest strings > possible > > > that > > > > > > still > > > > > > > > > demonstrate the discrepancy. And then repost your problem. > > In > > > > > doing > > > > > > > so, > > > > > > > > > it will make your request easier to digest by the readers > of > > > > group, > > > > > > and > > > > > > > > you > > > > > > > > > might even discover a problem in your original data if you > > can > > > > not > > > > > > > > > reproduce it on a smaller scale. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Steve > > > > > > > > > > > > > > > > > > On Fri, Nov 13, 2015 at 10:28 AM, Sam Joe < > > > > [email protected] > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > I am trying to group a table (final) containing 10 > records, > > > by > > > > a > > > > > > > > > > column screen_name using the following command. > > > > > > > > > > > > > > > > > > > > final_by_sn = GROUP final BY screen_name; > > > > > > > > > > > > > > > > > > > > When I dump final_by_sn table, only 4 records are > returned > > as > > > > > shown > > > > > > > > > below: > > > > > > > > > > > > > > > > > > > > grunt> dump final_by_sn; > > > > > > > > > > > > > > > > > > > > (gwenshap,{(.@bigdata used this photo in his blog post > and > > > made > > > > > me > > > > > > > > > realize > > > > > > > > > > how much I miss Japan: > > > > > > > > > https://t.co/XdglxbLBhN,en,gwenshap,,4992,1887,2943 > > > > > > > > > > ) > > > > > > > > > > }) > > > > > > > > > > (p2people,{(6 new @p2pLanguages jobs w/ #BigData #Hadoop > > > skills > > > > > > > > > > http://t.co/UBAni5DPrw > > > > > > > > > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,2437 > > > > > > > > > > ),(6 > > > > > > > > > > new @p2pLanguages jobs w/ #BigData #Hadoop skills > > > > > > > > http://t.co/UBAni5DPrw > > > > > > > > > > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,2437),(6 > new > > > > > > > > @p2pLanguages > > > > > > > > > > jobs w/ #BigData #Hadoop skills http://t.co/UBAni5DPrw > > > > > > > > > > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,2437)}) > > > > > > > > > > (GuitartJosep,{(#BigData: What it can and can't do! > > > > > > > > > > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,140 > > > ),(#BigData: > > > > > > What > > > > > > > it > > > > > > > > > can > > > > > > > > > > and can't do! > > > > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,140 > > > > > > > > > > ),(#BigData: > > > > > > > > > > What it can and can't do! > > > > > > > > > > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,140)}) > > > > > > > > > > (W4_Jobs_in_ARZ,{(Big #Data #Lead Phoenix AZ (#job) > wanted > > in > > > > > > > #Arizona. > > > > > > > > > > #TechFetch > > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,433 > > > > > ),(Big > > > > > > > > #Data > > > > > > > > > > #Lead Phoenix AZ (#job) wanted in #Arizona. #TechFetch > > > > > > > > > > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,433),(Big > > > #Data > > > > > > #Lead > > > > > > > > > > Phoenix > > > > > > > > > > AZ (#job) wanted in #Arizona. #TechFetch > > > > > > > > > > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,433)}) > > > > > > > > > > > > > > > > > > > > dump final; > > > > > > > > > > > > > > > > > > > > (RT @lordlancaster: Absolutely blown away by > > > @SciTecDaresbury! > > > > > > > 'Proper' > > > > > > > > > Big > > > > > > > > > > Data, Smart Cities, Internet of Things & more! > > #TechNorth > > > > > > > > > > http:/…,en,LornaGreenNWC,8,166,188,Mon May 12 10:19:39 > > +0000 > > > > > > > > > > 2014,654395184428515332) > > > > > > > > > > (#BigData: What it can and can't do! > > > > > > > > > > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,Thu Jun > 18 > > > > > 10:20:02 > > > > > > > > +0000 > > > > > > > > > > 2015,654395189595869184) > > > > > > > > > > (.@bigdata used this photo in his blog post and made me > > > realize > > > > > how > > > > > > > > much > > > > > > > > > I > > > > > > > > > > miss Japan: > > > https://t.co/XdglxbLBhN,en,gwenshap,,4992,1887,Mon > > > > > Oct > > > > > > > 15 > > > > > > > > > > 20:49:39 +0000 2007,654395195581009920) > > > > > > > > > > ("Global Release [Big Data Book] Profit From Science" on > > > > > @LinkedIn > > > > > > > > > > http://t.co/WnJ2HwthYF Congrats to George > > > > > > > > > > Danner!,en,innovatesocialm,,1517,1712,Wed Sep 12 13:46:43 > > > +0000 > > > > > > > > > > 2012,654395207065034752) > > > > > > > > > > (Hi, BesPardon Don't Forget to follow -->> > > > > > > > > http://t.co/Dahu964w5U > > > > > > > > > > Thanks.. > http://t.co/9kKXJ0GQcT,en,Komalmittal91,,51,0,Thu > > > Feb > > > > > 12 > > > > > > > > > 16:44:50 > > > > > > > > > > +0000 2015,654395216208752641) > > > > > > > > > > (On Google Books, language, and the possible limits of > big > > > data > > > > > > > > > > https://t.co/OEebZSK952,en,Ian_hoch,,63,107,Fri Aug 31 > > > > 16:25:09 > > > > > > > +0000 > > > > > > > > > > 2012,654395216057659392) > > > > > > > > > > (6 new @p2pLanguages jobs w/ #BigData #Hadoop skills > > > > > > > > > > http://t.co/UBAni5DPrw > > > > > > > > > > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,Wed Mar 04 > > > > > 06:17:09 > > > > > > > > +0000 > > > > > > > > > > 2009,654395220373729280) > > > > > > > > > > (Big #Data #Lead Phoenix AZ (#job) wanted in #Arizona. > > > > #TechFetch > > > > > > > > > > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,Fri Aug 29 > > > > > 09:32:31 > > > > > > > > +0000 > > > > > > > > > > 2014,654395236718911488) > > > > > > > > > > (#Appboy expands suite of #mobile #analytics @venturebeat > > > > > > > @wesleyyuhn1 > > > > > > > > > > http://t.co/85P6vEJg08 #MarTech #automation > > > > > > > > > > http://t.co/rWqzNNt1vW,en,wesleyyuhn1,,1531,1927,Mon Jul > > 21 > > > > > > 12:35:12 > > > > > > > > > +0000 > > > > > > > > > > 2014,654395243975065600) > > > > > > > > > > (Best Cloud Hosting and CDN services for Web Developers > > > > > > > > > > http://t.co/9uf6IaUIlM #cdn #cloudcomputing > #cloudhosting > > > > > > > #webmasters > > > > > > > > > > #websites,en,DoThisBest,,816,1092,Mon Nov 26 18:34:20 > +0000 > > > > > > > > > > 2012,654395246025904128) > > > > > > > > > > grunt> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Could you please help me understand why 6 records are > > > > eliminated > > > > > > > while > > > > > > > > > > doing a group by? > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Joel > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
