Hi,

I reproduced the issue with less columns as well.

dump final_by_lsn;

(en,LornaGreenNWC)
(en,GuitartJosep)
(en,gwenshap)
(en,innovatesocialm)
(en,Komalmittal91)
(en,Ian_hoch)
(en,p2people)
(en,W4_Jobs_in_ARZ)
(en,wesleyyuhn1)
(en,DoThisBest)

grunt> final_by_lsn_g = GROUP final_by_lsn BY screen_name;


grunt> dump final_by_lsn_g;

(gwenshap,{(en,gwenshap)})
(p2people,{(en,p2people),(en,p2people),(en,p2people)})
(GuitartJosep,{(en,GuitartJosep),(en,GuitartJosep),(en,GuitartJosep)})
(W4_Jobs_in_ARZ,{(en,W4_Jobs_in_ARZ),(en,W4_Jobs_in_ARZ),(en,W4_Jobs_in_ARZ)})


Steps I tried to find the root-cause:
- Removing special characters from the data
- Setting the loglevel to 'Debug'
However, I couldn't find a clue about the problem.



Can someone please help me troubleshoot the issue?

Thanks,
Joel

On Fri, Nov 13, 2015 at 12:18 PM, Steve Terrell <[email protected]>
wrote:

> Please try reproducing the problem with the smallest amount of data
> possible.  Use as few rows and the smallest strings possible that still
> demonstrate the discrepancy.  And then repost your problem.  In doing so,
> it will make your request easier to digest by the readers of group, and you
> might even discover a problem in your original data if you can not
> reproduce it on a smaller scale.
>
> Thanks,
>     Steve
>
> On Fri, Nov 13, 2015 at 10:28 AM, Sam Joe <[email protected]> wrote:
>
> > Hi,
> >
> > I am trying to group a table (final) containing 10 records, by a
> > column screen_name using the following command.
> >
> > final_by_sn = GROUP final BY screen_name;
> >
> > When I dump final_by_sn table, only 4 records are returned as shown
> below:
> >
> > grunt> dump final_by_sn;
> >
> > (gwenshap,{(.@bigdata used this photo in his blog post and made me
> realize
> > how much I miss Japan:
> https://t.co/XdglxbLBhN,en,gwenshap,,4992,1887,2943
> > )
> > })
> > (p2people,{(6 new @p2pLanguages jobs w/ #BigData #Hadoop skills
> > http://t.co/UBAni5DPrw
> http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,2437
> > ),(6
> > new @p2pLanguages jobs w/ #BigData #Hadoop skills http://t.co/UBAni5DPrw
> > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,2437),(6 new @p2pLanguages
> > jobs w/ #BigData #Hadoop skills http://t.co/UBAni5DPrw
> > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,2437)})
> > (GuitartJosep,{(#BigData: What it can and can't do!
> > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,140),(#BigData: What it
> can
> > and can't do! http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,140
> > ),(#BigData:
> > What it can and can't do!
> > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,140)})
> > (W4_Jobs_in_ARZ,{(Big #Data #Lead Phoenix AZ (#job) wanted in #Arizona.
> > #TechFetch http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,433),(Big #Data
> > #Lead Phoenix AZ (#job) wanted in #Arizona. #TechFetch
> > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,433),(Big #Data #Lead
> > Phoenix
> > AZ (#job) wanted in #Arizona. #TechFetch
> > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,433)})
> >
> > dump final;
> >
> > (RT @lordlancaster: Absolutely blown away by @SciTecDaresbury! 'Proper'
> Big
> > Data, Smart Cities, Internet of Things &amp; more! #TechNorth
> > http:/…,en,LornaGreenNWC,8,166,188,Mon May 12 10:19:39 +0000
> > 2014,654395184428515332)
> > (#BigData: What it can and can't do!
> > http://t.co/LrO4NBZE4J,en,GuitartJosep,,61,218,Thu Jun 18 10:20:02 +0000
> > 2015,654395189595869184)
> > (.@bigdata used this photo in his blog post and made me realize how much
> I
> > miss Japan: https://t.co/XdglxbLBhN,en,gwenshap,,4992,1887,Mon Oct 15
> > 20:49:39 +0000 2007,654395195581009920)
> > ("Global Release [Big Data Book] Profit From Science" on @LinkedIn
> > http://t.co/WnJ2HwthYF Congrats to George
> > Danner!,en,innovatesocialm,,1517,1712,Wed Sep 12 13:46:43 +0000
> > 2012,654395207065034752)
> > (Hi, BesPardon Don't Forget to follow --&gt;&gt; http://t.co/Dahu964w5U
> > Thanks.. http://t.co/9kKXJ0GQcT,en,Komalmittal91,,51,0,Thu Feb 12
> 16:44:50
> > +0000 2015,654395216208752641)
> > (On Google Books, language, and the possible limits of big data
> > https://t.co/OEebZSK952,en,Ian_hoch,,63,107,Fri Aug 31 16:25:09 +0000
> > 2012,654395216057659392)
> > (6 new @p2pLanguages jobs w/ #BigData #Hadoop skills
> > http://t.co/UBAni5DPrw
> > http://t.co/IhKNWMc5fy,en,p2people,,1899,1916,Wed Mar 04 06:17:09 +0000
> > 2009,654395220373729280)
> > (Big #Data #Lead Phoenix AZ (#job) wanted in #Arizona. #TechFetch
> > http://t.co/v82R4WmWMC,en,W4_Jobs_in_ARZ,,7,9,Fri Aug 29 09:32:31 +0000
> > 2014,654395236718911488)
> > (#Appboy expands suite of #mobile #analytics @venturebeat @wesleyyuhn1
> > http://t.co/85P6vEJg08 #MarTech #automation
> > http://t.co/rWqzNNt1vW,en,wesleyyuhn1,,1531,1927,Mon Jul 21 12:35:12
> +0000
> > 2014,654395243975065600)
> > (Best Cloud Hosting and CDN services for Web Developers
> > http://t.co/9uf6IaUIlM #cdn #cloudcomputing #cloudhosting #webmasters
> > #websites,en,DoThisBest,,816,1092,Mon Nov 26 18:34:20 +0000
> > 2012,654395246025904128)
> > grunt>
> >
> >
> > Could you please help me understand why 6 records are eliminated while
> > doing a group by?
> >
> > Thanks,
> > Joel
> >
>

Reply via email to