Nevermind I see in the docs, it is rows PER SPLIT. -b
On Mon, Jul 29, 2013 at 9:52 PM, j.barrett Strausser < j.barrett.straus...@gmail.com> wrote: > SELECT COUNT(*) FROM sparse_features_small; > > And I receive back : > > Total MapReduce CPU Time Spent: 3 seconds 330 msec > OK > 100000 > > Rather than the expected 50000 > > I am running hive 11.2 > > > > > On Mon, Jul 29, 2013 at 9:51 PM, j.barrett Strausser < > j.barrett.straus...@gmail.com> wrote: > >> Hello All, >> >> Why does TABLESAMPLE(N rows) produce ouptut with 2*N rows? >> >> >> I have the following script: >> >> DROP TABLE IF EXISTS sparse_features_small; >> >> CREATE TABLE sparse_features_small ROW FORMAT DELIMITED FIELDS TERMINATED >> BY ',' LINES TERMINATED BY '\n' as >> >> SELECT >> * >> FROM >> sparse_features >> TABLESAMPLE(50000 ROWS) >> >> >> After I execute this by sourcing the file, I can then execute : >> >> >> >> >> >> >> >> -- >> >> >> https://github.com/bearrito >> @deepbearrito >> > > > > -- > > > https://github.com/bearrito > @deepbearrito > -- https://github.com/bearrito @deepbearrito