+1 for documentation. sometimes it surprises you. :)
On Mon, Jul 29, 2013 at 7:11 PM, j.barrett Strausser < j.barrett.straus...@gmail.com> wrote: > Nevermind I see in the docs, it is rows PER SPLIT. > > -b > > > On Mon, Jul 29, 2013 at 9:52 PM, j.barrett Strausser < > j.barrett.straus...@gmail.com> wrote: > >> SELECT COUNT(*) FROM sparse_features_small; >> >> And I receive back : >> >> Total MapReduce CPU Time Spent: 3 seconds 330 msec >> OK >> 100000 >> >> Rather than the expected 50000 >> >> I am running hive 11.2 >> >> >> >> >> On Mon, Jul 29, 2013 at 9:51 PM, j.barrett Strausser < >> j.barrett.straus...@gmail.com> wrote: >> >>> Hello All, >>> >>> Why does TABLESAMPLE(N rows) produce ouptut with 2*N rows? >>> >>> >>> I have the following script: >>> >>> DROP TABLE IF EXISTS sparse_features_small; >>> >>> CREATE TABLE sparse_features_small ROW FORMAT DELIMITED FIELDS >>> TERMINATED BY ',' LINES TERMINATED BY '\n' as >>> >>> SELECT >>> * >>> FROM >>> sparse_features >>> TABLESAMPLE(50000 ROWS) >>> >>> >>> After I execute this by sourcing the file, I can then execute : >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> >>> https://github.com/bearrito >>> @deepbearrito >>> >> >> >> >> -- >> >> >> https://github.com/bearrito >> @deepbearrito >> > > > > -- > > > https://github.com/bearrito > @deepbearrito >