Yes, It looks like they were not unique and hence the reduction in count. Thanks!
On Fri, Apr 8, 2016 at 9:46 PM, Steve Terrell <[email protected]> wrote: > Are the primary keys in the .csv file are all unique? (no rows overwriting > other rows) > > On Fri, Apr 8, 2016 at 10:21 AM, Amit Shah <[email protected]> wrote: > >> Hi, >> >> I am using phoenix 4.6 and hbase 1.0. After bulk loading 10 mil records >> into a table using the psql.py utility, I tried querying the table using >> the sqlline.py utility through a select count(*) query. I see only 0.1 >> million records. >> >> What could be missing? >> >> The psql.py logs are >> >> python psql.py localhost -t TRANSACTIONS_TEST >> ../examples/Transactions_big.csv >> csv columns from database. >> Table row timestamp column position: -1 >> Table name: SYSTEM.CATALOG >> CSV Upsert complete. 10000000 rows upserted >> Time: 4679.317 sec(s) >> >> >> 0: jdbc:phoenix:localhost> select count(*) from TRANSACTIONS_TEST; >> Table row timestamp column position: -1 >> Table name: TRANSACTIONS_TEST >> Table row timestamp column position: -1 >> Table name: SYSTEM.CATALOG >> +------------------------------------------+ >> | COUNT(1) | >> +------------------------------------------+ >> | 184402 | >> +------------------------------------------+ >> 1 row selected (2.173 seconds) >> >> Thanks, >> Amit >> > >
