Great thanks. Is this a server-side-only /requires restart parameter? 2015-02-23 22:36 GMT-08:00 Gopal Vijayaraghavan <[email protected]>:
> Hi, > > Are you sure you have > > hive.optimize.metadataonly=true ? > > I’m not saying it will complete instantaneously (possibly even be very > slow, due to the lack of a temp-table optimization of that), but it won’t > read any part of the actual table. > > Cheers, > Gopal > > From: Stephen Boesch <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Monday, February 23, 2015 at 10:26 PM > To: "[email protected]" <[email protected]> > Subject: Select distinct on partitioned column requires reading all the > files? > > > When querying a hive table according to a partitioning column, it would be > logical that a simple > > select count(distinct partitioned_column_name) from my_partitioned_table > > would complete almost instantaneously. > > But we are seeing that both hive and impala are unable to execute this > query properly: they just read the entire table! > > What do we need to do to ensure the above command executes rapidly? >
