Bucket the table and use TABLESAMPLE clause? york
From: Yang <teddyyyy...@gmail.com<mailto:teddyyyy...@gmail.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Date: Sunday, September 29, 2013 3:39 PM To: "hive-u...@hadoop.apache.org<mailto:hive-u...@hadoop.apache.org>" <hive-u...@hadoop.apache.org<mailto:hive-u...@hadoop.apache.org>> Subject: how to treat an existing partition data file as a table? we have a huge table, including browsing data for the past 5 years, let's say. now I want to take a few samples to play around with it. so I did select * from mytable limit 10; but it actually went full out and tried to scan the entire table. is there a way to kind of create a "view" pointing to only one of the data files used by the original table mytable ? this way the total files to be scanned is much smaller. thanks! yang ________________________________ This transmission may contain information that is privileged, confidential, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is strictly prohibited. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format.