Hi, Is it possible that 'create table sorted by' must have buckets?
I found the below statements in https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL: "The CLUSTERED BY and SORTED BY creation commands do not affect how data is inserted into a table – only how it is read. This means that users must be careful to insert data correctly by specifying the number of reducers to be equal to the number of buckets, and using CLUSTER BY and SORT BY commands in their query." On Thu, Jul 30, 2015 at 7:22 PM, David Capwell <[email protected]> wrote: > We are trying to create a external table in hive. This data is sorted, > so wanted to tell hive about this. When I do, it complains about > parsing the create. > > > CREATE EXTERNAL TABLE IF NOT EXISTS store.testing ( > ... > . . . . . . . . . . . . . . . . . . .> timestamp bigint, > ...) > . . . . . . . . . . . . . . . . . . .> SORTED BY (timestamp) > ... > . . . . . . . . . . . . . . . . . . .> LOCATION '/project/db/table' > . . . . . . . . . . . . . . . . . . .> ; > Error: Error while compiling statement: FAILED: ParseException line > 1:507 missing EOF at 'SORTED' near ')' (state=42000,code=40000) > 2: jdbc:hive2://localhost:10000/store> > > What can I do to let hive know that my data is sorted? Every example > online of sorted by is grouped with buckets, but we really don't want > to add bucketing. > > > Hive version: 0.14.0 > > Thanks for your help! > -- Takahiko Saito
