Hi,

Is it possible that 'create table sorted by' must have buckets?

I found the below statements in
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL:

"The CLUSTERED BY and SORTED BY creation commands do not affect how data is
inserted into a table – only how it is read. This means that users must be
careful to insert data correctly by specifying the number of reducers to be
equal to the number of buckets, and using CLUSTER BY and SORT BY commands
in their query."

On Thu, Jul 30, 2015 at 7:22 PM, David Capwell <[email protected]> wrote:

> We are trying to create a external table in hive. This data is sorted,
> so wanted to tell hive about this. When I do, it complains about
> parsing the create.
>
> > CREATE EXTERNAL TABLE IF NOT EXISTS store.testing (
> ...
> . . . . . . . . . . . . . . . . . . .>   timestamp bigint,
> ...)
> . . . . . . . . . . . . . . . . . . .>   SORTED BY (timestamp)
> ...
> . . . . . . . . . . . . . . . . . . .>   LOCATION '/project/db/table'
> . . . . . . . . . . . . . . . . . . .> ;
> Error: Error while compiling statement: FAILED: ParseException line
> 1:507 missing EOF at 'SORTED' near ')' (state=42000,code=40000)
> 2: jdbc:hive2://localhost:10000/store>
>
> What can I do to let hive know that my data is sorted? Every example
> online of sorted by is grouped with buckets, but we really don't want
> to add bucketing.
>
>
> Hive version: 0.14.0
>
> Thanks for your help!
>



-- 
Takahiko Saito

Reply via email to