My comments inline From: Stephen Sprague [mailto:sprag...@gmail.com] Sent: Friday, March 14, 2014 12:23 AM To: user@hive.apache.org Subject: Re: Hive - Sorting on the Partition Column data type Int . Output is Alphabetic Sort
wow. its still sorting based on string context. ok, some followups. 1. did you start clean? ie. did you do a "drop table moiz_partition_test" before you started? Yes I did 2. lets see the output of "show create table moiz_partition_test" (if that doesn't work [its hive v0.11 i think] lets see "desc moiz_partition_test" ) hive> desc moiz_partition_test; OK event_dt string partition_hr int Time taken: 1.967 seconds 3. what version of hive are you running? 0.7.1 pretty bizarre. On Thu, Mar 13, 2014 at 3:48 AM, Arafat, Moiz <moiz.ara...@teamaol.com<mailto:moiz.ara...@teamaol.com>> wrote: Hi Stephen, I followed your approach and still got the same result 1) hive> CREATE TABLE moiz_partition_test > (EVENT_DT STRING) partitioned by > ( > PARTITION_HR INT > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '09' > location '/user/moiztcs/moiz_partition_test' > ; 2) hive> alter table moiz_partition_test add IF NOT EXISTS partition (partition_hr=0) ; OK Time taken: 2.421 seconds hive> alter table moiz_partition_test add IF NOT EXISTS partition (partition_hr=1) ; OK Time taken: 0.132 seconds hive> alter table moiz_partition_test add IF NOT EXISTS partition (partition_hr=2) ; OK Time taken: 0.226 seconds hive> alter table moiz_partition_test add IF NOT EXISTS partition (partition_hr=10) ; OK Time taken: 0.177 seconds 3) $ hadoop fs -ls /user/moiztcs/moiz_partition_test Found 4 items drwxr-xr-x - cdidw aolmis 0 2014-03-13 06:40 /user/moiztcs/moiz_partition_test/partition_hr=0 drwxr-xr-x - cdidw aolmis 0 2014-03-13 06:41 /user/moiztcs/moiz_partition_test/partition_hr=1 drwxr-xr-x - cdidw aolmis 0 2014-03-13 06:42 /user/moiztcs/moiz_partition_test/partition_hr=10 drwxr-xr-x - cdidw aolmis 0 2014-03-13 06:41 /user/moiztcs/moiz_partition_test/partition_hr=2 4) $ hadoop fs -copyFromLocal test.dat /user/moiztcs/moiz_partition_test/partition_hr=0 $ hadoop fs -copyFromLocal test.dat /user/moiztcs/moiz_partition_test/partition_hr=1 $ hadoop fs -copyFromLocal test.dat /user/moiztcs/moiz_partition_test/partition_hr=10 $ hadoop fs -copyFromLocal test.dat /user/moiztcs/moiz_partition_test/partition_hr=2 5) hive> select distinct partition_hr from moiz_partition_test order by partition_hr; OK 0 1 10 2 Thanks, Moiz From: Stephen Sprague [mailto:sprag...@gmail.com<mailto:sprag...@gmail.com>] Sent: Wednesday, March 12, 2014 9:58 PM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: Hive - Sorting on the Partition Column data type Int . Output is Alphabetic Sort there you go. I think you're inflicting too much of your own will onto hive with specifying the partition directories as 00, 01, 02. In my experience hive expects the partition name followed by an equal sign followed by the value. I'd stick with this kind of hdfs topology: /user/moiztcs/moiz_partition_test/partition_hr=00/ /user/moiztcs/moiz_partition_test/partition_hr=01/ /user/moiztcs/moiz_partition_test/partition_hr=10/ By omitting the location clause on your alter table statements you should get above layout which can be confirmed by issuing the following command: $ hdfs dfs -ls /user/moiztc/moiz_partition_test Can you try this? On Wed, Mar 12, 2014 at 12:10 AM, Arafat, Moiz <moiz.ara...@teamaol.com<mailto:moiz.ara...@teamaol.com>> wrote: Hi, Here are the steps I followed . Please let me know If I did something wrong. 1) Create table hive> CREATE TABLE moiz_partition_test > (EVENT_DT STRING) partitioned by > ( > PARTITION_HR INT > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '09' > location '/user/moiztcs/moiz_partition_test' > ; 2) Add partitions hive> alter table moiz_partition_test add IF NOT EXISTS partition (partition_hr=0) location '/user/moiztcs/moiz_partition_test/00'; OK Time taken: 0.411 seconds hive> alter table moiz_partition_test add IF NOT EXISTS partition (partition_hr=1) location '/user/moiztcs/moiz_partition_test/01'; OK Time taken: 0.193 seconds hive> alter table moiz_partition_test add IF NOT EXISTS partition (partition_hr=2) location '/user/moiztcs/moiz_partition_test/02'; OK Time taken: 0.182 seconds hive> alter table moiz_partition_test add IF NOT EXISTS partition (partition_hr=10) location '/user/moiztcs/moiz_partition_test/10'; OK Time taken: 0.235 seconds 3) Copy data into the directories hadoop fs -copyFromLocal test.dat /user/moiztcs/moiz_partition_test/00 hadoop fs -copyFromLocal test.dat /user/moiztcs/moiz_partition_test/01 hadoop fs -copyFromLocal test.dat /user/moiztcs/moiz_partition_test/02 hadoop fs -copyFromLocal test.dat /user/moiztcs/moiz_partition_test/10 4) Ran the sql hive> select distinct partition_hr from moiz_partition_test order by partition_hr; Ended Job OK 0 1 10 2 Thanks, Moiz From: Stephen Sprague [mailto:sprag...@gmail.com<mailto:sprag...@gmail.com>] Sent: Wednesday, March 12, 2014 12:55 AM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: Hive - Sorting on the Partition Column data type Int . Output is Alphabetic Sort that makes no sense. if the column is an int it isn't going to sort like a string. I smell a user error somewhere. On Tue, Mar 11, 2014 at 6:21 AM, Arafat, Moiz <moiz.ara...@teamaol.com<mailto:moiz.ara...@teamaol.com>> wrote: Hi , I have a table that has a partition column partition_hr . Data Type is int (partition_hr int) . When i run a sort on this column the output is like this. 0 1 10 11 12 13 14 15 16 17 18 19 2 20 21 22 23 3 4 5 6 7 8 9 I expected the output like this . 0 1 2 3 4 5 6 7 8 9 10 . . and so on. It works fine for non-partition columns. Please advise. Thanks, Moiz