If the input column value is NULL or empty string, the row will be put into a special partition, whose name is controlled by the hive parameter hive.exec.default.dynamic.partition.name. The default value is `__HIVE_DEFAULT_PARTITION__`. Basically this partition will contain all "bad" rows whose value are not valid partition names.
so basically you do following things when you create a partitioned table, your partitioned column is normally at the end of the table, so when you are inserting data into this partitioned table, I would recommend using the column names in place select * from so your insert query should look like set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; insert overwrite table new_table partition(event_date) select col1, col2 .... coln, event_date from old_table; On Thu, Jun 13, 2013 at 5:24 PM, Hamza Asad <[email protected]> wrote: > when i browse it in browser, all the data is in * > event_date=__HIVE_DEFAULT_PARTITION__<http://10.0.0.14:50075/browseDirectory.jsp?dir=%2Fvar%2Flog%2Fpring%2Fhive%2Fwarehouse%2Fnydus.db%2Fnew_rc_partition_cluster_table%2Fevent_date%3D__HIVE_DEFAULT_PARTITION__&namenodeInfoPort=50070> > *, rest of the files does not contains data > > > On Thu, Jun 13, 2013 at 4:52 PM, Nitin Pawar <[email protected]>wrote: > >> what do you mean when you say "it wont split correctly" ? >> >> >> On Thu, Jun 13, 2013 at 5:19 PM, Hamza Asad <[email protected]>wrote: >> >>> what if i have data of more then 500 days then how can i create >>> partition on date column by specifying each and every date? (i knw that >>> does not happens in dynamic partition but on dynamic partition, it wont >>> splits correctly). >>> >>> >>> On Thu, Jun 13, 2013 at 4:12 PM, Nitin Pawar <[email protected]>wrote: >>> >>>> you can partition existing table unless the hdfs data is laid out in >>>> partitioned fashion. >>>> your best bet is create a new partitioned table >>>> enable dynamic paritionining >>>> read from old table and write into new table >>>> >>>> you can verify the new partitions by using command "show partitions" >>>> >>>> >>>> On Thu, Jun 13, 2013 at 4:40 PM, Hamza Asad <[email protected]>wrote: >>>> >>>>> now i created partition table like >>>>> *CREATE TABLE new_rc_partition_cluster_table( >>>>> >>>>> id int, >>>>> event_id int, >>>>> user_id BIGINT, >>>>> >>>>> intval_1 int , >>>>> intval_2 int, >>>>> intval_3 int, >>>>> intval_4 int, >>>>> intval_5 int, >>>>> intval_6 int, >>>>> intval_7 int, >>>>> intval_8 int, >>>>> intval_9 int, >>>>> intval_10 int, >>>>> intval_11 int, >>>>> intval_12 int, >>>>> intval_13 int, >>>>> intval_14 int, >>>>> intval_15 int, >>>>> intval_16 int, >>>>> intval_17 int, >>>>> intval_18 int, >>>>> intval_19 int, >>>>> intval_20 int, >>>>> intval_21 int, >>>>> intval_22 int, >>>>> intval_23 int, >>>>> intval_24 int, >>>>> intval_25 int, >>>>> intval_26 int) >>>>> PARTITIONED BY (event_date string) >>>>> >>>>> CLUSTERED BY(id) INTO 256 BUCKETS >>>>> ROW FORMAT DELIMITED >>>>> FIELDS TERMINATED BY ',' >>>>> STORED AS RCFile; >>>>> * >>>>> >>>>> rest of the commands are same. But this time INSERT OVERWRITE query >>>>> executed and took time but when i queries from that table, it results none >>>>> as it does not contains data. why is this so? and also please tell me how >>>>> can i partition my existing table dynamicaly on date so that data splits >>>>> equally without mentioning date explicitly? >>>>> >>>>> >>>>> On Wed, Jun 12, 2013 at 6:50 PM, Nitin Pawar >>>>> <[email protected]>wrote: >>>>> >>>>>> you did not create partitioned table. You just created a bucketed >>>>>> table. >>>>>> >>>>>> refer to partitioned table created >>>>>> something like >>>>>> partitioned by (event_date string) >>>>>> >>>>>> >>>>>> On Wed, Jun 12, 2013 at 7:17 PM, Hamza Asad >>>>>> <[email protected]>wrote: >>>>>> >>>>>>> i have created table after enabling dynamic partition. i partitioned >>>>>>> it on date but it is not splitting data datewise. Below is the query of >>>>>>> table created and data insert >>>>>>> CREATE TABLE rc_partition_cluster_table( >>>>>>> id int, >>>>>>> event_id int, >>>>>>> user_id BIGINT, >>>>>>> event_date string, >>>>>>> intval_1 int ) >>>>>>> CLUSTERED BY(id) INTO 256 BUCKETS >>>>>>> ROW FORMAT DELIMITED >>>>>>> FIELDS TERMINATED BY ',' >>>>>>> STORED AS RCFile; >>>>>>> >>>>>>> set hive.exec.dynamic.partition=true; >>>>>>> set hive.exec.dynamic.partition.mode=nonstrict; >>>>>>> set hive.exec.max.dynamic.partitions=1000; >>>>>>> set hive.exec.max.dynamic.partitions.pernode=1000; >>>>>>> >>>>>>> INSERT OVERWRITE TABLE rc_partition_cluster_table Partition >>>>>>> (event_date) >>>>>>> SELECT * FROM events_details; >>>>>>> >>>>>>> why it is not working fine? >>>>>>> >>>>>>> -- >>>>>>> *Muhammad Hamza Asad* >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Nitin Pawar >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *Muhammad Hamza Asad* >>>>> >>>> >>>> >>>> >>>> -- >>>> Nitin Pawar >>>> >>> >>> >>> >>> -- >>> *Muhammad Hamza Asad* >>> >> >> >> >> -- >> Nitin Pawar >> > > > > -- > *Muhammad Hamza Asad* > -- Nitin Pawar
