Hi Prabhu, Be careful when going into the direction of calendar dimensions. While strictly speaking this is a cleaner dwh design you will for sure run into issues you might not expect. Consider this is probably what you would want to do (roughly) to query a day:
select count(*) from fact f join dim_date d on (d.date_id = f.date_id) where ddate = '2020-12-22' That won't trigger partition pruning and the query will walk over all records in the fact table (I doubt that's what you would want). Pruning happens during the creation of the query plan and at that time it doesn't know how many records the dim_date table will return so can't do any partition pruning for you. If you would want partitioning to work in this case you would have to do: select count(*) from fact f where f.dateid =7662 Which kind of defeats the purpose of the dim_date table :( At this point in time I would simply point the date in the fact table and use functions to get things like month. It's annoying but it works so: select count(*) from fact f where date = '2020-12-22' Bennie. From: prabhu k [mailto:prabhu.h...@gmail.com] Sent: Wednesday, July 25, 2012 1:59 PM To: user@hive.apache.org; bejoy...@yahoo.com Subject: Re: Loading data into data_dim table Thanks for your help :) it's data has been loaded fine now, select * from dim_date; 7662 2020-12-22 00:00:00.000 2020 4 12 3 52 13 4 357 83 22 3 December Dec Tuesday Tue 7663 2020-12-23 00:00:00.000 2020 4 12 3 52 13 4 358 84 23 4 December Dec Wednesday Wed 7664 2020-12-24 00:00:00.000 2020 4 12 3 52 13 4 359 85 24 5 December Dec Thursday Thu 7665 2020-12-25 00:00:00.000 2020 4 12 3 52 13 4 360 86 25 6 December Dec Friday Fri 7666 2020-12-26 00:00:00.000 2020 4 12 3 52 13 4 361 87 26 7 December Dec Saturday Sat 7667 2020-12-27 00:00:00.000 2020 4 12 3 53 14 5 362 88 27 1 December Dec Sunday Sun 7668 2020-12-28 00:00:00.000 2020 4 12 3 53 14 5 363 89 28 2 December Dec Monday Mon 7669 2020-12-29 00:00:00.000 2020 4 12 3 53 14 5 364 90 29 3 December Dec Tuesday Tue 7670 2020-12-30 00:00:00.000 2020 4 12 3 53 14 5 365 91 30 4 December Dec Wednesday Wed 7671 2020-12-31 00:00:00.000 2020 4 12 3 53 14 5 366 92 31 5 December Dec Thursday Thu Time taken: 0.401 seconds Thanks, Prabhu. On Wed, Jul 25, 2012 at 5:20 PM, Bejoy KS <bejoy...@yahoo.com<mailto:bejoy...@yahoo.com>> wrote: Hi Prabhu Your data is tab delimited use /t as the delimiter while creating table. fields terminated by '/t' Not sure this is the right / or not. If this doesn't work try the other one. Regards Bejoy KS Sent from handheld, please excuse typos. ________________________________ From: prabhu k <prabhu.h...@gmail.com<mailto:prabhu.h...@gmail.com>> Date: Wed, 25 Jul 2012 17:10:09 +0530 To: <user@hive.apache.org<mailto:user@hive.apache.org>> ReplyTo: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: Loading data into data_dim table Thanks for the reply. I have tried the with delimited fields terminated by '|' and delimited fields terminated by ',' while selecting the table both Im getting null . when i see the HDFS file looks like below. bin/hadoop fs -cat /user/hive/warehoure/time.txt 7666 2020-12-26 00:00:00.000 2020 4 12 3 52 13 4 361 87 26 7 December Dec Saturday Sat 20201226 2020/12/26 Dec 26 2020 2020-12-26 7667 2020-12-27 00:00:00.000 2020 4 12 3 53 14 5 362 88 27 1 December Dec Sunday Sun 20201227 2020/12/27 Dec 27 2020 2020-12-27 7668 2020-12-28 00:00:00.000 2020 4 12 3 53 14 5 363 89 28 2 December Dec Monday Mon 20201228 2020/12/28 Dec 28 2020 2020-12-28 7669 2020-12-29 00:00:00.000 2020 4 12 3 53 14 5 364 90 29 3 December Dec Tuesday Tue 20201229 2020/12/29 Dec 29 2020 2020-12-29 7670 2020-12-30 00:00:00.000 2020 4 12 3 53 14 5 365 91 30 4 December Dec Wednesday Wed 20201230 2020/12/30 Dec 30 2020 2020-12-30 7671 2020-12-31 00:00:00.000 2020 4 12 3 53 14 5 366 92 31 5 December Dec Thursday Thu 20201231 2020/12/31 Dec 31 2020 2020-12-31 Please suggest and help me. Thanks, Prabhu. On Wed, Jul 25, 2012 at 4:58 PM, Bertrand Dechoux <decho...@gmail.com<mailto:decho...@gmail.com>> wrote: What Bejoy is saying implicitly, is that the format is not verified by the load command. If it does not match, you will get NULL. And it would be curious that your comma separated value (csv) file is using pipe (|) but why not. Bertrand On Wed, Jul 25, 2012 at 12:45 PM, Bejoy KS <bejoy...@yahoo.com<mailto:bejoy...@yahoo.com>> wrote: Hi Prabhu Can you cat the file in hdfs and ensure that the fields are delimited by '|' character. hadoop fs -text user/hive/warehouse/dim_date/time.csv Regards Bejoy KS Sent from handheld, please excuse typos. ________________________________ From: prabhu k <prabhu.h...@gmail.com<mailto:prabhu.h...@gmail.com>> Date: Wed, 25 Jul 2012 16:05:42 +0530 To: <user@hive.apache.org<mailto:user@hive.apache.org>> ReplyTo: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Loading data into data_dim table Hi Users, I have created dim_date table like below. table created successfully and i then load the data into the dim_date table while i am selecting the table, getting null values.my input file is time.csv file hive> create table dim_date(DateId int,ddate string,Year int,Quarter int,Month_Number_Year int,Month_Number_Of_Quarter int,Week_Number_Of_Year int,Week_Number_Of_Quarter int,Week_Number_Of_Month int,Day_Number_Of_Year int,Day_Number_Of_Quarter int,Day_Number_Of_Month int,Day_Number_Of_Week int,Month_Name string,Month_Name_Abbreviation string,Day_Name string,Day_Name_Abbreviation string)row format delimited fields terminated by '|' stored as textfile; OK Time taken: 0.123 seconds hive> load data local inpath '/usr/local/time.csv' overwrite into table dim_date; Copying data from file:/usr/local/time.csv Copying file: file:/usr/local/time.csv Loading data to table default.dim_date Deleted hdfs://md-aaa1:54310/user/hive/warehouse/dim_date OK Time taken: 0.234 seconds hive> select * from dim_date; NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL Please help me on this issue, I have missed anything wrong? Thanks, Prabhu. -- Bertrand Dechoux