Thanks Mich. This date is informational for now. Will see once I need it. On Sat, Jun 4, 2016 at 5:23 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> or just create an internal table and do insert/select from external table > to that table as Dudu mentioned > > hive> use test; > OK > hive> desc mytime; > OK > adddate timestamp > > hive> insert into > > test.mytime > > select cast(concat_ws(' ',substring > ("2016-05-17T02:10:44.527",1,10),substring ("2016-05-17T02:10:44.527",12)) > as timestamp) as adddate; > > HTH > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 4 June 2016 at 21:31, Igor Kravzov <igork.ine...@gmail.com> wrote: > >> Thanks Dudu. >> So if I need actual date I will use view. >> Regarding partition column: I can create 2 external tables based on the >> same data with integer or string column partition and see which one is more >> convenient for our use. >> >> On Sat, Jun 4, 2016 at 2:20 PM, Markovitz, Dudu <dmarkov...@paypal.com> >> wrote: >> >>> I’m not aware of an option to do what you request in the external table >>> definition but you might want to that using a view. >>> >>> >>> >>> P.s. >>> >>> I seems to me that defining the partition column as a string would be >>> more user friendly than integer, e.g. – >>> >>> >>> >>> select * from threads_test where yyyymmdd like ‘2016%’ – year 2016; >>> >>> select * from threads_test where yyyymmdd like ‘201603%’ –- March 2016; >>> >>> select * from threads_test where yyyymmdd like ‘______01’ -- first of >>> every month; >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> $ hdfs dfs -ls -R /tmp/threads_test >>> >>> drwxr-xr-x - cloudera supergroup 0 2016-06-04 10:45 >>> /tmp/threads_test/20160604 >>> >>> -rw-r--r-- 1 cloudera supergroup 136 2016-06-04 10:45 >>> /tmp/threads_test/20160604/data.txt >>> >>> >>> >>> $ hdfs dfs -cat /tmp/threads_test/20160604/data.txt >>> >>> {"url":"www.blablabla.com >>> ","pageType":"pg1","addDate":"2016-05-17T02:10:44.527","postDate":"2016-05-16T02:08:55","postText":"YadaYada"} >>> >>> >>> >>> >>> ---------------------------------------------------------------------------------------------------- >>> >>> >>> >>> >>> >>> hive> add jar >>> /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar; >>> >>> >>> >>> hive> >>> >>> create external table threads_test >>> >>> ( >>> >>> url string >>> >>> ,pagetype string >>> >>> ,adddate string >>> >>> ,postdate string >>> >>> ,posttext string >>> >>> ) >>> >>> partitioned by (yyyymmdd string) >>> >>> row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' >>> >>> location '/tmp/threads_test' >>> >>> ; >>> >>> >>> >>> hive> alter table threads_test add partition (yyyymmdd=20160604) >>> location '/tmp/threads_test/20160604'; >>> >>> >>> >>> hive> select * from threads_test; >>> >>> >>> >>> www.blablabla.com pg1 2016-05-17T02:10:44.527 >>> 2016-05-16T02:08:55 YadaYada 20160604 >>> >>> >>> >>> hive> >>> >>> create view threads_test_v >>> >>> as >>> >>> select url >>> >>> ,pagetype >>> >>> ,cast (concat_ws(' ',substr (adddate ,1,10),substr (adddate >>> ,12)) as timestamp) as adddate >>> >>> ,cast (concat_ws(' ',substr (postdate,1,10),substr >>> (postdate,12)) as timestamp) as postdate >>> >>> ,posttext >>> >>> >>> >>> from threads_test >>> >>> ; >>> >>> >>> >>> hive> select * from threads_test_v; >>> >>> >>> >>> www.blablabla.com pg1 2016-05-17 02:10:44.527 2016-05-16 >>> 02:08:55 YadaYada >>> >>> >>> >>> >>> >>> *From:* Igor Kravzov [mailto:igork.ine...@gmail.com] >>> *Sent:* Saturday, June 04, 2016 8:13 PM >>> *To:* user@hive.apache.org >>> *Subject:* Convert date in string format to timestamp in table >>> definition >>> >>> >>> >>> Hi, >>> >>> >>> >>> I have 2 dates in Json file defined like this >>> >>> "addDate": "2016-05-17T02:10:44.527", >>> >>> "postDate": "2016-05-16T02:08:55", >>> >>> >>> >>> Right now I define external table based on this file like this: >>> >>> CREATE external TABLE threads_test >>> >>> (url string, >>> >>> pagetype string, >>> >>> adddate string, >>> >>> postdate string, >>> >>> posttext string) >>> >>> partitioned by (yyyymmdd int) >>> >>> ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' >>> >>> location 'my location'; >>> >>> >>> >>> is it possible to define these 2 dates as timestamp? >>> >>> Do I need to change date format in the file? is it possible to specify >>> date format in table definition? >>> >>> Or I better off with string? >>> >>> >>> >>> Thanks in advance. >>> >> >> >