Re: Table creation for logfile data

2013-11-24 Thread John Omernik
Put the logile into a location on HDFS, and create an external table pointing to that location. The External table should just have one column, a string, CREATE EXTERNAL TABLE logfile_etl (message STRING) LOCATION '/etl/logfile' I think that should work. Then Create another table CREATE TABLE

Re: Table creation for logfile data

2013-11-24 Thread Baahu
Hi John, Thanks for the reply,I have been given a new format of data and now the logs aren't as messy as they were earlier, but yes your mail gave me pointers which helped me is handling the new data. Now..I am stuck while handling a format of date,I am getting date in the form 22/11/13 which is

Re: Table creation for logfile data

2013-11-24 Thread John Omernik
I wouldn't worry about efficiency to much: concat('20', split(date_field, '\\/')[2], '-', split(date_field, '\\/')[1], '-', split(date_field, '\\/')[0]) as proper_date -- -MM-DD On Sun, Nov 24, 2013 at 12:13 PM, Baahu bahub...@gmail.com wrote: Hi John, Thanks for the reply,I have been

Re: Table creation for logfile data

2013-11-24 Thread 郭士伟
or using the Date functions? Such as unix_timestamp to convert date string to unix timestamp and from_unixtime convert unix timestamp to string 2013/11/25 John Omernik j...@omernik.com I wouldn't worry about efficiency to much: concat('20', split(date_field, '\\/')[2], '-', split(date_field,

Re: Table creation for logfile data

2013-11-24 Thread 郭士伟
here is the document https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions 2013/11/25 郭士伟 guoshi...@gmail.com or using the Date functions? Such as unix_timestamp to convert date string to unix timestamp and from_unixtime convert unix timestamp to

Table creation for logfile data

2013-11-23 Thread Baahu
Hi, I have a messy log file which I want to use to create a table, I am only interested to retrieve 3 columns (time,ADD,files),which are in bold. Sample entry from log file *: 13-11-23 06:23:45 [ADD] [file1.zip|file2.zip] * junkjunk|2013-11-23 06:23:44:592 EST|file3.zip xyz|2013-11-23