Put the logile into a location on HDFS, and create an external table
pointing to that location. The External table should just have one column,
a string,
CREATE EXTERNAL TABLE logfile_etl (message STRING) LOCATION '/etl/logfile'
I think that should work.
Then Create another table
CREATE TABLE
Hi John,
Thanks for the reply,I have been given a new format of data and now the
logs aren't as messy as they were earlier, but yes your mail gave me
pointers which helped me is handling the new data.
Now..I am stuck while handling a format of date,I am getting date in the
form 22/11/13 which is
I wouldn't worry about efficiency to much:
concat('20', split(date_field, '\\/')[2], '-', split(date_field, '\\/')[1],
'-', split(date_field, '\\/')[0]) as proper_date -- -MM-DD
On Sun, Nov 24, 2013 at 12:13 PM, Baahu bahub...@gmail.com wrote:
Hi John,
Thanks for the reply,I have been
or using the Date functions? Such as unix_timestamp to convert date string
to unix timestamp and from_unixtime convert unix timestamp to string
2013/11/25 John Omernik j...@omernik.com
I wouldn't worry about efficiency to much:
concat('20', split(date_field, '\\/')[2], '-', split(date_field,
here is the document
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions
2013/11/25 郭士伟 guoshi...@gmail.com
or using the Date functions? Such as unix_timestamp to convert date
string to unix timestamp and from_unixtime convert unix timestamp to
Hi,
I have a messy log file which I want to use to create a table, I am only
interested to retrieve 3 columns (time,ADD,files),which are in bold.
Sample entry from log file
*: 13-11-23 06:23:45 [ADD] [file1.zip|file2.zip] * junkjunk|2013-11-23
06:23:44:592 EST|file3.zip xyz|2013-11-23