Pig has a Log loader in Piggybank. You can use that to generate the columns of that table and make the table point to it.
Take a look-- https://github.com/apache/pig/tree/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/apachelog Thanks, Aniket On Tue, Dec 6, 2011 at 10:19 AM, Abhishek Pratap Singh <manu.i...@gmail.com>wrote: > Hi Sangeetha, > > One more easier option is to use Flume Decorators to put some delimiter in > you stream of data and then load the data into table. > > For example: > Below data can be converted to say PIPE Delimited data (You an code for > any delimiter) by using Flume decorators. > > [2011-10-17 16:30:57,281] [ INFO] [33157362@qtp-28456974-0] > [net.hp.tr.webservice.referenceimplcustomer.resource.CustomersResource] > [Organization: Travelocity] [Client: AA] [Location of device: DFW] [User: > 550393] [user_role: ] [CorelationId: 248] [Component: Crossplane] [Server: > server01] [Request: seats=5] [Response: yes] [Status: pass] - Entering > Method = getKey() > > PIPE Delimited--- > 2011-10-17 16:30:57,281 | INFO > |33157362@qtp-28456974-0|net.hp.tr.webservice.referenceimplcustomer.resource.CustomersResource|Organization: > Travelocity|Client: AA|Location of device: DFW|User: 550393|user_role: > |CorelationId: 248|Component: Crossplane|Server: server01|Request: > seats=5|Response: yes|Status: pass| - Entering Method = getKey() > > Now once you have this pipe delimited data, you can create a table with > pipe delimiter and load this file. > > You can choose any delimiter as well as remove some data in flume > decorator and finally load into Hive table with same schema and delimiter. > Hope it helps. > > ~Abhishek P Singh > > On Tue, Dec 6, 2011 at 7:58 AM, alo alt <wget.n...@googlemail.com> wrote: > >> Hi Sangeetha, >> >> sry, was on road and the answer tooks a while. >> >> As Mark wrote, SerDe will be a good start. If its usefull for you take a >> look at http://code.google.com/p/hive-json-serde/wiki/GettingStarted. >> >> - alex >> >> >> On Tue, Dec 6, 2011 at 10:26 AM, sangeetha k <get2sa...@yahoo.com> wrote: >> >>> Hi, >>> >>> Thanks for the response. >>> Yes, You got my question. >>> >>> An example of my log message line will be as below: >>> >>> [2011-10-17 16:30:57,281] [ INFO] [33157362@qtp-28456974-0] >>> [net.hp.tr.webservice.referenceimplcustomer.resource.CustomersResource] >>> [Organization: Travelocity] [Client: AA] [Location of device: DFW] [User: >>> 550393] [user_role: ] [CorelationId: 248] [Component: Crossplane] [Server: >>> server01] [Request: seats=5] [Response: yes] [Status: pass] - Entering >>> Method = getKey() >>> >>> How to specify the delimiter, while describing the table? >>> >>> Thanks, >>> Sangeetha >>> >>> *From:* alo alt <wget.n...@googlemail.com> >>> *To:* user@hive.apache.org; sangeetha k <get2sa...@yahoo.com> >>> *Sent:* Tuesday, December 6, 2011 2:01 PM >>> *Subject:* Re: log4j format logs in Hive table >>> >>> Hi, >>> >>> I hope I understood your question correct - did you describe your table? >>> Like >>> "create TABLE YOURTABLE (row1 STRING, row2 STRING, row3 STRING) ROW >>> FORMAT DELIMITED FIELDS TERMINATED BY 'YOUR TERMINATOR' STORED AS >>> TEXTFILE;" >>> >>> row* = a name of your descision, Datatype look @documentation. >>> >>> After import via "insert (overwrite) table YOURTABLE" >>> >>> - alex >>> >>> >>> On Tue, Dec 6, 2011 at 8:56 AM, sangeetha k <get2sa...@yahoo.com> wrote: >>> >>> Hi, >>> >>> I am new to Hive. >>> >>> I am using Flume agent to collect log4j logs and sending to HDFS. >>> Now i wanted to load the log4j format logs from HDFS to Hive tables. >>> Each of the attributes in log statements like timestamp, level, >>> classname etc... should be loaded in seperate columns in the Hive tables. >>> >>> I tried creating table in Hive and loaded the entire log in one column, >>> but dont know how to load the above mentioned data in seperate columns. >>> >>> Please send me your suggestions, any links, tutorials on this. >>> >>> Thanks, >>> Sangeetha >>> >>> >>> >>> >>> -- >>> Alexander Lorenz >>> http://mapredit.blogspot.com >>> >>> *P **Think of the environment: please don't print this email unless you >>> really need to.* >>> >>> >>> >>> >>> >> >> >> -- >> Alexander Lorenz >> http://mapredit.blogspot.com >> >> *P **Think of the environment: please don't print this email unless you >> really need to.* >> >> >> > -- "...:::Aniket:::... Quetzalco@tl"