So the regex has to match every piece of the line completely. I wrote the
regex so that it just takes a few helpful things out of the log line.

Thanks for your help
Jonathan


On Sat, Oct 1, 2011 at 12:14 AM, Vijay <tec...@gmail.com> wrote:

> The log lines are in some kind of JSON format though. The regex needs
> to literally match the log lines.
>
> On Fri, Sep 30, 2011 at 8:52 PM, Jonathan <jonny2...@gmail.com> wrote:
> > Hi,
> > I am  trying to parse an apache2 log using
> > the 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'. I am able to load
> > the tables using the script below but its showing up each of the 3 rows
> as
> > null for every entry.
> > CREATE TABLE apachelog4 (
> >   ip STRING,
> >   time STRING,
> >   beacon STRING)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
> > WITH SERDEPROPERTIES (
> >   "input.regex" = "([0-9]*?\.[0-9]*?\.[0-9]*?\.[0-9]*?) (\[.*\])
> (beacon.*
> > )",
> >   "output.regex" = "%1$s %2$s %3$s"
> > )
> > STORED AS TEXTFILE;
> >
> > As an example here is a few rows of the input data that I am trying to
> get
> > Hive to parse.
> >
> > {"body":"68.40.84.98 - - [29/Sep/2011:17:38:15 -0400] \"GET
> >
> /beacon?action=imp&pos=2&PlacementId=&AdType=1&hmGUID=e9a51cd2-14ec-48d5-8652-b7607af26962&advertiserPostcode=WA12+0HE&lat=53.466&lon=-2.6327&searchRadius=&DistanceModified=&sort=priceasc&vMake=&vModel=&vMaximumPrice=&vMaximumPriceModified=&vMinimumPrice=&vMinimumPriceModified=&vFuelType=&vMileage=&vAge=&vBodyType=&vTransmission=&vEngineCc=&vColour=&sellertype=&vNoOfDoors=&channel=cars&pgN=1&vMinAge=&vMaxAge=&vMinMileage=&vMaxMileage=&vMinEngineCc=&vMaxEngineCc=&Platform=&PlatformVersion=&cUserID=&cid=70&pid=7fc9e5b6-6dfc-4c45-8cd9-8709b7c8e0fb&advertisementId=201113383357957&advertiserId=0
> > HTTP/1.1\" 304
> >
> 148","timestamp":1317332295456,"pri":"INFO","nanos":10720819422720,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}}
> >  {"body":"68.40.84.98 - - [29/Sep/2011:17:38:15 -0400] \"GET /favicon.ico
> > HTTP/1.1\" 404
> >
> 467","timestamp":1317332295693,"pri":"INFO","nanos":10721056284876,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}}
> >  {"body":"68.40.84.98 - - [29/Sep/2011:17:38:16 -0400] \"GET
> >
> /beacon?action=imp&pos=2&PlacementId=&AdType=1&hmGUID=e9a51cd2-14ec-48d5-8652-b7607af26962&advertiserPostcode=WA12+0HE&lat=53.466&lon=-2.6327&searchRadius=&DistanceModified=&sort=priceasc&vMake=&vModel=&vMaximumPrice=&vMaximumPriceModified=&vMinimumPrice=&vMinimumPriceModified=&vFuelType=&vMileage=&vAge=&vBodyType=&vTransmission=&vEngineCc=&vColour=&sellertype=&vNoOfDoors=&channel=cars&pgN=1&vMinAge=&vMaxAge=&vMinMileage=&vMaxMileage=&vMinEngineCc=&vMaxEngineCc=&Platform=&PlatformVersion=&cUserID=&cid=70&pid=7fc9e5b6-6dfc-4c45-8cd9-8709b7c8e0fb&advertisementId=201113383357957&advertiserId=0
> > HTTP/1.1\" 304
> >
> 148","timestamp":1317332296116,"pri":"INFO","nanos":10721479503001,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}}
> >  {"body":"68.40.84.98 - - [29/Sep/2011:17:38:16 -0400] \"GET /favicon.ico
> > HTTP/1.1\" 404
> >
> 467","timestamp":1317332296353,"pri":"INFO","nanos":10721716299339,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}}
> >  {"body":"68.40.84.98 - - [29/Sep/2011:17:38:16 -0400] \"GET
> >
> /beacon?action=imp&pos=2&PlacementId=&AdType=1&hmGUID=e9a51cd2-14ec-48d5-8652-b7607af26962&advertiserPostcode=WA12+0HE&lat=53.466&lon=-2.6327&searchRadius=&DistanceModified=&sort=priceasc&vMake=&vModel=&vMaximumPrice=&vMaximumPriceModified=&vMinimumPrice=&vMinimumPriceModified=&vFuelType=&vMileage=&vAge=&vBodyType=&vTransmission=&vEngineCc=&vColour=&sellertype=&vNoOfDoors=&channel=cars&pgN=1&vMinAge=&vMaxAge=&vMinMileage=&vMaxMileage=&vMinEngineCc=&vMaxEngineCc=&Platform=&PlatformVersion=&cUserID=&cid=70&pid=7fc9e5b6-6dfc-4c45-8cd9-8709b7c8e0fb&advertisementId=201113383357957&advertiserId=0
> > HTTP/1.1\" 304
> >
> 148","timestamp":1317332296533,"pri":"INFO","nanos":10721896540315,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}}
> >  {"body":"68.40.84.98 - - [29/Sep/2011:17:38:16 -0400] \"GET /favicon.ico
> > HTTP/1.1\" 404
> >
> 467","timestamp":1317332296833,"pri":"INFO","nanos":10722196180417,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}}
> >  {"body":"68.40.84.98 - - [29/Sep/2011:17:38:17 -0400] \"GET
> >
> /beacon?action=imp&pos=2&PlacementId=&AdType=1&hmGUID=e9a51cd2-14ec-48d5-8652-b7607af26962&advertiserPostcode=WA12+0HE&lat=53.466&lon=-2.6327&searchRadius=&DistanceModified=&sort=priceasc&vMake=&vModel=&vMaximumPrice=&vMaximumPriceModified=&vMinimumPrice=&vMinimumPriceModified=&vFuelType=&vMileage=&vAge=&vBodyType=&vTransmission=&vEngineCc=&vColour=&sellertype=&vNoOfDoors=&channel=cars&pgN=1&vMinAge=&vMaxAge=&vMinMileage=&vMaxMileage=&vMinEngineCc=&vMaxEngineCc=&Platform=&PlatformVersion=&cUserID=&cid=70&pid=7fc9e5b6-6dfc-4c45-8cd9-8709b7c8e0fb&advertisementId=201113383357957&advertiserId=0
> > HTTP/1.1\" 304
> >
> > Thanks in advance for all you help.
>

Reply via email to