So the regex has to match every piece of the line completely. I wrote the regex so that it just takes a few helpful things out of the log line.
Thanks for your help Jonathan On Sat, Oct 1, 2011 at 12:14 AM, Vijay <tec...@gmail.com> wrote: > The log lines are in some kind of JSON format though. The regex needs > to literally match the log lines. > > On Fri, Sep 30, 2011 at 8:52 PM, Jonathan <jonny2...@gmail.com> wrote: > > Hi, > > I am trying to parse an apache2 log using > > the 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'. I am able to load > > the tables using the script below but its showing up each of the 3 rows > as > > null for every entry. > > CREATE TABLE apachelog4 ( > > ip STRING, > > time STRING, > > beacon STRING) > > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' > > WITH SERDEPROPERTIES ( > > "input.regex" = "([0-9]*?\.[0-9]*?\.[0-9]*?\.[0-9]*?) (\[.*\]) > (beacon.* > > )", > > "output.regex" = "%1$s %2$s %3$s" > > ) > > STORED AS TEXTFILE; > > > > As an example here is a few rows of the input data that I am trying to > get > > Hive to parse. > > > > {"body":"68.40.84.98 - - [29/Sep/2011:17:38:15 -0400] \"GET > > > /beacon?action=imp&pos=2&PlacementId=&AdType=1&hmGUID=e9a51cd2-14ec-48d5-8652-b7607af26962&advertiserPostcode=WA12+0HE&lat=53.466&lon=-2.6327&searchRadius=&DistanceModified=&sort=priceasc&vMake=&vModel=&vMaximumPrice=&vMaximumPriceModified=&vMinimumPrice=&vMinimumPriceModified=&vFuelType=&vMileage=&vAge=&vBodyType=&vTransmission=&vEngineCc=&vColour=&sellertype=&vNoOfDoors=&channel=cars&pgN=1&vMinAge=&vMaxAge=&vMinMileage=&vMaxMileage=&vMinEngineCc=&vMaxEngineCc=&Platform=&PlatformVersion=&cUserID=&cid=70&pid=7fc9e5b6-6dfc-4c45-8cd9-8709b7c8e0fb&advertisementId=201113383357957&advertiserId=0 > > HTTP/1.1\" 304 > > > 148","timestamp":1317332295456,"pri":"INFO","nanos":10720819422720,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}} > > {"body":"68.40.84.98 - - [29/Sep/2011:17:38:15 -0400] \"GET /favicon.ico > > HTTP/1.1\" 404 > > > 467","timestamp":1317332295693,"pri":"INFO","nanos":10721056284876,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}} > > {"body":"68.40.84.98 - - [29/Sep/2011:17:38:16 -0400] \"GET > > > /beacon?action=imp&pos=2&PlacementId=&AdType=1&hmGUID=e9a51cd2-14ec-48d5-8652-b7607af26962&advertiserPostcode=WA12+0HE&lat=53.466&lon=-2.6327&searchRadius=&DistanceModified=&sort=priceasc&vMake=&vModel=&vMaximumPrice=&vMaximumPriceModified=&vMinimumPrice=&vMinimumPriceModified=&vFuelType=&vMileage=&vAge=&vBodyType=&vTransmission=&vEngineCc=&vColour=&sellertype=&vNoOfDoors=&channel=cars&pgN=1&vMinAge=&vMaxAge=&vMinMileage=&vMaxMileage=&vMinEngineCc=&vMaxEngineCc=&Platform=&PlatformVersion=&cUserID=&cid=70&pid=7fc9e5b6-6dfc-4c45-8cd9-8709b7c8e0fb&advertisementId=201113383357957&advertiserId=0 > > HTTP/1.1\" 304 > > > 148","timestamp":1317332296116,"pri":"INFO","nanos":10721479503001,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}} > > {"body":"68.40.84.98 - - [29/Sep/2011:17:38:16 -0400] \"GET /favicon.ico > > HTTP/1.1\" 404 > > > 467","timestamp":1317332296353,"pri":"INFO","nanos":10721716299339,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}} > > {"body":"68.40.84.98 - - [29/Sep/2011:17:38:16 -0400] \"GET > > > /beacon?action=imp&pos=2&PlacementId=&AdType=1&hmGUID=e9a51cd2-14ec-48d5-8652-b7607af26962&advertiserPostcode=WA12+0HE&lat=53.466&lon=-2.6327&searchRadius=&DistanceModified=&sort=priceasc&vMake=&vModel=&vMaximumPrice=&vMaximumPriceModified=&vMinimumPrice=&vMinimumPriceModified=&vFuelType=&vMileage=&vAge=&vBodyType=&vTransmission=&vEngineCc=&vColour=&sellertype=&vNoOfDoors=&channel=cars&pgN=1&vMinAge=&vMaxAge=&vMinMileage=&vMaxMileage=&vMinEngineCc=&vMaxEngineCc=&Platform=&PlatformVersion=&cUserID=&cid=70&pid=7fc9e5b6-6dfc-4c45-8cd9-8709b7c8e0fb&advertisementId=201113383357957&advertiserId=0 > > HTTP/1.1\" 304 > > > 148","timestamp":1317332296533,"pri":"INFO","nanos":10721896540315,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}} > > {"body":"68.40.84.98 - - [29/Sep/2011:17:38:16 -0400] \"GET /favicon.ico > > HTTP/1.1\" 404 > > > 467","timestamp":1317332296833,"pri":"INFO","nanos":10722196180417,"host":"domU-12-31-39-0F-1D-C3.compute-1.internal","fields":{"rolltag":"20110929-173823752-0400.7672766572025.00000058"}} > > {"body":"68.40.84.98 - - [29/Sep/2011:17:38:17 -0400] \"GET > > > /beacon?action=imp&pos=2&PlacementId=&AdType=1&hmGUID=e9a51cd2-14ec-48d5-8652-b7607af26962&advertiserPostcode=WA12+0HE&lat=53.466&lon=-2.6327&searchRadius=&DistanceModified=&sort=priceasc&vMake=&vModel=&vMaximumPrice=&vMaximumPriceModified=&vMinimumPrice=&vMinimumPriceModified=&vFuelType=&vMileage=&vAge=&vBodyType=&vTransmission=&vEngineCc=&vColour=&sellertype=&vNoOfDoors=&channel=cars&pgN=1&vMinAge=&vMaxAge=&vMinMileage=&vMaxMileage=&vMinEngineCc=&vMaxEngineCc=&Platform=&PlatformVersion=&cUserID=&cid=70&pid=7fc9e5b6-6dfc-4c45-8cd9-8709b7c8e0fb&advertisementId=201113383357957&advertiserId=0 > > HTTP/1.1\" 304 > > > > Thanks in advance for all you help. >