Hi Ravi Kiran,

       Really Thank You for your reply it worked well.I could see Apache
data in phoenix.
Do you have any idea  when will Apache Phoenix Give support for UNION
Statement.
We are eagerly waiting for it .Apache Phoenix is really a good tool and
very useful.

Thanks a Lot !!

Divya N

On Mon, Dec 22, 2014 at 11:33 AM, Ravi Kiran <maghamraviki...@gmail.com>
wrote:

> Hi Divya,
>
>   Based on the logs you have shared, can you please change the following
> entries
>
> agent.sinks.phoenix-sink.serializer.regex=^([\\d.]+) (\\S+) (\\S+)
> \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+) \"([^\"]+)\"
> \"([^\"]+)\"
>
> agent.sinks.phoenix-sink.serializer.columns=host,identity,user,time,request,status,size,referer,agent
>
> Regarding changing the logging level , try changing the entry within
> log4j.properties and give it a try.
>
> Regards
> Ravi
>
> On Sat, Dec 20, 2014 at 4:34 AM, Divya Nagarajan <divya.se2...@gmail.com>
> wrote:
>
>> Hi,
>>
>> This is my Flume Configuration File
>>
>> agent.sources = tail
>> agent.channels = memoryChannel
>> agent.sinks = loggerSink
>> agent.sinks = phoenix-sink
>>
>> agent.sources.tail.type = exec
>> agent.sources.tail.command = tail -f /var/log/httpd/access_log
>> agent.sources.tail.channels = memoryChannel
>>
>> agent.sinks.loggerSink.channel = memoryChannel
>> agent.sinks.loggerSink.type = logger
>>
>> agent.channels.memoryChannel.type = memory
>> agent.channels.memoryChannel.capacity = 100
>>
>> agent.sinks.phoenix-sink.type=org.apache.phoenix.flume.sink.PhoenixSink
>> agent.sinks.phoenix-sink.channel=memoryChannel
>> agent.sinks.phoenix-sink.batchSize=5
>> agent.sinks.phoenix-sink.table=S1.APACHE
>>
>> agent.sinks.phoenix-sink.zookeeperQuorum=nn01
>> agent.sinks.phoenix-sink.serializer=REGEX
>> agent.sinks.phoenix-sink.serializer.rowkeyType=uuid
>> agent.sinks.phoenix-sink.ddl=CREATE TABLE IF NOT EXISTS S1.APACHE (uid
>> varchar NOT NULL,host varchar,identity varchar,user varchar,time
>> varchar,method varchar,request varchar,protocol varchar,status INTEGER,size
>> INTEGER,referer varchar,agent varchar,f_host varchar CONSTRAINT pk PRIMARY
>> KEY (uid))
>>
>> #agent.sinks.phoenix-sink.serializer.regex="([^ ]*) ([^ ]*) ([^ ]*)
>> (-|\\[[^\\]]*\\]) \"([^ ]+) ([^ ]+) ([^\"]+)\" (-|[0-9]*) (-|[0-9]*)(?: ([^
>> \"]*|\"[^\"]*\") ([^ \"]$
>> #agent.sinks.phoenix-sink.serializer.regex="([^ ]*) ([^ ]*) ([^ ]*)
>> (-|\\[[^\\]]*\\]) \"([^ ]+) ([^ ]+) ([^\"]+)\" (-|[0-9]*) (-|[0-9]*)(?: ([^
>> \"]*|\"[^\"]*\") ([^ \"]*|\"[^\"]*\"))?"
>>
>>
>> agent.sinks.phoenix-sink.serializer.regex=([^ ]*) ([^ ]*) ([^ ]*)  ([^ ]*
>> [^ ]*) "([^\"]+)\" (-|[0-9]*) (-|[0-9]*) "([^ ]*)" "([^\"]+)\"
>>
>> agent.sinks.phoenix-sink.serializer.columns=host,identity,user,time,method,request,protocol,status,size,referer,agent
>> agent.sinks.phoenix-sink.serializer.headers=f_host
>>
>>
>> This Is my Apache log File Structure
>>
>> 127.0.0.1 - - [20/Dec/2014:17:11:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:16:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:21:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:26:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:31:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:36:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:41:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:46:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:51:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:56:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>>
>>
>> Iam using
>> phoenix 4.2.1
>> Hbase 0.98.8
>>
>> and Sorry i enable DEBUG Mode in flume . it shows only INFO as usual when
>> executing this
>> flume-ng agent -c conf -f /opt/flume/conf/apache.conf -n agent
>> -Dflume.root.looger=DEBUG,console
>>
>> Thanks
>> Divya N
>>
>>
>>
>> On Sat, Dec 20, 2014 at 2:14 AM, Ravi Kiran <maghamraviki...@gmail.com>
>> wrote:
>>
>>> Hi Divya,
>>>
>>>    Also, can you confirm if the regex given in the configuration matches
>>> the access log . To confirm , is it possible to set the logging level to
>>> debug as there is debug log entry if the event doesn't match the regex
>>> given in the configuration.
>>>   We have a test case for processing apache logs
>>> https://github.com/apache/phoenix/blob/master/phoenix-flume/src/it/java/org/apache/phoenix/flume/RegexEventSerializerIT.java#testApacheLogRegex
>>> which can help you with the regex
>>>   Happy to help!!
>>>
>>> Regards
>>> Ravi
>>>
>>> On Fri, Dec 19, 2014 at 11:19 AM, Ravi Kiran <maghamraviki...@gmail.com>
>>> wrote:
>>>>
>>>> Hi Nagarajan,
>>>>
>>>>     Do you see any exceptions in the logs ? Can you please give it a
>>>> try to ingest > 100 records and see if that works.  Also, can  you please
>>>> share the version of Phoenix you are using.
>>>>
>>>> Regards
>>>> Ravi
>>>>
>>>> On Thu, Dec 18, 2014 at 10:36 PM, Divya Nagarajan <
>>>> divya.se2...@gmail.com> wrote:
>>>>>
>>>>>
>>>>> H i,
>>>>>    I tried with 5 as batchsize,still data is not upserted into phoenix.
>>>>>
>>>>>
>>>>> 14/12/19 12:03:51 INFO zookeeper.ClientCnxn: EventThread shut down
>>>>> 14/12/19 12:03:51 INFO zookeeper.ZooKeeper: Session: 0x14a60b4631e003d
>>>>> closed
>>>>> 14/12/19 12:03:55 INFO serializer.BaseEventSerializer:  the upsert
>>>>> statement is UPSERT INTO APACHE_LOG4 ("HOST", "IDENTITY", "USER", "TIME",
>>>>> "METHOD", "REQUEST", "PROTOCOL", "STATUS", "SIZE", "REFERER", "AGENT",
>>>>> "F_HOST", "UID") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
>>>>> 14/12/19 12:03:55 INFO sink.PhoenixSink: Time taken to process [3]
>>>>> events was [0] seconds
>>>>> 14/12/19 12:03:55 INFO sink.PhoenixSink: Time taken to process [3]
>>>>> events was [0] seconds
>>>>> 14/12/19 12:03:55 INFO sink.PhoenixSink: Time taken to process [3]
>>>>> events was [0] seconds
>>>>> 14/12/19 12:03:58 INFO sink.PhoenixSink: Time taken to process [1]
>>>>> events was [3] seconds
>>>>> 14/12/19 12:04:02 INFO sink.PhoenixSink: Time taken to process [0]
>>>>> events was [3] seconds
>>>>> 14/12/19 12
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Divya N
>>>>>
>>>>>
>>>>> On Fri, Dec 19, 2014 at 6:10 AM, Ravi Kiran <maghamraviki...@gmail.com
>>>>> > wrote:
>>>>>>
>>>>>>
>>>>>> ​Hi Nagarajan,
>>>>>>
>>>>>>     Apparently, we do batches of 100 by default for each commit . You
>>>>>> can decrease that number if you would like to.
>>>>>> http://phoenix.apache.org/flume.html
>>>>>>
>>>>>> Regards
>>>>>> Ravi​
>>>>>>
>>>>>> Hi,
>>>>>>>
>>>>>>>     As we are working in flume to store apache log into hbase/phoenix
>>>>>>> using phoenix-flume jars.Using flume conf file we could create table
>>>>>>> in phoenix and see the events get processed.while querying  through
>>>>>>> phoenix there is no record stored.please suggest us steps to process
>>>>>>> apache access log as source and store data in phoenix as sink.
>>>>>>>
>>>>>>> Hbase : 0.98.8
>>>>>>> Phoenix:4.2.1
>>>>>>>
>>>>>>> Flume Configuration:
>>>>>>> ===============
>>>>>>>
>>>>>>> agent.channels.memory-channel.type = memory
>>>>>>> agent.sources.tail-source.type = exec
>>>>>>> agent.sources.tail-source.command = tail -F /var/log/httpd/access_log
>>>>>>> agent.sources.tail-source.channels = memory-channel
>>>>>>>
>>>>>>> agent.channels.memory-channel.type=memory
>>>>>>> agent.channels.memory-channel.transactionCapacity=100
>>>>>>> agent.channels.memory-channel.byteCapacityBufferPercentage=20
>>>>>>>
>>>>>>>
>>>>>>> agent.channels = memory-channel
>>>>>>> agent.sources = tail-source
>>>>>>> agent.sinks = phoenix-sink
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> agent.sinks.phoenix-sink.type=org.apache.phoenix.flume.sink.PhoenixSink
>>>>>>> agent.sinks.phoenix-sink.channel=memory-channel
>>>>>>> agent.sinks.phoenix-sink.batchSize=100
>>>>>>> agent.sinks.phoenix-sink.table=APACHE_LOGS
>>>>>>> agent.sinks.phoenix-sink.ddl=CREATE TABLE IF NOT EXISTS
>>>>>>> APACHE_LOGS(uid VARCHAR NOT NULL,host VARCHAR,identity VARCHAR,user
>>>>>>> VARCHAR,time VARCHAR,method VARCHAR,request VARCHAR,protocol
>>>>>>> VARCHAR,status INTEGER,size INTEGER,referer VARCHAR,agent
>>>>>>> VARCHAR,f_host VARCHAR CONSTRAINT pk PRIMARY KEY(uid))
>>>>>>> agent.sinks.phoenix-sink.zookeeperQuorum=dn02
>>>>>>> agent.sinks.phoenix-sink.serializer=REGEX
>>>>>>> agent.sinks.phoenix-sink.serializer.regex="^([\\d.]+) (\\S+) (\\S+)
>>>>>>> \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+)
>>>>>>> \"([^\"]+)\"
>>>>>>> \"([^\"]+)\""
>>>>>>> agent.sinks.phoenix-sink.serializer.rowkeyType=uuid
>>>>>>>
>>>>>>> agent.sinks.phoenix-sink.serializer.columns=host,identity,user,time,method,request,protocol,status,size,referer,agent
>>>>>>> agent.sinks.phoenix-sink.serializer.headers=f_host
>>>>>>>
>>>>>>> flume-ng agent -c conf -f /opt/flume/conf/phoenix.conf  -n agent
>>>>>>>
>>>>>>> *
>>>>>>> *
>>>>>>> *
>>>>>>> *
>>>>>>> *
>>>>>>> *
>>>>>>>
>>>>>>> 14/12/18 12:53:09 INFO
>>>>>>> client.HConnectionManager$HConnectionImplementation: Closing master
>>>>>>> protocol: MasterService
>>>>>>> 14/12/18 12:53:09 INFO
>>>>>>> client.HConnectionManager$HConnectionImplementation: Closing
>>>>>>> zookeeper
>>>>>>> sessionid=0x14a5b888d18003e
>>>>>>> 14/12/18 12:53:09 INFO zookeeper.ZooKeeper: Session:
>>>>>>> 0x14a5b888d18003e closed
>>>>>>> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: EventThread shut down
>>>>>>> 14/12/18 12:53:09 INFO query.ConnectionQueryServicesImpl: Found
>>>>>>> quorum: nn01:2181
>>>>>>> 14/12/18 12:53:09 INFO zookeeper.RecoverableZooKeeper: Process
>>>>>>> identifier=hconnection-0x843735 connecting to ZooKeeper
>>>>>>> ensemble=nn01:2181
>>>>>>> 14/12/18 12:53:09 INFO zookeeper.ZooKeeper: Initiating client
>>>>>>> connection, connectString=nn01:2181 sessionTimeout=180000
>>>>>>> watcher=hconnection-0x843735, quorum=nn01:2181, baseZNode=/hbase
>>>>>>> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: Opening socket
>>>>>>> connection
>>>>>>> to server nn01/10.10.10.25:2181. Will not attempt to authenticate
>>>>>>> using SASL (unknown error)
>>>>>>> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: Socket connection
>>>>>>> established to nn01/10.10.10.25:2181, initiating session
>>>>>>> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: Session establishment
>>>>>>> complete on server nn01/10.10.10.25:2181, sessionid =
>>>>>>> 0x14a5b888d18003f, negotiated timeout = 40000
>>>>>>> 14/12/18 12:53:09 INFO
>>>>>>> client.HConnectionManager$HConnectionImplementation: Closing master
>>>>>>> protocol: MasterService
>>>>>>> 14/12/18 12:53:09 INFO
>>>>>>> client.HConnectionManager$HConnectionImplementation: Closing
>>>>>>> zookeeper
>>>>>>> sessionid=0x14a5b888d18003f
>>>>>>> 14/12/18 12:53:09 INFO zookeeper.ZooKeeper: Session:
>>>>>>> 0x14a5b888d18003f closed
>>>>>>> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: EventThread shut down
>>>>>>> 14/12/18 12:53:15 INFO serializer.BaseEventSerializer:  the upsert
>>>>>>> statement is UPSERT INTO APACHE_LOG4 ("HOST", "IDENTITY", "USER",
>>>>>>> "TIME", "METHOD", "REQUEST", "PROTOCOL", "STATUS", "SIZE", "REFERER",
>>>>>>> "AGENT", "F_HOST", "UID") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>>>>>>> ?)
>>>>>>> 14/12/18 12:53:18 INFO sink.PhoenixSink: Time taken to process [10]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:53:22 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:53:27 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:53:33 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:53:40 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:53:48 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:53:56 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:54:04 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:54:12 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:54:20 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:54:28 INFO sink.PhoenixSink: Time taken to process [1]
>>>>>>> events was [3] seconds
>>>>>>> 14/12/18 12:54:36 INFO sink.PhoenixSink: Time taken to process [0]
>>>>>>> events was [3] seconds
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Divya N
>>>>>>>
>>>>>>
>>
>

Reply via email to