​Hi Nagarajan,

    Apparently, we do batches of 100 by default for each commit . You can
decrease that number if you would like to.
http://phoenix.apache.org/flume.html

Regards
Ravi​

Hi,
>
>     As we are working in flume to store apache log into hbase/phoenix
> using phoenix-flume jars.Using flume conf file we could create table
> in phoenix and see the events get processed.while querying  through
> phoenix there is no record stored.please suggest us steps to process
> apache access log as source and store data in phoenix as sink.
>
> Hbase : 0.98.8
> Phoenix:4.2.1
>
> Flume Configuration:
> ===============
>
> agent.channels.memory-channel.type = memory
> agent.sources.tail-source.type = exec
> agent.sources.tail-source.command = tail -F /var/log/httpd/access_log
> agent.sources.tail-source.channels = memory-channel
>
> agent.channels.memory-channel.type=memory
> agent.channels.memory-channel.transactionCapacity=100
> agent.channels.memory-channel.byteCapacityBufferPercentage=20
>
>
> agent.channels = memory-channel
> agent.sources = tail-source
> agent.sinks = phoenix-sink
>
>
> agent.sinks.phoenix-sink.type=org.apache.phoenix.flume.sink.PhoenixSink
> agent.sinks.phoenix-sink.channel=memory-channel
> agent.sinks.phoenix-sink.batchSize=100
> agent.sinks.phoenix-sink.table=APACHE_LOGS
> agent.sinks.phoenix-sink.ddl=CREATE TABLE IF NOT EXISTS
> APACHE_LOGS(uid VARCHAR NOT NULL,host VARCHAR,identity VARCHAR,user
> VARCHAR,time VARCHAR,method VARCHAR,request VARCHAR,protocol
> VARCHAR,status INTEGER,size INTEGER,referer VARCHAR,agent
> VARCHAR,f_host VARCHAR CONSTRAINT pk PRIMARY KEY(uid))
> agent.sinks.phoenix-sink.zookeeperQuorum=dn02
> agent.sinks.phoenix-sink.serializer=REGEX
> agent.sinks.phoenix-sink.serializer.regex="^([\\d.]+) (\\S+) (\\S+)
> \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+) \"([^\"]+)\"
> \"([^\"]+)\""
> agent.sinks.phoenix-sink.serializer.rowkeyType=uuid
>
> agent.sinks.phoenix-sink.serializer.columns=host,identity,user,time,method,request,protocol,status,size,referer,agent
> agent.sinks.phoenix-sink.serializer.headers=f_host
>
> flume-ng agent -c conf -f /opt/flume/conf/phoenix.conf  -n agent
>
> *
> *
> *
> *
> *
> *
>
> 14/12/18 12:53:09 INFO
> client.HConnectionManager$HConnectionImplementation: Closing master
> protocol: MasterService
> 14/12/18 12:53:09 INFO
> client.HConnectionManager$HConnectionImplementation: Closing zookeeper
> sessionid=0x14a5b888d18003e
> 14/12/18 12:53:09 INFO zookeeper.ZooKeeper: Session: 0x14a5b888d18003e
> closed
> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: EventThread shut down
> 14/12/18 12:53:09 INFO query.ConnectionQueryServicesImpl: Found
> quorum: nn01:2181
> 14/12/18 12:53:09 INFO zookeeper.RecoverableZooKeeper: Process
> identifier=hconnection-0x843735 connecting to ZooKeeper
> ensemble=nn01:2181
> 14/12/18 12:53:09 INFO zookeeper.ZooKeeper: Initiating client
> connection, connectString=nn01:2181 sessionTimeout=180000
> watcher=hconnection-0x843735, quorum=nn01:2181, baseZNode=/hbase
> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: Opening socket connection
> to server nn01/10.10.10.25:2181. Will not attempt to authenticate
> using SASL (unknown error)
> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: Socket connection
> established to nn01/10.10.10.25:2181, initiating session
> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: Session establishment
> complete on server nn01/10.10.10.25:2181, sessionid =
> 0x14a5b888d18003f, negotiated timeout = 40000
> 14/12/18 12:53:09 INFO
> client.HConnectionManager$HConnectionImplementation: Closing master
> protocol: MasterService
> 14/12/18 12:53:09 INFO
> client.HConnectionManager$HConnectionImplementation: Closing zookeeper
> sessionid=0x14a5b888d18003f
> 14/12/18 12:53:09 INFO zookeeper.ZooKeeper: Session: 0x14a5b888d18003f
> closed
> 14/12/18 12:53:09 INFO zookeeper.ClientCnxn: EventThread shut down
> 14/12/18 12:53:15 INFO serializer.BaseEventSerializer:  the upsert
> statement is UPSERT INTO APACHE_LOG4 ("HOST", "IDENTITY", "USER",
> "TIME", "METHOD", "REQUEST", "PROTOCOL", "STATUS", "SIZE", "REFERER",
> "AGENT", "F_HOST", "UID") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
> ?)
> 14/12/18 12:53:18 INFO sink.PhoenixSink: Time taken to process [10]
> events was [3] seconds
> 14/12/18 12:53:22 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
> 14/12/18 12:53:27 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
> 14/12/18 12:53:33 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
> 14/12/18 12:53:40 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
> 14/12/18 12:53:48 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
> 14/12/18 12:53:56 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
> 14/12/18 12:54:04 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
> 14/12/18 12:54:12 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
> 14/12/18 12:54:20 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
> 14/12/18 12:54:28 INFO sink.PhoenixSink: Time taken to process [1]
> events was [3] seconds
> 14/12/18 12:54:36 INFO sink.PhoenixSink: Time taken to process [0]
> events was [3] seconds
>
>
> Thanks
>
> Divya N
>

Reply via email to