Increasing the batch size to 10k didnt make any diff:   24.6k event/sec
Subsequently increased the number of sources from 1 to 8  which improved it
a bit .. 33k e/s

Yes I will try to take a deeper look using a profiler.

Here is another issue that comes up occasionally with HDFS sink ... any
thoughts ?

15 Dec 2013 11:13:26,689 ERROR
[SinkRunner-PollingRunner-DefaultSinkProcessor]
(org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:82)  -
Unexpected error while checking replication factor

java.lang.reflect.InvocationTargetException

        at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at
org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:147)

        at
org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:68)

        at
org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)

        at
org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)

        at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)

        at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)

        at
org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)

        at java.lang.Thread.run(Thread.java:662)

Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
Lease mismatch on /flume/bidder/_flume_tx_agent01_sink01_web06-
east.manage.com_2013121422.1387093314310.snappy.tmp owned by
DFSClient_NONMAPREDUCE_-1742177356_32 but is accessed by
DFSClient_NONMAPREDUCE_643393114_25

        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2770)

        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2567)

        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2480)

        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)

        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to