Hello!

We are in the process of migrating from the old Flume to version 1.6. We are 
using the ThriftSource with the new KafkaSink. Here's what our config looks 
like:

```
agent1.channels = ch1
agent1.sources = thriftSrc
agent1.sinks = kafka

agent1.channels.ch1.type = memory
agent1.channels.ch1.capacity = 10000
agent1.channels.ch1.transactionCapacity = 500

# THRIFT
agent1.sources.thriftSrc.type = thrift
agent1.sources.thriftSrc.channels = ch1
agent1.sources.thriftSrc.bind = 0.0.0.0
agent1.sources.thriftSrc.port = 4042
agent1.sources.thriftSrc.threads = 150 # if we don't set this option, the 
source keeps creating more and more threads until all heap memory is used up 
and then it crashes

# KAFKA
agent1.sinks.kafka.channel = ch1
agent1.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
agent1.sinks.kafka.batchSize = 50
agent1.sinks.kafka.brokerList = broker.example.com:9092
agent1.sinks.kafka.requiredAcks = 1
agent1.sinks.kafka.topic = topic1
```

We have been noticing some bad behavior by the Thrift source/Thrift server 
using the JMX connection. If we don't restrict the number of threads, it spawns 
thousands of new threads, apparently one for every message it receives. These 
threads all have the name "Flume Thrift IPC Thread [number]" and according to 
the jvisualvm console they are always idle. At some point all of the JVM memory 
is used up through creating new threads and flume crashes with the following 
exception:
```
12 Aug 2015 16:56:11,721 ERROR [Thread-1] 
(org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.run:544)  - 
run() exiting due to uncaught error
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:714)
        at 
java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
        at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
        at 
org.apache.thrift.server.TThreadedSelectorServer.requestInvoke(TThreadedSelectorServer.java:310)
        at 
org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:209)
        at 
org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.select(TThreadedSelectorServer.java:576)
        at 
org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.run(TThreadedSelectorServer.java:536)
```

When we set the option to restrict the number of threads, the server sticks to 
that number and runs smoothly, however it drops messages occasionally (may have 
a different cause).

I am wondering whether this is a bug or in some way expected behavior? What are 
the best practices for using a ThriftSource? Are there further parameters to 
possibly tune (like channel.capacity)?

Thanks a lot


--
Tobias Heintz

Reply via email to