Hi, I currently have only a single sink. I actually did not know I could have several sinks attached to the same channel. I will try this as well as increasing the channel size and see if I get rid of these channelfullexceptions.
On Thu, Jan 29, 2015 at 8:45 PM, Hari Shreedharan <[email protected]> wrote: > How many sinks do you have? Adding more sinks increases parallelism and will > clear the channel faster, provided the downstream system can handle the > load. > > Thanks, > Hari > > > On Thu, Jan 29, 2015 at 9:41 AM, Sverre Bakke <[email protected]> > wrote: >> >> Hi, >> >> Thanks for your feedback. I can of course switch to the multiport one >> if the plain one is not maintained. >> >> Back to the ChannelFullException issue. I can increase the channel >> size, but the basic problem remains.. as long as the syslog client is >> faster than the Flume sink, then this exception will occur and data >> would be lost... I really believe that blocking so that the syslog >> client must wait to send more data is the way to go for a robust >> solution. >> >> Lets assume that the syslog client reads batches of events e.g. from >> file and send these as fast as possible to the Flume multiport tcp >> syslog source. In such cases, the average event per second rate would >> be medium, while in practice, there would be huge spikes where the >> client would deliver as fast as possible. Instead of asking the client >> to "slow down", Flume would accept the events and drop them. This >> forces me as an admin to monitor the logs and try to guess which >> events were dropped. If this happens, I can have a reliable and >> persistent channel configured, but events will still be dropped thus >> undermining the entire solution. >> >> >> >> On Thu, Jan 29, 2015 at 4:56 PM, Jeff Lord <[email protected]> wrote: >> > Have you considered increasing the size of the memory channel? I haven't >> > played with Kafka sink much but in regards to hdfs we often add sinks >> > which >> > can help to increase the flow of the channel. >> > The multi port Syslog source is the way to go here as it will give >> > better >> > performance. We should probably go ahead and deprecate the vanilla >> > syslog >> > source. >> > >> > >> > On Thursday, January 29, 2015, Sverre Bakke <[email protected]> >> > wrote: >> >> >> >> Hi, >> >> >> >> I have a syslogtcp source using a default memory channel and Kafka >> >> sink. When producing data as fast as possible (3000 syslog events in a >> >> second), the source seems to accept all the data, but will crash due >> >> to ChannelFullException when adding the event to the channel. >> >> >> >> Is there any way to throttle or otherwise wait receiving more syslog >> >> events before channel is available again rather than crashing because >> >> the channel is full? I would prefer that Flume would accept syslog >> >> events slower rather than crashing and dropping events. >> >> >> >> 29 Jan 2015 16:26:56,721 ERROR [New I/O worker #2] >> >> >> >> >> >> (org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived:94) >> >> - Error writting to channel, event dropped >> >> >> >> Also, the syslogtcp seems to keep the syslog headers regardless of the >> >> keepFields setting, is there any common reason for why this might >> >> happen? In contrast, the multiport syslog tcp listener works as >> >> expected with this particular setting. > >
