Re: Guarantees of the memory channel for delivering to sink
Ping on the below questions about new Spool Directory source: If we choose to use the memory channel with this source, to an Avro sink on a remote box, do we risk data loss in the eventuality of a network partition/slow network or if the flume-agent on the source box dies? If we choose to use file channel with this source, we will result in double writes to disk, correct? (one for the legacy log files which will be ingested by the Spool Directory source, and the other for the WAL) From: Rahul Ravindran rahu...@yahoo.com To: user@flume.apache.org user@flume.apache.org Sent: Tuesday, November 6, 2012 3:40 PM Subject: Re: Guarantees of the memory channel for delivering to sink This is awesome. This may be perfect for our use case :) When is the 1.3 release expected? Couple of questions for the choice of channel for the new source: If we choose to use the memory channel with this source, to an Avro sink on a remote box, do we risk data loss in the eventuality of a network partition/slow network or if the flume-agent on the source box dies? If we choose to use file channel with this source, we will result in double writes to disk, correct? (one for the legacy log files which will be ingested by the Spool Directory source, and the other for the WAL) Thanks, ~Rahul. From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 3:05 PM Subject: Re: Guarantees of the memory channel for delivering to sink This use case sounds like a perfect use of the Spool DIrectory source which will be in the upcoming 1.3 release. Brock On Tue, Nov 6, 2012 at 4:53 PM, Rahul Ravindran rahu...@yahoo.com wrote: We will update the checkpoint each time (we may tune this to be periodic) but the contents of the memory channel will be in the legacy logs which are currently being generated. Additionally, the sink for the memory channel will be an Avro source in another machine. Does that clear things up? From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:44 PM Subject: Re: Guarantees of the memory channel for delivering to sink But in your architecture you are going to write the contents of the memory channel out? Or did I miss something? The checkpoint will be updated each time we perform a successive insertion into the memory channel. On Tue, Nov 6, 2012 at 3:43 PM, Rahul Ravindran rahu...@yahoo.com wrote: We have a legacy system which writes events to a file (existing log file). This will continue. If I used a filechannel, I will be double the number of IO operations(writes to the legacy log file, and writes to WAL). From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:38 PM Subject: Re: Guarantees of the memory channel for delivering to sink Your still going to be writing out all events, no? So how would file channel do more IO than that? On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below: FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I realize that this results in a risk of data, the maximum size of which is the capacity of the memory channel). As long as there is capacity in the memory channel buffers, does the memory channel guarantee delivery to a sink (does it wait for acknowledgements, and retry failed packets)? This would mean that we need to ensure that we do not exceed the channel capacity. I am writing a custom source which will use the memory channel, and which will catch a ChannelException to identify any channel capacity issues(so, buffer used in the memory channel is full because of lagging sinks/network issues etc). Is that a reasonable assumption to make? Thanks, ~Rahul. -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
Re: Guarantees of the memory channel for delivering to sink
Hi, Yes if you use memory channel, you can lose data. To not lose data, file channel needs to write to disk... Brock On Wed, Nov 7, 2012 at 1:29 PM, Rahul Ravindran rahu...@yahoo.com wrote: Ping on the below questions about new Spool Directory source: If we choose to use the memory channel with this source, to an Avro sink on a remote box, do we risk data loss in the eventuality of a network partition/slow network or if the flume-agent on the source box dies? If we choose to use file channel with this source, we will result in double writes to disk, correct? (one for the legacy log files which will be ingested by the Spool Directory source, and the other for the WAL) -- *From:* Rahul Ravindran rahu...@yahoo.com *To:* user@flume.apache.org user@flume.apache.org *Sent:* Tuesday, November 6, 2012 3:40 PM *Subject:* Re: Guarantees of the memory channel for delivering to sink This is awesome. This may be perfect for our use case :) When is the 1.3 release expected? Couple of questions for the choice of channel for the new source: If we choose to use the memory channel with this source, to an Avro sink on a remote box, do we risk data loss in the eventuality of a network partition/slow network or if the flume-agent on the source box dies? If we choose to use file channel with this source, we will result in double writes to disk, correct? (one for the legacy log files which will be ingested by the Spool Directory source, and the other for the WAL) Thanks, ~Rahul. -- *From:* Brock Noland br...@cloudera.com *To:* user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com *Sent:* Tuesday, November 6, 2012 3:05 PM *Subject:* Re: Guarantees of the memory channel for delivering to sink This use case sounds like a perfect use of the Spool DIrectory source which will be in the upcoming 1.3 release. Brock On Tue, Nov 6, 2012 at 4:53 PM, Rahul Ravindran rahu...@yahoo.com wrote: We will update the checkpoint each time (we may tune this to be periodic) but the contents of the memory channel will be in the legacy logs which are currently being generated. Additionally, the sink for the memory channel will be an Avro source in another machine. Does that clear things up? From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:44 PM Subject: Re: Guarantees of the memory channel for delivering to sink But in your architecture you are going to write the contents of the memory channel out? Or did I miss something? The checkpoint will be updated each time we perform a successive insertion into the memory channel. On Tue, Nov 6, 2012 at 3:43 PM, Rahul Ravindran rahu...@yahoo.com wrote: We have a legacy system which writes events to a file (existing log file). This will continue. If I used a filechannel, I will be double the number of IO operations(writes to the legacy log file, and writes to WAL). From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:38 PM Subject: Re: Guarantees of the memory channel for delivering to sink Your still going to be writing out all events, no? So how would file channel do more IO than that? On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below: FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I realize that this results in a risk of data, the maximum size of which is the capacity of the memory channel). As long as there is capacity in the memory channel buffers, does the memory channel guarantee delivery to a sink (does it wait for acknowledgements, and retry failed packets)? This would mean that we need to ensure that we do not exceed the channel capacity. I am writing a custom source which will use the memory channel, and which will catch a ChannelException to identify any channel capacity issues(so, buffer used in the memory channel is full because of lagging sinks/network issues etc). Is that a reasonable assumption to make? Thanks, ~Rahul. -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
Re: Guarantees of the memory channel for delivering to sink
Hi, Thanks for the response. Does the memory channel provide transactional guarantees? In the event of a network packet loss, does it retry sending the packet? If we ensure that we do not exceed the capacity for the memory channel, does it continue retrying to send an event to the remote source on failure? Thanks, ~Rahul. From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Wednesday, November 7, 2012 11:48 AM Subject: Re: Guarantees of the memory channel for delivering to sink Hi, Yes if you use memory channel, you can lose data. To not lose data, file channel needs to write to disk... Brock On Wed, Nov 7, 2012 at 1:29 PM, Rahul Ravindran rahu...@yahoo.com wrote: Ping on the below questions about new Spool Directory source: If we choose to use the memory channel with this source, to an Avro sink on a remote box, do we risk data loss in the eventuality of a network partition/slow network or if the flume-agent on the source box dies? If we choose to use file channel with this source, we will result in double writes to disk, correct? (one for the legacy log files which will be ingested by the Spool Directory source, and the other for the WAL) From: Rahul Ravindran rahu...@yahoo.com To: user@flume.apache.org user@flume.apache.org Sent: Tuesday, November 6, 2012 3:40 PM Subject: Re: Guarantees of the memory channel for delivering to sink This is awesome. This may be perfect for our use case :) When is the 1.3 release expected? Couple of questions for the choice of channel for the new source: If we choose to use the memory channel with this source, to an Avro sink on a remote box, do we risk data loss in the eventuality of a network partition/slow network or if the flume-agent on the source box dies? If we choose to use file channel with this source, we will result in double writes to disk, correct? (one for the legacy log files which will be ingested by the Spool Directory source, and the other for the WAL) Thanks, ~Rahul. From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 3:05 PM Subject: Re: Guarantees of the memory channel for delivering to sink This use case sounds like a perfect use of the Spool DIrectory source which will be in the upcoming 1.3 release. Brock On Tue, Nov 6, 2012 at 4:53 PM, Rahul Ravindran rahu...@yahoo.com wrote: We will update the checkpoint each time (we may tune this to be periodic) but the contents of the memory channel will be in the legacy logs which are currently being generated. Additionally, the sink for the memory channel will be an Avro source in another machine. Does that clear things up? From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:44 PM Subject: Re: Guarantees of the memory channel for delivering to sink But in your architecture you are going to write the contents of the memory channel out? Or did I miss something? The checkpoint will be updated each time we perform a successive insertion into the memory channel. On Tue, Nov 6, 2012 at 3:43 PM, Rahul Ravindran rahu...@yahoo.com wrote: We have a legacy system which writes events to a file (existing log file). This will continue. If I used a filechannel, I will be double the number of IO operations(writes to the legacy log file, and writes to WAL). From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:38 PM Subject: Re: Guarantees of the memory channel for delivering to sink Your still going to be writing out all events, no? So how would file channel do more IO than that? On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below: FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I realize that this results in a risk of data, the maximum size of which is the capacity of the memory channel). As long as there is capacity in the memory channel buffers, does the memory channel guarantee delivery to a sink (does it wait for acknowledgements, and retry failed packets)? This would mean that we need to ensure that we do not exceed the channel capacity. I am writing a custom source which will use the memory channel, and which will catch
Re: Guarantees of the memory channel for delivering to sink
The memory channel doesn't know about networks. The sources like avrosource/avrosink do. They operate on TCP/IP and when there is an error sending data downstream they roll the transaction back so that no data is lost. The believe the docs cover this here http://flume.apache.org/FlumeUserGuide.html Brock On Wed, Nov 7, 2012 at 1:52 PM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, Thanks for the response. Does the memory channel provide transactional guarantees? In the event of a network packet loss, does it retry sending the packet? If we ensure that we do not exceed the capacity for the memory channel, does it continue retrying to send an event to the remote source on failure? Thanks, ~Rahul. -- *From:* Brock Noland br...@cloudera.com *To:* user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com *Sent:* Wednesday, November 7, 2012 11:48 AM *Subject:* Re: Guarantees of the memory channel for delivering to sink Hi, Yes if you use memory channel, you can lose data. To not lose data, file channel needs to write to disk... Brock On Wed, Nov 7, 2012 at 1:29 PM, Rahul Ravindran rahu...@yahoo.com wrote: Ping on the below questions about new Spool Directory source: If we choose to use the memory channel with this source, to an Avro sink on a remote box, do we risk data loss in the eventuality of a network partition/slow network or if the flume-agent on the source box dies? If we choose to use file channel with this source, we will result in double writes to disk, correct? (one for the legacy log files which will be ingested by the Spool Directory source, and the other for the WAL) -- *From:* Rahul Ravindran rahu...@yahoo.com *To:* user@flume.apache.org user@flume.apache.org *Sent:* Tuesday, November 6, 2012 3:40 PM *Subject:* Re: Guarantees of the memory channel for delivering to sink This is awesome. This may be perfect for our use case :) When is the 1.3 release expected? Couple of questions for the choice of channel for the new source: If we choose to use the memory channel with this source, to an Avro sink on a remote box, do we risk data loss in the eventuality of a network partition/slow network or if the flume-agent on the source box dies? If we choose to use file channel with this source, we will result in double writes to disk, correct? (one for the legacy log files which will be ingested by the Spool Directory source, and the other for the WAL) Thanks, ~Rahul. -- *From:* Brock Noland br...@cloudera.com *To:* user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com *Sent:* Tuesday, November 6, 2012 3:05 PM *Subject:* Re: Guarantees of the memory channel for delivering to sink This use case sounds like a perfect use of the Spool DIrectory source which will be in the upcoming 1.3 release. Brock On Tue, Nov 6, 2012 at 4:53 PM, Rahul Ravindran rahu...@yahoo.com wrote: We will update the checkpoint each time (we may tune this to be periodic) but the contents of the memory channel will be in the legacy logs which are currently being generated. Additionally, the sink for the memory channel will be an Avro source in another machine. Does that clear things up? From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:44 PM Subject: Re: Guarantees of the memory channel for delivering to sink But in your architecture you are going to write the contents of the memory channel out? Or did I miss something? The checkpoint will be updated each time we perform a successive insertion into the memory channel. On Tue, Nov 6, 2012 at 3:43 PM, Rahul Ravindran rahu...@yahoo.com wrote: We have a legacy system which writes events to a file (existing log file). This will continue. If I used a filechannel, I will be double the number of IO operations(writes to the legacy log file, and writes to WAL). From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:38 PM Subject: Re: Guarantees of the memory channel for delivering to sink Your still going to be writing out all events, no? So how would file channel do more IO than that? On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below: FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I
Guarantees of the memory channel for delivering to sink
Hi, I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below: FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I realize that this results in a risk of data, the maximum size of which is the capacity of the memory channel). As long as there is capacity in the memory channel buffers, does the memory channel guarantee delivery to a sink (does it wait for acknowledgements, and retry failed packets)? This would mean that we need to ensure that we do not exceed the channel capacity. I am writing a custom source which will use the memory channel, and which will catch a ChannelException to identify any channel capacity issues(so, buffer used in the memory channel is full because of lagging sinks/network issues etc). Is that a reasonable assumption to make? Thanks, ~Rahul.
Re: Guarantees of the memory channel for delivering to sink
Your still going to be writing out all events, no? So how would file channel do more IO than that? On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below: FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I realize that this results in a risk of data, the maximum size of which is the capacity of the memory channel). As long as there is capacity in the memory channel buffers, does the memory channel guarantee delivery to a sink (does it wait for acknowledgements, and retry failed packets)? This would mean that we need to ensure that we do not exceed the channel capacity. I am writing a custom source which will use the memory channel, and which will catch a ChannelException to identify any channel capacity issues(so, buffer used in the memory channel is full because of lagging sinks/network issues etc). Is that a reasonable assumption to make? Thanks, ~Rahul. -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
Re: Guarantees of the memory channel for delivering to sink
But in your architecture you are going to write the contents of the memory channel out? Or did I miss something? The checkpoint will be updated each time we perform a successive insertion into the memory channel. On Tue, Nov 6, 2012 at 3:43 PM, Rahul Ravindran rahu...@yahoo.com wrote: We have a legacy system which writes events to a file (existing log file). This will continue. If I used a filechannel, I will be double the number of IO operations(writes to the legacy log file, and writes to WAL). From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:38 PM Subject: Re: Guarantees of the memory channel for delivering to sink Your still going to be writing out all events, no? So how would file channel do more IO than that? On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below: FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I realize that this results in a risk of data, the maximum size of which is the capacity of the memory channel). As long as there is capacity in the memory channel buffers, does the memory channel guarantee delivery to a sink (does it wait for acknowledgements, and retry failed packets)? This would mean that we need to ensure that we do not exceed the channel capacity. I am writing a custom source which will use the memory channel, and which will catch a ChannelException to identify any channel capacity issues(so, buffer used in the memory channel is full because of lagging sinks/network issues etc). Is that a reasonable assumption to make? Thanks, ~Rahul. -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
Re: Guarantees of the memory channel for delivering to sink
We will update the checkpoint each time (we may tune this to be periodic) but the contents of the memory channel will be in the legacy logs which are currently being generated. Additionally, the sink for the memory channel will be an Avro source in another machine. Does that clear things up? From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:44 PM Subject: Re: Guarantees of the memory channel for delivering to sink But in your architecture you are going to write the contents of the memory channel out? Or did I miss something? The checkpoint will be updated each time we perform a successive insertion into the memory channel. On Tue, Nov 6, 2012 at 3:43 PM, Rahul Ravindran rahu...@yahoo.com wrote: We have a legacy system which writes events to a file (existing log file). This will continue. If I used a filechannel, I will be double the number of IO operations(writes to the legacy log file, and writes to WAL). From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:38 PM Subject: Re: Guarantees of the memory channel for delivering to sink Your still going to be writing out all events, no? So how would file channel do more IO than that? On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below: FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I realize that this results in a risk of data, the maximum size of which is the capacity of the memory channel). As long as there is capacity in the memory channel buffers, does the memory channel guarantee delivery to a sink (does it wait for acknowledgements, and retry failed packets)? This would mean that we need to ensure that we do not exceed the channel capacity. I am writing a custom source which will use the memory channel, and which will catch a ChannelException to identify any channel capacity issues(so, buffer used in the memory channel is full because of lagging sinks/network issues etc). Is that a reasonable assumption to make? Thanks, ~Rahul. -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
Re: Guarantees of the memory channel for delivering to sink
This use case sounds like a perfect use of the Spool DIrectory source which will be in the upcoming 1.3 release. Brock On Tue, Nov 6, 2012 at 4:53 PM, Rahul Ravindran rahu...@yahoo.com wrote: We will update the checkpoint each time (we may tune this to be periodic) but the contents of the memory channel will be in the legacy logs which are currently being generated. Additionally, the sink for the memory channel will be an Avro source in another machine. Does that clear things up? From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:44 PM Subject: Re: Guarantees of the memory channel for delivering to sink But in your architecture you are going to write the contents of the memory channel out? Or did I miss something? The checkpoint will be updated each time we perform a successive insertion into the memory channel. On Tue, Nov 6, 2012 at 3:43 PM, Rahul Ravindran rahu...@yahoo.com wrote: We have a legacy system which writes events to a file (existing log file). This will continue. If I used a filechannel, I will be double the number of IO operations(writes to the legacy log file, and writes to WAL). From: Brock Noland br...@cloudera.com To: user@flume.apache.org; Rahul Ravindran rahu...@yahoo.com Sent: Tuesday, November 6, 2012 1:38 PM Subject: Re: Guarantees of the memory channel for delivering to sink Your still going to be writing out all events, no? So how would file channel do more IO than that? On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below: FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I realize that this results in a risk of data, the maximum size of which is the capacity of the memory channel). As long as there is capacity in the memory channel buffers, does the memory channel guarantee delivery to a sink (does it wait for acknowledgements, and retry failed packets)? This would mean that we need to ensure that we do not exceed the channel capacity. I am writing a custom source which will use the memory channel, and which will catch a ChannelException to identify any channel capacity issues(so, buffer used in the memory channel is full because of lagging sinks/network issues etc). Is that a reasonable assumption to make? Thanks, ~Rahul. -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/