> > I am saying that replication quotas will mitigate one of the potential > downsides of setting an infinite retention policy.
I was just interested in all of the possible potential downsides! Could you please point me to a documentation that has more information on this? On Tue, Mar 14, 2017 at 7:07 PM, Hans Jespersen <h...@confluent.io> wrote: > I am saying that replication quotas will mitigate one of the potential > downsides of setting an infinite retention policy. > > There is no clear set yes/no best practice rule for setting an extremely > large retention policy. It is clearly a valid configuration and there are > people who run this way. > > The issues have more to do will the amount of data you expect to be stored > over the life of the system. If you have a Kafka cluster with petabytes of > data in it and a consumer comes along and blindly consumes from the > beginning, they will be getting a lot of data. So much so that this might > be considered an anti-pattern because their apps might not behave as they > expect and the network bandwidth used by lots of clients operating this way > may be considered bad practice. > > Another way to avoid collecting too much data is to use compacted topics, > which are a special kind of topic that keeps the latest value for each key > forever, but removes the older messages with the same key in order to > reduce the total about of messages stored. > > How much data do you expect to store in your largest topic over the life of > the cluster? > > -hans > > > > > > /** > * Hans Jespersen, Principal Systems Engineer, Confluent Inc. > * h...@confluent.io (650)924-2670 > */ > > On Tue, Mar 14, 2017 at 10:36 AM, Joe San <codeintheo...@gmail.com> wrote: > > > So that means with replication quotas, I can set the retention policy to > be > > infinite? > > > > On Tue, Mar 14, 2017 at 6:25 PM, Hans Jespersen <h...@confluent.io> > wrote: > > > > > You might want to use the new replication quotas mechanism (i.e. > network > > > throttling) to make sure that replication traffic doesn't negatively > > impact > > > your production traffic. > > > > > > See for details: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > 73+Replication+Quotas > > > > > > This feature was added in 0.10.1 > > > > > > -hans > > > > > > /** > > > * Hans Jespersen, Principal Systems Engineer, Confluent Inc. > > > * h...@confluent.io (650)924-2670 > > > */ > > > > > > On Tue, Mar 14, 2017 at 10:09 AM, Joe San <codeintheo...@gmail.com> > > wrote: > > > > > > > Dear Kafka Users, > > > > > > > > What are the arguments against setting the retention plociy on a > Kafka > > > > topic to infinite? I was in an interesting discussion with one of my > > > > colleagues where he was suggesting to set the retention policy for a > > > topic > > > > to be indefinite. > > > > > > > > So how does this play up when adding new broker partitions? Say, I > have > > > > accumulated in my topic some gigabytes of data and now I realize > that I > > > > have to scale up by adding another partition. Now is this going to > pose > > > me > > > > a problem? The partition rebalance has to happen and I'm not sure > what > > > the > > > > implications are with rebalancing a partition that has gigabytes of > > data. > > > > > > > > Any thoughts on this? > > > > > > > > Thanks and Regards, > > > > Jothi > > > > > > > > > >