Re: Query about LogConfigs
This also came up recently when Geoff and I were discussing KAFKA-2257, which needs to add a timeout option to a tool that KAFKA-2276 is introducing. For that timeout, the question is whether we should strive for consistency (use ms), the unit that is most likely convenient (use s), or finest granularity (use ms). In the case of retention, I like having at least minutes and hours as options. Admittedly it's not ideal to have two settings for the same config, but both units are reasonable and having both makes it convenient for the user. If you only support, for example, the minutes unit, having to mentally divide by 60 to do the conversion is annoying. If you only support hours, you might not provide fine enough granularity for some use cases. I think the only real candidate for removal would be log.retention.ms, but it seems that was added for consistency. I actually think the ideal solution is to make ConfigDef.Types contain a DURATION type. It could either make it so that adding a setting foo of type TIME allows you to specify foo. = duration where unit is ms,seconds,minutes,hours, or just parse strings with the appropriate suffixes (ms, s, m, hr) into int + TimeUnit. This gives users flexibility to use whatever is most convenient and standardizes on supporting any format for any duration config. In practice, the first version is backwards compatible and probably feasible but annoying to add to the ConfigDef/AbstractConfig classes, and the second option is not backwards compatible. -Ewen On Fri, Jul 24, 2015 at 7:29 PM, Grant Henke wrote: > They take precedence in order of granularity (ms > minutes > hours). Below > is the relevant snippet of code. > > I vote for keeping all time related configs in milliseconds and deprecating > the others. It gives you the most control in a single config. The minutes > and hours ones offer no new functionality that can't be achieved with > the milliseconds one. > > private def getLogRetentionTimeMillis(): Long = { > val millisInMinute = 60L * 1000L > val millisInHour = 60L * millisInMinute > > if(props.containsKey("log.retention.ms")){ >props.getIntInRange("log.retention.ms", (1, Int.MaxValue)) > } > else if(props.containsKey("log.retention.minutes")){ >millisInMinute * props.getIntInRange("log.retention.minutes", (1, > Int.MaxValue)) > } > else { >millisInHour * props.getIntInRange("log.retention.hours", 24*7, (1, > Int.MaxValue)) > } > } > > On Fri, Jul 24, 2015 at 9:18 PM, Mayuresh Gharat < > gharatmayures...@gmail.com > > wrote: > > > Hmm. > > This creates confusion for people who might be new to kafka or have never > > used those properties. > > Does deprecating the older ones make sense here? > > > > Thanks, > > > > Mayuresh > > > > > > On Fri, Jul 24, 2015 at 7:15 PM, Gwen Shapira > > wrote: > > > > > Agree. All I found on this was KAFKA-1325, which is more about > > > inconsistency than real use-case. > > > > > > Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out > > now > > > :) > > > > > > On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar > > > wrote: > > > > Is there actually a use case where we need "log.retention.ms"? In > most > > > > cases, people would want to retain their logs for at least a few > > minutes > > > > I'd think. > > > > > > > > Aditya > > > > > > > > On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira > > > > wrote: > > > > > > > >> Backward compatibility, I think. > > > >> > > > >> At least the "ms" one is fairly new, and I think we left the others > to > > > >> avoid break configuration during upgrade. > > > >> > > > >> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat > > > >> wrote: > > > >> > I was thinking why we have 3 different configs for the same > property > > > (log > > > >> > retention) : > > > >> > > > > >> > "log.retention.ms" > > > >> > "log.retention.minutes" > > > >> > "log.retention.hours" > > > >> > > > > >> > Why don't we only use the Milliseconds? > > > >> > There are other properties as well like log Jitter, LogRollTime > > which > > > >> raise > > > >> > the same question in my mind. > > > >> > > > > >> > The reason I am asking this is that a user might misconfigure them > > > like > > > >> > setting the log.retention.ms = 1 and log.retention.hours = 1. > > > >> > > > > >> > -- > > > >> > -Regards, > > > >> > Mayuresh R. Gharat > > > >> > (862) 250-7125 > > > >> > > > > > > > > > > > -- > > -Regards, > > Mayuresh R. Gharat > > (862) 250-7125 > > > > > > -- > Grant Henke > Solutions Consultant | Cloudera > ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke > -- Thanks, Ewen
Re: Query about LogConfigs
Cool. Yeah +1 on make it consistent. Thanks, Mayuresh On Fri, Jul 24, 2015 at 7:29 PM, Grant Henke wrote: > They take precedence in order of granularity (ms > minutes > hours). Below > is the relevant snippet of code. > > I vote for keeping all time related configs in milliseconds and deprecating > the others. It gives you the most control in a single config. The minutes > and hours ones offer no new functionality that can't be achieved with > the milliseconds one. > > private def getLogRetentionTimeMillis(): Long = { > val millisInMinute = 60L * 1000L > val millisInHour = 60L * millisInMinute > > if(props.containsKey("log.retention.ms")){ >props.getIntInRange("log.retention.ms", (1, Int.MaxValue)) > } > else if(props.containsKey("log.retention.minutes")){ >millisInMinute * props.getIntInRange("log.retention.minutes", (1, > Int.MaxValue)) > } > else { >millisInHour * props.getIntInRange("log.retention.hours", 24*7, (1, > Int.MaxValue)) > } > } > > On Fri, Jul 24, 2015 at 9:18 PM, Mayuresh Gharat < > gharatmayures...@gmail.com > > wrote: > > > Hmm. > > This creates confusion for people who might be new to kafka or have never > > used those properties. > > Does deprecating the older ones make sense here? > > > > Thanks, > > > > Mayuresh > > > > > > On Fri, Jul 24, 2015 at 7:15 PM, Gwen Shapira > > wrote: > > > > > Agree. All I found on this was KAFKA-1325, which is more about > > > inconsistency than real use-case. > > > > > > Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out > > now > > > :) > > > > > > On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar > > > wrote: > > > > Is there actually a use case where we need "log.retention.ms"? In > most > > > > cases, people would want to retain their logs for at least a few > > minutes > > > > I'd think. > > > > > > > > Aditya > > > > > > > > On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira > > > > wrote: > > > > > > > >> Backward compatibility, I think. > > > >> > > > >> At least the "ms" one is fairly new, and I think we left the others > to > > > >> avoid break configuration during upgrade. > > > >> > > > >> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat > > > >> wrote: > > > >> > I was thinking why we have 3 different configs for the same > property > > > (log > > > >> > retention) : > > > >> > > > > >> > "log.retention.ms" > > > >> > "log.retention.minutes" > > > >> > "log.retention.hours" > > > >> > > > > >> > Why don't we only use the Milliseconds? > > > >> > There are other properties as well like log Jitter, LogRollTime > > which > > > >> raise > > > >> > the same question in my mind. > > > >> > > > > >> > The reason I am asking this is that a user might misconfigure them > > > like > > > >> > setting the log.retention.ms = 1 and log.retention.hours = 1. > > > >> > > > > >> > -- > > > >> > -Regards, > > > >> > Mayuresh R. Gharat > > > >> > (862) 250-7125 > > > >> > > > > > > > > > > > -- > > -Regards, > > Mayuresh R. Gharat > > (862) 250-7125 > > > > > > -- > Grant Henke > Solutions Consultant | Cloudera > ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke > -- -Regards, Mayuresh R. Gharat (862) 250-7125
Re: Query about LogConfigs
They take precedence in order of granularity (ms > minutes > hours). Below is the relevant snippet of code. I vote for keeping all time related configs in milliseconds and deprecating the others. It gives you the most control in a single config. The minutes and hours ones offer no new functionality that can't be achieved with the milliseconds one. private def getLogRetentionTimeMillis(): Long = { val millisInMinute = 60L * 1000L val millisInHour = 60L * millisInMinute if(props.containsKey("log.retention.ms")){ props.getIntInRange("log.retention.ms", (1, Int.MaxValue)) } else if(props.containsKey("log.retention.minutes")){ millisInMinute * props.getIntInRange("log.retention.minutes", (1, Int.MaxValue)) } else { millisInHour * props.getIntInRange("log.retention.hours", 24*7, (1, Int.MaxValue)) } } On Fri, Jul 24, 2015 at 9:18 PM, Mayuresh Gharat wrote: > Hmm. > This creates confusion for people who might be new to kafka or have never > used those properties. > Does deprecating the older ones make sense here? > > Thanks, > > Mayuresh > > > On Fri, Jul 24, 2015 at 7:15 PM, Gwen Shapira > wrote: > > > Agree. All I found on this was KAFKA-1325, which is more about > > inconsistency than real use-case. > > > > Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out > now > > :) > > > > On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar > > wrote: > > > Is there actually a use case where we need "log.retention.ms"? In most > > > cases, people would want to retain their logs for at least a few > minutes > > > I'd think. > > > > > > Aditya > > > > > > On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira > > wrote: > > > > > >> Backward compatibility, I think. > > >> > > >> At least the "ms" one is fairly new, and I think we left the others to > > >> avoid break configuration during upgrade. > > >> > > >> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat > > >> wrote: > > >> > I was thinking why we have 3 different configs for the same property > > (log > > >> > retention) : > > >> > > > >> > "log.retention.ms" > > >> > "log.retention.minutes" > > >> > "log.retention.hours" > > >> > > > >> > Why don't we only use the Milliseconds? > > >> > There are other properties as well like log Jitter, LogRollTime > which > > >> raise > > >> > the same question in my mind. > > >> > > > >> > The reason I am asking this is that a user might misconfigure them > > like > > >> > setting the log.retention.ms = 1 and log.retention.hours = 1. > > >> > > > >> > -- > > >> > -Regards, > > >> > Mayuresh R. Gharat > > >> > (862) 250-7125 > > >> > > > > > > -- > -Regards, > Mayuresh R. Gharat > (862) 250-7125 > -- Grant Henke Solutions Consultant | Cloudera ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
Re: Query about LogConfigs
Hmm. This creates confusion for people who might be new to kafka or have never used those properties. Does deprecating the older ones make sense here? Thanks, Mayuresh On Fri, Jul 24, 2015 at 7:15 PM, Gwen Shapira wrote: > Agree. All I found on this was KAFKA-1325, which is more about > inconsistency than real use-case. > > Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out now > :) > > On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar > wrote: > > Is there actually a use case where we need "log.retention.ms"? In most > > cases, people would want to retain their logs for at least a few minutes > > I'd think. > > > > Aditya > > > > On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira > wrote: > > > >> Backward compatibility, I think. > >> > >> At least the "ms" one is fairly new, and I think we left the others to > >> avoid break configuration during upgrade. > >> > >> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat > >> wrote: > >> > I was thinking why we have 3 different configs for the same property > (log > >> > retention) : > >> > > >> > "log.retention.ms" > >> > "log.retention.minutes" > >> > "log.retention.hours" > >> > > >> > Why don't we only use the Milliseconds? > >> > There are other properties as well like log Jitter, LogRollTime which > >> raise > >> > the same question in my mind. > >> > > >> > The reason I am asking this is that a user might misconfigure them > like > >> > setting the log.retention.ms = 1 and log.retention.hours = 1. > >> > > >> > -- > >> > -Regards, > >> > Mayuresh R. Gharat > >> > (862) 250-7125 > >> > -- -Regards, Mayuresh R. Gharat (862) 250-7125
Re: Query about LogConfigs
Oh ok. Is there a plan that we should deprecate the older ones. This is because as I said some users might misconfigure it like by using both the log.retention.ms" and "log.retention.minutes" by mistake. We can probably set a warning that only one of this should be used. Also looking at the KafkaConfig, it looks like the LogRetentionTimeHours is a required config while the LogRetentionTimeMillis and LogRetentionTimeMinutes are not. Also the doc says this : val LogRetentionTimeMillisDoc = "The number of milliseconds to keep a log file before deleting it (in milliseconds)" val LogRetentionTimeMinsDoc = "The number of minutes to keep a log file before deleting it (in minutes), secondary to " + LogRetentionTimeMillisProp + " property" val LogRetentionTimeHoursDoc = "The number of hours to keep a log file before deleting it (in hours), tertiary to " + LogRetentionTimeMillisProp + " property" So if my understanding is correct, LogRetentionTimeHours takes precedence over all of the other. Am I right? Thanks, Mayuresh On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira wrote: > Backward compatibility, I think. > > At least the "ms" one is fairly new, and I think we left the others to > avoid break configuration during upgrade. > > On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat > wrote: > > I was thinking why we have 3 different configs for the same property (log > > retention) : > > > > "log.retention.ms" > > "log.retention.minutes" > > "log.retention.hours" > > > > Why don't we only use the Milliseconds? > > There are other properties as well like log Jitter, LogRollTime which > raise > > the same question in my mind. > > > > The reason I am asking this is that a user might misconfigure them like > > setting the log.retention.ms = 1 and log.retention.hours = 1. > > > > -- > > -Regards, > > Mayuresh R. Gharat > > (862) 250-7125 > -- -Regards, Mayuresh R. Gharat (862) 250-7125
Re: Query about LogConfigs
Agree. All I found on this was KAFKA-1325, which is more about inconsistency than real use-case. Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out now :) On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar wrote: > Is there actually a use case where we need "log.retention.ms"? In most > cases, people would want to retain their logs for at least a few minutes > I'd think. > > Aditya > > On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira wrote: > >> Backward compatibility, I think. >> >> At least the "ms" one is fairly new, and I think we left the others to >> avoid break configuration during upgrade. >> >> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat >> wrote: >> > I was thinking why we have 3 different configs for the same property (log >> > retention) : >> > >> > "log.retention.ms" >> > "log.retention.minutes" >> > "log.retention.hours" >> > >> > Why don't we only use the Milliseconds? >> > There are other properties as well like log Jitter, LogRollTime which >> raise >> > the same question in my mind. >> > >> > The reason I am asking this is that a user might misconfigure them like >> > setting the log.retention.ms = 1 and log.retention.hours = 1. >> > >> > -- >> > -Regards, >> > Mayuresh R. Gharat >> > (862) 250-7125 >>
Re: Query about LogConfigs
Is there actually a use case where we need "log.retention.ms"? In most cases, people would want to retain their logs for at least a few minutes I'd think. Aditya On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira wrote: > Backward compatibility, I think. > > At least the "ms" one is fairly new, and I think we left the others to > avoid break configuration during upgrade. > > On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat > wrote: > > I was thinking why we have 3 different configs for the same property (log > > retention) : > > > > "log.retention.ms" > > "log.retention.minutes" > > "log.retention.hours" > > > > Why don't we only use the Milliseconds? > > There are other properties as well like log Jitter, LogRollTime which > raise > > the same question in my mind. > > > > The reason I am asking this is that a user might misconfigure them like > > setting the log.retention.ms = 1 and log.retention.hours = 1. > > > > -- > > -Regards, > > Mayuresh R. Gharat > > (862) 250-7125 >
Re: Query about LogConfigs
Backward compatibility, I think. At least the "ms" one is fairly new, and I think we left the others to avoid break configuration during upgrade. On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat wrote: > I was thinking why we have 3 different configs for the same property (log > retention) : > > "log.retention.ms" > "log.retention.minutes" > "log.retention.hours" > > Why don't we only use the Milliseconds? > There are other properties as well like log Jitter, LogRollTime which raise > the same question in my mind. > > The reason I am asking this is that a user might misconfigure them like > setting the log.retention.ms = 1 and log.retention.hours = 1. > > -- > -Regards, > Mayuresh R. Gharat > (862) 250-7125
Query about LogConfigs
I was thinking why we have 3 different configs for the same property (log retention) : "log.retention.ms" "log.retention.minutes" "log.retention.hours" Why don't we only use the Milliseconds? There are other properties as well like log Jitter, LogRollTime which raise the same question in my mind. The reason I am asking this is that a user might misconfigure them like setting the log.retention.ms = 1 and log.retention.hours = 1. -- -Regards, Mayuresh R. Gharat (862) 250-7125