Re: How to parse some of JMX Bean's names.
Hi Neha, The patch is in https://issues.apache.org/jira/browse/KAFKA-1481 . It's super-simple. Does it require a review or can one submit it to Jenkins or something else? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Jun 4, 2014 at 9:47 PM, Neha Narkhede neha.narkh...@gmail.com wrote: Agree that 0.9 is a few months out. 0.8.2, on the other hand, is probably less than a month away. On one hand, we are trying our best to minimize the set of changes to the existing consumer in the interest of saving time to work on the new consumer. Since the old consumer is very complex and is now stable for some time, the motivation is to make fewer changes to it to maintain that stability. On the other hand, if there are critical bug fixes, it makes sense to patch the old consumer and do a point release. We would be happy to take a patch from you. How about we look at the size of the proposed changes and discuss a release timeline on the JIRA? Thanks, Neha On Wed, Jun 4, 2014 at 3:09 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Guozhang, Since the new consumer is 2-3 months out ... hm, no, looks like 4 months out - October - https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan Any way you could change this in 0.8.1.2 or 0.8.2? We can submit a patch. Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang wangg...@gmail.com wrote: I think we will make this change in the new consumer, which may be released in 0.9. Guozhang On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi, thx Guozhang Wang, looking forward. When do you think this changes will available? 0.8.2? Jul 2014 https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ? Or later? Best regards, Vladimir. On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com wrote: Hello Vladimir, comments in-lined. On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi again, few more questions from me: *1.* What I see in JMX: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize From code: newGauge( config.clientId + - + config.groupId + - + topicThreadId._1 + - + topicThreadId._2 + -FetchQueueSize, new Gauge[Int] { def value = q.size } ) I've tried to parse part as I've understood they. config.clientIdaf_servers topicThreadId._1 af_servers-spm_new_cluster_topic topicThreadId._2 af_servers_wawanawna-Dell-1401353748289-fcaaea29-0 Yes I can suppose that this topicThreadId._1 will always looks like GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will it always true? With the new consumer coming soon, the metrics naming schemes would very likely to be refined. So I cannot say this will always be true. *2.* From code I see that sometimes Kafka uses _ as separator, not only -: val consumerIdString = { var consumerUuid : String = null config.consumerId match { case Some(consumerId) // for testing only = consumerUuid = consumerId case None // generate unique consumerId automatically = val uuid = UUID.randomUUID() consumerUuid = %s-%d-%s.format( InetAddress.getLocalHost.getHostName, System.currentTimeMillis, uuid.getMostSignificantBits().toHexString.substring(0,8)) } config.groupId + _ + consumerUuid } That means if user will use _ as part of his host/topic/groupId name it maybe be a problem to parse string like: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell, what is host name here servers_wawanawna-Dell or wawanawna-Dell ? So from one side if we want to be able parse name without any problems we have to avoid using - and _ in host/topic/groupId/clientId, but at the same time I see (from http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name ): *Client id is used
Re: How to parse some of JMX Bean's names.
Yes, it will require a review and I see that Jun reviewed it. On Fri, Jun 20, 2014 at 5:03 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Neha, The patch is in https://issues.apache.org/jira/browse/KAFKA-1481 . It's super-simple. Does it require a review or can one submit it to Jenkins or something else? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Jun 4, 2014 at 9:47 PM, Neha Narkhede neha.narkh...@gmail.com wrote: Agree that 0.9 is a few months out. 0.8.2, on the other hand, is probably less than a month away. On one hand, we are trying our best to minimize the set of changes to the existing consumer in the interest of saving time to work on the new consumer. Since the old consumer is very complex and is now stable for some time, the motivation is to make fewer changes to it to maintain that stability. On the other hand, if there are critical bug fixes, it makes sense to patch the old consumer and do a point release. We would be happy to take a patch from you. How about we look at the size of the proposed changes and discuss a release timeline on the JIRA? Thanks, Neha On Wed, Jun 4, 2014 at 3:09 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Guozhang, Since the new consumer is 2-3 months out ... hm, no, looks like 4 months out - October - https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan Any way you could change this in 0.8.1.2 or 0.8.2? We can submit a patch. Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang wangg...@gmail.com wrote: I think we will make this change in the new consumer, which may be released in 0.9. Guozhang On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi, thx Guozhang Wang, looking forward. When do you think this changes will available? 0.8.2? Jul 2014 https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ? Or later? Best regards, Vladimir. On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com wrote: Hello Vladimir, comments in-lined. On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi again, few more questions from me: *1.* What I see in JMX: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize From code: newGauge( config.clientId + - + config.groupId + - + topicThreadId._1 + - + topicThreadId._2 + -FetchQueueSize, new Gauge[Int] { def value = q.size } ) I've tried to parse part as I've understood they. config.clientIdaf_servers topicThreadId._1 af_servers-spm_new_cluster_topic topicThreadId._2 af_servers_wawanawna-Dell-1401353748289-fcaaea29-0 Yes I can suppose that this topicThreadId._1 will always looks like GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will it always true? With the new consumer coming soon, the metrics naming schemes would very likely to be refined. So I cannot say this will always be true. *2.* From code I see that sometimes Kafka uses _ as separator, not only -: val consumerIdString = { var consumerUuid : String = null config.consumerId match { case Some(consumerId) // for testing only = consumerUuid = consumerId case None // generate unique consumerId automatically = val uuid = UUID.randomUUID() consumerUuid = %s-%d-%s.format( InetAddress.getLocalHost.getHostName, System.currentTimeMillis, uuid.getMostSignificantBits().toHexString.substring(0,8)) } config.groupId + _ + consumerUuid } That means if user will use _ as part of his host/topic/groupId name it maybe be a problem to parse string like: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell, what is host name here servers_wawanawna-Dell or wawanawna-Dell ?
Re: How to parse some of JMX Bean's names.
Hello Vladimir, comments in-lined. On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi again, few more questions from me: *1.* What I see in JMX: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize From code: newGauge( config.clientId + - + config.groupId + - + topicThreadId._1 + - + topicThreadId._2 + -FetchQueueSize, new Gauge[Int] { def value = q.size } ) I've tried to parse part as I've understood they. config.clientIdaf_servers topicThreadId._1 af_servers-spm_new_cluster_topic topicThreadId._2 af_servers_wawanawna-Dell-1401353748289-fcaaea29-0 Yes I can suppose that this topicThreadId._1 will always looks like GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will it always true? With the new consumer coming soon, the metrics naming schemes would very likely to be refined. So I cannot say this will always be true. *2.* From code I see that sometimes Kafka uses _ as separator, not only -: val consumerIdString = { var consumerUuid : String = null config.consumerId match { case Some(consumerId) // for testing only = consumerUuid = consumerId case None // generate unique consumerId automatically = val uuid = UUID.randomUUID() consumerUuid = %s-%d-%s.format( InetAddress.getLocalHost.getHostName, System.currentTimeMillis, uuid.getMostSignificantBits().toHexString.substring(0,8)) } config.groupId + _ + consumerUuid } That means if user will use _ as part of his host/topic/groupId name it maybe be a problem to parse string like: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell, what is host name here servers_wawanawna-Dell or wawanawna-Dell ? So from one side if we want to be able parse name without any problems we have to avoid using - and _ in host/topic/groupId/clientId, but at the same time I see (from http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name): *Client id is used for registering jmx beans for monitoring. Because of the* *restrictions in bean names, we limit the client id to be only alpha-numeric* *plus - and _.* Does that mean user can use only camelCase in his host/topic/groupId/clientId for distinguish one part of name from another? Is this a problem? Or I didn't understand something? Yeah I agree this is a problem, and we should fix it in the new consumer. Best regards from Sematext. On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Guozhang, On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang wangg...@gmail.com wrote: That is indeed a problem, for now, we recommend group name and topic names to use _ when there is a need for -, but this should be fixed systematically. Right! For you use case, could you change your topic/group name using _? Our own Kafka doesn't use topics with - characters, so we don't have a problem. The problem, in our case, is that we have a general (Kafka) monitoring tool that other people use to monitor Kafka - see http://sematext.com/spm/ . So we can't really tell people hey, our tool will work but only if you don't have a dash in your topic names and hosts and ... because if you use dashes we won't know how to parse your Kafka's MBean names :) Also, do you mind to file a JIRA ticket to keep track of this issue? Here it is: https://issues.apache.org/jira/browse/KAFKA-1481 Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hello everyone, We are adding Kafka 0.8.x monitoring support to SPM http://sematext.com/spm/ here at Sematext. Unfortunately, we quickly hit an issue caused by the new bean naming convention that embeds things like topic and host names in the beans along with metrics, separated by dashes, making it hard to parse these beans. To put it simply: it is hard/impossible to automatically figure out which part of the bean name is e.g. consumer group, which is the topic, which is the host name, and which is the name of the metric. Let me show you what I mean: kafka.consumer:type=ConsumerTopicMetrics, name=af_servers-spm_topic-BytesPerSec Here we actually CAN extract: * consumer group ('af_servers') * topic ('spm_topic') * metric (‘BytesPerSec’) BUT what if the
Re: How to parse some of JMX Bean's names.
I think we will make this change in the new consumer, which may be released in 0.9. Guozhang On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi, thx Guozhang Wang, looking forward. When do you think this changes will available? 0.8.2? Jul 2014 https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ? Or later? Best regards, Vladimir. On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com wrote: Hello Vladimir, comments in-lined. On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi again, few more questions from me: *1.* What I see in JMX: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize From code: newGauge( config.clientId + - + config.groupId + - + topicThreadId._1 + - + topicThreadId._2 + -FetchQueueSize, new Gauge[Int] { def value = q.size } ) I've tried to parse part as I've understood they. config.clientIdaf_servers topicThreadId._1 af_servers-spm_new_cluster_topic topicThreadId._2 af_servers_wawanawna-Dell-1401353748289-fcaaea29-0 Yes I can suppose that this topicThreadId._1 will always looks like GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will it always true? With the new consumer coming soon, the metrics naming schemes would very likely to be refined. So I cannot say this will always be true. *2.* From code I see that sometimes Kafka uses _ as separator, not only -: val consumerIdString = { var consumerUuid : String = null config.consumerId match { case Some(consumerId) // for testing only = consumerUuid = consumerId case None // generate unique consumerId automatically = val uuid = UUID.randomUUID() consumerUuid = %s-%d-%s.format( InetAddress.getLocalHost.getHostName, System.currentTimeMillis, uuid.getMostSignificantBits().toHexString.substring(0,8)) } config.groupId + _ + consumerUuid } That means if user will use _ as part of his host/topic/groupId name it maybe be a problem to parse string like: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell, what is host name here servers_wawanawna-Dell or wawanawna-Dell ? So from one side if we want to be able parse name without any problems we have to avoid using - and _ in host/topic/groupId/clientId, but at the same time I see (from http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name): *Client id is used for registering jmx beans for monitoring. Because of the* *restrictions in bean names, we limit the client id to be only alpha-numeric* *plus - and _.* Does that mean user can use only camelCase in his host/topic/groupId/clientId for distinguish one part of name from another? Is this a problem? Or I didn't understand something? Yeah I agree this is a problem, and we should fix it in the new consumer. Best regards from Sematext. On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Guozhang, On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang wangg...@gmail.com wrote: That is indeed a problem, for now, we recommend group name and topic names to use _ when there is a need for -, but this should be fixed systematically. Right! For you use case, could you change your topic/group name using _? Our own Kafka doesn't use topics with - characters, so we don't have a problem. The problem, in our case, is that we have a general (Kafka) monitoring tool that other people use to monitor Kafka - see http://sematext.com/spm/ . So we can't really tell people hey, our tool will work but only if you don't have a dash in your topic names and hosts and ... because if you use dashes we won't know how to parse your Kafka's MBean names :) Also, do you mind to file a JIRA ticket to keep track of this issue? Here it is: https://issues.apache.org/jira/browse/KAFKA-1481 Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hello everyone, We are adding Kafka 0.8.x monitoring support to SPM http://sematext.com/spm/
Re: How to parse some of JMX Bean's names.
Hi Guozhang, Since the new consumer is 2-3 months out ... hm, no, looks like 4 months out - October - https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan Any way you could change this in 0.8.1.2 or 0.8.2? We can submit a patch. Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang wangg...@gmail.com wrote: I think we will make this change in the new consumer, which may be released in 0.9. Guozhang On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi, thx Guozhang Wang, looking forward. When do you think this changes will available? 0.8.2? Jul 2014 https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ? Or later? Best regards, Vladimir. On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com wrote: Hello Vladimir, comments in-lined. On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi again, few more questions from me: *1.* What I see in JMX: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize From code: newGauge( config.clientId + - + config.groupId + - + topicThreadId._1 + - + topicThreadId._2 + -FetchQueueSize, new Gauge[Int] { def value = q.size } ) I've tried to parse part as I've understood they. config.clientIdaf_servers topicThreadId._1 af_servers-spm_new_cluster_topic topicThreadId._2 af_servers_wawanawna-Dell-1401353748289-fcaaea29-0 Yes I can suppose that this topicThreadId._1 will always looks like GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will it always true? With the new consumer coming soon, the metrics naming schemes would very likely to be refined. So I cannot say this will always be true. *2.* From code I see that sometimes Kafka uses _ as separator, not only -: val consumerIdString = { var consumerUuid : String = null config.consumerId match { case Some(consumerId) // for testing only = consumerUuid = consumerId case None // generate unique consumerId automatically = val uuid = UUID.randomUUID() consumerUuid = %s-%d-%s.format( InetAddress.getLocalHost.getHostName, System.currentTimeMillis, uuid.getMostSignificantBits().toHexString.substring(0,8)) } config.groupId + _ + consumerUuid } That means if user will use _ as part of his host/topic/groupId name it maybe be a problem to parse string like: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell, what is host name here servers_wawanawna-Dell or wawanawna-Dell ? So from one side if we want to be able parse name without any problems we have to avoid using - and _ in host/topic/groupId/clientId, but at the same time I see (from http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name ): *Client id is used for registering jmx beans for monitoring. Because of the* *restrictions in bean names, we limit the client id to be only alpha-numeric* *plus - and _.* Does that mean user can use only camelCase in his host/topic/groupId/clientId for distinguish one part of name from another? Is this a problem? Or I didn't understand something? Yeah I agree this is a problem, and we should fix it in the new consumer. Best regards from Sematext. On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Guozhang, On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang wangg...@gmail.com wrote: That is indeed a problem, for now, we recommend group name and topic names to use _ when there is a need for -, but this should be fixed systematically. Right! For you use case, could you change your topic/group name using _? Our own Kafka doesn't use topics with - characters, so we don't have a problem. The problem, in our case, is that we have a general (Kafka) monitoring tool that other people use to monitor Kafka - see http://sematext.com/spm/ . So we can't really tell people hey, our tool will work but only if you don't have a dash in your topic names and hosts and ...
Re: How to parse some of JMX Bean's names.
Agree that 0.9 is a few months out. 0.8.2, on the other hand, is probably less than a month away. On one hand, we are trying our best to minimize the set of changes to the existing consumer in the interest of saving time to work on the new consumer. Since the old consumer is very complex and is now stable for some time, the motivation is to make fewer changes to it to maintain that stability. On the other hand, if there are critical bug fixes, it makes sense to patch the old consumer and do a point release. We would be happy to take a patch from you. How about we look at the size of the proposed changes and discuss a release timeline on the JIRA? Thanks, Neha On Wed, Jun 4, 2014 at 3:09 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Guozhang, Since the new consumer is 2-3 months out ... hm, no, looks like 4 months out - October - https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan Any way you could change this in 0.8.1.2 or 0.8.2? We can submit a patch. Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang wangg...@gmail.com wrote: I think we will make this change in the new consumer, which may be released in 0.9. Guozhang On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi, thx Guozhang Wang, looking forward. When do you think this changes will available? 0.8.2? Jul 2014 https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ? Or later? Best regards, Vladimir. On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com wrote: Hello Vladimir, comments in-lined. On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hi again, few more questions from me: *1.* What I see in JMX: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize From code: newGauge( config.clientId + - + config.groupId + - + topicThreadId._1 + - + topicThreadId._2 + -FetchQueueSize, new Gauge[Int] { def value = q.size } ) I've tried to parse part as I've understood they. config.clientIdaf_servers topicThreadId._1 af_servers-spm_new_cluster_topic topicThreadId._2 af_servers_wawanawna-Dell-1401353748289-fcaaea29-0 Yes I can suppose that this topicThreadId._1 will always looks like GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will it always true? With the new consumer coming soon, the metrics naming schemes would very likely to be refined. So I cannot say this will always be true. *2.* From code I see that sometimes Kafka uses _ as separator, not only -: val consumerIdString = { var consumerUuid : String = null config.consumerId match { case Some(consumerId) // for testing only = consumerUuid = consumerId case None // generate unique consumerId automatically = val uuid = UUID.randomUUID() consumerUuid = %s-%d-%s.format( InetAddress.getLocalHost.getHostName, System.currentTimeMillis, uuid.getMostSignificantBits().toHexString.substring(0,8)) } config.groupId + _ + consumerUuid } That means if user will use _ as part of his host/topic/groupId name it maybe be a problem to parse string like: kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell, what is host name here servers_wawanawna-Dell or wawanawna-Dell ? So from one side if we want to be able parse name without any problems we have to avoid using - and _ in host/topic/groupId/clientId, but at the same time I see (from http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name ): *Client id is used for registering jmx beans for monitoring. Because of the* *restrictions in bean names, we limit the client id to be only alpha-numeric* *plus - and _.* Does that mean user can use only camelCase in his host/topic/groupId/clientId for distinguish one part of name from another? Is this a problem? Or I didn't understand something? Yeah I agree this is a problem, and we should fix it in the new consumer. Best regards from Sematext.
Re: How to parse some of JMX Bean's names.
That is indeed a problem, for now, we recommend group name and topic names to use _ when there is a need for -, but this should be fixed systematically. For you use case, could you change your topic/group name using _? Also, do you mind to file a JIRA ticket to keep track of this issue? Guozhang On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hello everyone, We are adding Kafka 0.8.x monitoring support to SPM http://sematext.com/spm/ here at Sematext. Unfortunately, we quickly hit an issue caused by the new bean naming convention that embeds things like topic and host names in the beans along with metrics, separated by dashes, making it hard to parse these beans. To put it simply: it is hard/impossible to automatically figure out which part of the bean name is e.g. consumer group, which is the topic, which is the host name, and which is the name of the metric. Let me show you what I mean: kafka.consumer:type=ConsumerTopicMetrics, name=af_servers-spm_topic-BytesPerSec Here we actually CAN extract: * consumer group ('af_servers') * topic ('spm_topic') * metric (‘BytesPerSec’) BUT what if the consumer group id and/or topic name contain '-'? Then how would we extract consumer group and topic? Here is a concrete example of this problem: kafka.consumer:type=ConsumerTopicMetrics, name=af-servers-spm-topic-BytesPerSec How can we know what is group id or topic name here? This looks like a problem to me, but maybe I’m missing something? Is it possible to have all these values (group id, topic name) as separate attributes inside JMX bean? Or maybe the problem could be solved if a different delimiter was used, such as the pipe (“I”)? It is really needed things and will be nice to have it to build good tool for monitoring. Thx and best regards from Sematext. -- -- Guozhang
Re: How to parse some of JMX Bean's names.
Hi Guozhang, On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang wangg...@gmail.com wrote: That is indeed a problem, for now, we recommend group name and topic names to use _ when there is a need for -, but this should be fixed systematically. Right! For you use case, could you change your topic/group name using _? Our own Kafka doesn't use topics with - characters, so we don't have a problem. The problem, in our case, is that we have a general (Kafka) monitoring tool that other people use to monitor Kafka - see http://sematext.com/spm/ . So we can't really tell people hey, our tool will work but only if you don't have a dash in your topic names and hosts and ... because if you use dashes we won't know how to parse your Kafka's MBean names :) Also, do you mind to file a JIRA ticket to keep track of this issue? Here it is: https://issues.apache.org/jira/browse/KAFKA-1481 Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov vladimir.tretya...@sematext.com wrote: Hello everyone, We are adding Kafka 0.8.x monitoring support to SPM http://sematext.com/spm/ here at Sematext. Unfortunately, we quickly hit an issue caused by the new bean naming convention that embeds things like topic and host names in the beans along with metrics, separated by dashes, making it hard to parse these beans. To put it simply: it is hard/impossible to automatically figure out which part of the bean name is e.g. consumer group, which is the topic, which is the host name, and which is the name of the metric. Let me show you what I mean: kafka.consumer:type=ConsumerTopicMetrics, name=af_servers-spm_topic-BytesPerSec Here we actually CAN extract: * consumer group ('af_servers') * topic ('spm_topic') * metric (‘BytesPerSec’) BUT what if the consumer group id and/or topic name contain '-'? Then how would we extract consumer group and topic? Here is a concrete example of this problem: kafka.consumer:type=ConsumerTopicMetrics, name=af-servers-spm-topic-BytesPerSec How can we know what is group id or topic name here? This looks like a problem to me, but maybe I’m missing something? Is it possible to have all these values (group id, topic name) as separate attributes inside JMX bean? Or maybe the problem could be solved if a different delimiter was used, such as the pipe (“I”)? It is really needed things and will be nice to have it to build good tool for monitoring. Thx and best regards from Sematext. -- -- Guozhang