Re: How to parse some of JMX Bean's names.

2014-06-20 Thread Otis Gospodnetic
Hi Neha,

The patch is in https://issues.apache.org/jira/browse/KAFKA-1481 . It's
super-simple.  Does it require a review or can one submit it to Jenkins or
something else?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Wed, Jun 4, 2014 at 9:47 PM, Neha Narkhede neha.narkh...@gmail.com
wrote:

 Agree that 0.9 is a few months out. 0.8.2, on the other hand, is probably
 less than a month away. On one hand, we are trying our best to minimize the
 set of changes to the existing consumer in the interest of saving time to
 work on the new consumer. Since the old consumer is very complex and is now
 stable for some time, the motivation is to make fewer changes to it to
 maintain that stability. On the other hand, if there are critical bug
 fixes, it makes sense to patch the old consumer and do a point release.

 We would be happy to take a patch from you. How about we look at the size
 of the proposed changes and discuss a release timeline on the JIRA?

 Thanks,
 Neha


 On Wed, Jun 4, 2014 at 3:09 PM, Otis Gospodnetic 
 otis.gospodne...@gmail.com
  wrote:

  Hi Guozhang,
 
  Since the new consumer is 2-3 months out ... hm, no, looks like 4 months
  out - October -
  https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
 
  Any way you could change this in 0.8.1.2 or 0.8.2?  We can submit a
 patch.
 
  Thanks,
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
  On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang wangg...@gmail.com
 wrote:
 
   I think we will make this change in the new consumer, which may be
  released
   in 0.9.
  
   Guozhang
  
  
   On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov 
   vladimir.tretya...@sematext.com wrote:
  
Hi, thx Guozhang Wang, looking forward.
   
When do you think this changes will available? 0.8.2? Jul 2014
   
 https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
  ?
   Or
later?
   
Best regards, Vladimir.
   
   
On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com
   wrote:
   
 Hello Vladimir, comments in-lined.


 On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov 
 vladimir.tretya...@sematext.com wrote:

  Hi again, few more questions from me:
 
  *1.*
 
  What I see in JMX:
 
 
 

   
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize
 
  From code:
 
  newGauge(
  config.clientId + - + config.groupId + - +
topicThreadId._1 +
  - + topicThreadId._2 + -FetchQueueSize,
  new Gauge[Int] {
def value = q.size
  }
)
 
  I've tried to parse part as I've understood they.
 
  config.clientIdaf_servers
  topicThreadId._1  af_servers-spm_new_cluster_topic
  topicThreadId._2 
   af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
 
  Yes I can suppose that this topicThreadId._1 will always looks
 like
  GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST,
 but
will
 it
  always true?



 With the new consumer coming soon, the metrics naming schemes would
   very
 likely to be refined. So I cannot say this will always be true.



 
  *2.*
 
  From code I see that sometimes Kafka uses _ as separator, not
  only
-:
 
  val consumerIdString = {
  var consumerUuid : String = null
  config.consumerId match {
case Some(consumerId) // for testing only
= consumerUuid = consumerId
case None // generate unique consumerId automatically
= val uuid = UUID.randomUUID()
consumerUuid = %s-%d-%s.format(
  InetAddress.getLocalHost.getHostName,
   System.currentTimeMillis,
  uuid.getMostSignificantBits().toHexString.substring(0,8))
  }
  config.groupId + _ + consumerUuid
}
 
  That means if user will use _ as part of his host/topic/groupId
   name
it
  maybe be a problem to parse string like:
 
 
 

   
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize
 
  Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell,
  what
is
  host name here servers_wawanawna-Dell or wawanawna-Dell ?
 
  So from one side if we want to be able parse name without any
   problems
we
  have to avoid using - and _ in host/topic/groupId/clientId,
 but
   at
 the
  same time I see (from
 
  http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name
   ):
 
  *Client id is used 

Re: How to parse some of JMX Bean's names.

2014-06-20 Thread Neha Narkhede
Yes, it will require a review and I see that Jun reviewed it.


On Fri, Jun 20, 2014 at 5:03 AM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 Hi Neha,

 The patch is in https://issues.apache.org/jira/browse/KAFKA-1481 . It's
 super-simple.  Does it require a review or can one submit it to Jenkins or
 something else?

 Thanks,
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/


 On Wed, Jun 4, 2014 at 9:47 PM, Neha Narkhede neha.narkh...@gmail.com
 wrote:

  Agree that 0.9 is a few months out. 0.8.2, on the other hand, is probably
  less than a month away. On one hand, we are trying our best to minimize
 the
  set of changes to the existing consumer in the interest of saving time to
  work on the new consumer. Since the old consumer is very complex and is
 now
  stable for some time, the motivation is to make fewer changes to it to
  maintain that stability. On the other hand, if there are critical bug
  fixes, it makes sense to patch the old consumer and do a point release.
 
  We would be happy to take a patch from you. How about we look at the size
  of the proposed changes and discuss a release timeline on the JIRA?
 
  Thanks,
  Neha
 
 
  On Wed, Jun 4, 2014 at 3:09 PM, Otis Gospodnetic 
  otis.gospodne...@gmail.com
   wrote:
 
   Hi Guozhang,
  
   Since the new consumer is 2-3 months out ... hm, no, looks like 4
 months
   out - October -
   https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
  
   Any way you could change this in 0.8.1.2 or 0.8.2?  We can submit a
  patch.
  
   Thanks,
   Otis
   --
   Performance Monitoring * Log Analytics * Search Analytics
   Solr  Elasticsearch Support * http://sematext.com/
  
  
   On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang wangg...@gmail.com
  wrote:
  
I think we will make this change in the new consumer, which may be
   released
in 0.9.
   
Guozhang
   
   
On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov 
vladimir.tretya...@sematext.com wrote:
   
 Hi, thx Guozhang Wang, looking forward.

 When do you think this changes will available? 0.8.2? Jul 2014

  https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
   ?
Or
 later?

 Best regards, Vladimir.


 On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com
wrote:

  Hello Vladimir, comments in-lined.
 
 
  On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov 
  vladimir.tretya...@sematext.com wrote:
 
   Hi again, few more questions from me:
  
   *1.*
  
   What I see in JMX:
  
  
  
 

   
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize
  
   From code:
  
   newGauge(
   config.clientId + - + config.groupId + - +
 topicThreadId._1 +
   - + topicThreadId._2 + -FetchQueueSize,
   new Gauge[Int] {
 def value = q.size
   }
 )
  
   I've tried to parse part as I've understood they.
  
   config.clientIdaf_servers
   topicThreadId._1  af_servers-spm_new_cluster_topic
   topicThreadId._2 
af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
  
   Yes I can suppose that this topicThreadId._1 will always looks
  like
   GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST,
  but
 will
  it
   always true?
 
 
 
  With the new consumer coming soon, the metrics naming schemes
 would
very
  likely to be refined. So I cannot say this will always be true.
 
 
 
  
   *2.*
  
   From code I see that sometimes Kafka uses _ as separator, not
   only
 -:
  
   val consumerIdString = {
   var consumerUuid : String = null
   config.consumerId match {
 case Some(consumerId) // for testing only
 = consumerUuid = consumerId
 case None // generate unique consumerId automatically
 = val uuid = UUID.randomUUID()
 consumerUuid = %s-%d-%s.format(
   InetAddress.getLocalHost.getHostName,
System.currentTimeMillis,
  
 uuid.getMostSignificantBits().toHexString.substring(0,8))
   }
   config.groupId + _ + consumerUuid
 }
  
   That means if user will use _ as part of his
 host/topic/groupId
name
 it
   maybe be a problem to parse string like:
  
  
  
 

   
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize
  
   Look at part:
 spm_new_cluster_topic-af_servers_wawanawna-Dell,
   what
 is
   host name here servers_wawanawna-Dell or wawanawna-Dell ?

Re: How to parse some of JMX Bean's names.

2014-06-04 Thread Guozhang Wang
Hello Vladimir, comments in-lined.


On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov 
vladimir.tretya...@sematext.com wrote:

 Hi again, few more questions from me:

 *1.*

 What I see in JMX:


 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize

 From code:

 newGauge(
 config.clientId + - + config.groupId + - + topicThreadId._1 +
 - + topicThreadId._2 + -FetchQueueSize,
 new Gauge[Int] {
   def value = q.size
 }
   )

 I've tried to parse part as I've understood they.

 config.clientIdaf_servers
 topicThreadId._1  af_servers-spm_new_cluster_topic
 topicThreadId._2  af_servers_wawanawna-Dell-1401353748289-fcaaea29-0

 Yes I can suppose that this topicThreadId._1 will always looks like
 GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will it
 always true?



With the new consumer coming soon, the metrics naming schemes would very
likely to be refined. So I cannot say this will always be true.




 *2.*

 From code I see that sometimes Kafka uses _ as separator, not only -:

 val consumerIdString = {
 var consumerUuid : String = null
 config.consumerId match {
   case Some(consumerId) // for testing only
   = consumerUuid = consumerId
   case None // generate unique consumerId automatically
   = val uuid = UUID.randomUUID()
   consumerUuid = %s-%d-%s.format(
 InetAddress.getLocalHost.getHostName, System.currentTimeMillis,
 uuid.getMostSignificantBits().toHexString.substring(0,8))
 }
 config.groupId + _ + consumerUuid
   }

 That means if user will use _ as part of his host/topic/groupId name it
 maybe be a problem to parse string like:


 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize

 Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell, what is
 host name here servers_wawanawna-Dell or wawanawna-Dell ?

 So from one side if we want to be able parse name without any problems we
 have to avoid using - and _ in host/topic/groupId/clientId, but at the
 same time I see (from
 http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name):

 *Client id is used for registering jmx beans for monitoring. Because of
 the*
 *restrictions in bean names, we limit the client id to be only
 alpha-numeric*
 *plus - and _.*

 Does that mean user can use only camelCase in his
 host/topic/groupId/clientId for distinguish one part of name from another?

 Is this a problem? Or I didn't understand something?


Yeah I agree this is a problem, and we should fix it in the new consumer.



 Best regards from Sematext.






 On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic 
 otis.gospodne...@gmail.com
  wrote:

  Hi Guozhang,
 
  On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang wangg...@gmail.com
 wrote:
 
   That is indeed a problem, for now, we recommend group name and topic
  names
   to use _ when there is a need for -, but this should be fixed
   systematically.
  
 
  Right!
 
  For you use case, could you change your topic/group name using _?
 
 
  Our own Kafka doesn't use topics with - characters, so we don't have a
  problem.
 
  The problem, in our case, is that we have a general (Kafka) monitoring
 tool
  that other people use to monitor Kafka - see http://sematext.com/spm/ .
   So
  we can't really tell people hey, our tool will work but only if you
 don't
  have a dash in your topic names and hosts and ... because if you use
 dashes
  we won't know how to parse your Kafka's MBean names :)
 
 
   Also, do you mind to file a JIRA ticket to keep track of this issue?
 
 
  Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
 
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
 
 
 
  
   On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov 
   vladimir.tretya...@sematext.com wrote:
  
Hello everyone,
   
We are adding Kafka 0.8.x monitoring support to SPM
http://sematext.com/spm/ here at Sematext. Unfortunately, we
 quickly
   hit
an issue caused by the new bean naming convention that embeds things
  like
topic and host names in the beans along with metrics, separated by
   dashes,
making it hard to parse these beans.
   
To put it simply: it is hard/impossible to automatically figure out
  which
part of the bean name is e.g. consumer group, which is the topic,
 which
   is
the host name, and which is the name of the metric.
   
Let me show you what I mean:
   
kafka.consumer:type=ConsumerTopicMetrics,
   
   name=af_servers-spm_topic-BytesPerSec
   
Here we actually CAN extract:
   
 * consumer group ('af_servers')
   
 * topic ('spm_topic')
   
 * metric (‘BytesPerSec’)
   
BUT what if the 

Re: How to parse some of JMX Bean's names.

2014-06-04 Thread Guozhang Wang
I think we will make this change in the new consumer, which may be released
in 0.9.

Guozhang


On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov 
vladimir.tretya...@sematext.com wrote:

 Hi, thx Guozhang Wang, looking forward.

 When do you think this changes will available? 0.8.2? Jul 2014
 https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ? Or
 later?

 Best regards, Vladimir.


 On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com wrote:

  Hello Vladimir, comments in-lined.
 
 
  On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov 
  vladimir.tretya...@sematext.com wrote:
 
   Hi again, few more questions from me:
  
   *1.*
  
   What I see in JMX:
  
  
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize
  
   From code:
  
   newGauge(
   config.clientId + - + config.groupId + - +
 topicThreadId._1 +
   - + topicThreadId._2 + -FetchQueueSize,
   new Gauge[Int] {
 def value = q.size
   }
 )
  
   I've tried to parse part as I've understood they.
  
   config.clientIdaf_servers
   topicThreadId._1  af_servers-spm_new_cluster_topic
   topicThreadId._2  af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
  
   Yes I can suppose that this topicThreadId._1 will always looks like
   GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but
 will
  it
   always true?
 
 
 
  With the new consumer coming soon, the metrics naming schemes would very
  likely to be refined. So I cannot say this will always be true.
 
 
 
  
   *2.*
  
   From code I see that sometimes Kafka uses _ as separator, not only
 -:
  
   val consumerIdString = {
   var consumerUuid : String = null
   config.consumerId match {
 case Some(consumerId) // for testing only
 = consumerUuid = consumerId
 case None // generate unique consumerId automatically
 = val uuid = UUID.randomUUID()
 consumerUuid = %s-%d-%s.format(
   InetAddress.getLocalHost.getHostName, System.currentTimeMillis,
   uuid.getMostSignificantBits().toHexString.substring(0,8))
   }
   config.groupId + _ + consumerUuid
 }
  
   That means if user will use _ as part of his host/topic/groupId name
 it
   maybe be a problem to parse string like:
  
  
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize
  
   Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell, what
 is
   host name here servers_wawanawna-Dell or wawanawna-Dell ?
  
   So from one side if we want to be able parse name without any problems
 we
   have to avoid using - and _ in host/topic/groupId/clientId, but at
  the
   same time I see (from
   http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name):
  
   *Client id is used for registering jmx beans for monitoring. Because
 of
   the*
   *restrictions in bean names, we limit the client id to be only
   alpha-numeric*
   *plus - and _.*
  
   Does that mean user can use only camelCase in his
   host/topic/groupId/clientId for distinguish one part of name from
  another?
  
   Is this a problem? Or I didn't understand something?
  
 
  Yeah I agree this is a problem, and we should fix it in the new consumer.
 
 
  
   Best regards from Sematext.
  
  
  
  
  
  
   On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic 
   otis.gospodne...@gmail.com
wrote:
  
Hi Guozhang,
   
On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang wangg...@gmail.com
   wrote:
   
 That is indeed a problem, for now, we recommend group name and
 topic
names
 to use _ when there is a need for -, but this should be fixed
 systematically.

   
Right!
   
For you use case, could you change your topic/group name using _?
   
   
Our own Kafka doesn't use topics with - characters, so we don't
 have
  a
problem.
   
The problem, in our case, is that we have a general (Kafka)
 monitoring
   tool
that other people use to monitor Kafka - see
 http://sematext.com/spm/
  .
 So
we can't really tell people hey, our tool will work but only if you
   don't
have a dash in your topic names and hosts and ... because if you use
   dashes
we won't know how to parse your Kafka's MBean names :)
   
   
 Also, do you mind to file a JIRA ticket to keep track of this
 issue?
   
   
Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
   
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/
   
   
   
   
   

 On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov 
 vladimir.tretya...@sematext.com wrote:

  Hello everyone,
 
  We are adding Kafka 0.8.x monitoring support to SPM
  http://sematext.com/spm/ 

Re: How to parse some of JMX Bean's names.

2014-06-04 Thread Otis Gospodnetic
Hi Guozhang,

Since the new consumer is 2-3 months out ... hm, no, looks like 4 months
out - October -
https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan

Any way you could change this in 0.8.1.2 or 0.8.2?  We can submit a patch.

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang wangg...@gmail.com wrote:

 I think we will make this change in the new consumer, which may be released
 in 0.9.

 Guozhang


 On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov 
 vladimir.tretya...@sematext.com wrote:

  Hi, thx Guozhang Wang, looking forward.
 
  When do you think this changes will available? 0.8.2? Jul 2014
  https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ?
 Or
  later?
 
  Best regards, Vladimir.
 
 
  On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com
 wrote:
 
   Hello Vladimir, comments in-lined.
  
  
   On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov 
   vladimir.tretya...@sematext.com wrote:
  
Hi again, few more questions from me:
   
*1.*
   
What I see in JMX:
   
   
   
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize
   
From code:
   
newGauge(
config.clientId + - + config.groupId + - +
  topicThreadId._1 +
- + topicThreadId._2 + -FetchQueueSize,
new Gauge[Int] {
  def value = q.size
}
  )
   
I've tried to parse part as I've understood they.
   
config.clientIdaf_servers
topicThreadId._1  af_servers-spm_new_cluster_topic
topicThreadId._2 
 af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
   
Yes I can suppose that this topicThreadId._1 will always looks like
GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but
  will
   it
always true?
  
  
  
   With the new consumer coming soon, the metrics naming schemes would
 very
   likely to be refined. So I cannot say this will always be true.
  
  
  
   
*2.*
   
From code I see that sometimes Kafka uses _ as separator, not only
  -:
   
val consumerIdString = {
var consumerUuid : String = null
config.consumerId match {
  case Some(consumerId) // for testing only
  = consumerUuid = consumerId
  case None // generate unique consumerId automatically
  = val uuid = UUID.randomUUID()
  consumerUuid = %s-%d-%s.format(
InetAddress.getLocalHost.getHostName,
 System.currentTimeMillis,
uuid.getMostSignificantBits().toHexString.substring(0,8))
}
config.groupId + _ + consumerUuid
  }
   
That means if user will use _ as part of his host/topic/groupId
 name
  it
maybe be a problem to parse string like:
   
   
   
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize
   
Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell, what
  is
host name here servers_wawanawna-Dell or wawanawna-Dell ?
   
So from one side if we want to be able parse name without any
 problems
  we
have to avoid using - and _ in host/topic/groupId/clientId, but
 at
   the
same time I see (from
http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name
 ):
   
*Client id is used for registering jmx beans for monitoring. Because
  of
the*
*restrictions in bean names, we limit the client id to be only
alpha-numeric*
*plus - and _.*
   
Does that mean user can use only camelCase in his
host/topic/groupId/clientId for distinguish one part of name from
   another?
   
Is this a problem? Or I didn't understand something?
   
  
   Yeah I agree this is a problem, and we should fix it in the new
 consumer.
  
  
   
Best regards from Sematext.
   
   
   
   
   
   
On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic 
otis.gospodne...@gmail.com
 wrote:
   
 Hi Guozhang,

 On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang wangg...@gmail.com
wrote:

  That is indeed a problem, for now, we recommend group name and
  topic
 names
  to use _ when there is a need for -, but this should be fixed
  systematically.
 

 Right!

 For you use case, could you change your topic/group name using _?


 Our own Kafka doesn't use topics with - characters, so we don't
  have
   a
 problem.

 The problem, in our case, is that we have a general (Kafka)
  monitoring
tool
 that other people use to monitor Kafka - see
  http://sematext.com/spm/
   .
  So
 we can't really tell people hey, our tool will work but only if
 you
don't
 have a dash in your topic names and hosts and ... 

Re: How to parse some of JMX Bean's names.

2014-06-04 Thread Neha Narkhede
Agree that 0.9 is a few months out. 0.8.2, on the other hand, is probably
less than a month away. On one hand, we are trying our best to minimize the
set of changes to the existing consumer in the interest of saving time to
work on the new consumer. Since the old consumer is very complex and is now
stable for some time, the motivation is to make fewer changes to it to
maintain that stability. On the other hand, if there are critical bug
fixes, it makes sense to patch the old consumer and do a point release.

We would be happy to take a patch from you. How about we look at the size
of the proposed changes and discuss a release timeline on the JIRA?

Thanks,
Neha


On Wed, Jun 4, 2014 at 3:09 PM, Otis Gospodnetic otis.gospodne...@gmail.com
 wrote:

 Hi Guozhang,

 Since the new consumer is 2-3 months out ... hm, no, looks like 4 months
 out - October -
 https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan

 Any way you could change this in 0.8.1.2 or 0.8.2?  We can submit a patch.

 Thanks,
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/


 On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang wangg...@gmail.com wrote:

  I think we will make this change in the new consumer, which may be
 released
  in 0.9.
 
  Guozhang
 
 
  On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov 
  vladimir.tretya...@sematext.com wrote:
 
   Hi, thx Guozhang Wang, looking forward.
  
   When do you think this changes will available? 0.8.2? Jul 2014
   https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
 ?
  Or
   later?
  
   Best regards, Vladimir.
  
  
   On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang wangg...@gmail.com
  wrote:
  
Hello Vladimir, comments in-lined.
   
   
On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov 
vladimir.tretya...@sematext.com wrote:
   
 Hi again, few more questions from me:

 *1.*

 What I see in JMX:



   
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize

 From code:

 newGauge(
 config.clientId + - + config.groupId + - +
   topicThreadId._1 +
 - + topicThreadId._2 + -FetchQueueSize,
 new Gauge[Int] {
   def value = q.size
 }
   )

 I've tried to parse part as I've understood they.

 config.clientIdaf_servers
 topicThreadId._1  af_servers-spm_new_cluster_topic
 topicThreadId._2 
  af_servers_wawanawna-Dell-1401353748289-fcaaea29-0

 Yes I can suppose that this topicThreadId._1 will always looks like
 GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but
   will
it
 always true?
   
   
   
With the new consumer coming soon, the metrics naming schemes would
  very
likely to be refined. So I cannot say this will always be true.
   
   
   

 *2.*

 From code I see that sometimes Kafka uses _ as separator, not
 only
   -:

 val consumerIdString = {
 var consumerUuid : String = null
 config.consumerId match {
   case Some(consumerId) // for testing only
   = consumerUuid = consumerId
   case None // generate unique consumerId automatically
   = val uuid = UUID.randomUUID()
   consumerUuid = %s-%d-%s.format(
 InetAddress.getLocalHost.getHostName,
  System.currentTimeMillis,
 uuid.getMostSignificantBits().toHexString.substring(0,8))
 }
 config.groupId + _ + consumerUuid
   }

 That means if user will use _ as part of his host/topic/groupId
  name
   it
 maybe be a problem to parse string like:



   
  
 
 kafka.consumer:type=ZookeeperConsumerConnector,name=af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize

 Look at part: spm_new_cluster_topic-af_servers_wawanawna-Dell,
 what
   is
 host name here servers_wawanawna-Dell or wawanawna-Dell ?

 So from one side if we want to be able parse name without any
  problems
   we
 have to avoid using - and _ in host/topic/groupId/clientId, but
  at
the
 same time I see (from

 http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name
  ):

 *Client id is used for registering jmx beans for monitoring.
 Because
   of
 the*
 *restrictions in bean names, we limit the client id to be only
 alpha-numeric*
 *plus - and _.*

 Does that mean user can use only camelCase in his
 host/topic/groupId/clientId for distinguish one part of name from
another?

 Is this a problem? Or I didn't understand something?

   
Yeah I agree this is a problem, and we should fix it in the new
  consumer.
   
   

 Best regards from Sematext.





   

Re: How to parse some of JMX Bean's names.

2014-06-02 Thread Guozhang Wang
That is indeed a problem, for now, we recommend group name and topic names
to use _ when there is a need for -, but this should be fixed
systematically.

For you use case, could you change your topic/group name using _? Also,
do you mind to file a JIRA ticket to keep track of this issue?

Guozhang


On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov 
vladimir.tretya...@sematext.com wrote:

 Hello everyone,

 We are adding Kafka 0.8.x monitoring support to SPM
 http://sematext.com/spm/ here at Sematext. Unfortunately, we quickly hit
 an issue caused by the new bean naming convention that embeds things like
 topic and host names in the beans along with metrics, separated by dashes,
 making it hard to parse these beans.

 To put it simply: it is hard/impossible to automatically figure out which
 part of the bean name is e.g. consumer group, which is the topic, which is
 the host name, and which is the name of the metric.

 Let me show you what I mean:

 kafka.consumer:type=ConsumerTopicMetrics,

name=af_servers-spm_topic-BytesPerSec

 Here we actually CAN extract:

  * consumer group ('af_servers')

  * topic ('spm_topic')

  * metric (‘BytesPerSec’)

 BUT what if the consumer group id and/or topic name contain '-'?

 Then how would we extract consumer group and topic?

 Here is a concrete example of this problem:

 kafka.consumer:type=ConsumerTopicMetrics,

   name=af-servers-spm-topic-BytesPerSec

 How can we know what is group id or topic name here?

 This looks like a problem to me, but maybe I’m missing something?

 Is it possible to have all these values (group id, topic name) as separate
 attributes inside JMX bean?

 Or maybe the problem could be solved if a different delimiter was used,
 such as the pipe (“I”)?

 It is really needed things and will be nice to have it to build good tool
 for monitoring.

 Thx and best regards from Sematext.




-- 
-- Guozhang


Re: How to parse some of JMX Bean's names.

2014-06-02 Thread Otis Gospodnetic
Hi Guozhang,

On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang wangg...@gmail.com wrote:

 That is indeed a problem, for now, we recommend group name and topic names
 to use _ when there is a need for -, but this should be fixed
 systematically.


Right!

For you use case, could you change your topic/group name using _?


Our own Kafka doesn't use topics with - characters, so we don't have a
problem.

The problem, in our case, is that we have a general (Kafka) monitoring tool
that other people use to monitor Kafka - see http://sematext.com/spm/ .  So
we can't really tell people hey, our tool will work but only if you don't
have a dash in your topic names and hosts and ... because if you use dashes
we won't know how to parse your Kafka's MBean names :)


 Also, do you mind to file a JIRA ticket to keep track of this issue?


Here it is: https://issues.apache.org/jira/browse/KAFKA-1481

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/






 On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov 
 vladimir.tretya...@sematext.com wrote:

  Hello everyone,
 
  We are adding Kafka 0.8.x monitoring support to SPM
  http://sematext.com/spm/ here at Sematext. Unfortunately, we quickly
 hit
  an issue caused by the new bean naming convention that embeds things like
  topic and host names in the beans along with metrics, separated by
 dashes,
  making it hard to parse these beans.
 
  To put it simply: it is hard/impossible to automatically figure out which
  part of the bean name is e.g. consumer group, which is the topic, which
 is
  the host name, and which is the name of the metric.
 
  Let me show you what I mean:
 
  kafka.consumer:type=ConsumerTopicMetrics,
 
 name=af_servers-spm_topic-BytesPerSec
 
  Here we actually CAN extract:
 
   * consumer group ('af_servers')
 
   * topic ('spm_topic')
 
   * metric (‘BytesPerSec’)
 
  BUT what if the consumer group id and/or topic name contain '-'?
 
  Then how would we extract consumer group and topic?
 
  Here is a concrete example of this problem:
 
  kafka.consumer:type=ConsumerTopicMetrics,
 
name=af-servers-spm-topic-BytesPerSec
 
  How can we know what is group id or topic name here?
 
  This looks like a problem to me, but maybe I’m missing something?
 
  Is it possible to have all these values (group id, topic name) as
 separate
  attributes inside JMX bean?
 
  Or maybe the problem could be solved if a different delimiter was used,
  such as the pipe (“I”)?
 
  It is really needed things and will be nice to have it to build good tool
  for monitoring.
 
  Thx and best regards from Sematext.
 



 --
 -- Guozhang