Re: install kafka and hadoop on the same cluster?

2016-02-29 Thread Banias H
I have had Kafka installed with Hadoop/Spark/etc in a comparable cluster,
except that I didn't have Kafka brokers installed in the nodes with
zookeeper. As long as physical disk space in the rest of the nodes is not a
concern, you will be OK.

I don't have enough background with Cassandra. But I can talk about HBase.
I separated Kafka brokers and HBase region servers as they are both disk
I/O intensive.

-B



On Mon, Feb 29, 2016 at 9:45 PM, Sa Li  wrote:

> Hi, All
>
> I have a 9-node cluster where I already installed cloudera Hadoop/spark,
> now I want to install kafka in this cluster too, is it a good idea I
> install kafka on each of 9 node? If so, any potential risk for that?
>
> Also I am thinking to install cassandra on each of this node too,
> basically all components sitting in the same nodes, would that be OK?
>
> thanks
>
> AL
>
>
>


install kafka and hadoop on the same cluster?

2016-02-29 Thread Sa Li
Hi, All

I have a 9-node cluster where I already installed cloudera Hadoop/spark, now I 
want to install kafka in this cluster too, is it a good idea I install kafka on 
each of 9 node? If so, any potential risk for that?

Also I am thinking to install cassandra on each of this node too, basically all 
components sitting in the same nodes, would that be OK?

thanks

AL




Need to understand the graph

2016-02-29 Thread awa...@pch.com
Hello,

I have an issue with the timing of Kafka. As and when the time increases with 
load testing, we see the increase in "request queue time". What is the "request 
queue time" ?

Some basic config we are using

2 Kafka Nodes(CPU - 8 CPU's [Thread(s) per core:2], Mem - 32 GB Ram)
pkafkaapp01.pchoso.com
pkafkaapp02.pchoso.com

Tests duration: 13:05 - 14:05
Messages published - 869156

Load Average, CPU, memory is all under control so not sure what the issue is.

Below are some SPM graphs showing the state of my system.
Here's the 'Requests' graph:
  https://apps.sematext.com/spm-reports/s/lCOJULIKuJ

Re: Fetching meta from Kafka continuously.

2016-02-29 Thread Kim Chew
Problem solved. There was a DNS problem for one of the brokers.

Kim

On Mon, Feb 29, 2016 at 12:56 AM, Jens Rantil  wrote:

> Hi,
>
> Could it be that you need rebalance your topics, perhaps?
>
> Cheers,
> Jens
>
> On Wed, Feb 24, 2016 at 7:25 PM, Kim Chew  wrote:
>
> > We have shut down some nodes from our cluster yesterday and now we are
> > seeing tons of these in the log,
> >
> >
> > 2016-02-24 18:11:23 INFO  SyncProducer:68 - Connected to
> > ip-172-30-198-64.us-west-2.compute.internal:9092 for producing
> > 2016-02-24 18:11:23 INFO  SyncProducer:68 - Disconnecting from
> > ip-172-30-198-64.us-west-2.compute.internal:9092
> > 2016-02-24 18:11:23 INFO  ConsumerFetcherManager:68 -
> > [ConsumerFetcherManager-1456295103918] Added fetcher for partitions
> > ArrayBuffer()
> > 2016-02-24 18:11:23 INFO  VerifiableProperties:68 - Verifying properties
> > 2016-02-24 18:11:23 INFO  VerifiableProperties:68 - Property client.id
> is
> > overridden to utils_backup
> > 2016-02-24 18:11:23 INFO  VerifiableProperties:68 - Property
> > metadata.broker.list is overridden to
> >
> >
> ip-172-30-198-19.us-west-2.compute.internal:9092,ip-172-30-198-64.us-west-2.compute.internal:9092,ip-172-30-200-37.us-west-2.compute.internal:9092
> > 2016-02-24 18:11:23 INFO  VerifiableProperties:68 - Property
> > request.timeout.ms is overridden to 3
> > 2016-02-24 18:11:23 INFO  ClientUtils$:68 - Fetching metadata from broker
> > id:8,host:ip-172-30-198-64.us-west-2.compute.internal,port:9092 with
> > correlation id 189686 for 2 topic(s) Set(container, profile-data-2)
> >
> > I am new to Kafka so although I know it has something to do with the
> > brokers, I would like to know what has happened and what is the best way
> to
> > fix it?
> >
> > TIA
> >
>
>
>
> --
> Jens Rantil
> Backend engineer
> Tink AB
>
> Email: jens.ran...@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
>
> Facebook  Linkedin
> <
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> >
>  Twitter 
>


Connecting to secure Kafka

2016-02-29 Thread Oleg Zhurakousky
So, my Kafka is kerberized and all works well
However when trying to connect with missing SASL properties Kafka doesn’t fail. 
It simply hangs indefinitely.
Here is the example code;

Properties props = new Properties();
props.put("bootstrap.servers", "ubuntu.oleg.com:9095");
props.put("group.id", "securing-kafka-group");
props.put("key.deserializer", 
"org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", 
"org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer consumer = new KafkaConsumer<>(props);
consumer.listTopics();

So, it hangs on listTopics(). I think one would expect it to fail. Am I wrong?

Thanks
Oleg



Re: Kafka node liveness check

2016-02-29 Thread Elias Abacioglu
Crap, forgot to remove my signature.. I guess my e-mail will now get
spammed forever :(





On Mon, Feb 29, 2016 at 3:14 PM, Elias Abacioglu <
elias.abacio...@deltaprojects.com> wrote:

> We've setup jmxtrans and use it to check these two values.
> UncleanLeaderElectionsPerSec
> UnderReplicatedPartitions
>
> Here is our shinken/nagios configuration:
>
> define command {
>   command_name check_kafka_underreplicated
>   command_line $USER1$/check_jmx -U
> service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:/jmxrmi -O
> "kafka.server":type="ReplicaManager",name="UnderReplicatedPartitions" -A
> Value -w $ARG1$ -c $ARG2$
> }
>
> define command {
>   command_name check_kafka_uncleanleader
>   command_line $USER1$/check_jmx -U
> service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:/jmxrmi -O
> "kafka.controller":type="ControllerStats",name="UncleanLeaderElectionsPerSec"
> -A Count -w $ARG1$ -c $ARG2$
> }
>
> define service {
>   hostgroup_name KafkaBroker
>   use generic-service
>   service_description Kafka Unclean Leader Elections per sec
>   check_command check_kafka_uncleanleader!1!10
>   check_interval 15
>   retry_interval 5
> }
> define service {
>   hostgroup_name KafkaBroker
>   use generic-service
>   service_description Kafka Under Replicated Partitions
>   check_command check_kafka_underreplicated!1!10
>   check_interval 15
>   retry_interval 5
> }
>
>
>
>
>
>
>
>
>
>
> On Mon, Feb 29, 2016 at 12:41 PM, tao xiao  wrote:
>
>> Thanks Jens. What I want to achieve is to check every broker within a
>> cluster functions probably. The way you suggest can identify the liveness
>> of a cluster but it doesn't necessarily mean every broker in the cluster
>> is
>> alive. In order to achieve that I can either create a topic with number of
>> partitions being same as the number of brokers and min.insync.isr=number
>> of
>> brokers or one topic per broker and then send ping message to broker. But
>> this approach is definitely not scalable as we expand the cluster.
>> Therefore I am looking for a way to achieve this.
>>
>> On Mon, 29 Feb 2016 at 16:54 Jens Rantil  wrote:
>>
>> > Hi,
>> >
>> > I assume you first want to ask yourself what liveness you would like to
>> > check for. I guess the most realistic check is to put a "ping" message
>> on
>> > the broken and make sure that you can consume it.
>> >
>> > Cheers,
>> > Jens
>> >
>> > On Fri, Feb 26, 2016 at 12:38 PM, tao xiao 
>> wrote:
>> >
>> > > Hi team,
>> > >
>> > > What is the best way to verify a specific Kafka node functions
>> properly?
>> > > Telnet the port is one of the approach but I don't think it tells me
>> > > whether or not the broker can still receive/send traffics. I am
>> thinking
>> > to
>> > > ask for metadata from the broker using consumer.partitionsFor. If it
>> can
>> > > return partitioninfo it is considered live. Is this a good approach?
>> > >
>> >
>> >
>> >
>> > --
>> > Jens Rantil
>> > Backend engineer
>> > Tink AB
>> >
>> > Email: jens.ran...@tink.se
>> > Phone: +46 708 84 18 32
>> > Web: www.tink.se
>> >
>> > Facebook  Linkedin
>> > <
>> >
>> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
>> > >
>> >  Twitter 
>> >
>>
>
>


Re: Kafka node liveness check

2016-02-29 Thread Elias Abacioglu
We've setup jmxtrans and use it to check these two values.
UncleanLeaderElectionsPerSec
UnderReplicatedPartitions

Here is our shinken/nagios configuration:

define command {
  command_name check_kafka_underreplicated
  command_line $USER1$/check_jmx -U
service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:/jmxrmi -O
"kafka.server":type="ReplicaManager",name="UnderReplicatedPartitions" -A
Value -w $ARG1$ -c $ARG2$
}

define command {
  command_name check_kafka_uncleanleader
  command_line $USER1$/check_jmx -U
service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:/jmxrmi -O
"kafka.controller":type="ControllerStats",name="UncleanLeaderElectionsPerSec"
-A Count -w $ARG1$ -c $ARG2$
}

define service {
  hostgroup_name KafkaBroker
  use generic-service
  service_description Kafka Unclean Leader Elections per sec
  check_command check_kafka_uncleanleader!1!10
  check_interval 15
  retry_interval 5
}
define service {
  hostgroup_name KafkaBroker
  use generic-service
  service_description Kafka Under Replicated Partitions
  check_command check_kafka_underreplicated!1!10
  check_interval 15
  retry_interval 5
}







*DELTA PROJECTS*

*Elias Abacioglu*
Infrastructure Specialist at Delta Projects AB

*E-mail*: elias.abacio...@deltaprojects.com
*Office*: +46 8 667 76 90 *Mobile*: +46 70 222 59 25
*Office*: Banérgatan 10, SE-115 23 Stockholm, Sweden
website  | map  |
support  | twitter
 | linkedin




On Mon, Feb 29, 2016 at 12:41 PM, tao xiao  wrote:

> Thanks Jens. What I want to achieve is to check every broker within a
> cluster functions probably. The way you suggest can identify the liveness
> of a cluster but it doesn't necessarily mean every broker in the cluster is
> alive. In order to achieve that I can either create a topic with number of
> partitions being same as the number of brokers and min.insync.isr=number of
> brokers or one topic per broker and then send ping message to broker. But
> this approach is definitely not scalable as we expand the cluster.
> Therefore I am looking for a way to achieve this.
>
> On Mon, 29 Feb 2016 at 16:54 Jens Rantil  wrote:
>
> > Hi,
> >
> > I assume you first want to ask yourself what liveness you would like to
> > check for. I guess the most realistic check is to put a "ping" message on
> > the broken and make sure that you can consume it.
> >
> > Cheers,
> > Jens
> >
> > On Fri, Feb 26, 2016 at 12:38 PM, tao xiao  wrote:
> >
> > > Hi team,
> > >
> > > What is the best way to verify a specific Kafka node functions
> properly?
> > > Telnet the port is one of the approach but I don't think it tells me
> > > whether or not the broker can still receive/send traffics. I am
> thinking
> > to
> > > ask for metadata from the broker using consumer.partitionsFor. If it
> can
> > > return partitioninfo it is considered live. Is this a good approach?
> > >
> >
> >
> >
> > --
> > Jens Rantil
> > Backend engineer
> > Tink AB
> >
> > Email: jens.ran...@tink.se
> > Phone: +46 708 84 18 32
> > Web: www.tink.se
> >
> > Facebook  Linkedin
> > <
> >
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> > >
> >  Twitter 
> >
>


Re: Kafka node liveness check

2016-02-29 Thread tao xiao
Thanks Jens. What I want to achieve is to check every broker within a
cluster functions probably. The way you suggest can identify the liveness
of a cluster but it doesn't necessarily mean every broker in the cluster is
alive. In order to achieve that I can either create a topic with number of
partitions being same as the number of brokers and min.insync.isr=number of
brokers or one topic per broker and then send ping message to broker. But
this approach is definitely not scalable as we expand the cluster.
Therefore I am looking for a way to achieve this.

On Mon, 29 Feb 2016 at 16:54 Jens Rantil  wrote:

> Hi,
>
> I assume you first want to ask yourself what liveness you would like to
> check for. I guess the most realistic check is to put a "ping" message on
> the broken and make sure that you can consume it.
>
> Cheers,
> Jens
>
> On Fri, Feb 26, 2016 at 12:38 PM, tao xiao  wrote:
>
> > Hi team,
> >
> > What is the best way to verify a specific Kafka node functions properly?
> > Telnet the port is one of the approach but I don't think it tells me
> > whether or not the broker can still receive/send traffics. I am thinking
> to
> > ask for metadata from the broker using consumer.partitionsFor. If it can
> > return partitioninfo it is considered live. Is this a good approach?
> >
>
>
>
> --
> Jens Rantil
> Backend engineer
> Tink AB
>
> Email: jens.ran...@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
>
> Facebook  Linkedin
> <
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> >
>  Twitter 
>


Re: Unable to start cluster after crash (0.8.2.2)

2016-02-29 Thread Jens Rantil
Double post. Please keep discussion in the other thread.

Cheers,
Jens

On Wed, Feb 24, 2016 at 4:39 PM, Anthony Sparks 
wrote:

> Hello,
>
> Our Kafka cluster (3 servers, each server has Zookeeper and Kafka installed
> and running) crashed, and actually out of the 6 processes only one
> Zookeeper instance remained alive.  The logs do not indicate much, the only
> errors shown were:
>
> 2016-02-21T12:21:36.881+: 27445381.013: [GC (Allocation Failure)
> 27445381.013: [ParNew: 136472K->159K(153344K), 0.0047077 secs]
> 139578K->3265K(507264K), 0.0048552 secs] [Times: user=0.01 sys=0.00,
> real=0.01 secs]
>
> These errors were both in the Zookeeper and the Kafka logs, and it appears
> they have been happening everyday (with no impact on Kafka, except for
> maybe now?).
>
> The crash is concerning, but not as concerning as what we are encountering
> right now.  I am unable to get the cluster back up.  Two of the three nodes
> halt with this fatal error:
>
> [2016-02-23 21:18:47,251] FATAL [ReplicaFetcherThread-0-0], Halting because
> log truncation is not allowed for topic audit_data, Current leader 0's
> latest offset 52844816 is less than replica 1's latest offset 52844835
> (kafka.server.ReplicaFetcherThread)
>
> The other node that manages to stay alive is unable to fulfill writes
> because we have min.ack set to 2 on the producers (requiring at least two
> nodes to be available).  We could change this, but that doesn't fix our
> overall problem.
>
> In browsing the Kafka code, in ReplicaFetcherThread.scala there is this
> little nugget:
>
> // Prior to truncating the follower's log, ensure that doing so is not
> disallowed by the configuration for unclean leader election.
> // This situation could only happen if the unclean election configuration
> for a topic changes while a replica is down. Otherwise,
> // we should never encounter this situation since a non-ISR leader cannot
> be elected if disallowed by the broker configuration.
> if (!LogConfig.fromProps(brokerConfig.toProps,
> AdminUtils.fetchTopicConfig(replicaMgr.zkClient,
> topicAndPartition.topic)).uncleanLeaderElectionEnable) {
> // Log a fatal error and shutdown the broker to ensure that data loss
> does not unexpectedly occur.
> fatal("Halting because log truncation is not allowed for topic
> %s,".format(topicAndPartition.topic) +
>   " Current leader %d's latest offset %d is less than replica %d's
> latest offset %d"
>   .format(sourceBroker.id, leaderEndOffset, brokerConfig.brokerId,
> replica.logEndOffset.messageOffset))
> Runtime.getRuntime.halt(1)
> }
>
> For each one of our Kafka instances we have them set at:
> *unclean.leader.election.enable=false
> *which hasn't changed at all since we deployed the cluster (verified by
> file modification stamps).  This to me would indicate the above comment
> assertion is incorrect; we have encountered a non-ISR leader elected even
> though it is configured not to do so.
>
> Any ideas on how to work around this?
>
> Thank you,
>
> Tony Sparks
>



-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook  Linkedin

 Twitter 


Re: Is it possible to configure Kafka Mirror to specify fixed ports to connect to Remote DC

2016-02-29 Thread Jens Rantil
Hi Munir,

Are you referring to outbound or inbound ports from/to the Mirror tool?

Cheers,
Jens

On Wed, Feb 24, 2016 at 6:01 PM, Munir Khan (munkhan) 
wrote:

> Hi,
> I am trying out Kafka Mirror for moving kafka message between DC. In our
> case we have to make it work through firewalls and use  specific TCP ports
> permitted by ACL.  What I have seen so far each Mirror instance opens a
> number of tcp connections to the remote DC and the port numbers are not
> fixed. Is there a way to configure Kafka Mirror so that is always uses
> specific ports ?
>
> Best Regards
> Munir Khan
>
>


-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook  Linkedin

 Twitter 


Re: Fetching meta from Kafka continuously.

2016-02-29 Thread Jens Rantil
Hi,

Could it be that you need rebalance your topics, perhaps?

Cheers,
Jens

On Wed, Feb 24, 2016 at 7:25 PM, Kim Chew  wrote:

> We have shut down some nodes from our cluster yesterday and now we are
> seeing tons of these in the log,
>
>
> 2016-02-24 18:11:23 INFO  SyncProducer:68 - Connected to
> ip-172-30-198-64.us-west-2.compute.internal:9092 for producing
> 2016-02-24 18:11:23 INFO  SyncProducer:68 - Disconnecting from
> ip-172-30-198-64.us-west-2.compute.internal:9092
> 2016-02-24 18:11:23 INFO  ConsumerFetcherManager:68 -
> [ConsumerFetcherManager-1456295103918] Added fetcher for partitions
> ArrayBuffer()
> 2016-02-24 18:11:23 INFO  VerifiableProperties:68 - Verifying properties
> 2016-02-24 18:11:23 INFO  VerifiableProperties:68 - Property client.id is
> overridden to utils_backup
> 2016-02-24 18:11:23 INFO  VerifiableProperties:68 - Property
> metadata.broker.list is overridden to
>
> ip-172-30-198-19.us-west-2.compute.internal:9092,ip-172-30-198-64.us-west-2.compute.internal:9092,ip-172-30-200-37.us-west-2.compute.internal:9092
> 2016-02-24 18:11:23 INFO  VerifiableProperties:68 - Property
> request.timeout.ms is overridden to 3
> 2016-02-24 18:11:23 INFO  ClientUtils$:68 - Fetching metadata from broker
> id:8,host:ip-172-30-198-64.us-west-2.compute.internal,port:9092 with
> correlation id 189686 for 2 topic(s) Set(container, profile-data-2)
>
> I am new to Kafka so although I know it has something to do with the
> brokers, I would like to know what has happened and what is the best way to
> fix it?
>
> TIA
>



-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook  Linkedin

 Twitter 


Re: Kafka node liveness check

2016-02-29 Thread Jens Rantil
Hi,

I assume you first want to ask yourself what liveness you would like to
check for. I guess the most realistic check is to put a "ping" message on
the broken and make sure that you can consume it.

Cheers,
Jens

On Fri, Feb 26, 2016 at 12:38 PM, tao xiao  wrote:

> Hi team,
>
> What is the best way to verify a specific Kafka node functions properly?
> Telnet the port is one of the approach but I don't think it tells me
> whether or not the broker can still receive/send traffics. I am thinking to
> ask for metadata from the broker using consumer.partitionsFor. If it can
> return partitioninfo it is considered live. Is this a good approach?
>



-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook  Linkedin

 Twitter