MQTT Support (Mosquitto + Mosquitto Kafka Bridge + Kafka)

2016-03-03 Thread Sankar R M Pillai
Hi All, 

 

I'm working on the IoT stack with Kafka. I am using Mosquitto + Mosquitto
Kafka Bridge + Kafka stack as I need the protocol support for MQTT. 

 

I need to scale this stack too. Please advise on this in terms of
scalability and if any better approach to handle this. Thanks

 

 

Cheers, 

Sankar

 



Re: Kafka over Satellite links

2016-03-03 Thread Mathias Herberts
Sat links induce roughly 250ms of latency, you have to tweak TCP Windows so
you can saturate the link as much as you can, but besides that Sat links
are rather common.
On Mar 4, 2016 7:26 AM, "Jan"  wrote:

> Thanks for the input.
> The IoT application could have a http ingestion receiver to funnel data
> into Kafka. However, I am thinking about Kafka's ability to be able to
> injest traffic directly from IoT devices.
> Does anyone have any recommendations while using Kafka Connect between
> producer &  consumer sides over Satellite links.   Does the Satellite
> link really make any difference v/s a slow Internet link ?
> Input/ suggestions would be much appreciated.
> ThanksJan
>
>
> On Thursday, March 3, 2016 1:16 AM, Christian Csar 
> wrote:
>
>
>  I would not do that. I admit I may be a bit biased due to working for
> Buddy Platform (IoT backend stuff including telemetry collection), but
> you want to send the data via some protocol (HTTP? MQTT? COAP?) to the
> central hub and then have those servers put the data into Kafka. Now
> if you want to use Kafka there are the various HTTP front ends that
> will basically put the data into Kafka for you without the client
> needing to deal with the partition management part. But putting data
> into Kafka directly really seems like a bad idea even if it's a large
> number of messages per second per node, even if the security parts
> work out for you.
>
> Christian
>
> On Wed, Mar 2, 2016 at 9:52 PM, Jan  wrote:
> > Hi folks;
> > does anyone know of Kafka's ability to work over Satellite links. We
> have a IoT Telemetry application that uses Satellite communication to send
> data from remote sites to a Central hub.
> > Any help/ input/ links/ gotchas would be much appreciated.
> > Regards,Jan
>
>
>


Re: Writing a Producer from Scratch

2016-03-03 Thread James Cheng
Stephen,

There is a mailing list for kafka client developers that you may find useful: 
https://groups.google.com/forum/#!forum/kafka-clients

The d...@kafka.apache.org mailing list might also 
be a good resource: http://kafka.apache.org/contact.html

Lastly, do you have any way to do HTTP calls on your platform? There exist some 
REST servers that you speak HTTP to and then they will produce to Kafka on your 
behalf. Here is one: http://docs.confluent.io/2.0.1/kafka-rest/docs/index.html

-James

On Mar 3, 2016, at 2:47 AM, Hopson, Stephen 
mailto:stephen.hop...@gb.unisys.com>> wrote:

Hi,
Not sure if this is the right forum for this question, but if it not I’m sure 
someone will direct me to the proper one.
Also, I am new to Kafka (but not new to computers).

I want to write a kafka producer client for a Unisys OS 2200 mainframe. I need 
to write it in C, and since I have no access to Windows / Unix / Linux 
libraries, I have to develop the interface at the lowest level.

So far, I have downloaded a kafka server with associated zookeeper (kafka 
_2.10-0.8.2.2). Note I have downloaded the Windows version and have it running 
on my laptop, successfully tested on the same laptop with the provided provider 
and consumer clients.

I have developed code to open a TCP session to the kafka server which appears 
to work and I have attempted to send a metadata request which does not appear 
to work. When I say it does not appear to work, I mean that I send the message 
and then I sit on a retrieve, which eventually times out ( I do seem to get one 
character in the receive buffer of 0235 octal). The message format I am using 
is the one described by the excellent document by Jay Creps / Gwen Shapira 
athttps://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
  However, it is not clear what level of kafka these message formats are 
applicable for.

Can anybody offer me any advice or suggestions as to how to progress?

PS is the CRC mandatory in the Producer messages?
Many thanks in advance.

Stephen Hopson | Infrastructure Architect | Enterprise Solutions
Unisys | +44 (0)1908 805010 | +44 (0)7557 303321 | 
stephen.hop...@gb.unisys.com



THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is for use only by the intended recipient. If you received this in 
error, please contact the sender and delete the e-mail and its attachments from 
all devices.




This email and any attachments may contain confidential and privileged material 
for the sole use of the intended recipient. Any review, copying, or 
distribution of this email (or any attachments) by others is prohibited. If you 
are not the intended recipient, please contact the sender immediately and 
permanently delete this email and any attachments. No employee or agent of TiVo 
Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by 
email. Binding agreements with TiVo Inc. may only be made by a signed written 
agreement.


Fwd: Kafka Security

2016-03-03 Thread sudeep mishra
Hi,

I am exploring on the Security capabilities of Kafka 0.9.1 but unable to
use it successfully.

I have set below configuration in my server.properties

*allow.everyone.if.no.acl.found=false*
*super.users=User:root;User:kafka*

I created an ACL using below command

*./kafka-acls.sh --authorizer-properties zookeeper.connect=
--add --allow-principal User:imit --allow-host  --topic imit
--producer --consumer --group imit-consumer-group*

and I see below response for it

*Current ACLs for resource `Topic:imit`:*
*User:imit has Allow permission for operations: Describe from
hosts: *
*User:imit has Allow permission for operations: Read from hosts:
*
*User:imit has Allow permission for operations: Write from hosts:
*

*Note:* Values mentioned in <> are replaced with some dummy values in the
question and used correctly while creating the ACL

I have following observations:

a) Though I define the rule for imit topic to access for a particular using
from a given host yet I can write to the topic from any host using any user
account.

b) I am unable to read the messages from topic from any host or any user
account (even using the one for which I have defined the rules).

I am running Kafka on RHEL 6.7 and all the users are local.

Appreciate if someone can guide if I am missing any configuration
parameters or commands to manage authorization or if Kafka is behaving in a
weird way.

Also where can I getting authorization related logs in Kafka?


Thanks & Regards,

Sudeep


Re: Kafka over Satellite links

2016-03-03 Thread Jan
Thanks for the input. 
The IoT application could have a http ingestion receiver to funnel data into 
Kafka. However, I am thinking about Kafka's ability to be able to injest 
traffic directly from IoT devices.  
Does anyone have any recommendations while using Kafka Connect between producer 
&  consumer sides over Satellite links.       Does the Satellite link really 
make any difference v/s a slow Internet link ? 
Input/ suggestions would be much appreciated.
ThanksJan
 

On Thursday, March 3, 2016 1:16 AM, Christian Csar  
wrote:
 

 I would not do that. I admit I may be a bit biased due to working for
Buddy Platform (IoT backend stuff including telemetry collection), but
you want to send the data via some protocol (HTTP? MQTT? COAP?) to the
central hub and then have those servers put the data into Kafka. Now
if you want to use Kafka there are the various HTTP front ends that
will basically put the data into Kafka for you without the client
needing to deal with the partition management part. But putting data
into Kafka directly really seems like a bad idea even if it's a large
number of messages per second per node, even if the security parts
work out for you.

Christian

On Wed, Mar 2, 2016 at 9:52 PM, Jan  wrote:
> Hi folks;
> does anyone know of Kafka's ability to work over Satellite links. We have a 
> IoT Telemetry application that uses Satellite communication to send data from 
> remote sites to a Central hub.
> Any help/ input/ links/ gotchas would be much appreciated.
> Regards,Jan


  

Re: KafkaConsumer poll() problems

2016-03-03 Thread Jason Gustafson
Hi Robin,

Sorry for the late reply. I'm a little puzzled with your consumer code.
Once the "end" flag is set to true, you won't ever hit the poll() call
again, or am I missing something? Do you even need that inner loop?

-Jason

On Tue, Mar 1, 2016 at 6:06 AM, Péricé Robin  wrote:

> Producer : https://gist.github.com/r0perice/9ce2bece76dd4113a44a
> Consumer : https://gist.github.com/r0perice/8dcee160017ccd779d59
> Console : https://gist.github.com/r0perice/5a8e2b2939651b1ac893
>
> 2016-03-01 14:50 GMT+01:00 craig w :
>
> > Can you try posting your code into a Gist (gist.github.com) or Pastebin,
> > so
> > it's formatted and easier to read?
> >
> > On Tue, Mar 1, 2016 at 8:49 AM, Péricé Robin 
> > wrote:
> >
> > > Hello everybody,
> > >
> > > I'm having troubles using KafkaConsumer 0.9.0.0 API. My Consumer class
> > > doesn't consumer messages properly.
> > >
> > > -
> > > | *Consumer*.java |
> > > -
> > >
> > > public final void run() { try {
> > > *consumer.subscribe(Collections.singletonList(topicName));* boolean
> end =
> > > false; while (!closed.get()) { while (!end) { *final
> > ConsumerRecords > > ITupleApi> records = consumer.poll(1000);* if (records == null ||
> > records.
> > > count() == 0) { System.err.println("In ConsumerThread consumer.poll
> > > received nothing on " + topicName); } else { /* some processing on
> > records
> > > here */ end = true; } } } } catch (final WakeupException e) { if
> > (!closed.
> > > get()) { throw e; } } finally { consumer.close(); } }
> > >
> > > 
> > > | *Producer*.java |
> > > 
> > >
> > >
> > > public final void run() {
> > > while (true) {
> > > *producer.send(new ProducerRecord(topicName, timems,
> > > tuple), callback);*
> > > }
> > > }
> > >
> > > --
> > >
> > > I run this exactly the same way as in Github example :
> > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/examples/src/main/java/kafka/examples/KafkaConsumerProducerDemo.java
> > >
> > > But I only get "In ConsumerThread consumer.poll received nothing".
> Poll()
> > > never send me messages ... But when I use command line tools I can see
> my
> > > messages on the topic.
> > >
> > > When I run the basic example from GitHub everything works fine ... So
> > it's
> > > seems like I'm missing something.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *CONSOLE*
> > >
> > > [2016-03-01 14:42:14,280] INFO [GroupCoordinator 0]: Preparing to
> > > restabilize group KafkaChannelBasicTestConsumer with old generation 1
> > > (kafka.coordin
> > > ator.GroupCoordinator)
> > > [2016-03-01 14:42:14,281] INFO [GroupCoordinator 0]: Group
> > > KafkaChannelBasicTestConsumer generation 1 is dead and removed
> > > (kafka.coordinator.GroupCoor
> > > dinator)
> > > [2016-03-01 14:42:22,788] INFO [GroupCoordinator 0]: Preparing to
> > > restabilize group KafkaChannelBasicTestConsumer with old generation 0
> > > (kafka.coordin
> > > ator.GroupCoordinator)
> > > [2016-03-01 14:42:22,788] INFO [GroupCoordinator 0]: Stabilized group
> > > KafkaChannelBasicTestConsumer generation 1
> > > (kafka.coordinator.GroupCoordinator)
> > > [2016-03-01 14:42:22,797] INFO [GroupCoordinator 0]: Assignment
> received
> > > from leader for group KafkaChannelBasicTestConsumer for generation 1
> > > (kafka.c
> > > oordinator.GroupCoordinator)
> > > [2016-03-01 14:42:55,808] INFO [GroupCoordinator 0]: Preparing to
> > > restabilize group KafkaChannelBasicTestConsumer with old generation 1
> > > (kafka.coordin
> > > ator.GroupCoordinator)
> > > [2016-03-01 14:42:55,809] INFO [GroupCoordinator 0]: Group
> > > KafkaChannelBasicTestConsumer generation 1 is dead and removed
> > > (kafka.coordinator.GroupCoor
> > > dinator)
> > >
> > >
> > > I really need help on this !
> > >
> > > Regards,
> > >
> > > Robin
> > >
> >
> >
> >
> > --
> >
> > https://github.com/mindscratch
> > https://www.google.com/+CraigWickesser
> > https://twitter.com/mind_scratch
> > https://twitter.com/craig_links
> >
>


Re: Consumer deadlock

2016-03-03 Thread Jason Gustafson
Can you post some logs from the consumer? That should tell us what it's
busy doing while hanging. You may have to enable DEBUG level.

-Jason

On Thu, Mar 3, 2016 at 5:02 PM, Muthukumaran K 
wrote:

> Hi Jason,
>
> I am using 0.9 broker.
>
> One more observation. I had written producer code with 0.9 - Even with
> Producer code, I had hanging issue where send method was hanging requesting
> metadata. Thread-dump below
>
> "main" prio=6 tid=0x02238000 nid=0x1390 in Object.wait()
> [0x025bf000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x0007aecacea0> (a
> org.apache.kafka.clients.Metadata)
> at
> org.apache.kafka.clients.Metadata.awaitUpdate(Metadata.java:121)
> - locked <0x0007aecacea0> (a
> org.apache.kafka.clients.Metadata)
> at
> org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata(KafkaProducer.java:483)
> at
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:412)
> at
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:339)
> at kafkaexperiments.ProducerDriver.main(ProducerDriver.java:41)
>
> Then I included metadata.fetch.timeout.ms=1 and then producer started
> working. But when I poll the same topic using kafka-console-consumer.sh,
> console-consumer also hangs.
>
>
>
> Regards
> Muthu
>
>
> -Original Message-
> From: Jason Gustafson [mailto:ja...@confluent.io]
> Sent: Friday, March 04, 2016 5:33 AM
> To: users@kafka.apache.org
> Subject: Re: Consumer deadlock
>
> Hi there,
>
> Just to clarify, is the broker still on 0.8? Unfortunately, the new
> consumer needs 0.9. That probably would explain the hanging.
>
> -Jason
>
> On Thu, Mar 3, 2016 at 2:28 AM, Muthukumaran K <
> muthukumara...@ericsson.com>
> wrote:
>
> > Ewen,
> >
> > By new Consumer API, you mean KafkaConsumer ? I have an issue with a
> > poll in 0.9.0.1. poll hangs indefinitely even with the timeout
> >
> > Following is the consumer code which I am using. Any pointers would be
> > helpful
> >
> > public class ConsumerLoop implements Runnable {
> >
> >
> > private final KafkaConsumer consumer;
> > private final List topics;
> > private final int id;
> >
> > public ConsumerLoop(int id,
> >   String groupId,
> >   List topics) {
> > this.id = id;
> > this.topics = topics;
> > Properties props = new Properties();
> > props.put("bootstrap.servers", "192.168.56.101:9092");
> > props.put("group.id", groupId);
> > props.put("auto.offset.reset", "earliest");
> > props.put("key.deserializer",
> > "org.apache.kafka.common.serialization.StringDeserializer");
> > props.put("value.deserializer",
> > "org.apache.kafka.common.serialization.StringDeserializer");
> > props.put("metadata.fetch.timeout.ms", 1);
> >
> > this.consumer = new KafkaConsumer<>(props);
> >
> > }
> >
> > @Override
> > public void run() {
> > try {
> >
> > System.out.println("Starting consumer ID : " + id +
> > " Thread : " + Thread.currentThread().getName() +
> > " Topic : " + topics.toString() +
> > " ... ");
> > long startTime = System.currentTimeMillis();
> > int recordCount = 0;
> >
> >   consumer.subscribe(topics);
> >
> >   System.out.println("Consumer-ID " + id + " after
> > subscribe...");
> >
> >   while (true) {
> > ConsumerRecords records =
> > consumer.poll(1);
> >
> > System.out.println("Consumer-ID " + id + " after
> > poll...");
> >
> >
> > for (ConsumerRecord record : records) {
> >   Map data = new HashMap<>();
> >   data.put("partition", record.partition());
> >   data.put("offset", record.offset());
> >   data.put("value", record.value());
> >   System.out.println(
> >   "Consumer-ID : " + this.id +
> >   ": " + data +
> >   " Thread_name : " +
> > Thread.currentThread().getName());
> >   recordCount++;
> >
> > }
> > long endTime = System.currentTimeMillis();
> > long duration = (endTime - startTime)/1000;
> > System.out.println("## rate : " + recordCount/duration +
> "
> > msgs/sec on Consumer ID " + id);
> >
> >   }
> > } catch (WakeupException e) {
> >   // ignore for shutdown
> > } finally {
> >   consumer.close();
> > }
> > }
> >
> > public void shutdown() {
> >
> > consumer.wakeup();
> > }
> >
> > Regards
> > Muthu
> >
> >
> > -Original Message-
> > From: Ewen Cheslack-Postava [mailto:e...@confluent.io]
> > Sent: Thursd

Re: About bootstrap.servers

2016-03-03 Thread Jason Gustafson
Oh, one more thing. There has actually been some discussion about making
broker discovery pluggable so that you can integrate with frameworks like
etcd and consul. If we go down that route, it might make sense to provide a
plugin for Zookeeper, although I think you'd still want to keep the client
from accessing the paths used by the brokers directly. See here for one
JIRA: https://issues.apache.org/jira/browse/KAFKA-1793.

-Jason

On Thu, Mar 3, 2016 at 4:51 PM, Jason Gustafson  wrote:

> Hi Tian,
>
> Removing the client dependence on Zookeeper has been one of the main goals
> of the Kafka team for a while now. It simplifies client development since
> it's one less dependence and one less remote system they have to manage
> interaction with. It also makes a lot of sense with the security features
> released in 0.9. Fewer systems for the client to talk to makes its security
> story a whole lot easier to tell. And if the broker is the only thing
> talking to Zookeeper, then you can isolate it better.
>
> And I'm not sure it actually makes sense to try to decouple kafka clients
> from kafka servers ;-) I consider the Zookeeper state as part of the
> server's internals. Exposing it actually couples clients to those internals
> and makes the server harder to change.
>
> -Jason
>
> On Tue, Mar 1, 2016 at 6:51 PM, 田守枝  wrote:
>
>> Hi All:
>> I want to known why use "bootstrap.servers"  to establish the initial
>> connection to the Kafka cluster when I initialize a Producer or Consumer?
>> Why not let producer or consumer connect to the zookeeper to get the
>> broker's ip and port?  I think this is one way to decouple the client and
>> brokers!
>>
>>
>>
>>   Tian
>> Shouzhi
>>
>>
>>
>> 2016/3/2
>>
>>
>>
>
>
>


RE: Consumer deadlock

2016-03-03 Thread Muthukumaran K
Hi Jason, 

I am using 0.9 broker. 

One more observation. I had written producer code with 0.9 - Even with Producer 
code, I had hanging issue where send method was hanging requesting metadata. 
Thread-dump below 

"main" prio=6 tid=0x02238000 nid=0x1390 in Object.wait() 
[0x025bf000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0007aecacea0> (a 
org.apache.kafka.clients.Metadata)
at org.apache.kafka.clients.Metadata.awaitUpdate(Metadata.java:121)
- locked <0x0007aecacea0> (a org.apache.kafka.clients.Metadata)
at 
org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata(KafkaProducer.java:483)
at 
org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:412)
at 
org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:339)
at kafkaexperiments.ProducerDriver.main(ProducerDriver.java:41)

Then I included metadata.fetch.timeout.ms=1 and then producer started working. 
But when I poll the same topic using kafka-console-consumer.sh, 
console-consumer also hangs. 



Regards
Muthu


-Original Message-
From: Jason Gustafson [mailto:ja...@confluent.io] 
Sent: Friday, March 04, 2016 5:33 AM
To: users@kafka.apache.org
Subject: Re: Consumer deadlock

Hi there,

Just to clarify, is the broker still on 0.8? Unfortunately, the new consumer 
needs 0.9. That probably would explain the hanging.

-Jason

On Thu, Mar 3, 2016 at 2:28 AM, Muthukumaran K 
wrote:

> Ewen,
>
> By new Consumer API, you mean KafkaConsumer ? I have an issue with a 
> poll in 0.9.0.1. poll hangs indefinitely even with the timeout
>
> Following is the consumer code which I am using. Any pointers would be 
> helpful
>
> public class ConsumerLoop implements Runnable {
>
>
> private final KafkaConsumer consumer;
> private final List topics;
> private final int id;
>
> public ConsumerLoop(int id,
>   String groupId,
>   List topics) {
> this.id = id;
> this.topics = topics;
> Properties props = new Properties();
> props.put("bootstrap.servers", "192.168.56.101:9092");
> props.put("group.id", groupId);
> props.put("auto.offset.reset", "earliest");
> props.put("key.deserializer",
> "org.apache.kafka.common.serialization.StringDeserializer");
> props.put("value.deserializer", 
> "org.apache.kafka.common.serialization.StringDeserializer");
> props.put("metadata.fetch.timeout.ms", 1);
>
> this.consumer = new KafkaConsumer<>(props);
>
> }
>
> @Override
> public void run() {
> try {
>
> System.out.println("Starting consumer ID : " + id +
> " Thread : " + Thread.currentThread().getName() +
> " Topic : " + topics.toString() +
> " ... ");
> long startTime = System.currentTimeMillis();
> int recordCount = 0;
>
>   consumer.subscribe(topics);
>
>   System.out.println("Consumer-ID " + id + " after 
> subscribe...");
>
>   while (true) {
> ConsumerRecords records = 
> consumer.poll(1);
>
> System.out.println("Consumer-ID " + id + " after 
> poll...");
>
>
> for (ConsumerRecord record : records) {
>   Map data = new HashMap<>();
>   data.put("partition", record.partition());
>   data.put("offset", record.offset());
>   data.put("value", record.value());
>   System.out.println(
>   "Consumer-ID : " + this.id +
>   ": " + data +
>   " Thread_name : " + 
> Thread.currentThread().getName());
>   recordCount++;
>
> }
> long endTime = System.currentTimeMillis();
> long duration = (endTime - startTime)/1000;
> System.out.println("## rate : " + recordCount/duration + "
> msgs/sec on Consumer ID " + id);
>
>   }
> } catch (WakeupException e) {
>   // ignore for shutdown
> } finally {
>   consumer.close();
> }
> }
>
> public void shutdown() {
>
> consumer.wakeup();
> }
>
> Regards
> Muthu
>
>
> -Original Message-
> From: Ewen Cheslack-Postava [mailto:e...@confluent.io]
> Sent: Thursday, March 03, 2016 2:21 PM
> To: users@kafka.apache.org
> Subject: Re: Consumer deadlock
>
> Take a look at the consumer.timeout.ms setting if you don't want the 
> iterator to block indefinitely.
>
> And a better long term solution is to switch to the new consumer, but 
> that obviously requires much more significant code changes. The new 
> consumer API is a single-threaded poll-based API where you can always 
> specify timeouts to the consumer's poll() method (though it currently 
> ha

Re: About bootstrap.servers

2016-03-03 Thread Jason Gustafson
Hi Tian,

Removing the client dependence on Zookeeper has been one of the main goals
of the Kafka team for a while now. It simplifies client development since
it's one less dependence and one less remote system they have to manage
interaction with. It also makes a lot of sense with the security features
released in 0.9. Fewer systems for the client to talk to makes its security
story a whole lot easier to tell. And if the broker is the only thing
talking to Zookeeper, then you can isolate it better.

And I'm not sure it actually makes sense to try to decouple kafka clients
from kafka servers ;-) I consider the Zookeeper state as part of the
server's internals. Exposing it actually couples clients to those internals
and makes the server harder to change.

-Jason

On Tue, Mar 1, 2016 at 6:51 PM, 田守枝  wrote:

> Hi All:
> I want to known why use "bootstrap.servers"  to establish the initial
> connection to the Kafka cluster when I initialize a Producer or Consumer?
> Why not let producer or consumer connect to the zookeeper to get the
> broker's ip and port?  I think this is one way to decouple the client and
> brokers!
>
>
>
>   Tian
> Shouzhi
>
>
>
> 2016/3/2
>
>
>


Re: Consumer deadlock

2016-03-03 Thread Jason Gustafson
Hi there,

Just to clarify, is the broker still on 0.8? Unfortunately, the new
consumer needs 0.9. That probably would explain the hanging.

-Jason

On Thu, Mar 3, 2016 at 2:28 AM, Muthukumaran K 
wrote:

> Ewen,
>
> By new Consumer API, you mean KafkaConsumer ? I have an issue with a poll
> in 0.9.0.1. poll hangs indefinitely even with the timeout
>
> Following is the consumer code which I am using. Any pointers would be
> helpful
>
> public class ConsumerLoop implements Runnable {
>
>
> private final KafkaConsumer consumer;
> private final List topics;
> private final int id;
>
> public ConsumerLoop(int id,
>   String groupId,
>   List topics) {
> this.id = id;
> this.topics = topics;
> Properties props = new Properties();
> props.put("bootstrap.servers", "192.168.56.101:9092");
> props.put("group.id", groupId);
> props.put("auto.offset.reset", "earliest");
> props.put("key.deserializer",
> "org.apache.kafka.common.serialization.StringDeserializer");
> props.put("value.deserializer",
> "org.apache.kafka.common.serialization.StringDeserializer");
> props.put("metadata.fetch.timeout.ms", 1);
>
> this.consumer = new KafkaConsumer<>(props);
>
> }
>
> @Override
> public void run() {
> try {
>
> System.out.println("Starting consumer ID : " + id +
> " Thread : " + Thread.currentThread().getName() +
> " Topic : " + topics.toString() +
> " ... ");
> long startTime = System.currentTimeMillis();
> int recordCount = 0;
>
>   consumer.subscribe(topics);
>
>   System.out.println("Consumer-ID " + id + " after subscribe...");
>
>   while (true) {
> ConsumerRecords records = consumer.poll(1);
>
> System.out.println("Consumer-ID " + id + " after poll...");
>
>
> for (ConsumerRecord record : records) {
>   Map data = new HashMap<>();
>   data.put("partition", record.partition());
>   data.put("offset", record.offset());
>   data.put("value", record.value());
>   System.out.println(
>   "Consumer-ID : " + this.id +
>   ": " + data +
>   " Thread_name : " +
> Thread.currentThread().getName());
>   recordCount++;
>
> }
> long endTime = System.currentTimeMillis();
> long duration = (endTime - startTime)/1000;
> System.out.println("## rate : " + recordCount/duration + "
> msgs/sec on Consumer ID " + id);
>
>   }
> } catch (WakeupException e) {
>   // ignore for shutdown
> } finally {
>   consumer.close();
> }
> }
>
> public void shutdown() {
>
> consumer.wakeup();
> }
>
> Regards
> Muthu
>
>
> -Original Message-
> From: Ewen Cheslack-Postava [mailto:e...@confluent.io]
> Sent: Thursday, March 03, 2016 2:21 PM
> To: users@kafka.apache.org
> Subject: Re: Consumer deadlock
>
> Take a look at the consumer.timeout.ms setting if you don't want the
> iterator to block indefinitely.
>
> And a better long term solution is to switch to the new consumer, but that
> obviously requires much more significant code changes. The new consumer API
> is a single-threaded poll-based API where you can always specify timeouts
> to the consumer's poll() method (though it currently has some limitations
> to how it enforces that timeout).
>
> -Ewen
>
> On Wed, Mar 2, 2016 at 11:24 AM, Oleg Zhurakousky <
> ozhurakou...@hortonworks.com> wrote:
>
> > Guys
> >
> > We have a consumer deadlock and here is the relevant dump:
> >
> > "Timer-Driven Process Thread-10" Id=76 TIMED_WAITING  on
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@39775787
> > at sun.misc.Unsafe.park(Native Method)
> > at
> > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> > at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> > at
> >
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> > at
> > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:65)
> > at
> > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
> > at
> > kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
> > at
> kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
> > . . . . .
> >
> > What worries me is the fact that ‘hasNext’ is essentially a blocking
> > operation. I can’t seem to find a single reason when it would be
> > useful, hence I am calling it a bug, but hopefully someone can clarify.
> > Kafka version is 0.8.*
> >
> > Cheers
> > Oleg
> >
> >

Re: Kafka | Unable to publish data to broker - ClosedChannelException

2016-03-03 Thread Banias H
Try changing the port like below.

./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:
*6667* --topic page_visits

-B

On Thu, Mar 3, 2016 at 12:45 PM, Shashi Vishwakarma <
shashi.vish...@gmail.com> wrote:

> Hi
>
> I am trying to run simple kafka producer consumer example on HDP but facing
> below exception.
>
> [2016-03-03 18:26:38,683] WARN Fetching topic metadata with
> correlation id 0 for topics [Set(page_visits)] from broker
> [BrokerEndPoint(0,sandbox.hortonworks.com,9092)] failed
> (kafka.client.ClientUtils$)
> java.nio.channels.ClosedChannelException
> at kafka.network.BlockingChannel.send(BlockingChannel.scala:120)
> at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
> at
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
> at kafka.producer.SyncProducer.send(SyncProducer.scala:115)
> at
> kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
> at
> kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
> at
> kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:68)
> at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:89)
> at kafka.utils.Logging$class.swallowError(Logging.scala:106)
> at kafka.utils.CoreUtils$.swallowError(CoreUtils.scala:51)
> at
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:68)
> at
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
> at
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
> at
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
> at scala.collection.immutable.Stream.foreach(Stream.scala:547)
> at
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
> at
> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
> [2016-03-03 18:26:38,688] ERROR fetching topic metadata for topics
> [Set(page_visits)] from broker
> [ArrayBuffer(BrokerEndPoint(0,sandbox.hortonworks.com,9092))] failed
> (kafka.utils.CoreUtils$)
> kafka.common.KafkaException: fetching topic metadata for topics
> [Set(page_visits)] from broker
> [ArrayBuffer(BrokerEndPoint(0,sandbox.hortonworks.com,9092))] failed
> at
> kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:73)
> at
> kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
> at
> kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:68)
> at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:89)
> at kafka.utils.Logging$class.swallowError(Logging.scala:106)
> at kafka.utils.CoreUtils$.swallowError(CoreUtils.scala:51)
> at
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:68)
> at
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
> at
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
> at
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
> at scala.collection.immutable.Stream.foreach(Stream.scala:547)
> at
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
> at
> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
> Caused by: java.nio.channels.ClosedChannelException
> at kafka.network.BlockingChannel.send(BlockingChannel.scala:120)
> at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
> at
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
> at kafka.producer.SyncProducer.send(SyncProducer.scala:115)
> at
> kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
> ... 12 more
> [2016-03-03 18:26:38,693] WARN Fetching topic metadata with
> correlation id 1 for topics [Set(page_visits)] from broker
> [BrokerEndPoint(0,sandbox.hortonworks.com,9092)] failed
> (kafka.client.ClientUtils$)
> java.nio.channels.ClosedChannelException
>
> Here is command that I am using for producer.
>
> ./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:9092
>  --topic page_visits
>
> After doing bit of googling , I found that I need to add
> advertised.host.name property in server.properties file . Here is my
> server.properties file.
>
> # Generated by Apache Ambari. Thu Mar  3 18:12:50 2016
> advertised.host.name=sandbox.hortonworks.com
> auto.create.topics.enable=true
> auto.leader.rebalance.enable=truebroker.id=0
> compression.type=producer
> controlled.shutdown.enable=true
> controlled.shutdown.max.retries=3controlled.s

Larger Size Error Message

2016-03-03 Thread Fang Wong
Got the following error message with Kafka 0.8.2.1:
[2016-02-26 20:33:43,025] INFO Closing socket connection to /x due to
invalid request: Request of length 1937006964 is not valid, it is larger
than the maximum size of 104857600 bytes. (kafka.network.Processor)

Didn't send a large message at all, it seems like encoding issue or partial
request, any suggestion how to fix it?

The code is like below:

ByteArrayOutputStream bos = new ByteArrayOutputStream();

DataOutputStream dos = new DataOutputStream(bos);

dos.writeLong(System.currentTimeMillis());

OutputStreamWriter byteWriter = new OutputStreamWriter(bos,
com.force.commons.text.EncodingUtil.UTF_ENCODING);

gson.toJson(obj, byteWriter);

byte[] payload = bos.toByteArray();

ProducerRecord data = new ProducerRecord(“Topic”, 0, null, payload);

kafkaProducer.send(data);


Re: [sdc-user] Re: Having trouble to connect StreamSets to Kafka with Kerberos authentication

2016-03-03 Thread Ismael Juma
Hi Harikiran,

One comment: `advertised.host.name` is not used if `advertised.listeners`
is set and similarly `host.name` is not used if `listeners` is set. In
general, the use of those properties is now discouraged in favour of
listeners. There is a PR to make the documentation clearer:

https://github.com/apache/kafka/pull/793

Ismael

On Thu, Mar 3, 2016 at 4:48 PM, Harikiran Nayak  wrote:

> Hi Michal,
>
> Can you please add the *advertised.listeners* and *advertised.host.name
> * properties in your kafka server config file
> 'server.properties_krb5'?
>
> For example, I have the following configuration in my working setup
>
> listeners=SASL_PLAINTEXT://:9092
> advertised.listeners=SASL_PLAINTEXT://:9092
> host.name=kafka
> advertised.host.name=kafka
>
> 'kafka' is the hostname on which the Kafka broker is running in my setup.
> There is an entry for this host in '/etc/hosts' on the node where
> StreamSets is running.
>
> Thanks
> Hari.
>
> On Thu, Mar 3, 2016 at 8:19 AM Harikiran Nayak 
> wrote:
>
> > Hi Michal,
> >
> > Are you able to write and read from the kerberized Kafka setup using the
> > Kafka Console Producer and Consumer?
> >
> > I am taking a look at your configuration files.
> >
> > Thanks
> > Hari.
> >
> > On Thu, Mar 3, 2016 at 8:09 AM Jonathan Natkins 
> > wrote:
> >
> >> Hey Michal,
> >>
> >> I'm cc'ing the StreamSets user list, which might be able to get you some
> >> better StreamSets-specific answers.
> >>
> >> Thanks!
> >> Natty
> >>
> >> On Thursday, March 3, 2016, Michał Kabocik 
> >> wrote:
> >>
> >>> Dears,
> >>>
> >>> I’m Middleware Engineer and I’m trying to configure secure Kafka
> Cluster
> >>> with SSL and Kerberos authentication with StreamSets, which will be
> used
> >>> for data injection to HDP.
> >>>
> >>> I have two Kafka Clusters; one with SSL enabled and there I
> successfully
> >>> connected StreamSets to Kafka with SSL authentication, and second one
> with
> >>> Kerberos authentication and here I’m facing with the problem:
> >>>
> >>> Both Kafka (with Zookeeper) and StreamSets are configured to
> >>> authenticate via Kerberos. When starting all of them, I see in the
> logs,
> >>> that they are successfully authenticated (TGT granted etc.)
> >>>
> >>> I have two listeners defined in Kafka:
> >>> listeners=PLAINTEXT://:9092,SASL_PLAINTEXT://:9093. When starting
> Kafka, I
> >>> see Kafka listens on both, 9092 and 9093.
> >>>
> >>> When I connect StreamSets to Kafka on port 9092, everything works
> >>> smooth. But when I try to connect to port 9093, error occurs:
> >>>
> >>> KAFKA_41 - Could not get partition count for topic 'streamsets5' :
> >>> com.streamsets.pipeline.api.StageException: KAFKA_41 - Could not get
> >>> caseition count for topic 'streamsets5' :
> >>> org.apache.kafka.common.KafkaException: Failed to construct kafka
> consumer
> >>>
> >>> I see no errors in Kafka, in the log of StreamSets, there is only above
> >>> error visible. I attached major config files of Kafka, Zookeeper and
> >>> StreamSets.
> >>>
> >>> Will greatly appreciate your help in solving this case!
> >>>
> >>> Kind regards,
> >>>
> >>
> >>
> >> --
> >> Jonathan "Natty" Natkins
> >> StreamSets | Field Engineer
> >> mobile: 609.577.1600 | linkedin 
> >>
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "sdc-user" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to sdc-user+unsubscr...@streamsets.com.
> >> Visit this group at
> >> https://groups.google.com/a/streamsets.com/group/sdc-user/.
> >>
> >
>


Kafka | Unable to publish data to broker - ClosedChannelException

2016-03-03 Thread Shashi Vishwakarma
Hi

I am trying to run simple kafka producer consumer example on HDP but facing
below exception.

[2016-03-03 18:26:38,683] WARN Fetching topic metadata with
correlation id 0 for topics [Set(page_visits)] from broker
[BrokerEndPoint(0,sandbox.hortonworks.com,9092)] failed
(kafka.client.ClientUtils$)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:120)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
at 
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
at kafka.producer.SyncProducer.send(SyncProducer.scala:115)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
at 
kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
at 
kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:68)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:89)
at kafka.utils.Logging$class.swallowError(Logging.scala:106)
at kafka.utils.CoreUtils$.swallowError(CoreUtils.scala:51)
at 
kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:68)
at 
kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
at 
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
at 
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
at scala.collection.immutable.Stream.foreach(Stream.scala:547)
at 
kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
at 
kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
[2016-03-03 18:26:38,688] ERROR fetching topic metadata for topics
[Set(page_visits)] from broker
[ArrayBuffer(BrokerEndPoint(0,sandbox.hortonworks.com,9092))] failed
(kafka.utils.CoreUtils$)
kafka.common.KafkaException: fetching topic metadata for topics
[Set(page_visits)] from broker
[ArrayBuffer(BrokerEndPoint(0,sandbox.hortonworks.com,9092))] failed
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:73)
at 
kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
at 
kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:68)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:89)
at kafka.utils.Logging$class.swallowError(Logging.scala:106)
at kafka.utils.CoreUtils$.swallowError(CoreUtils.scala:51)
at 
kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:68)
at 
kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
at 
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
at 
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
at scala.collection.immutable.Stream.foreach(Stream.scala:547)
at 
kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
at 
kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
Caused by: java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:120)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
at 
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
at kafka.producer.SyncProducer.send(SyncProducer.scala:115)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
... 12 more
[2016-03-03 18:26:38,693] WARN Fetching topic metadata with
correlation id 1 for topics [Set(page_visits)] from broker
[BrokerEndPoint(0,sandbox.hortonworks.com,9092)] failed
(kafka.client.ClientUtils$)
java.nio.channels.ClosedChannelException

Here is command that I am using for producer.

./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:9092
 --topic page_visits

After doing bit of googling , I found that I need to add
advertised.host.name property in server.properties file . Here is my
server.properties file.

# Generated by Apache Ambari. Thu Mar  3 18:12:50 2016
advertised.host.name=sandbox.hortonworks.com
auto.create.topics.enable=true
auto.leader.rebalance.enable=truebroker.id=0
compression.type=producer
controlled.shutdown.enable=true
controlled.shutdown.max.retries=3controlled.shutdown.retry.backoff.ms=5000
controller.message.queue.size=10controller.socket.timeout.ms=3
default.replication.factor=1
delete.topic.enable=false
fetch.purgatory.purge.interval.requests=1host.name=sandbox.hortonworks.com
kafka.ganglia.metrics.group=kafka
kafka.ganglia.metrics.host=localhost
kafka.ganglia.metrics.port=8671
kafka.ganglia.metrics.reporter.enabled=true
kafka.metrics.reporters=org.apache.hadoop.metrics2.s

Re: Having trouble to connect StreamSets to Kafka with Kerberos authentication

2016-03-03 Thread Gwen Shapira
Hi Michal,

Can you succesfully connect to the SASL port without StreamSet?
For example using the console consumer as explain here?
http://www.confluent.io/blog/apache-kafka-security-authorization-authentication-encryption
(the end-to-end example is all the way at the end of the blog)

This can help us decide if the issue is with the client or server
configuration...

Gwen


On Thu, Mar 3, 2016 at 3:29 AM, Michał Kabocik  wrote:
> Dears,
>
> I’m Middleware Engineer and I’m trying to configure secure Kafka Cluster
> with SSL and Kerberos authentication with StreamSets, which will be used for
> data injection to HDP.
>
> I have two Kafka Clusters; one with SSL enabled and there I successfully
> connected StreamSets to Kafka with SSL authentication, and second one with
> Kerberos authentication and here I’m facing with the problem:
>
> Both Kafka (with Zookeeper) and StreamSets are configured to authenticate
> via Kerberos. When starting all of them, I see in the logs, that they are
> successfully authenticated (TGT granted etc.)
>
> I have two listeners defined in Kafka:
> listeners=PLAINTEXT://:9092,SASL_PLAINTEXT://:9093. When starting Kafka, I
> see Kafka listens on both, 9092 and 9093.
>
> When I connect StreamSets to Kafka on port 9092, everything works smooth.
> But when I try to connect to port 9093, error occurs:
>
> KAFKA_41 - Could not get partition count for topic 'streamsets5' :
> com.streamsets.pipeline.api.StageException: KAFKA_41 - Could not get
> caseition count for topic 'streamsets5' :
> org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
>
> I see no errors in Kafka, in the log of StreamSets, there is only above
> error visible. I attached major config files of Kafka, Zookeeper and
> StreamSets.
>
> Will greatly appreciate your help in solving this case!
>
> Kind regards,


Re: [sdc-user] Re: Having trouble to connect StreamSets to Kafka with Kerberos authentication

2016-03-03 Thread Harikiran Nayak
Hi Michal,

Can you please add the *advertised.listeners* and *advertised.host.name
* properties in your kafka server config file
'server.properties_krb5'?

For example, I have the following configuration in my working setup

listeners=SASL_PLAINTEXT://:9092
advertised.listeners=SASL_PLAINTEXT://:9092
host.name=kafka
advertised.host.name=kafka

'kafka' is the hostname on which the Kafka broker is running in my setup.
There is an entry for this host in '/etc/hosts' on the node where
StreamSets is running.

Thanks
Hari.

On Thu, Mar 3, 2016 at 8:19 AM Harikiran Nayak  wrote:

> Hi Michal,
>
> Are you able to write and read from the kerberized Kafka setup using the
> Kafka Console Producer and Consumer?
>
> I am taking a look at your configuration files.
>
> Thanks
> Hari.
>
> On Thu, Mar 3, 2016 at 8:09 AM Jonathan Natkins 
> wrote:
>
>> Hey Michal,
>>
>> I'm cc'ing the StreamSets user list, which might be able to get you some
>> better StreamSets-specific answers.
>>
>> Thanks!
>> Natty
>>
>> On Thursday, March 3, 2016, Michał Kabocik 
>> wrote:
>>
>>> Dears,
>>>
>>> I’m Middleware Engineer and I’m trying to configure secure Kafka Cluster
>>> with SSL and Kerberos authentication with StreamSets, which will be used
>>> for data injection to HDP.
>>>
>>> I have two Kafka Clusters; one with SSL enabled and there I successfully
>>> connected StreamSets to Kafka with SSL authentication, and second one with
>>> Kerberos authentication and here I’m facing with the problem:
>>>
>>> Both Kafka (with Zookeeper) and StreamSets are configured to
>>> authenticate via Kerberos. When starting all of them, I see in the logs,
>>> that they are successfully authenticated (TGT granted etc.)
>>>
>>> I have two listeners defined in Kafka:
>>> listeners=PLAINTEXT://:9092,SASL_PLAINTEXT://:9093. When starting Kafka, I
>>> see Kafka listens on both, 9092 and 9093.
>>>
>>> When I connect StreamSets to Kafka on port 9092, everything works
>>> smooth. But when I try to connect to port 9093, error occurs:
>>>
>>> KAFKA_41 - Could not get partition count for topic 'streamsets5' :
>>> com.streamsets.pipeline.api.StageException: KAFKA_41 - Could not get
>>> caseition count for topic 'streamsets5' :
>>> org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
>>>
>>> I see no errors in Kafka, in the log of StreamSets, there is only above
>>> error visible. I attached major config files of Kafka, Zookeeper and
>>> StreamSets.
>>>
>>> Will greatly appreciate your help in solving this case!
>>>
>>> Kind regards,
>>>
>>
>>
>> --
>> Jonathan "Natty" Natkins
>> StreamSets | Field Engineer
>> mobile: 609.577.1600 | linkedin 
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "sdc-user" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to sdc-user+unsubscr...@streamsets.com.
>> Visit this group at
>> https://groups.google.com/a/streamsets.com/group/sdc-user/.
>>
>


Re: Fetch Response V1 incorrectly documented

2016-03-03 Thread Oleksiy Krivoshey
Just the correct documentation. To be sure it won't change with next minor
release.

On 3 March 2016 at 17:53, Grant Henke  wrote:

> It looks like you are right, the throttle time does come first. I have a
> WIP implementation (https://github.com/apache/kafka/pull/970) that
> generates the protocol docs based on the protocol specification in the code
> and the output for fetch response v2 is:
>
> > Fetch Response (Version: 2) => throttle_time_ms [responses]
> >   responses => topic [partition_responses]
> > partition_responses => partition error_code high_watermark record_set
> >   partition => INT32
> >   error_code => INT16
> >   high_watermark => INT64
> >   record_set => BYTES
> > topic => STRING
> >   throttle_time_ms => INT32
> >
> > I don't think that was intentional though, because the similar produce
> response puts it on the end like documented:
>
> > Produce Response (Version: 2) => [responses] throttle_time_ms
> >   responses => topic [partition_responses]
> > partition_responses => partition error_code base_offset timestamp
> >   partition => INT32
> >   error_code => INT16
> >   base_offset => INT64
> >   timestamp => INT64
> > topic => STRING
> >   throttle_time_ms => INT32
> >
> > However, even if it was a mistake it wont change until at least the next
> protocol bump for fetch. Is it important to you that it be at the end
> functionally? Or just that the documentation is correct?
>
> Thanks,
> Grant
>
>
>
> On Thu, Mar 3, 2016 at 2:49 AM, Oleksiy Krivoshey 
> wrote:
>
> > It seems that Fetch Response V1 is not correctly documented:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> >
> > It says the response should be:
> >
> > FetchResponse => [TopicName [Partition ErrorCode HighwaterMarkOffset
> > MessageSetSize MessageSet]] ThrottleTime
> >
> > But it actually is (as of Kafka 0.9.0.1):
> >
> > FetchResponse => ThrottleTime [TopicName [Partition ErrorCode
> > HighwaterMarkOffset MessageSetSize MessageSet]]
> >
> > e.g. ThrottleTime comes first after the response header, not last.
> >
> > As a client library developer (https://github.com/oleksiyk/kafka) I
> would
> > like to know if its an error in documentation or in Kafka server?
> >
> > Thanks!
> >
> > --
> > Oleksiy Krivoshey
> >
>
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>


Mirrormaker only publishing to a single partition on destination cluster

2016-03-03 Thread Stephen Powis
Hey!

I'm using kafka 0.9.0.1 and trying to replicate a cluster from one
datacenter to another.  mirror-maker properly connects to my source cluster
and consumes messages, but for some reason is only publishing to a single
partition for my topic in the destination cluster.  So all of my partitions
for the topic are empty, except one, which contains everything from the
source cluster.

Has anyone seen this behavior before?  I must have something misconfigured,
but am unable to figure it out from reviewing the online docs.

Thanks!
Stephen


Re: [sdc-user] Re: Having trouble to connect StreamSets to Kafka with Kerberos authentication

2016-03-03 Thread Harikiran Nayak
Hi Michal,

Are you able to write and read from the kerberized Kafka setup using the
Kafka Console Producer and Consumer?

I am taking a look at your configuration files.

Thanks
Hari.

On Thu, Mar 3, 2016 at 8:09 AM Jonathan Natkins 
wrote:

> Hey Michal,
>
> I'm cc'ing the StreamSets user list, which might be able to get you some
> better StreamSets-specific answers.
>
> Thanks!
> Natty
>
> On Thursday, March 3, 2016, Michał Kabocik 
> wrote:
>
>> Dears,
>>
>> I’m Middleware Engineer and I’m trying to configure secure Kafka Cluster
>> with SSL and Kerberos authentication with StreamSets, which will be used
>> for data injection to HDP.
>>
>> I have two Kafka Clusters; one with SSL enabled and there I successfully
>> connected StreamSets to Kafka with SSL authentication, and second one with
>> Kerberos authentication and here I’m facing with the problem:
>>
>> Both Kafka (with Zookeeper) and StreamSets are configured to authenticate
>> via Kerberos. When starting all of them, I see in the logs, that they are
>> successfully authenticated (TGT granted etc.)
>>
>> I have two listeners defined in Kafka:
>> listeners=PLAINTEXT://:9092,SASL_PLAINTEXT://:9093. When starting Kafka, I
>> see Kafka listens on both, 9092 and 9093.
>>
>> When I connect StreamSets to Kafka on port 9092, everything works smooth.
>> But when I try to connect to port 9093, error occurs:
>>
>> KAFKA_41 - Could not get partition count for topic 'streamsets5' :
>> com.streamsets.pipeline.api.StageException: KAFKA_41 - Could not get
>> caseition count for topic 'streamsets5' :
>> org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
>>
>> I see no errors in Kafka, in the log of StreamSets, there is only above
>> error visible. I attached major config files of Kafka, Zookeeper and
>> StreamSets.
>>
>> Will greatly appreciate your help in solving this case!
>>
>> Kind regards,
>>
>
>
> --
> Jonathan "Natty" Natkins
> StreamSets | Field Engineer
> mobile: 609.577.1600 | linkedin 
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "sdc-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sdc-user+unsubscr...@streamsets.com.
> Visit this group at
> https://groups.google.com/a/streamsets.com/group/sdc-user/.
>


Re: Writing a Producer from Scratch

2016-03-03 Thread Péricé Robin
Oups didn't see your response.

Sorry

2016-03-03 17:23 GMT+01:00 Péricé Robin :

> Hi,
>
> Maybe you should look at this : https://github.com/edenhill/librdkafka
>
> Regards,
>
> Robin
>
> 2016-03-03 11:47 GMT+01:00 Hopson, Stephen :
>
>> Hi,
>>
>> Not sure if this is the right forum for this question, but if it not I’m
>> sure someone will direct me to the proper one.
>>
>> Also, I am new to Kafka (but not new to computers).
>>
>>
>>
>> I want to write a kafka producer client for a Unisys OS 2200 mainframe. I
>> need to write it in C, and since I have no access to Windows / Unix / Linux
>> libraries, I have to develop the interface at the lowest level.
>>
>>
>>
>> So far, I have downloaded a kafka server with associated zookeeper (kafka
>> _2.10-0.8.2.2). Note I have downloaded the Windows version and have it
>> running on my laptop, successfully tested on the same laptop with the
>> provided provider and consumer clients.
>>
>>
>>
>> I have developed code to open a TCP session to the kafka server which
>> appears to work and I have attempted to send a metadata request which does
>> not appear to work. When I say it does not appear to work, I mean that I
>> send the message and then I sit on a retrieve, which eventually times out (
>> I do seem to get one character in the receive buffer of 0235 octal). The
>> message format I am using is the one described by the excellent document by
>> Jay Creps / Gwen Shapira at
>> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>> However, it is not clear what level of kafka these message formats are
>> applicable for.
>>
>>
>>
>> Can anybody offer me any advice or suggestions as to how to progress?
>>
>>
>>
>> PS is the CRC mandatory in the Producer messages?
>>
>> Many thanks in advance.
>>
>>
>>
>> *Stephen Hopson* | Infrastructure Architect | Enterprise Solutions
>>
>> Unisys | +44 (0)1908 805010 | +44 (0)7557 303321 |
>> stephen.hop...@gb.unisys.com
>>
>>
>>
>> [image: unisys_logo] 
>>
>>
>>
>> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
>> MATERIAL and is for use only by the intended recipient. If you received
>> this in error, please contact the sender and delete the e-mail and its
>> attachments from all devices.
>>
>>
>>
>
>


Re: Writing a Producer from Scratch

2016-03-03 Thread Péricé Robin
Hi,

Maybe you should look at this : https://github.com/edenhill/librdkafka

Regards,

Robin

2016-03-03 11:47 GMT+01:00 Hopson, Stephen :

> Hi,
>
> Not sure if this is the right forum for this question, but if it not I’m
> sure someone will direct me to the proper one.
>
> Also, I am new to Kafka (but not new to computers).
>
>
>
> I want to write a kafka producer client for a Unisys OS 2200 mainframe. I
> need to write it in C, and since I have no access to Windows / Unix / Linux
> libraries, I have to develop the interface at the lowest level.
>
>
>
> So far, I have downloaded a kafka server with associated zookeeper (kafka
> _2.10-0.8.2.2). Note I have downloaded the Windows version and have it
> running on my laptop, successfully tested on the same laptop with the
> provided provider and consumer clients.
>
>
>
> I have developed code to open a TCP session to the kafka server which
> appears to work and I have attempted to send a metadata request which does
> not appear to work. When I say it does not appear to work, I mean that I
> send the message and then I sit on a retrieve, which eventually times out (
> I do seem to get one character in the receive buffer of 0235 octal). The
> message format I am using is the one described by the excellent document by
> Jay Creps / Gwen Shapira at
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> However, it is not clear what level of kafka these message formats are
> applicable for.
>
>
>
> Can anybody offer me any advice or suggestions as to how to progress?
>
>
>
> PS is the CRC mandatory in the Producer messages?
>
> Many thanks in advance.
>
>
>
> *Stephen Hopson* | Infrastructure Architect | Enterprise Solutions
>
> Unisys | +44 (0)1908 805010 | +44 (0)7557 303321 |
> stephen.hop...@gb.unisys.com
>
>
>
> [image: unisys_logo] 
>
>
>
> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
> MATERIAL and is for use only by the intended recipient. If you received
> this in error, please contact the sender and delete the e-mail and its
> attachments from all devices.
>
>
>


RE: Writing a Producer from Scratch

2016-03-03 Thread Hopson, Stephen
Hi Ben,
 Yes I have looked at that code. It relies heavily on either Windows or Linux / 
Unix libraries and even the C libraries are not compatible with the C compiler 
on my system (my mainframe is 36 bit WORD based) so there are a number of 
issues with sizes that I am aware of and have to deal with.

Thanks for the suggestion

Steve

Stephen Hopson | Infrastructure Architect | Enterprise Solutions
Unisys | +44 (0)1908 805010 | +44 (0)7557 303321 | stephen.hop...@gb.unisys.com 



THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is for use only by the intended recipient. If you received this in 
error, please contact the sender and delete the e-mail and its attachments from 
all devices.


-Original Message-
From: Ben Davison [mailto:ben.davi...@7digital.com] 
Sent: 03 March 2016 16:19
To: users@kafka.apache.org
Subject: Re: Writing a Producer from Scratch

Hi Stephen,

Have you checked out https://github.com/edenhill/librdkafka ? It might be
what you need (I don't do C, so it might not be right for you)

Regards,

Ben

On Thu, Mar 3, 2016 at 10:47 AM, Hopson, Stephen <
stephen.hop...@gb.unisys.com> wrote:

> Hi,
>
> Not sure if this is the right forum for this question, but if it not I’m
> sure someone will direct me to the proper one.
>
> Also, I am new to Kafka (but not new to computers).
>
>
>
> I want to write a kafka producer client for a Unisys OS 2200 mainframe. I
> need to write it in C, and since I have no access to Windows / Unix / Linux
> libraries, I have to develop the interface at the lowest level.
>
>
>
> So far, I have downloaded a kafka server with associated zookeeper (kafka
> _2.10-0.8.2.2). Note I have downloaded the Windows version and have it
> running on my laptop, successfully tested on the same laptop with the
> provided provider and consumer clients.
>
>
>
> I have developed code to open a TCP session to the kafka server which
> appears to work and I have attempted to send a metadata request which does
> not appear to work. When I say it does not appear to work, I mean that I
> send the message and then I sit on a retrieve, which eventually times out (
> I do seem to get one character in the receive buffer of 0235 octal). The
> message format I am using is the one described by the excellent document by
> Jay Creps / Gwen Shapira at
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> However, it is not clear what level of kafka these message formats are
> applicable for.
>
>
>
> Can anybody offer me any advice or suggestions as to how to progress?
>
>
>
> PS is the CRC mandatory in the Producer messages?
>
> Many thanks in advance.
>
>
>
> *Stephen Hopson* | Infrastructure Architect | Enterprise Solutions
>
> Unisys | +44 (0)1908 805010 | +44 (0)7557 303321 |
> stephen.hop...@gb.unisys.com
>
>
>
> [image: unisys_logo] 
>
>
>
> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
> MATERIAL and is for use only by the intended recipient. If you received
> this in error, please contact the sender and delete the e-mail and its
> attachments from all devices.
>
>
>

-- 


This email, including attachments, is private and confidential. If you have 
received this email in error please notify the sender and delete it from 
your system. Emails are not secure and may contain viruses. No liability 
can be accepted for viruses that might be transferred by this email or any 
attachment. Any unauthorised copying of this message or unauthorised 
distribution and publication of the information contained herein are 
prohibited.

7digital Limited. Registered office: 69 Wilson Street, London EC2A 2BB.
Registered in England and Wales. Registered No. 04843573.


Re: Writing a Producer from Scratch

2016-03-03 Thread Ben Davison
Hi Stephen,

Have you checked out https://github.com/edenhill/librdkafka ? It might be
what you need (I don't do C, so it might not be right for you)

Regards,

Ben

On Thu, Mar 3, 2016 at 10:47 AM, Hopson, Stephen <
stephen.hop...@gb.unisys.com> wrote:

> Hi,
>
> Not sure if this is the right forum for this question, but if it not I’m
> sure someone will direct me to the proper one.
>
> Also, I am new to Kafka (but not new to computers).
>
>
>
> I want to write a kafka producer client for a Unisys OS 2200 mainframe. I
> need to write it in C, and since I have no access to Windows / Unix / Linux
> libraries, I have to develop the interface at the lowest level.
>
>
>
> So far, I have downloaded a kafka server with associated zookeeper (kafka
> _2.10-0.8.2.2). Note I have downloaded the Windows version and have it
> running on my laptop, successfully tested on the same laptop with the
> provided provider and consumer clients.
>
>
>
> I have developed code to open a TCP session to the kafka server which
> appears to work and I have attempted to send a metadata request which does
> not appear to work. When I say it does not appear to work, I mean that I
> send the message and then I sit on a retrieve, which eventually times out (
> I do seem to get one character in the receive buffer of 0235 octal). The
> message format I am using is the one described by the excellent document by
> Jay Creps / Gwen Shapira at
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> However, it is not clear what level of kafka these message formats are
> applicable for.
>
>
>
> Can anybody offer me any advice or suggestions as to how to progress?
>
>
>
> PS is the CRC mandatory in the Producer messages?
>
> Many thanks in advance.
>
>
>
> *Stephen Hopson* | Infrastructure Architect | Enterprise Solutions
>
> Unisys | +44 (0)1908 805010 | +44 (0)7557 303321 |
> stephen.hop...@gb.unisys.com
>
>
>
> [image: unisys_logo] 
>
>
>
> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
> MATERIAL and is for use only by the intended recipient. If you received
> this in error, please contact the sender and delete the e-mail and its
> attachments from all devices.
>
>
>

-- 


This email, including attachments, is private and confidential. If you have 
received this email in error please notify the sender and delete it from 
your system. Emails are not secure and may contain viruses. No liability 
can be accepted for viruses that might be transferred by this email or any 
attachment. Any unauthorised copying of this message or unauthorised 
distribution and publication of the information contained herein are 
prohibited.

7digital Limited. Registered office: 69 Wilson Street, London EC2A 2BB.
Registered in England and Wales. Registered No. 04843573.


Re: Having trouble to connect StreamSets to Kafka with Kerberos authentication

2016-03-03 Thread Jonathan Natkins
Hey Michal,

I'm cc'ing the StreamSets user list, which might be able to get you some
better StreamSets-specific answers.

Thanks!
Natty

On Thursday, March 3, 2016, Michał Kabocik  wrote:

> Dears,
>
> I’m Middleware Engineer and I’m trying to configure secure Kafka Cluster
> with SSL and Kerberos authentication with StreamSets, which will be used
> for data injection to HDP.
>
> I have two Kafka Clusters; one with SSL enabled and there I successfully
> connected StreamSets to Kafka with SSL authentication, and second one with
> Kerberos authentication and here I’m facing with the problem:
>
> Both Kafka (with Zookeeper) and StreamSets are configured to authenticate
> via Kerberos. When starting all of them, I see in the logs, that they are
> successfully authenticated (TGT granted etc.)
>
> I have two listeners defined in Kafka:
> listeners=PLAINTEXT://:9092,SASL_PLAINTEXT://:9093. When starting Kafka, I
> see Kafka listens on both, 9092 and 9093.
>
> When I connect StreamSets to Kafka on port 9092, everything works smooth.
> But when I try to connect to port 9093, error occurs:
>
> KAFKA_41 - Could not get partition count for topic 'streamsets5' :
> com.streamsets.pipeline.api.StageException: KAFKA_41 - Could not get
> caseition count for topic 'streamsets5' :
> org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
>
> I see no errors in Kafka, in the log of StreamSets, there is only above
> error visible. I attached major config files of Kafka, Zookeeper and
> StreamSets.
>
> Will greatly appreciate your help in solving this case!
>
> Kind regards,
>


-- 
Jonathan "Natty" Natkins
StreamSets | Field Engineer
mobile: 609.577.1600 | linkedin 


Re: Fetch Response V1 incorrectly documented

2016-03-03 Thread Grant Henke
It looks like you are right, the throttle time does come first. I have a
WIP implementation (https://github.com/apache/kafka/pull/970) that
generates the protocol docs based on the protocol specification in the code
and the output for fetch response v2 is:

> Fetch Response (Version: 2) => throttle_time_ms [responses]
>   responses => topic [partition_responses]
> partition_responses => partition error_code high_watermark record_set
>   partition => INT32
>   error_code => INT16
>   high_watermark => INT64
>   record_set => BYTES
> topic => STRING
>   throttle_time_ms => INT32
>
> I don't think that was intentional though, because the similar produce
response puts it on the end like documented:

> Produce Response (Version: 2) => [responses] throttle_time_ms
>   responses => topic [partition_responses]
> partition_responses => partition error_code base_offset timestamp
>   partition => INT32
>   error_code => INT16
>   base_offset => INT64
>   timestamp => INT64
> topic => STRING
>   throttle_time_ms => INT32
>
> However, even if it was a mistake it wont change until at least the next
protocol bump for fetch. Is it important to you that it be at the end
functionally? Or just that the documentation is correct?

Thanks,
Grant



On Thu, Mar 3, 2016 at 2:49 AM, Oleksiy Krivoshey 
wrote:

> It seems that Fetch Response V1 is not correctly documented:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>
> It says the response should be:
>
> FetchResponse => [TopicName [Partition ErrorCode HighwaterMarkOffset
> MessageSetSize MessageSet]] ThrottleTime
>
> But it actually is (as of Kafka 0.9.0.1):
>
> FetchResponse => ThrottleTime [TopicName [Partition ErrorCode
> HighwaterMarkOffset MessageSetSize MessageSet]]
>
> e.g. ThrottleTime comes first after the response header, not last.
>
> As a client library developer (https://github.com/oleksiyk/kafka) I would
> like to know if its an error in documentation or in Kafka server?
>
> Thanks!
>
> --
> Oleksiy Krivoshey
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Writing a Producer from Scratch

2016-03-03 Thread Hopson, Stephen
Hi,
Not sure if this is the right forum for this question, but if it not I'm sure 
someone will direct me to the proper one.
Also, I am new to Kafka (but not new to computers).

I want to write a kafka producer client for a Unisys OS 2200 mainframe. I need 
to write it in C, and since I have no access to Windows / Unix / Linux 
libraries, I have to develop the interface at the lowest level.

So far, I have downloaded a kafka server with associated zookeeper (kafka 
_2.10-0.8.2.2). Note I have downloaded the Windows version and have it running 
on my laptop, successfully tested on the same laptop with the provided provider 
and consumer clients.

I have developed code to open a TCP session to the kafka server which appears 
to work and I have attempted to send a metadata request which does not appear 
to work. When I say it does not appear to work, I mean that I send the message 
and then I sit on a retrieve, which eventually times out ( I do seem to get one 
character in the receive buffer of 0235 octal). The message format I am using 
is the one described by the excellent document by Jay Creps / Gwen Shapira at 
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol 
 However, it is not clear what level of kafka these message formats are 
applicable for.

Can anybody offer me any advice or suggestions as to how to progress?

PS is the CRC mandatory in the Producer messages?
Many thanks in advance.

Stephen Hopson | Infrastructure Architect | Enterprise Solutions
Unisys | +44 (0)1908 805010 | +44 (0)7557 303321 | 
stephen.hop...@gb.unisys.com

[unisys_logo]

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is for use only by the intended recipient. If you received this in 
error, please contact the sender and delete the e-mail and its attachments from 
all devices.



Re: migrating the main-page docs to gitbook format

2016-03-03 Thread Marcos Luis Ortiz Valmaseda
I love that too
+1

2016-03-02 21:15 GMT-05:00 Christian Posta :

> For sure! Will take a look!
>
> On Wednesday, March 2, 2016, Gwen Shapira  wrote:
>
> > Hey!
> >
> > Yes! We'd love that too! Maybe you want to help us out with
> > https://issues.apache.org/jira/browse/KAFKA-2967 ?
> >
> > Gwen
> >
> > On Wed, Mar 2, 2016 at 2:39 PM, Christian Posta
> > > wrote:
> > > Would love to have the docs in gitbook/markdown format so they can
> easily
> > > be viewed from the source repo (or mirror, technically) on github.com.
> > They
> > > can also be easily converted to HTML, have a side-navigation ToC, and
> > still
> > > be versioned along with the src code.
> > >
> > > Thoughts?
> > >
> > > --
> > > *Christian Posta*
> > > twitter: @christianposta
> > > http://www.christianposta.com/blog
> > > http://fabric8.io
> >
>
>
> --
> *Christian Posta*
> twitter: @christianposta
> http://www.christianposta.com/blog
> http://fabric8.io
>


RE: Consumer deadlock

2016-03-03 Thread Muthukumaran K
Ewen, 

By new Consumer API, you mean KafkaConsumer ? I have an issue with a poll in 
0.9.0.1. poll hangs indefinitely even with the timeout

Following is the consumer code which I am using. Any pointers would be helpful

public class ConsumerLoop implements Runnable {


private final KafkaConsumer consumer;
private final List topics;
private final int id;

public ConsumerLoop(int id,
  String groupId,
  List topics) {
this.id = id;
this.topics = topics;
Properties props = new Properties();
props.put("bootstrap.servers", "192.168.56.101:9092");
props.put("group.id", groupId);
props.put("auto.offset.reset", "earliest");
props.put("key.deserializer", 
"org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", 
"org.apache.kafka.common.serialization.StringDeserializer");
props.put("metadata.fetch.timeout.ms", 1);

this.consumer = new KafkaConsumer<>(props);

}

@Override
public void run() {
try {

System.out.println("Starting consumer ID : " + id + 
" Thread : " + Thread.currentThread().getName() + 
" Topic : " + topics.toString() +
" ... ");
long startTime = System.currentTimeMillis();
int recordCount = 0;

  consumer.subscribe(topics);

  System.out.println("Consumer-ID " + id + " after subscribe...");

  while (true) {
ConsumerRecords records = consumer.poll(1);

System.out.println("Consumer-ID " + id + " after poll...");


for (ConsumerRecord record : records) {
  Map data = new HashMap<>();
  data.put("partition", record.partition());
  data.put("offset", record.offset());
  data.put("value", record.value());
  System.out.println(
  "Consumer-ID : " + this.id +
  ": " + data +
  " Thread_name : " + 
Thread.currentThread().getName());
  recordCount++;

}
long endTime = System.currentTimeMillis();
long duration = (endTime - startTime)/1000;
System.out.println("## rate : " + recordCount/duration + " 
msgs/sec on Consumer ID " + id);

  }
} catch (WakeupException e) {
  // ignore for shutdown
} finally {
  consumer.close();
}
}

public void shutdown() {

consumer.wakeup();
}

Regards
Muthu


-Original Message-
From: Ewen Cheslack-Postava [mailto:e...@confluent.io] 
Sent: Thursday, March 03, 2016 2:21 PM
To: users@kafka.apache.org
Subject: Re: Consumer deadlock

Take a look at the consumer.timeout.ms setting if you don't want the iterator 
to block indefinitely.

And a better long term solution is to switch to the new consumer, but that 
obviously requires much more significant code changes. The new consumer API is 
a single-threaded poll-based API where you can always specify timeouts to the 
consumer's poll() method (though it currently has some limitations to how it 
enforces that timeout).

-Ewen

On Wed, Mar 2, 2016 at 11:24 AM, Oleg Zhurakousky < 
ozhurakou...@hortonworks.com> wrote:

> Guys
>
> We have a consumer deadlock and here is the relevant dump:
>
> "Timer-Driven Process Thread-10" Id=76 TIMED_WAITING  on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@39775787
> at sun.misc.Unsafe.park(Native Method)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at
> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:65)
> at
> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
> at
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
> . . . . .
>
> What worries me is the fact that ‘hasNext’ is essentially a blocking 
> operation. I can’t seem to find a single reason when it would be 
> useful, hence I am calling it a bug, but hopefully someone can clarify.
> Kafka version is 0.8.*
>
> Cheers
> Oleg
>
>


--
Thanks,
Ewen


Re: Consumer deadlock

2016-03-03 Thread Ewen Cheslack-Postava
Yes, mapping an iterator interface to a networked service can have some
drawbacks :) Providing a simple, familiar interface to Kafka is challenging
-- if you didn't block and hasNext() returned false, what would you do
after that? That return value doesn't indicate that the iterable for a
Kafka topic ended, just that there wasn't anything available *immediately*.
But once hasNext() returns false, it's expected that it will always return
false and that there won't be more elements from the Iterator. With a Kafka
topic, there isn't really an end to the collection, there just might not be
any data available immediately.

When it comes down to it, there's a mismatch between the Java Iterator
interface and what Kafka is trying to provide. That's one of the reasons
the consumer interface has been rethought and looks significantly different
in the new consumer:
http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html

-Ewen

On Thu, Mar 3, 2016 at 1:51 AM, Oleg Zhurakousky <
ozhurakou...@hortonworks.com> wrote:

> Thank you Ewan, I'll give it a shot, but I am still puzzled as to the
> reasoning behind implementing hasNext() in a manner that it can block.
> Isn't that counter to the whole point of the method? You either have it or
> you not. Blocking in the off chance one may have it in some time in the
> future simply means it's not there at the moment, so why not return false
> and let user retry?
>
> Sent from my iPhone
>
> > On Mar 3, 2016, at 03:51, Ewen Cheslack-Postava 
> wrote:
> >
> > Take a look at the consumer.timeout.ms setting if you don't want the
> > iterator to block indefinitely.
> >
> > And a better long term solution is to switch to the new consumer, but
> that
> > obviously requires much more significant code changes. The new consumer
> API
> > is a single-threaded poll-based API where you can always specify timeouts
> > to the consumer's poll() method (though it currently has some limitations
> > to how it enforces that timeout).
> >
> > -Ewen
> >
> > On Wed, Mar 2, 2016 at 11:24 AM, Oleg Zhurakousky <
> > ozhurakou...@hortonworks.com> wrote:
> >
> >> Guys
> >>
> >> We have a consumer deadlock and here is the relevant dump:
> >>
> >> "Timer-Driven Process Thread-10" Id=76 TIMED_WAITING  on
> >>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@39775787
> >>at sun.misc.Unsafe.park(Native Method)
> >>at
> >> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> >>at
> >>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> >>at
> >>
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> >>at
> >> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:65)
> >>at
> >> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
> >>at
> >> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
> >>at
> kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
> >>. . . . .
> >>
> >> What worries me is the fact that ‘hasNext’ is essentially a blocking
> >> operation. I can’t seem to find a single reason when it would be useful,
> >> hence I am calling it a bug, but hopefully someone can clarify.
> >> Kafka version is 0.8.*
> >>
> >> Cheers
> >> Oleg
> >
> >
> > --
> > Thanks,
> > Ewen
>



-- 
Thanks,
Ewen


Re: Consumer deadlock

2016-03-03 Thread Oleg Zhurakousky
Thank you Ewan, I'll give it a shot, but I am still puzzled as to the reasoning 
behind implementing hasNext() in a manner that it can block. Isn't that counter 
to the whole point of the method? You either have it or you not. Blocking in 
the off chance one may have it in some time in the future simply means it's not 
there at the moment, so why not return false and let user retry?

Sent from my iPhone

> On Mar 3, 2016, at 03:51, Ewen Cheslack-Postava  wrote:
> 
> Take a look at the consumer.timeout.ms setting if you don't want the
> iterator to block indefinitely.
> 
> And a better long term solution is to switch to the new consumer, but that
> obviously requires much more significant code changes. The new consumer API
> is a single-threaded poll-based API where you can always specify timeouts
> to the consumer's poll() method (though it currently has some limitations
> to how it enforces that timeout).
> 
> -Ewen
> 
> On Wed, Mar 2, 2016 at 11:24 AM, Oleg Zhurakousky <
> ozhurakou...@hortonworks.com> wrote:
> 
>> Guys
>> 
>> We have a consumer deadlock and here is the relevant dump:
>> 
>> "Timer-Driven Process Thread-10" Id=76 TIMED_WAITING  on
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@39775787
>>at sun.misc.Unsafe.park(Native Method)
>>at
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>>at
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>>at
>> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
>>at
>> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:65)
>>at
>> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
>>at
>> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
>>at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
>>. . . . .
>> 
>> What worries me is the fact that ‘hasNext’ is essentially a blocking
>> operation. I can’t seem to find a single reason when it would be useful,
>> hence I am calling it a bug, but hopefully someone can clarify.
>> Kafka version is 0.8.*
>> 
>> Cheers
>> Oleg
> 
> 
> -- 
> Thanks,
> Ewen


Re: Announcing rdkafka-dotnet - C# Apache Kafka client

2016-03-03 Thread Ewen Cheslack-Postava
Andreas,

Thanks, this looks great and I think its leveraging the existing
functionality available in librdkafka is a great choice. I've added this to
the wiki page for clients:
https://cwiki.apache.org/confluence/display/KAFKA/Clients

-Ewen

On Wed, Mar 2, 2016 at 12:07 PM, Andreas Heider  wrote:

> Hi,
>
> I was really missing a high-quality Kafka client for C#/F#, so I built
> one: https://github.com/ah-/rdkafka-dotnet
> It’s based on the fantastic librdkafka, so it supports pretty much
> everything you might want:
>
> - High performance (I'm getting ~1 million msgs/second producing/consuming
> on my laptop with a single process)
> - An API close to the new Java Consumer / Producer
> - Kafka 0.9 consumer groups and broker offset storage
> - Working failover
> - Auto-committing of offsets
> - Compression with snappy and gzip
> - Metadata API to query for topics and offsets
> - Custom message partitioners
> - No zookeeper dependency
>
> It runs on .NET Core on Linux, OS X and Windows, and classic .NET 4.5 on
> Windows. Mono should work, but I haven’t tested it outside dnx.
>
> Cheers,
> Andreas




-- 
Thanks,
Ewen


Re: Consumer deadlock

2016-03-03 Thread Ewen Cheslack-Postava
Take a look at the consumer.timeout.ms setting if you don't want the
iterator to block indefinitely.

And a better long term solution is to switch to the new consumer, but that
obviously requires much more significant code changes. The new consumer API
is a single-threaded poll-based API where you can always specify timeouts
to the consumer's poll() method (though it currently has some limitations
to how it enforces that timeout).

-Ewen

On Wed, Mar 2, 2016 at 11:24 AM, Oleg Zhurakousky <
ozhurakou...@hortonworks.com> wrote:

> Guys
>
> We have a consumer deadlock and here is the relevant dump:
>
> "Timer-Driven Process Thread-10" Id=76 TIMED_WAITING  on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@39775787
> at sun.misc.Unsafe.park(Native Method)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at
> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:65)
> at
> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
> at
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
> . . . . .
>
> What worries me is the fact that ‘hasNext’ is essentially a blocking
> operation. I can’t seem to find a single reason when it would be useful,
> hence I am calling it a bug, but hopefully someone can clarify.
> Kafka version is 0.8.*
>
> Cheers
> Oleg
>
>


-- 
Thanks,
Ewen


Fetch Response V1 incorrectly documented

2016-03-03 Thread Oleksiy Krivoshey
It seems that Fetch Response V1 is not correctly documented:

https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol

It says the response should be:

FetchResponse => [TopicName [Partition ErrorCode HighwaterMarkOffset
MessageSetSize MessageSet]] ThrottleTime

But it actually is (as of Kafka 0.9.0.1):

FetchResponse => ThrottleTime [TopicName [Partition ErrorCode
HighwaterMarkOffset MessageSetSize MessageSet]]

e.g. ThrottleTime comes first after the response header, not last.

As a client library developer (https://github.com/oleksiyk/kafka) I would
like to know if its an error in documentation or in Kafka server?

Thanks!

-- 
Oleksiy Krivoshey


Re: session.timeout.ms limit - Kafka Consumer

2016-03-03 Thread Ewen Cheslack-Postava
In fact, KIP-41 has been implemented in trunk -- see
https://issues.apache.org/jira/browse/KAFKA-3007. Testing against a version
including that change would be greatly appreciated to ensure it fully
addresses the problems you're seeing.

-Ewen

On Wed, Mar 2, 2016 at 7:00 AM, Olson,Andrew  wrote:

> This topic is currently being discussed at
> https://issues.apache.org/jira/browse/KAFKA-2986 and
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-41%3A+KafkaConsumer+Max+Records
>
>
> On 3/2/16, 8:11 AM, "Vanessa Gligor"  wrote:
>
> >Hello,
> >
> >I am using Kafka higher consumer 0.9.0. I am not using the auto commit for
> >the offsets, so after I consume the messaged (poll from Kafka) I will have
> >to commit the offsets manually.
> >
> >The issue that I have is actually that the processing of the messages
> takes
> >longer than 30s (and I cannot call poll again, before these messages are
> >processed) and when I try to commit the offset a exception is thrown:
> >ERROR o.a.k.c.c.i.ConsumerCoordinator - Error ILLEGAL_GENERATION occurred
> >while committing offsets for group MetadataConsumerSpout.
> >(I have found on stackoverflow this explanation: so if you wait for longer
> >that the timeout request then the coordinator for the topic will kickout
> >the consumer because it will think is dead and it will rebalance the
> group)
> >
> >In order to get rid of this I have thought about a couple of solutions:
> >
> >1. The configuration session.timeout.ms has a maximum value, so if I try
> to
> >set it to 60 seconds, also I get an exception, because this value is not
> in
> >the valid interval.
> >
> >2. I have tried to find a solution to get a paginated request when the
> >polling method is called - no success.
> >
> >3. I have tried to send a heart beat from the outside of the poll (because
> >this method sends the heartbeats) - no success.
> >
> >
> >Thank you.
>
> CONFIDENTIALITY NOTICE This message and any included attachments are from
> Cerner Corporation and are intended only for the addressee. The information
> contained in this message is confidential and may constitute inside or
> non-public information under international, federal, or state securities
> laws. Unauthorized forwarding, printing, copying, distribution, or use of
> such information is strictly prohibited and may be unlawful. If you are not
> the addressee, please promptly delete this message and notify the sender of
> the delivery error by e-mail or you may call Cerner's corporate offices in
> Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
>



-- 
Thanks,
Ewen