Re: kafka stream to new topic based on message key

2016-10-06 Thread Guozhang Wang
If you can create a ZK client inside your processor implementation then you
can definitely to create any topics by talking to ZK directly, it's just
that Kafka Streams public interface does not expose any efficient ways
beyond that for now.

Note that in KIP-4 we are trying to introduce the admin client for such
tasks such as create / delete topics, it has added such requests in the
upcoming 0.10.1.0 release, but the full implementation is yet to be
completed.


Guozhang


On Thu, Oct 6, 2016 at 12:48 PM, Gary Ogden  wrote:

> Thanks Guozhang. I've gotten an example to work using your tips.
>
> So, is there no other way in streams to create a topic if
> "auto.topic.create.enabled"
> is set to false?  Maybe by creating a record in zookeeper for that topic?
>
>
>
> On 5 October 2016 at 17:20, Guozhang Wang  wrote:
>
> > Hello Gary,
> >
> >
> > 1. The InternalTopicManager is only used by the Streams-instantiated
> > PartitionAssignor to create internal topics for auto-repartitioning and
> > changelog.
> >
> > 2. About "RecordCollector.Supplier": you are right, and as I wrote in the
> > above email you have to force casting it to RecordCollector.Supplier,
> > theoretically this is not safe but the internal Impl is always used.
> >
> >
> > If you know before hand of all the possible topics that you would want to
> > send based on the key-value pair, you can then use KStreams.branch() to
> > branch the source stream into multiple ones based on the content, with
> each
> > branched stream to a different topic.
> >
> >
> > Guozhang
> >
> >
> > On Wed, Oct 5, 2016 at 7:48 AM, Gary Ogden  wrote:
> >
> > > Guozhang. I was just looking at the source for this, and it looks like
> > the
> > > RecordCollector.Supplier is part of the internal ProcessorContextImpl
> > > class.  I don't think that's exposed to me, is it?
> > >
> > > If I create a processor class that extends AbstractProcess, it only has
> > > access to the ProcessorContext interface, which doesn't expose the
> > > Supplier.
> > >
> > > On 5 October 2016 at 09:42, Gary Ogden  wrote:
> > >
> > > > What if we were to use kafka connect instead of streams? Does it have
> > the
> > > > ability to specify partitions, rf, segment size etc?
> > > >
> > > > On 5 October 2016 at 09:42, Gary Ogden  wrote:
> > > >
> > > >> Thanks Guozhang.
> > > >>
> > > >> So there's no way we could also use InternalTopicManager to specify
> > the
> > > >> number of partitions and RF?
> > > >>
> > > >> https://github.com/apache/kafka/blob/0.10.1/streams/src/main
> > > >> /java/org/apache/kafka/streams/processor/internals/InternalT
> > > >> opicManager.java
> > > >>
> > > >> On 4 October 2016 at 19:34, Guozhang Wang 
> wrote:
> > > >>
> > > >>> Hello Gary,
> > > >>>
> > > >>> This is also doable in the Processor API, you can use the record
> > > >>> collector
> > > >>> from ProcessorContext to send data to arbitrary topics, i.e.:
> > > >>>
> > > >>> RecordCollector collector = ((RecordCollector.Supplier)
> > > >>> context).recordCollector();
> > > >>> collector.send(new ProducerRecord<>(topic, *...*), keySerializer,
> > > >>> valSerializer, partitioner);
> > > >>>
> > > >>>
> > > >>> But note that if the new topic, e.g. "123456_lv2" is not created,
> > then
> > > >>> the send call will thrown an exception unless the borker-side
> config
> > > >>> "auto.topic.create.enabled" is set to true; and even in this case,
> > the
> > > >>> topic will be auto-created with the pre-defined number of
> partitions,
> > > >>> i.e. you cannot control how the topics can be created with what
> > > >>> configs such as compaction policy, num.partitions, segment sizes,
> > etc.
> > > >>> If that works for you then I think it should be fine.
> > > >>>
> > > >>>
> > > >>> Guozhang
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Tue, Oct 4, 2016 at 12:51 PM, Gary Ogden 
> > wrote:
> > > >>>
> > > >>> > Is it possible, in a kafka streaming job, to write to another
> topic
> > > >>> based
> > > >>> > on the key in the messages?
> > > >>> >
> > > >>> > For example, say the message is:
> > > >>> >
> > > >>> > 123456#{"id":56789,"type":1}
> > > >>> >
> > > >>> > where the key is 123456, # is the delimeter, and the {} is the
> json
> > > >>> data.
> > > >>> >
> > > >>> > And I want to push the json data to another topic that will have
> > the
> > > >>> name
> > > >>> > 123456_lv2.
> > > >>> >
> > > >>> > Is this possible with kafka streaming?
> > > >>> >
> > > >>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> -- Guozhang
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang


kafka-plugin in ranger didn't success

2016-10-06 Thread Duan Xiong

Dear Sir:
 I'm sorry to bother you.I am a sudent,and I have great interest  in 
apache kafka,I find this email address in http://kafka.apache.org/contact , and 
I findthis is a  very professional club,So I send  a letter for  help.In CLI,My 
Kafka can work success,So I very confused.I don't use  like  HDP  software,Just 
install  kafka-plugin in ranger.I really need your help.I hope you can help me 
and  receive your reply.Thank you very much.

Question:

kafka-plugin in ranger didn't success

software version:

ranger-0.6.2
ranger-0.6.2-SNAPSHOT-kafka-plugin

I tried to install kafka-plugin in ranger,kafks can connected successfully,but 
In ranger Plugins ,I can't  see kafka,   Did I miss any configuration setting 
of kafka or misunderstanding  something else?
[cid:_Foxmail.1@242088be-cab3-a1ba-6675-7e57bd5f357e]

it_duanxi...@outlook.com


Re: Printing to stdin from KStreams?

2016-10-06 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

If you restart your application, it will resume where is left off
(same as any other Kafka consumer that does use group management and
commits offsets).

If you want to reprocess data from scratch, you need to reset your
application using bin/kafka-streams-application-reset.sh

See also
http://docs.confluent.io/3.0.1/streams/developer-guide.html#application-
reset-tool

and
http://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resett
ing-a-streams-application/


About the timestamp issue: it seems that your Go client does not
assign valid timestamps when writing the data. As you already said,
you need to provide a custom TimestampExtractor (or you
WallclockTimestampExtractor if semantic permit) instead of default
ConsumerRecordTimestampExtractor)


- -Matthias

On 10/6/16 7:53 PM, Ali Akhtar wrote:
> Thanks.
> 
> I'm encountering a strange issue.
> 
> If I create messages thru console-producer.sh on a new topic,
> things work fine.
> 
> But on the topic that I need to consume, the messages are being
> produced via the go kafka plugin.
> 
> On this topic, at first, nothing happens when the stream starts
> (i.e it doesn't process the messages which are already in there)
> 
> Then, if I produce new messages, then my exception handler is
> called with the exception that timestamp is negative.
> 
> I'm pretty sure I'm on kafka 0.10. I downloaded it 2-3 weeks ago.
> 
> I'm going to write a new timestamp extractor, but any ideas why
> nothing happens with the old messages which are in the topic, it
> only responds if i push new messages to this topic?
> 
> On Fri, Oct 7, 2016 at 7:25 AM, Matthias J. Sax
>  wrote:
> 
> Sure.
> 
> Just use #print() or #writeAsText()
> 
> 
> -Matthias
> 
> On 10/6/16 6:25 PM, Ali Akhtar wrote:
 What the subject says. For dev, it would be a lot easier if 
 debugging info can be printed to stdin instead of another
 topic, where it will persist.
 
 Any ideas if this is possible?
 
>> 
> 
-BEGIN PGP SIGNATURE-
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJX9xO7AAoJECnhiMLycopPLFwQAK76xmPobB5xFdE/qFWm94d5
8lj8LahMMBUCG4xWCEs4shvGHjkh2kx2cUZmdgkUSLtEy7HGK6MgmjTa8Smse+0f
JxQ0f/F8lkMvJKhuw9wmgOKnT/b/U/jRXvUedWvYXp/r7d8Z36DgW9hzO9Yx7Ugq
qafvCfMdScE4FZEOoU/wJLiRJ3FZZsEobQSbyXR9Vmjs9UYUExUpq02B2N0ItvWz
6JPtWNC2PWSlc7j7C7PK0XYeexuE/ZK9yLrM7iuXh6fYeTy3MtBV3pHsDn3d+I2m
AOUUMyJt4kaSyMX0BzWZVVKZVvdw7rbbGfsZisw67Mko2p+De2KoIEF3yEEvYEit
Vks00KzGZ1gvGdMDvKzJJRkMVLUxl2R4LxH/TEwXKn5WYQullEHtQ3xb0U7sl2Ae
UkIgOw0LY3sQj7NL2OOnt9dMS5m0r+FZPlMoYNmN7coAGxo98iKacIqR1tc3f2qs
NxW2iUvD9lzVaZhaMOY3AjD1Q2G7yyQ+wRdlcZtNkAAID4mNrbyu7MKT7x6paLUf
OXGjtl8tcMwegSqZtpEZnJFSquu0SuX2UraDWDiz6NaW+TYIM8Qnq3oF9gWDQX+I
gvtqMiGzxxs4ZW9ysueT+X1MRoPRrnwlXPQ7XVFR6oKMEJrw0W2x8TkyHJiXVdls
ZBA0KEjx9U8NNf+eiqN5
=UMbs
-END PGP SIGNATURE-


Re: Printing to stdin from KStreams?

2016-10-06 Thread Ali Akhtar
Thanks.

I'm encountering a strange issue.

If I create messages thru console-producer.sh on a new topic, things work
fine.

But on the topic that I need to consume, the messages are being produced
via the go kafka plugin.

On this topic, at first, nothing happens when the stream starts (i.e it
doesn't process the messages which are already in there)

Then, if I produce new messages, then my exception handler is called with
the exception that timestamp is negative.

I'm pretty sure I'm on kafka 0.10. I downloaded it 2-3 weeks ago.

I'm going to write a new timestamp extractor, but any ideas why nothing
happens with the old messages which are in the topic, it only responds if i
push new messages to this topic?

On Fri, Oct 7, 2016 at 7:25 AM, Matthias J. Sax 
wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Sure.
>
> Just use #print() or #writeAsText()
>
>
> - -Matthias
>
> On 10/6/16 6:25 PM, Ali Akhtar wrote:
> > What the subject says. For dev, it would be a lot easier if
> > debugging info can be printed to stdin instead of another topic,
> > where it will persist.
> >
> > Any ideas if this is possible?
> >
> -BEGIN PGP SIGNATURE-
> Comment: GPGTools - https://gpgtools.org
>
> iQIcBAEBCgAGBQJX9weuAAoJECnhiMLycopPtbIP/R2hJgcfOtPqlRqbuQ1590D8
> zznMVi9TPPZM4RiGInuUFpefT3Emwxh0lbkNziEQKh+CmPjFqL6nUqssK+2G4dgf
> Aer3s5lMa0gz/AxhOwYi97VjptCGT21zxop5gbgWi4FlNapncHAFiziS5WxDkJgf
> LqJR6/isexI5Av5pQQkXvxQ7/S7GWFQKHygrqKJqGP6N5+kFtpPjJrzBie3VLuCN
> nMkgInWPFMoLNvLUCHGHIClPJe1Go+/4guXUuy1xDWHjeEHnT26+KnpOlTfuiBU6
> rNWoJqFWUb4je+QVQt/9izCuaBQQhPp4HaH2e5F9DJd72VOjg3e0CB1uRYVf4qCM
> zhXhiXXkDKfunxhMHcPFTXITPXClwRxhTJnDuYWOFaMPyrwDeXz3c3hMEaJlNRDe
> PyIvsh3OWcA9IxGjsvHETtFK//ujBDn3zh9t/lkQlj36s+h/Sz+NMYMrFh1ipPbR
> yd/YwYUNnRp5f8udmWOktUvx1RUc8vjwA8VXCppiIiCLUhMF8HH8lFNRzOgTz+PF
> kV//LXSpovfzlPDDfuK1WijgjG1su/IWSpOGBqOM56fke9/xtoY5T5iqYVyvIsDI
> I3I4M8xOSZHRm4ye+OX+qa9Cwf4h0/LcdWJkbQT1TgNvYf0dOTZDKsLvcy/tOk/U
> 2SXFUX7k+6Es38vhcfXU
> =ypHP
> -END PGP SIGNATURE-
>


Re: Printing to stdin from KStreams?

2016-10-06 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Sure.

Just use #print() or #writeAsText()


- -Matthias

On 10/6/16 6:25 PM, Ali Akhtar wrote:
> What the subject says. For dev, it would be a lot easier if
> debugging info can be printed to stdin instead of another topic,
> where it will persist.
> 
> Any ideas if this is possible?
> 
-BEGIN PGP SIGNATURE-
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJX9weuAAoJECnhiMLycopPtbIP/R2hJgcfOtPqlRqbuQ1590D8
zznMVi9TPPZM4RiGInuUFpefT3Emwxh0lbkNziEQKh+CmPjFqL6nUqssK+2G4dgf
Aer3s5lMa0gz/AxhOwYi97VjptCGT21zxop5gbgWi4FlNapncHAFiziS5WxDkJgf
LqJR6/isexI5Av5pQQkXvxQ7/S7GWFQKHygrqKJqGP6N5+kFtpPjJrzBie3VLuCN
nMkgInWPFMoLNvLUCHGHIClPJe1Go+/4guXUuy1xDWHjeEHnT26+KnpOlTfuiBU6
rNWoJqFWUb4je+QVQt/9izCuaBQQhPp4HaH2e5F9DJd72VOjg3e0CB1uRYVf4qCM
zhXhiXXkDKfunxhMHcPFTXITPXClwRxhTJnDuYWOFaMPyrwDeXz3c3hMEaJlNRDe
PyIvsh3OWcA9IxGjsvHETtFK//ujBDn3zh9t/lkQlj36s+h/Sz+NMYMrFh1ipPbR
yd/YwYUNnRp5f8udmWOktUvx1RUc8vjwA8VXCppiIiCLUhMF8HH8lFNRzOgTz+PF
kV//LXSpovfzlPDDfuK1WijgjG1su/IWSpOGBqOM56fke9/xtoY5T5iqYVyvIsDI
I3I4M8xOSZHRm4ye+OX+qa9Cwf4h0/LcdWJkbQT1TgNvYf0dOTZDKsLvcy/tOk/U
2SXFUX7k+6Es38vhcfXU
=ypHP
-END PGP SIGNATURE-


Printing to stdin from KStreams?

2016-10-06 Thread Ali Akhtar
What the subject says. For dev, it would be a lot easier if debugging info
can be printed to stdin instead of another topic, where it will persist.

Any ideas if this is possible?


Re: [VOTE] 0.10.1.0 RC0

2016-10-06 Thread Henry Cai
Why is this feature in the release note?


   - [KAFKA-264 ] - Change
   the consumer side load balancing and distributed co-ordination to use a
   consumer co-ordinator

I thought this was already done in 2015.

On Thu, Oct 6, 2016 at 4:55 PM, Vahid S Hashemian  wrote:

> Jason,
>
> Thanks a lot for managing this release.
>
> I ran the quick start (Steps 2-8) with this release candidate on Ubuntu,
> Windows, and Mac and they mostly look great.
> These are some, hopefully, minor items and gaps I noticed with respect to
> the existing quick start documentation (and the updated quick start that
> leverages the new consumer).
> They may very well be carryovers from previous releases, or perhaps
> specific to my local environments.
> Hopefully others can confirm.
>
>
> Windows
>
> Since there are separate scripts on Windows platform, it probably would
> help if that is clarified in the quick start section. E.g. "On Windows
> platform replace `bin/` with `bin\windows\`". Or even have a separate
> quick start for Windows since a number of commands will be different on
> Windows.
> There is no `connect-standalone.sh` equivalent for Windows under
> bin\windows folder (Step 7).
> Step 8 is also not tailored for Windows terminals. I skipped this step.
> When I try to consume message using the new consumer (Step 5) I get an
> exception on the broker side. The old consumer works fine.
>
> java.io.IOException: Map failed
> at sun.nio.ch.FileChannelImpl.map(Unknown Source)
> at kafka.log.AbstractIndex.(AbstractIndex.scala:61)
> at kafka.log.OffsetIndex.(OffsetIndex.scala:51)
> at kafka.log.LogSegment.(LogSegment.scala:67)
> at kafka.log.Log.loadSegments(Log.scala:255)
> at kafka.log.Log.(Log.scala:108)
> at kafka.log.LogManager.createLog(LogManager.scala:362)
> at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:94)
> at
> kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.
> apply(Partition.scala:174)
> at
> kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.
> apply(Partition.scala:174)
> at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
> at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:174)
> at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:168)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
> at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:242)
> at kafka.cluster.Partition.makeLeader(Partition.scala:168)
> at
> kafka.server.ReplicaManager$$anonfun$makeLeaders$4.apply(
> ReplicaManager.scala:740)
> at
> kafka.server.ReplicaManager$$anonfun$makeLeaders$4.apply(
> ReplicaManager.scala:739)
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.
> apply(HashMap.scala:98)
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.
> apply(HashMap.scala:98)
> at
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
> at
> kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:739)
> at
> kafka.server.ReplicaManager.becomeLeaderOrFollower(
> ReplicaManager.scala:685)
> at
> kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:148)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:82)
> at
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Unknown Source)
> Caused by: java.lang.OutOfMemoryError: Map failed
> at sun.nio.ch.FileChannelImpl.map0(Native Method)
> ... 29 more
>
> This issue seems to break the broker and I have to clear out the logs so I
> can bring the broker back up again.
>
>
> Ubuntu / Mac
>
> At Step 8, the output I'm seeing after going through the instructions in
> sequence is this (with unique words)
>
> all 1
> lead1
> to  1
> hello   1
> streams 2
> join1
> kafka   3
> summit  1
>
> which is different what I see in the documentation (with repeating words).
>
>
> --Vahid
>
>
>
>
> From:   Jason Gustafson 
> To: users@kafka.apache.org, d...@kafka.apache.org, kafka-clients
> 
> Date:   10/04/2016 04:13 PM
> Subject:Re: [VOTE] 0.10.1.0 RC0
>
>
>
> One clarification: this is a minor release, not a major one.
>
> -Jason
>
> On Tue, Oct 4, 2016 at 4:01 PM, Jason Gustafson 
> wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the first candidate for release of Apache Kafka 0.10.1.0. This
> is
> > a major release that includes great new features including throttled
> > replication, secure quotas, time-based log searching, and queryable
> state
> > for Kafka Streams. A full list of the content can be found here:
> > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.10.1.

Re: [VOTE] 0.10.1.0 RC0

2016-10-06 Thread Vahid S Hashemian
Jason,

Thanks a lot for managing this release.

I ran the quick start (Steps 2-8) with this release candidate on Ubuntu, 
Windows, and Mac and they mostly look great.
These are some, hopefully, minor items and gaps I noticed with respect to 
the existing quick start documentation (and the updated quick start that 
leverages the new consumer).
They may very well be carryovers from previous releases, or perhaps 
specific to my local environments.
Hopefully others can confirm.


Windows

Since there are separate scripts on Windows platform, it probably would 
help if that is clarified in the quick start section. E.g. "On Windows 
platform replace `bin/` with `bin\windows\`". Or even have a separate 
quick start for Windows since a number of commands will be different on 
Windows.
There is no `connect-standalone.sh` equivalent for Windows under 
bin\windows folder (Step 7).
Step 8 is also not tailored for Windows terminals. I skipped this step.
When I try to consume message using the new consumer (Step 5) I get an 
exception on the broker side. The old consumer works fine.

java.io.IOException: Map failed
at sun.nio.ch.FileChannelImpl.map(Unknown Source)
at kafka.log.AbstractIndex.(AbstractIndex.scala:61)
at kafka.log.OffsetIndex.(OffsetIndex.scala:51)
at kafka.log.LogSegment.(LogSegment.scala:67)
at kafka.log.Log.loadSegments(Log.scala:255)
at kafka.log.Log.(Log.scala:108)
at kafka.log.LogManager.createLog(LogManager.scala:362)
at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:94)
at 
kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:174)
at 
kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:174)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:174)
at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:168)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:242)
at kafka.cluster.Partition.makeLeader(Partition.scala:168)
at 
kafka.server.ReplicaManager$$anonfun$makeLeaders$4.apply(ReplicaManager.scala:740)
at 
kafka.server.ReplicaManager$$anonfun$makeLeaders$4.apply(ReplicaManager.scala:739)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at 
kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:739)
at 
kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:685)
at 
kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:148)
at kafka.server.KafkaApis.handle(KafkaApis.scala:82)
at 
kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)
... 29 more

This issue seems to break the broker and I have to clear out the logs so I 
can bring the broker back up again.


Ubuntu / Mac

At Step 8, the output I'm seeing after going through the instructions in 
sequence is this (with unique words)

all 1
lead1
to  1
hello   1
streams 2
join1
kafka   3
summit  1

which is different what I see in the documentation (with repeating words).


--Vahid




From:   Jason Gustafson 
To: users@kafka.apache.org, d...@kafka.apache.org, kafka-clients 

Date:   10/04/2016 04:13 PM
Subject:Re: [VOTE] 0.10.1.0 RC0



One clarification: this is a minor release, not a major one.

-Jason

On Tue, Oct 4, 2016 at 4:01 PM, Jason Gustafson  
wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the first candidate for release of Apache Kafka 0.10.1.0. This 
is
> a major release that includes great new features including throttled
> replication, secure quotas, time-based log searching, and queryable 
state
> for Kafka Streams. A full list of the content can be found here:
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.10.1. 
Since
> this is a major release, we will give people more time to try it out and
> give feedback.
>
> Release notes for the 0.10.1.0 release:
> http://home.apache.org/~jgus/kafka-0.10.1.0-rc0/RELEASE_NOTES.html
>
> *** Please download, test and vote by Monday, Oct 10, 9am PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~jgus/kafka-0.10.1.0-rc0/
>
> * Maven artifacts to be voted upon:
> h

Re: Handling out of order messages without KTables

2016-10-06 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Yes, that should work.

On 10/6/16 3:54 PM, Ali Akhtar wrote:
> Thanks! That looks perfect.
> 
> Last q.. is there any shortcut to having the json string messages 
> automatically get serialized to their equivalent Java class via
> Jackson, or such?
> 
> Perhaps I can write a Serde impl which takes the java.lang.Class
> of the class to be mapped, and maps it via Jackson?
> 
> On Fri, Oct 7, 2016 at 3:39 AM, Matthias J. Sax
>  wrote:
> 
> Exactly. You need to set the key using KStream#selectKey() and 
> re-distribute data via #through().
> 
> About timestamps: you can provide a custom TimestampExtractor that 
> returns the JSON embedded TS instead of record TS (as 
> DefaultTimestampExtractor does)
> 
> See 
> http://docs.confluent.io/3.0.1/streams/developer-guide.html#timestamp-
ex
>
> 
tractor
> 
> 
> -Matthias
> 
> On 10/6/16 2:59 PM, Ali Akhtar wrote:
 Sorry, to be clear:
 
 - Producers post to topic A - Consumers of topic A receive
 the data, parse it to find the keys, and post the correct key
 + message to Topic B - Topic B is treated as a KTable by 2nd
 consumer layer, and its this layer which does the writes to
 ensure 'last one wins' (Assuming 'last one' can be determined
 using the timestamp in the json of the message)
 
 On Fri, Oct 7, 2016 at 2:54 AM, Ali Akhtar
  wrote:
 
> Thanks for the reply.
> 
> Its not possible to provide keys, unfortunately. (Producer
> is written by a colleague, and said colleague just wants to
> provide whatever data the API gives, and leave all
> processing of the data to me).
> 
> Perhaps I can use an intermediate kafka topic, and have
> producers post to that topic w/ whatever data they receive.
> Then, another consumer can listen to that topic, and use it
> as a KTable to process data in the order of 'last one
> winning'.
> 
> However, the source of truth on the time of the messages,
> is embedded in the message itself, its not the Kafka
> internal timestamp.
> 
> The message is a json string, which contains a 'timestamp' 
> field, containing a string timestamp, and that string
> timestamp is the source of truth on when this message was
> generated.
> 
> So, is it possible to use a KTable which lets me parse the 
> message and return the time  which is contained inside the 
> message, and use THAT time for sorting the messages?
> 
> On Fri, Oct 7, 2016 at 2:33 AM, Matthias J. Sax 
>  wrote:
> 
 It is not global in this sense.
 
 Thus, you need to ensure that records updating the same
 product, go to the same instance. You can ensure this, by
 given all records of the same product the same key and
 "groupByKey" before processing the data.
 
 -Matthias
 
 On 10/6/16 10:55 AM, Ali Akhtar wrote:
 Thank you, State Store seems promising. But, is it 
 distributed, or limited to the particular instance of
 my application?
 
 I.e if there are 3 messages, setting product 1's
 price to $1, $3, and $5, and all 3 of them go to a
 different instance of my application, will they be
 able to correctly identify the latest received
 message using State Store?
 
 On Thu, Oct 6, 2016 at 10:48 PM, Matthias J. Sax 
  wrote:
 
 What do you mean by "message keys are random" -- do
 you effectively have no keys and want all messages to
 be processed as if they all have the same key?
 
 To access record TS in general, you need to use
 Processor API. The given ProcessorContext object
 given by Processor#init() always return the timestamp
 of the currently processed on #timestamp().
 
 Thus, you can attach a state store to your processor
 and compare the timestamps of the current record with
 the timestamp of the one in your store.
 
 -Matthias
 
 On 10/6/16 8:52 AM, Ali Akhtar wrote:
>>> Heya,
>>> 
>>> I have some Kafka producers, which are
>>> listening to webhook events, and for each
>>> webhook event, they post its payload to a Kafka
>>> topic.
>>> 
>>> Each payload contains a timestamp from the
>>> webhook source.
>>> 
>>> This timestamp is the source of truth about
>>> which events happened first, which happened
>>> last, etc.
>>> 
>>> I need to ensure that the last arrival of a 
>>> particular type of message wins.
>>> 
>>> E.g, if there are 5 messages, saying the price
>>> of a product with id 1, was set to $1, then $3,
>>> then something else, etc, before finally being
>>> set to $10, then I need to make sure that the
>>> final price fo

Re: Handling out of order messages without KTables

2016-10-06 Thread Ali Akhtar
Thanks! That looks perfect.

Last q.. is there any shortcut to having the json string messages
automatically get serialized to their equivalent Java class via Jackson, or
such?

Perhaps I can write a Serde impl which takes the java.lang.Class of the
class to be mapped, and maps it via Jackson?

On Fri, Oct 7, 2016 at 3:39 AM, Matthias J. Sax 
wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Exactly. You need to set the key using KStream#selectKey() and
> re-distribute data via #through().
>
> About timestamps: you can provide a custom TimestampExtractor that
> returns the JSON embedded TS instead of record TS (as
> DefaultTimestampExtractor does)
>
> See
> http://docs.confluent.io/3.0.1/streams/developer-guide.html#timestamp-ex
> tractor
>
>
> - -Matthias
>
> On 10/6/16 2:59 PM, Ali Akhtar wrote:
> > Sorry, to be clear:
> >
> > - Producers post to topic A - Consumers of topic A receive the
> > data, parse it to find the keys, and post the correct key + message
> > to Topic B - Topic B is treated as a KTable by 2nd consumer layer,
> > and its this layer which does the writes to ensure 'last one wins'
> > (Assuming 'last one' can be determined using the timestamp in the
> > json of the message)
> >
> > On Fri, Oct 7, 2016 at 2:54 AM, Ali Akhtar 
> > wrote:
> >
> >> Thanks for the reply.
> >>
> >> Its not possible to provide keys, unfortunately. (Producer is
> >> written by a colleague, and said colleague just wants to provide
> >> whatever data the API gives, and leave all processing of the data
> >> to me).
> >>
> >> Perhaps I can use an intermediate kafka topic, and have producers
> >> post to that topic w/ whatever data they receive. Then, another
> >> consumer can listen to that topic, and use it as a KTable to
> >> process data in the order of 'last one winning'.
> >>
> >> However, the source of truth on the time of the messages, is
> >> embedded in the message itself, its not the Kafka internal
> >> timestamp.
> >>
> >> The message is a json string, which contains a 'timestamp'
> >> field, containing a string timestamp, and that string timestamp
> >> is the source of truth on when this message was generated.
> >>
> >> So, is it possible to use a KTable which lets me parse the
> >> message and return the time  which is contained inside the
> >> message, and use THAT time for sorting the messages?
> >>
> >> On Fri, Oct 7, 2016 at 2:33 AM, Matthias J. Sax
> >>  wrote:
> >>
> > It is not global in this sense.
> >
> > Thus, you need to ensure that records updating the same product, go
> > to the same instance. You can ensure this, by given all records of
> > the same product the same key and "groupByKey" before processing
> > the data.
> >
> > -Matthias
> >
> > On 10/6/16 10:55 AM, Ali Akhtar wrote:
> > Thank you, State Store seems promising. But, is it
> > distributed, or limited to the particular instance of my
> > application?
> >
> > I.e if there are 3 messages, setting product 1's price to
> > $1, $3, and $5, and all 3 of them go to a different
> > instance of my application, will they be able to correctly
> > identify the latest received message using State Store?
> >
> > On Thu, Oct 6, 2016 at 10:48 PM, Matthias J. Sax
> >  wrote:
> >
> > What do you mean by "message keys are random" -- do you
> > effectively have no keys and want all messages to be
> > processed as if they all have the same key?
> >
> > To access record TS in general, you need to use Processor
> > API. The given ProcessorContext object given by
> > Processor#init() always return the timestamp of the
> > currently processed on #timestamp().
> >
> > Thus, you can attach a state store to your processor and
> > compare the timestamps of the current record with the
> > timestamp of the one in your store.
> >
> > -Matthias
> >
> > On 10/6/16 8:52 AM, Ali Akhtar wrote:
>  Heya,
> 
>  I have some Kafka producers, which are listening to
>  webhook events, and for each webhook event, they post
>  its payload to a Kafka topic.
> 
>  Each payload contains a timestamp from the webhook
>  source.
> 
>  This timestamp is the source of truth about which
>  events happened first, which happened last, etc.
> 
>  I need to ensure that the last arrival of a
>  particular type of message wins.
> 
>  E.g, if there are 5 messages, saying the price of a
>  product with id 1, was set to $1, then $3, then
>  something else, etc, before finally being set to $10,
>  then I need to make sure that the final price for
>  that product is $10.
> 
>  These messages can be out of order, and I need to
>  determine the latest arrival based on the timestamp
>  from the webhook source. (Its atm in a string format
>  which can be parsed)
> 
> 

Re: Handling out of order messages without KTables

2016-10-06 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Exactly. You need to set the key using KStream#selectKey() and
re-distribute data via #through().

About timestamps: you can provide a custom TimestampExtractor that
returns the JSON embedded TS instead of record TS (as
DefaultTimestampExtractor does)

See
http://docs.confluent.io/3.0.1/streams/developer-guide.html#timestamp-ex
tractor


- -Matthias

On 10/6/16 2:59 PM, Ali Akhtar wrote:
> Sorry, to be clear:
> 
> - Producers post to topic A - Consumers of topic A receive the
> data, parse it to find the keys, and post the correct key + message
> to Topic B - Topic B is treated as a KTable by 2nd consumer layer,
> and its this layer which does the writes to ensure 'last one wins'
> (Assuming 'last one' can be determined using the timestamp in the
> json of the message)
> 
> On Fri, Oct 7, 2016 at 2:54 AM, Ali Akhtar 
> wrote:
> 
>> Thanks for the reply.
>> 
>> Its not possible to provide keys, unfortunately. (Producer is
>> written by a colleague, and said colleague just wants to provide
>> whatever data the API gives, and leave all processing of the data
>> to me).
>> 
>> Perhaps I can use an intermediate kafka topic, and have producers
>> post to that topic w/ whatever data they receive. Then, another
>> consumer can listen to that topic, and use it as a KTable to
>> process data in the order of 'last one winning'.
>> 
>> However, the source of truth on the time of the messages, is
>> embedded in the message itself, its not the Kafka internal
>> timestamp.
>> 
>> The message is a json string, which contains a 'timestamp'
>> field, containing a string timestamp, and that string timestamp
>> is the source of truth on when this message was generated.
>> 
>> So, is it possible to use a KTable which lets me parse the
>> message and return the time  which is contained inside the
>> message, and use THAT time for sorting the messages?
>> 
>> On Fri, Oct 7, 2016 at 2:33 AM, Matthias J. Sax
>>  wrote:
>> 
> It is not global in this sense.
> 
> Thus, you need to ensure that records updating the same product, go
> to the same instance. You can ensure this, by given all records of
> the same product the same key and "groupByKey" before processing
> the data.
> 
> -Matthias
> 
> On 10/6/16 10:55 AM, Ali Akhtar wrote:
> Thank you, State Store seems promising. But, is it
> distributed, or limited to the particular instance of my
> application?
> 
> I.e if there are 3 messages, setting product 1's price to
> $1, $3, and $5, and all 3 of them go to a different
> instance of my application, will they be able to correctly
> identify the latest received message using State Store?
> 
> On Thu, Oct 6, 2016 at 10:48 PM, Matthias J. Sax 
>  wrote:
> 
> What do you mean by "message keys are random" -- do you 
> effectively have no keys and want all messages to be
> processed as if they all have the same key?
> 
> To access record TS in general, you need to use Processor
> API. The given ProcessorContext object given by
> Processor#init() always return the timestamp of the
> currently processed on #timestamp().
> 
> Thus, you can attach a state store to your processor and
> compare the timestamps of the current record with the
> timestamp of the one in your store.
> 
> -Matthias
> 
> On 10/6/16 8:52 AM, Ali Akhtar wrote:
 Heya,
 
 I have some Kafka producers, which are listening to
 webhook events, and for each webhook event, they post
 its payload to a Kafka topic.
 
 Each payload contains a timestamp from the webhook
 source.
 
 This timestamp is the source of truth about which
 events happened first, which happened last, etc.
 
 I need to ensure that the last arrival of a
 particular type of message wins.
 
 E.g, if there are 5 messages, saying the price of a
 product with id 1, was set to $1, then $3, then
 something else, etc, before finally being set to $10,
 then I need to make sure that the final price for
 that product is $10.
 
 These messages can be out of order, and I need to
 determine the latest arrival based on the timestamp
 from the webhook source. (Its atm in a string format
 which can be parsed)
 
 Since KTable looks like it uses message keys to
 determine what happens - and in this case, the
 message keys are random, and the timestamp contained
 in the value of the message is what determines the
 order of the events - any pointers on what the best
 way to do this is?
 
 I'm using kafka streaming, latest version, Java.
 
>> 
> 
>>> 
>> 
>> 
> 
-BEGIN PGP SIGNATURE-
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJX9tKmAAoJECnhiMLycopPQ50P/2wibksuel/cAu0EY

Re: Handling out of order messages without KTables

2016-10-06 Thread Ali Akhtar
Sorry, to be clear:

- Producers post to topic A
- Consumers of topic A receive the data, parse it to find the keys, and
post the correct key + message to Topic B
- Topic B is treated as a KTable by 2nd consumer layer, and its this layer
which does the writes to ensure 'last one wins' (Assuming 'last one' can be
determined using the timestamp in the json of the message)

On Fri, Oct 7, 2016 at 2:54 AM, Ali Akhtar  wrote:

> Thanks for the reply.
>
> Its not possible to provide keys, unfortunately. (Producer is written by a
> colleague, and said colleague just wants to provide whatever data the API
> gives, and leave all processing of the data to me).
>
> Perhaps I can use an intermediate kafka topic, and have producers post to
> that topic w/ whatever data they receive. Then, another consumer can listen
> to that topic, and use it as a KTable to process data in the order of 'last
> one winning'.
>
> However, the source of truth on the time of the messages, is embedded in
> the message itself, its not the Kafka internal timestamp.
>
> The message is a json string, which contains a 'timestamp' field,
> containing a string timestamp, and that string timestamp is the source of
> truth on when this message was generated.
>
> So, is it possible to use a KTable which lets me parse the message and
> return the time  which is contained inside the message, and use THAT time
> for sorting the messages?
>
> On Fri, Oct 7, 2016 at 2:33 AM, Matthias J. Sax 
> wrote:
>
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA512
>>
>> It is not global in this sense.
>>
>> Thus, you need to ensure that records updating the same product, go to
>> the same instance. You can ensure this, by given all records of the
>> same product the same key and "groupByKey" before processing the data.
>>
>> - -Matthias
>>
>> On 10/6/16 10:55 AM, Ali Akhtar wrote:
>> > Thank you, State Store seems promising. But, is it distributed, or
>> > limited to the particular instance of my application?
>> >
>> > I.e if there are 3 messages, setting product 1's price to $1, $3,
>> > and $5, and all 3 of them go to a different instance of my
>> > application, will they be able to correctly identify the latest
>> > received message using State Store?
>> >
>> > On Thu, Oct 6, 2016 at 10:48 PM, Matthias J. Sax
>> >  wrote:
>> >
>> > What do you mean by "message keys are random" -- do you
>> > effectively have no keys and want all messages to be processed as
>> > if they all have the same key?
>> >
>> > To access record TS in general, you need to use Processor API. The
>> > given ProcessorContext object given by Processor#init() always
>> > return the timestamp of the currently processed on #timestamp().
>> >
>> > Thus, you can attach a state store to your processor and compare
>> > the timestamps of the current record with the timestamp of the one
>> > in your store.
>> >
>> > -Matthias
>> >
>> > On 10/6/16 8:52 AM, Ali Akhtar wrote:
>>  Heya,
>> 
>>  I have some Kafka producers, which are listening to webhook
>>  events, and for each webhook event, they post its payload to
>>  a Kafka topic.
>> 
>>  Each payload contains a timestamp from the webhook source.
>> 
>>  This timestamp is the source of truth about which events
>>  happened first, which happened last, etc.
>> 
>>  I need to ensure that the last arrival of a particular type
>>  of message wins.
>> 
>>  E.g, if there are 5 messages, saying the price of a product
>>  with id 1, was set to $1, then $3, then something else, etc,
>>  before finally being set to $10, then I need to make sure
>>  that the final price for that product is $10.
>> 
>>  These messages can be out of order, and I need to determine
>>  the latest arrival based on the timestamp from the webhook
>>  source. (Its atm in a string format which can be parsed)
>> 
>>  Since KTable looks like it uses message keys to determine
>>  what happens - and in this case, the message keys are random,
>>  and the timestamp contained in the value of the message is
>>  what determines the order of the events - any pointers on
>>  what the best way to do this is?
>> 
>>  I'm using kafka streaming, latest version, Java.
>> 
>> >>
>> >
>> -BEGIN PGP SIGNATURE-
>> Comment: GPGTools - https://gpgtools.org
>>
>> iQIcBAEBCgAGBQJX9sM4AAoJECnhiMLycopPtDQQAIRZ5X/w4u9tdeBORgLlvvRJ
>> VQpdov1/xhY1VDzNLbqxnW0HlBlPcWl0UJ2gHd9vWbHyGlm0D/amZbaAr+n54xSu
>> NTZ9u4zLDD6bRtNpnoFX8m2lsxb4AAzpzbCQqCCeJRPQ9D5eJCkV8i+mcRo6CA8V
>> JEkb+OZkEFwvlFFr5jUuLfrnFEI4pV+gRguFrmRdbXKAbqGgV8hMk/hS4aNAjiz0
>> hZ3uHW3JAXhn+kgCqykHlsVCHA/yQUU4Gm/5mNuYvrsTYW1UObuBJ/O0SPSfPcs9
>> XnJbV0T6xeNZLbCnGer+IykdLItdjs3slfAxINtyJkLXv7A6kOkE0Odb05gXYiy9
>> b+/vLiEiVnXU/eO70V5kcs3NCCovEu/+vFcsEEVg/UnCXl0K96ywVm44ljXOe80O
>> 4ESuagg6oNO50uLVrOydYGLlgVYjBL/LM/ld4DnWlt37g6r50FTclEkXZExlnWtz
>> pWjOsn/bbsx3Nybcvc4blga/7I5C25yZ44yQCcGtEg5JFqn4+2cUQGy6mOmRUjMV
>> wghdgHOc+A79sSmB

Re: Handling out of order messages without KTables

2016-10-06 Thread Ali Akhtar
Thanks for the reply.

Its not possible to provide keys, unfortunately. (Producer is written by a
colleague, and said colleague just wants to provide whatever data the API
gives, and leave all processing of the data to me).

Perhaps I can use an intermediate kafka topic, and have producers post to
that topic w/ whatever data they receive. Then, another consumer can listen
to that topic, and use it as a KTable to process data in the order of 'last
one winning'.

However, the source of truth on the time of the messages, is embedded in
the message itself, its not the Kafka internal timestamp.

The message is a json string, which contains a 'timestamp' field,
containing a string timestamp, and that string timestamp is the source of
truth on when this message was generated.

So, is it possible to use a KTable which lets me parse the message and
return the time  which is contained inside the message, and use THAT time
for sorting the messages?

On Fri, Oct 7, 2016 at 2:33 AM, Matthias J. Sax 
wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> It is not global in this sense.
>
> Thus, you need to ensure that records updating the same product, go to
> the same instance. You can ensure this, by given all records of the
> same product the same key and "groupByKey" before processing the data.
>
> - -Matthias
>
> On 10/6/16 10:55 AM, Ali Akhtar wrote:
> > Thank you, State Store seems promising. But, is it distributed, or
> > limited to the particular instance of my application?
> >
> > I.e if there are 3 messages, setting product 1's price to $1, $3,
> > and $5, and all 3 of them go to a different instance of my
> > application, will they be able to correctly identify the latest
> > received message using State Store?
> >
> > On Thu, Oct 6, 2016 at 10:48 PM, Matthias J. Sax
> >  wrote:
> >
> > What do you mean by "message keys are random" -- do you
> > effectively have no keys and want all messages to be processed as
> > if they all have the same key?
> >
> > To access record TS in general, you need to use Processor API. The
> > given ProcessorContext object given by Processor#init() always
> > return the timestamp of the currently processed on #timestamp().
> >
> > Thus, you can attach a state store to your processor and compare
> > the timestamps of the current record with the timestamp of the one
> > in your store.
> >
> > -Matthias
> >
> > On 10/6/16 8:52 AM, Ali Akhtar wrote:
>  Heya,
> 
>  I have some Kafka producers, which are listening to webhook
>  events, and for each webhook event, they post its payload to
>  a Kafka topic.
> 
>  Each payload contains a timestamp from the webhook source.
> 
>  This timestamp is the source of truth about which events
>  happened first, which happened last, etc.
> 
>  I need to ensure that the last arrival of a particular type
>  of message wins.
> 
>  E.g, if there are 5 messages, saying the price of a product
>  with id 1, was set to $1, then $3, then something else, etc,
>  before finally being set to $10, then I need to make sure
>  that the final price for that product is $10.
> 
>  These messages can be out of order, and I need to determine
>  the latest arrival based on the timestamp from the webhook
>  source. (Its atm in a string format which can be parsed)
> 
>  Since KTable looks like it uses message keys to determine
>  what happens - and in this case, the message keys are random,
>  and the timestamp contained in the value of the message is
>  what determines the order of the events - any pointers on
>  what the best way to do this is?
> 
>  I'm using kafka streaming, latest version, Java.
> 
> >>
> >
> -BEGIN PGP SIGNATURE-
> Comment: GPGTools - https://gpgtools.org
>
> iQIcBAEBCgAGBQJX9sM4AAoJECnhiMLycopPtDQQAIRZ5X/w4u9tdeBORgLlvvRJ
> VQpdov1/xhY1VDzNLbqxnW0HlBlPcWl0UJ2gHd9vWbHyGlm0D/amZbaAr+n54xSu
> NTZ9u4zLDD6bRtNpnoFX8m2lsxb4AAzpzbCQqCCeJRPQ9D5eJCkV8i+mcRo6CA8V
> JEkb+OZkEFwvlFFr5jUuLfrnFEI4pV+gRguFrmRdbXKAbqGgV8hMk/hS4aNAjiz0
> hZ3uHW3JAXhn+kgCqykHlsVCHA/yQUU4Gm/5mNuYvrsTYW1UObuBJ/O0SPSfPcs9
> XnJbV0T6xeNZLbCnGer+IykdLItdjs3slfAxINtyJkLXv7A6kOkE0Odb05gXYiy9
> b+/vLiEiVnXU/eO70V5kcs3NCCovEu/+vFcsEEVg/UnCXl0K96ywVm44ljXOe80O
> 4ESuagg6oNO50uLVrOydYGLlgVYjBL/LM/ld4DnWlt37g6r50FTclEkXZExlnWtz
> pWjOsn/bbsx3Nybcvc4blga/7I5C25yZ44yQCcGtEg5JFqn4+2cUQGy6mOmRUjMV
> wghdgHOc+A79sSmBx9BuOB4Lt15AYfvMP/NUGjESSKvIEoAbBavQKBFNYcbqMdfX
> UgYR7Icx5KmdB9ufB2dFMkaD163b1tVUT1nAzyyWcPrCjGQ9n7CDEQA4b9eUR5bH
> cydxl9H+2QRo1jh8UOlE
> =Mwrf
> -END PGP SIGNATURE-
>


Re: Kafka 10 Consumer Reading from Kafka 8 Cluster?

2016-10-06 Thread Craig Swift
Ok great - thanks for the clarification. Exactly what I needed. :)

Craig

On Thu, Oct 6, 2016 at 2:09 PM, Scott Reynolds  wrote:

> you cannot use a k10 client with a k8 cluster. The protocol changed
>
> You CAN use a k8 client with a k10 cluster.
>
> On Thu, Oct 6, 2016 at 12:00 PM Craig Swift
>  wrote:
>
> > We're doing some fairly intensive data transformations in the current
> > workers so it's not as straight forward as just reading/producing to
> > another topic. However, if you mean can we mirror the source topic to the
> > kafka 10 cluster and then update the worker to read/write to 10 - that
> > could be an option. I'd still like to know if any of the k10 client
> > consumer (old or new consumer) code can work with a k8 cluster though.
> >
> > Craig
> >
> > On Thu, Oct 6, 2016 at 12:38 PM, David Garcia 
> > wrote:
> >
> > > Any reason you can’t use mirror maker?
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=27846330
> > >
> > > -David
> > >
> > > On 10/6/16, 1:32 PM, "Craig Swift"  INVALID>
> > > wrote:
> > >
> > > Hello,
> > >
> > > We're in the process of upgrading several of our clusters to Kafka
> > 10.
> > > I
> > > was wondering if it's possible to use the Kafka 10 client code (old
> > or
> > > new)
> > > to read from a source Kafka 8 cluster and then use the new 10
> > producer
> > > to
> > > write to a destination Kafka 10 cluster? I know there's a
> recommended
> > > upgrade path per the documentation but we're unfortunately not able
> > to
> > > update the source cluster quite yet. Thoughts?
> > >
> > > Craig
> > >
> > >
> > >
> >
>


Re: Handling out of order messages without KTables

2016-10-06 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

It is not global in this sense.

Thus, you need to ensure that records updating the same product, go to
the same instance. You can ensure this, by given all records of the
same product the same key and "groupByKey" before processing the data.

- -Matthias

On 10/6/16 10:55 AM, Ali Akhtar wrote:
> Thank you, State Store seems promising. But, is it distributed, or
> limited to the particular instance of my application?
> 
> I.e if there are 3 messages, setting product 1's price to $1, $3,
> and $5, and all 3 of them go to a different instance of my
> application, will they be able to correctly identify the latest
> received message using State Store?
> 
> On Thu, Oct 6, 2016 at 10:48 PM, Matthias J. Sax
>  wrote:
> 
> What do you mean by "message keys are random" -- do you
> effectively have no keys and want all messages to be processed as
> if they all have the same key?
> 
> To access record TS in general, you need to use Processor API. The 
> given ProcessorContext object given by Processor#init() always
> return the timestamp of the currently processed on #timestamp().
> 
> Thus, you can attach a state store to your processor and compare
> the timestamps of the current record with the timestamp of the one
> in your store.
> 
> -Matthias
> 
> On 10/6/16 8:52 AM, Ali Akhtar wrote:
 Heya,
 
 I have some Kafka producers, which are listening to webhook
 events, and for each webhook event, they post its payload to
 a Kafka topic.
 
 Each payload contains a timestamp from the webhook source.
 
 This timestamp is the source of truth about which events
 happened first, which happened last, etc.
 
 I need to ensure that the last arrival of a particular type
 of message wins.
 
 E.g, if there are 5 messages, saying the price of a product
 with id 1, was set to $1, then $3, then something else, etc,
 before finally being set to $10, then I need to make sure
 that the final price for that product is $10.
 
 These messages can be out of order, and I need to determine
 the latest arrival based on the timestamp from the webhook
 source. (Its atm in a string format which can be parsed)
 
 Since KTable looks like it uses message keys to determine
 what happens - and in this case, the message keys are random,
 and the timestamp contained in the value of the message is
 what determines the order of the events - any pointers on
 what the best way to do this is?
 
 I'm using kafka streaming, latest version, Java.
 
>> 
> 
-BEGIN PGP SIGNATURE-
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJX9sM4AAoJECnhiMLycopPtDQQAIRZ5X/w4u9tdeBORgLlvvRJ
VQpdov1/xhY1VDzNLbqxnW0HlBlPcWl0UJ2gHd9vWbHyGlm0D/amZbaAr+n54xSu
NTZ9u4zLDD6bRtNpnoFX8m2lsxb4AAzpzbCQqCCeJRPQ9D5eJCkV8i+mcRo6CA8V
JEkb+OZkEFwvlFFr5jUuLfrnFEI4pV+gRguFrmRdbXKAbqGgV8hMk/hS4aNAjiz0
hZ3uHW3JAXhn+kgCqykHlsVCHA/yQUU4Gm/5mNuYvrsTYW1UObuBJ/O0SPSfPcs9
XnJbV0T6xeNZLbCnGer+IykdLItdjs3slfAxINtyJkLXv7A6kOkE0Odb05gXYiy9
b+/vLiEiVnXU/eO70V5kcs3NCCovEu/+vFcsEEVg/UnCXl0K96ywVm44ljXOe80O
4ESuagg6oNO50uLVrOydYGLlgVYjBL/LM/ld4DnWlt37g6r50FTclEkXZExlnWtz
pWjOsn/bbsx3Nybcvc4blga/7I5C25yZ44yQCcGtEg5JFqn4+2cUQGy6mOmRUjMV
wghdgHOc+A79sSmBx9BuOB4Lt15AYfvMP/NUGjESSKvIEoAbBavQKBFNYcbqMdfX
UgYR7Icx5KmdB9ufB2dFMkaD163b1tVUT1nAzyyWcPrCjGQ9n7CDEQA4b9eUR5bH
cydxl9H+2QRo1jh8UOlE
=Mwrf
-END PGP SIGNATURE-


Re: Snazzy new look to our website

2016-10-06 Thread Jason Gustafson
Thanks Mickael and Jonathon for reporting the problem with the javdocs. The
links should be fixed now.

-Jason

On Thu, Oct 6, 2016 at 10:59 AM, Jonathan Bond 
wrote:

> Hi,
>
> I'm having a problem with the new website.  Whenever I try to follow a link
> to the Kafka javadoc, either from within the website or a google link - it
> always takes me to the top of the Kafka 0.10 documentation page.  I can't
> figure out how to get to the javadoc.
>
> Thanks, Jonathan
>
> On Tue, Oct 4, 2016 at 6:46 PM Jason Gustafson  wrote:
>
> > Huge improvement. Thanks Derrick and Gwen!
> >
> > On Tue, Oct 4, 2016 at 5:54 PM, Becket Qin  wrote:
> >
> > > Much fancier now :)
> > >
> > > On Tue, Oct 4, 2016 at 5:51 PM, Ali Akhtar 
> wrote:
> > >
> > > > Just noticed this on pulling up the documentation. Oh yeah! This new
> > look
> > > > is fantastic.
> > > >
> > > > On Wed, Oct 5, 2016 at 4:31 AM, Vahid S Hashemian <
> > > > vahidhashem...@us.ibm.com
> > > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Thank you for the much needed new design.
> > > > > At first glance, it looks great, and more professional.
> > > > >
> > > > > --Vahid
> > > > >
> > > > >
> > > > >
> > > > > From:   Gwen Shapira 
> > > > > To: d...@kafka.apache.org, Users 
> > > > > Cc: Derrick Or 
> > > > > Date:   10/04/2016 04:13 PM
> > > > > Subject:Snazzy new look to our website
> > > > >
> > > > >
> > > > >
> > > > > Hi Team Kafka,
> > > > >
> > > > > I just merged PR 20 to our website - which gives it a new (and IMO
> > > > > pretty snazzy) look and feel. Thanks to Derrick Or for contributing
> > > > > the update.
> > > > >
> > > > > I had to do a hard-refresh (shift-f5 on my mac) to get the new look
> > to
> > > > > load properly - so if stuff looks off, try it.
> > > > >
> > > > > Comments and contributions to the site are welcome.
> > > > >
> > > > > Gwen
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>


Re: Kafka 10 Consumer Reading from Kafka 8 Cluster?

2016-10-06 Thread Scott Reynolds
you cannot use a k10 client with a k8 cluster. The protocol changed

You CAN use a k8 client with a k10 cluster.

On Thu, Oct 6, 2016 at 12:00 PM Craig Swift
 wrote:

> We're doing some fairly intensive data transformations in the current
> workers so it's not as straight forward as just reading/producing to
> another topic. However, if you mean can we mirror the source topic to the
> kafka 10 cluster and then update the worker to read/write to 10 - that
> could be an option. I'd still like to know if any of the k10 client
> consumer (old or new consumer) code can work with a k8 cluster though.
>
> Craig
>
> On Thu, Oct 6, 2016 at 12:38 PM, David Garcia 
> wrote:
>
> > Any reason you can’t use mirror maker?
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330
> >
> > -David
> >
> > On 10/6/16, 1:32 PM, "Craig Swift" 
> > wrote:
> >
> > Hello,
> >
> > We're in the process of upgrading several of our clusters to Kafka
> 10.
> > I
> > was wondering if it's possible to use the Kafka 10 client code (old
> or
> > new)
> > to read from a source Kafka 8 cluster and then use the new 10
> producer
> > to
> > write to a destination Kafka 10 cluster? I know there's a recommended
> > upgrade path per the documentation but we're unfortunately not able
> to
> > update the source cluster quite yet. Thoughts?
> >
> > Craig
> >
> >
> >
>


Re: kafka stream to new topic based on message key

2016-10-06 Thread Gary Ogden
Thanks Guozhang. I've gotten an example to work using your tips.

So, is there no other way in streams to create a topic if
"auto.topic.create.enabled"
is set to false?  Maybe by creating a record in zookeeper for that topic?



On 5 October 2016 at 17:20, Guozhang Wang  wrote:

> Hello Gary,
>
>
> 1. The InternalTopicManager is only used by the Streams-instantiated
> PartitionAssignor to create internal topics for auto-repartitioning and
> changelog.
>
> 2. About "RecordCollector.Supplier": you are right, and as I wrote in the
> above email you have to force casting it to RecordCollector.Supplier,
> theoretically this is not safe but the internal Impl is always used.
>
>
> If you know before hand of all the possible topics that you would want to
> send based on the key-value pair, you can then use KStreams.branch() to
> branch the source stream into multiple ones based on the content, with each
> branched stream to a different topic.
>
>
> Guozhang
>
>
> On Wed, Oct 5, 2016 at 7:48 AM, Gary Ogden  wrote:
>
> > Guozhang. I was just looking at the source for this, and it looks like
> the
> > RecordCollector.Supplier is part of the internal ProcessorContextImpl
> > class.  I don't think that's exposed to me, is it?
> >
> > If I create a processor class that extends AbstractProcess, it only has
> > access to the ProcessorContext interface, which doesn't expose the
> > Supplier.
> >
> > On 5 October 2016 at 09:42, Gary Ogden  wrote:
> >
> > > What if we were to use kafka connect instead of streams? Does it have
> the
> > > ability to specify partitions, rf, segment size etc?
> > >
> > > On 5 October 2016 at 09:42, Gary Ogden  wrote:
> > >
> > >> Thanks Guozhang.
> > >>
> > >> So there's no way we could also use InternalTopicManager to specify
> the
> > >> number of partitions and RF?
> > >>
> > >> https://github.com/apache/kafka/blob/0.10.1/streams/src/main
> > >> /java/org/apache/kafka/streams/processor/internals/InternalT
> > >> opicManager.java
> > >>
> > >> On 4 October 2016 at 19:34, Guozhang Wang  wrote:
> > >>
> > >>> Hello Gary,
> > >>>
> > >>> This is also doable in the Processor API, you can use the record
> > >>> collector
> > >>> from ProcessorContext to send data to arbitrary topics, i.e.:
> > >>>
> > >>> RecordCollector collector = ((RecordCollector.Supplier)
> > >>> context).recordCollector();
> > >>> collector.send(new ProducerRecord<>(topic, *...*), keySerializer,
> > >>> valSerializer, partitioner);
> > >>>
> > >>>
> > >>> But note that if the new topic, e.g. "123456_lv2" is not created,
> then
> > >>> the send call will thrown an exception unless the borker-side config
> > >>> "auto.topic.create.enabled" is set to true; and even in this case,
> the
> > >>> topic will be auto-created with the pre-defined number of partitions,
> > >>> i.e. you cannot control how the topics can be created with what
> > >>> configs such as compaction policy, num.partitions, segment sizes,
> etc.
> > >>> If that works for you then I think it should be fine.
> > >>>
> > >>>
> > >>> Guozhang
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Oct 4, 2016 at 12:51 PM, Gary Ogden 
> wrote:
> > >>>
> > >>> > Is it possible, in a kafka streaming job, to write to another topic
> > >>> based
> > >>> > on the key in the messages?
> > >>> >
> > >>> > For example, say the message is:
> > >>> >
> > >>> > 123456#{"id":56789,"type":1}
> > >>> >
> > >>> > where the key is 123456, # is the delimeter, and the {} is the json
> > >>> data.
> > >>> >
> > >>> > And I want to push the json data to another topic that will have
> the
> > >>> name
> > >>> > 123456_lv2.
> > >>> >
> > >>> > Is this possible with kafka streaming?
> > >>> >
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> -- Guozhang
> > >>>
> > >>
> > >>
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: Kafka 10 Consumer Reading from Kafka 8 Cluster?

2016-10-06 Thread Craig Swift
We're doing some fairly intensive data transformations in the current
workers so it's not as straight forward as just reading/producing to
another topic. However, if you mean can we mirror the source topic to the
kafka 10 cluster and then update the worker to read/write to 10 - that
could be an option. I'd still like to know if any of the k10 client
consumer (old or new consumer) code can work with a k8 cluster though.

Craig

On Thu, Oct 6, 2016 at 12:38 PM, David Garcia  wrote:

> Any reason you can’t use mirror maker?
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330
>
> -David
>
> On 10/6/16, 1:32 PM, "Craig Swift" 
> wrote:
>
> Hello,
>
> We're in the process of upgrading several of our clusters to Kafka 10.
> I
> was wondering if it's possible to use the Kafka 10 client code (old or
> new)
> to read from a source Kafka 8 cluster and then use the new 10 producer
> to
> write to a destination Kafka 10 cluster? I know there's a recommended
> upgrade path per the documentation but we're unfortunately not able to
> update the source cluster quite yet. Thoughts?
>
> Craig
>
>
>


Re: Kafka 10 Consumer Reading from Kafka 8 Cluster?

2016-10-06 Thread David Garcia
Any reason you can’t use mirror maker?
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330

-David

On 10/6/16, 1:32 PM, "Craig Swift"  wrote:

Hello,

We're in the process of upgrading several of our clusters to Kafka 10. I
was wondering if it's possible to use the Kafka 10 client code (old or new)
to read from a source Kafka 8 cluster and then use the new 10 producer to
write to a destination Kafka 10 cluster? I know there's a recommended
upgrade path per the documentation but we're unfortunately not able to
update the source cluster quite yet. Thoughts?

Craig




Kafka 10 Consumer Reading from Kafka 8 Cluster?

2016-10-06 Thread Craig Swift
Hello,

We're in the process of upgrading several of our clusters to Kafka 10. I
was wondering if it's possible to use the Kafka 10 client code (old or new)
to read from a source Kafka 8 cluster and then use the new 10 producer to
write to a destination Kafka 10 cluster? I know there's a recommended
upgrade path per the documentation but we're unfortunately not able to
update the source cluster quite yet. Thoughts?

Craig


Re: Kafka Streams dynamic partitioning

2016-10-06 Thread Michael Noll
> I think this should be ' pick number of partitions that matches max number
> of possible keys in stream to be partitioned '.
> At least in my usecase , in which I am trying to partition stream by key
> and make windowed aggregations, if there are less number of topic
> partitions than possible keys,  then application will not work correctly.

As I said above, this is actually not needed -- which (I hope) means good
news for you. :-)



On Wed, Oct 5, 2016 at 11:27 PM, Adrienne Kole 
wrote:

> Thanks, I got the point. That solves my problem.
>
>
>
> On Wed, Oct 5, 2016 at 10:58 PM, Matthias J. Sax 
> wrote:
>
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA512
> >
> > Hi,
> >
> > even if you have more distinct keys than partitions (ie, different key
> > go to the same partition), if you do "aggregate by key" Streams will
> > automatically separate the keys and compute an aggregate per key.
> > Thus, you do not need to worry about which keys is hashed to what
> > partition.
> >
> > - -Matthias
> >
> > On 10/5/16 1:37 PM, Adrienne Kole wrote:
> > > Hi,
> > >
> > > @Ali IMO, Yes. That is the job of kafka server to assign kafka
> > > instances partition(s) to process. Each instance can process more
> > > than one partition but one partition cannot be processed by more
> > > than one instance.
> > >
> > > @Michael, Thanks for reply.
> > >> Rather, pick the number of partitions in a way that matches your
> > >> needs to
> > > process the data in parallel I think this should be ' pick number
> > > of partitions that matches max number of possible keys in stream to
> > > be partitioned '. At least in my usecase , in which I am trying to
> > > partition stream by key and make windowed aggregations, if there
> > > are less number of topic partitions than possible keys,  then
> > > application will not work correctly.
> > >
> > > That is, if the number of topic partitions is less than possible
> > > stream keys, then different keyed stream tuples will be assigned to
> > > same topic. That was the problem that I was trying to solve and it
> > > seems the only solution is to estimate max number of possible keys
> > > and assign accordingly.
> > >
> > > Thanks Adrienne
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Oct 5, 2016 at 5:55 PM, Ali Akhtar 
> > > wrote:
> > >
> > >>> It's often a good
> > >> idea to over-partition your topics.  For example, even if today
> > >> 10 machines (and thus 10 partitions) would be sufficient, pick a
> > >> higher number of partitions (say, 50) so you have some wiggle
> > >> room to add more machines (11...50) later if need be.
> > >>
> > >> If you create e.g 30 partitions, but only have e.g 5 instances of
> > >> your program, all on the same consumer group, all using kafka
> > >> streams to consume the topic, do you still receive all the data
> > >> posted to the topic, or will you need to have the same instances
> > >> of the program as there are partitions?
> > >>
> > >> (If you have 1 instance, 30 partitions, will the same rules
> > >> apply, i.e it will receive all data?)
> > >>
> > >> On Wed, Oct 5, 2016 at 8:52 PM, Michael Noll
> > >>  wrote:
> > >>
> >  So, in this case I should know the max number of possible
> >  keys so that I can create that number of partitions.
> > >>>
> > >>> Assuming I understand your original question correctly, then
> > >>> you would
> > >> not
> > >>> need to do/know this.  Rather, pick the number of partitions in
> > >>> a way
> > >> that
> > >>> matches your needs to process the data in parallel (e.g. if you
> > >>> expect
> > >> that
> > >>> you require 10 machines in order to process the incoming data,
> > >>> then you'd need 10 partitions).  Also, as a general
> > >>> recommendation:  It's often a
> > >> good
> > >>> idea to over-partition your topics.  For example, even if today
> > >>> 10
> > >> machines
> > >>> (and thus 10 partitions) would be sufficient, pick a higher
> > >>> number of partitions (say, 50) so you have some wiggle room to
> > >>> add more machines (11...50) later if need be.
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Oct 5, 2016 at 9:34 AM, Adrienne Kole
> > >>>  wrote:
> > >>>
> >  Hi Guozhang,
> > 
> >  So, in this case I should know the max number of possible
> >  keys so that
> > >> I
> >  can create that number of partitions.
> > 
> >  Thanks
> > 
> >  Adrienne
> > 
> >  On Wed, Oct 5, 2016 at 1:00 AM, Guozhang Wang
> >  
> > >>> wrote:
> > 
> > > By default the partitioner will use murmur hash on the key
> > > and mode
> > >> on
> > > current num.partitions to determine which partitions to go
> > > to, so
> > >>> records
> > > with the same key will be assigned to the same partition.
> > > Would that
> > >> be
> >  OK
> > > for your case?
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Tue, Oct 4, 2016 at 3:00 PM, Adrienne Kole <
> > >> adrienneko...@gmail.com
> > 
> > > wrote:
> > >
> > >>

Re: Snazzy new look to our website

2016-10-06 Thread Jonathan Bond
Hi,

I'm having a problem with the new website.  Whenever I try to follow a link
to the Kafka javadoc, either from within the website or a google link - it
always takes me to the top of the Kafka 0.10 documentation page.  I can't
figure out how to get to the javadoc.

Thanks, Jonathan

On Tue, Oct 4, 2016 at 6:46 PM Jason Gustafson  wrote:

> Huge improvement. Thanks Derrick and Gwen!
>
> On Tue, Oct 4, 2016 at 5:54 PM, Becket Qin  wrote:
>
> > Much fancier now :)
> >
> > On Tue, Oct 4, 2016 at 5:51 PM, Ali Akhtar  wrote:
> >
> > > Just noticed this on pulling up the documentation. Oh yeah! This new
> look
> > > is fantastic.
> > >
> > > On Wed, Oct 5, 2016 at 4:31 AM, Vahid S Hashemian <
> > > vahidhashem...@us.ibm.com
> > > > wrote:
> > >
> > > > +1
> > > >
> > > > Thank you for the much needed new design.
> > > > At first glance, it looks great, and more professional.
> > > >
> > > > --Vahid
> > > >
> > > >
> > > >
> > > > From:   Gwen Shapira 
> > > > To: d...@kafka.apache.org, Users 
> > > > Cc: Derrick Or 
> > > > Date:   10/04/2016 04:13 PM
> > > > Subject:Snazzy new look to our website
> > > >
> > > >
> > > >
> > > > Hi Team Kafka,
> > > >
> > > > I just merged PR 20 to our website - which gives it a new (and IMO
> > > > pretty snazzy) look and feel. Thanks to Derrick Or for contributing
> > > > the update.
> > > >
> > > > I had to do a hard-refresh (shift-f5 on my mac) to get the new look
> to
> > > > load properly - so if stuff looks off, try it.
> > > >
> > > > Comments and contributions to the site are welcome.
> > > >
> > > > Gwen
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>


Re: Handling out of order messages without KTables

2016-10-06 Thread Ali Akhtar
Thank you, State Store seems promising. But, is it distributed, or limited
to the particular instance of my application?

I.e if there are 3 messages, setting product 1's price to $1, $3, and $5,
and all 3 of them go to a different instance of my application, will they
be able to correctly identify the latest received message using State Store?

On Thu, Oct 6, 2016 at 10:48 PM, Matthias J. Sax 
wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> What do you mean by "message keys are random" -- do you effectively
> have no keys and want all messages to be processed as if they all have
> the same key?
>
> To access record TS in general, you need to use Processor API. The
> given ProcessorContext object given by Processor#init() always return
> the timestamp of the currently processed on #timestamp().
>
> Thus, you can attach a state store to your processor and compare the
> timestamps of the current record with the timestamp of the one in your
> store.
>
> - -Matthias
>
> On 10/6/16 8:52 AM, Ali Akhtar wrote:
> > Heya,
> >
> > I have some Kafka producers, which are listening to webhook events,
> > and for each webhook event, they post its payload to a Kafka
> > topic.
> >
> > Each payload contains a timestamp from the webhook source.
> >
> > This timestamp is the source of truth about which events happened
> > first, which happened last, etc.
> >
> > I need to ensure that the last arrival of a particular type of
> > message wins.
> >
> > E.g, if there are 5 messages, saying the price of a product with id
> > 1, was set to $1, then $3, then something else, etc, before finally
> > being set to $10, then I need to make sure that the final price for
> > that product is $10.
> >
> > These messages can be out of order, and I need to determine the
> > latest arrival based on the timestamp from the webhook source. (Its
> > atm in a string format which can be parsed)
> >
> > Since KTable looks like it uses message keys to determine what
> > happens - and in this case, the message keys are random, and the
> > timestamp contained in the value of the message is what determines
> > the order of the events - any pointers on what the best way to do
> > this is?
> >
> > I'm using kafka streaming, latest version, Java.
> >
> -BEGIN PGP SIGNATURE-
> Comment: GPGTools - https://gpgtools.org
>
> iQIcBAEBCgAGBQJX9o6AAAoJECnhiMLycopPIjEP/At7i3ttitxDkDUKeaaemZbD
> BF6XX3GMt1TDM6h+PwryjssZmHpJgmdFmNOZe1HkZIrJKmwDP7lxtA5OoKCXeJxm
> 2O+vbaTSYi5W5isCVYRjNnGWjxjrdoD+HQoPElUM8byfqlIBBrmsE0NZ3P7WEpKI
> vnpOJJLd/mNfrBgXuXPClvOXl45uRyHyPBoAZ6sZEoNn9QxXXKGLuelpElpF2sGz
> lMFlELW90SUGjt+dVF61WtcdNj9Bn0pzjR5PYGq29wwOPiZqc8P7anithZi0k4Eb
> VBupjQMzivrvbdZ9ebnZljsf4G/6omTCFhf5y4ph6y8453Qr/4E/b9LKkM3gI79M
> NzjyCyPCSAjtO+TTIyQdn6fUvUDa7qeEC+CiqtlILdSeWxP5SajJKIHxbUpjLWEQ
> FByC21KKI20Lh81Pd4NqCddiUpOZae4ZTpv7Pa30YjKB3ofOjeHnIiFuic9lQ7Ro
> SJ8GuDWI1hxQ9T3h0uz3eqR1mpZvpIWlVq3bccT7d9olx39VW8c5/GYiZtibPZjb
> g4r459CSfcar5olR8TSbxivXxglj7vjn74DR9zQAorc3wsmcXigiYac8yhpwnCzQ
> os1ChrT7XDtTbaPNnE9JZeGFX920/6pJdSbpBf8OmFiT7arVEujyQgwLIKQaWU4j
> 0EhFm3pJKOsXvWS5sqEc
> =f+6C
> -END PGP SIGNATURE-
>


Re: Handling out of order messages without KTables

2016-10-06 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

What do you mean by "message keys are random" -- do you effectively
have no keys and want all messages to be processed as if they all have
the same key?

To access record TS in general, you need to use Processor API. The
given ProcessorContext object given by Processor#init() always return
the timestamp of the currently processed on #timestamp().

Thus, you can attach a state store to your processor and compare the
timestamps of the current record with the timestamp of the one in your
store.

- -Matthias

On 10/6/16 8:52 AM, Ali Akhtar wrote:
> Heya,
> 
> I have some Kafka producers, which are listening to webhook events,
> and for each webhook event, they post its payload to a Kafka
> topic.
> 
> Each payload contains a timestamp from the webhook source.
> 
> This timestamp is the source of truth about which events happened
> first, which happened last, etc.
> 
> I need to ensure that the last arrival of a particular type of
> message wins.
> 
> E.g, if there are 5 messages, saying the price of a product with id
> 1, was set to $1, then $3, then something else, etc, before finally
> being set to $10, then I need to make sure that the final price for
> that product is $10.
> 
> These messages can be out of order, and I need to determine the
> latest arrival based on the timestamp from the webhook source. (Its
> atm in a string format which can be parsed)
> 
> Since KTable looks like it uses message keys to determine what
> happens - and in this case, the message keys are random, and the
> timestamp contained in the value of the message is what determines
> the order of the events - any pointers on what the best way to do
> this is?
> 
> I'm using kafka streaming, latest version, Java.
> 
-BEGIN PGP SIGNATURE-
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJX9o6AAAoJECnhiMLycopPIjEP/At7i3ttitxDkDUKeaaemZbD
BF6XX3GMt1TDM6h+PwryjssZmHpJgmdFmNOZe1HkZIrJKmwDP7lxtA5OoKCXeJxm
2O+vbaTSYi5W5isCVYRjNnGWjxjrdoD+HQoPElUM8byfqlIBBrmsE0NZ3P7WEpKI
vnpOJJLd/mNfrBgXuXPClvOXl45uRyHyPBoAZ6sZEoNn9QxXXKGLuelpElpF2sGz
lMFlELW90SUGjt+dVF61WtcdNj9Bn0pzjR5PYGq29wwOPiZqc8P7anithZi0k4Eb
VBupjQMzivrvbdZ9ebnZljsf4G/6omTCFhf5y4ph6y8453Qr/4E/b9LKkM3gI79M
NzjyCyPCSAjtO+TTIyQdn6fUvUDa7qeEC+CiqtlILdSeWxP5SajJKIHxbUpjLWEQ
FByC21KKI20Lh81Pd4NqCddiUpOZae4ZTpv7Pa30YjKB3ofOjeHnIiFuic9lQ7Ro
SJ8GuDWI1hxQ9T3h0uz3eqR1mpZvpIWlVq3bccT7d9olx39VW8c5/GYiZtibPZjb
g4r459CSfcar5olR8TSbxivXxglj7vjn74DR9zQAorc3wsmcXigiYac8yhpwnCzQ
os1ChrT7XDtTbaPNnE9JZeGFX920/6pJdSbpBf8OmFiT7arVEujyQgwLIKQaWU4j
0EhFm3pJKOsXvWS5sqEc
=f+6C
-END PGP SIGNATURE-


Handling out of order messages without KTables

2016-10-06 Thread Ali Akhtar
Heya,

I have some Kafka producers, which are listening to webhook events, and for
each webhook event, they post its payload to a Kafka topic.

Each payload contains a timestamp from the webhook source.

This timestamp is the source of truth about which events happened first,
which happened last, etc.

I need to ensure that the last arrival of a particular type of message wins.

E.g, if there are 5 messages, saying the price of a product with id 1, was
set to $1, then $3, then something else, etc, before finally being set to
$10, then I need to make sure that the final price for that product is $10.

These messages can be out of order, and I need to determine the latest
arrival based on the timestamp from the webhook source. (Its atm in a
string format which can be parsed)

Since KTable looks like it uses message keys to determine what happens -
and in this case, the message keys are random, and the timestamp contained
in the value of the message is what determines the order of the events -
any pointers on what the best way to do this is?

I'm using kafka streaming, latest version, Java.


Re: difficulty to delete a topic because of its syntax

2016-10-06 Thread Ismael Juma
On Thu, Oct 6, 2016 at 2:51 PM, Avi Flax  wrote:
>
> Does this mean that the next release (after 0.10.1.0, maybe ~Feb?) might
> remove altogether the requirement that Streams apps be able to access
> ZooKeeper directly?


That's the plan. See the following PR for details:

https://github.com/apache/kafka/pull/1884

Ismael


Re: difficulty to delete a topic because of its syntax

2016-10-06 Thread Avi Flax
> On Oct 6, 2016, at 06:24, Ismael Juma  wrote:
> 
> It's worth mentioning that Streams is in the process of transitioning from
> updating ZooKeeper directly to using the newly introduced create topics and
> delete topics protocol requests. It was too late for 0.10.1.0, but should
> land in trunk soonish.


That’s great to hear!

Does this mean that the next release (after 0.10.1.0, maybe ~Feb?) might remove 
altogether the requirement that Streams apps be able to access ZooKeeper 
directly? Or is this simply a solid step in that direction? It’s good news 
either way; really I’m just curious.

Thanks!
Avi


Software Architect @ Park Assist
We’re hiring! http://tech.parkassist.com/jobs/

RE: difficulty to delete a topic because of its syntax

2016-10-06 Thread Hamza HACHANI
Yes in fact,

The topic in question was a name of a store.

Ok i will do it for th matter of JIRA.


De : isma...@gmail.com  de la part de Ismael Juma 

Envoyé : mercredi 5 octobre 2016 22:24:53
À : users@kafka.apache.org
Objet : Re: difficulty to delete a topic because of its syntax

It's worth mentioning that Streams is in the process of transitioning from
updating ZooKeeper directly to using the newly introduced create topics and
delete topics protocol requests. It was too late for 0.10.1.0, but should
land in trunk soonish.

Ismael

On Thu, Oct 6, 2016 at 11:15 AM, Yuto KAWAMURA 
wrote:

> I guess this topic is created by Kafka Streams.
> Kafka Streams has it's own topic creation(zookeeper node creation)
> implementation and not using core's AdminUtils to create internal use
> topics such as XX-changelog:
> https://github.com/apache/kafka/blob/trunk/streams/src/main/
> java/org/apache/kafka/streams/processor/internals/InternalTo
> picManager.java#L208
> In AdminUtils it has topic name
> validation(https://github.com/apache/kafka/blob/trunk/core/s
> rc/main/scala/kafka/common/Topic.scala#L33-L47)
> logic but I don't see similar thing in Kafka Streams version as I read
> the code briefly.
> Since the topic name is created by the store name which is an
> arbitrary string given by users, this could happen if you give a
> string that contains whitespace as a name of state store.
>
>
> 2016-10-06 18:40 GMT+09:00 Rajini Sivaram :
> > Hamza,
> >
> > Can you raise a JIRA with details on how the topic was created by Kafka
> > with an invalid name? Sounds like there might be a missing validation
> > somewhere.
> >
> > Regards,
> >
> > Rajini
> >
> > On Thu, Oct 6, 2016 at 10:12 AM, Hamza HACHANI 
> > wrote:
> >
> >> Thanks Todd,
> >>
> >>
> >> I've resolved it by suing what you told me.
> >>
> >> Thanks very much. But i think that there is a problem with kafka by
> >> letting the saving names of topic and logs where there is a space as i
> >> showes in the images.
> >>
> >> Have a good day to you all.
> >>
> >>
> >> Hamza
> >>
> >> 
> >> De : Hamza HACHANI 
> >> Envoyé : mercredi 5 octobre 2016 19:23:00
> >> À : users@kafka.apache.org
> >> Objet : RE: difficulty to delete a topic because of its syntax
> >>
> >>
> >> Hi,
> >>
> >> Attached the files showing what i'm talking about.
> >>
> >>
> >> Hamza
> >>
> >> 
> >> De : Todd S 
> >> Envoyé : mercredi 5 octobre 2016 07:25:48
> >> À : users@kafka.apache.org
> >> Objet : Re: difficulty to delete a topic because of its syntax
> >>
> >> You *could* go in to zookeeper and nuke the topic, then delete the
> files on
> >> disk
> >>
> >> Slightly more risky but it should work
> >>
> >> On Wednesday, 5 October 2016, Manikumar 
> wrote:
> >>
> >> > Kafka doesn't support white spaces in topic names.  Only support '.',
> '_'
> >> > and '-' these are allowed.
> >> > Not sure how you got white space in topic name.
> >> >
> >> > On Wed, Oct 5, 2016 at 8:19 PM, Hamza HACHANI <
> hamza.hach...@supcom.tn
> >> > >
> >> > wrote:
> >> >
> >> > > Well ackwardly when i list the topics i find it but when i do
> delete it
> >> > it
> >> > > says that this topic does not exist.
> >> > >
> >> > > 
> >> > > De : Ben Davison >
> >> > > Envoyé : mercredi 5 octobre 2016 02:37:14
> >> > > À : users@kafka.apache.org 
> >> > > Objet : Re: difficulty to delete a topic because of its syntax
> >> > >
> >> > > Try putting "" or '' around the string when running the command.
> >> > >
> >> > > On Wed, Oct 5, 2016 at 3:29 PM, Hamza HACHANI <
> hamza.hach...@supcom.tn
> >> > >
> >> > > wrote:
> >> > >
> >> > > > It's between "the" and "metric"
> >> > > >
> >> > > > 
> >> > > > De : Ali Akhtar >
> >> > > > Envoyé : mercredi 5 octobre 2016 02:16:33
> >> > > > À : users@kafka.apache.org 
> >> > > > Objet : Re: difficulty to delete a topic because of its syntax
> >> > > >
> >> > > > I don't see a space in that topic name
> >> > > >
> >> > > > On Wed, Oct 5, 2016 at 6:42 PM, Hamza HACHANI <
> >> hamza.hach...@supcom.tn
> >> > >
> >> > > > wrote:
> >> > > >
> >> > > > > Hi,
> >> > > > >
> >> > > > > I created a topic called device-connection-invert-key-value-the
> >> > > > > metric-changelog.
> >> > > > >
> >> > > > > I insit that there is a space in it.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > Now that i want to delete it because my  cluster can no longer
> work
> >> > > > > correctly i can't do it as it  only reads the first part of it
> : (
> >> > > > > device-connection-invert-key-value-the) which obviously it
> doesn't
> >> > > find.
> >> > > > >
> >> > > > > Does some body have a wolution to delete it ?
> >> > > > >
> >> > > > > Thanks in advance.
> >> > > > >
> >> > > > >
> >> > > > > Hamza
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> > > --
> >> > >
> >> > >
> >> > > This email, including attachments, is private and confidential.

Text error in documentation.html web page

2016-10-06 Thread Mazhar Shaikh
Visit: http://kafka.apache.org/documentation.html#quickstart

and scroll down.

Please find the image attached.
[image: Inline image 1]

Browser : IE & Firefox


Re: difficulty to delete a topic because of its syntax

2016-10-06 Thread Ismael Juma
It's worth mentioning that Streams is in the process of transitioning from
updating ZooKeeper directly to using the newly introduced create topics and
delete topics protocol requests. It was too late for 0.10.1.0, but should
land in trunk soonish.

Ismael

On Thu, Oct 6, 2016 at 11:15 AM, Yuto KAWAMURA 
wrote:

> I guess this topic is created by Kafka Streams.
> Kafka Streams has it's own topic creation(zookeeper node creation)
> implementation and not using core's AdminUtils to create internal use
> topics such as XX-changelog:
> https://github.com/apache/kafka/blob/trunk/streams/src/main/
> java/org/apache/kafka/streams/processor/internals/InternalTo
> picManager.java#L208
> In AdminUtils it has topic name
> validation(https://github.com/apache/kafka/blob/trunk/core/s
> rc/main/scala/kafka/common/Topic.scala#L33-L47)
> logic but I don't see similar thing in Kafka Streams version as I read
> the code briefly.
> Since the topic name is created by the store name which is an
> arbitrary string given by users, this could happen if you give a
> string that contains whitespace as a name of state store.
>
>
> 2016-10-06 18:40 GMT+09:00 Rajini Sivaram :
> > Hamza,
> >
> > Can you raise a JIRA with details on how the topic was created by Kafka
> > with an invalid name? Sounds like there might be a missing validation
> > somewhere.
> >
> > Regards,
> >
> > Rajini
> >
> > On Thu, Oct 6, 2016 at 10:12 AM, Hamza HACHANI 
> > wrote:
> >
> >> Thanks Todd,
> >>
> >>
> >> I've resolved it by suing what you told me.
> >>
> >> Thanks very much. But i think that there is a problem with kafka by
> >> letting the saving names of topic and logs where there is a space as i
> >> showes in the images.
> >>
> >> Have a good day to you all.
> >>
> >>
> >> Hamza
> >>
> >> 
> >> De : Hamza HACHANI 
> >> Envoyé : mercredi 5 octobre 2016 19:23:00
> >> À : users@kafka.apache.org
> >> Objet : RE: difficulty to delete a topic because of its syntax
> >>
> >>
> >> Hi,
> >>
> >> Attached the files showing what i'm talking about.
> >>
> >>
> >> Hamza
> >>
> >> 
> >> De : Todd S 
> >> Envoyé : mercredi 5 octobre 2016 07:25:48
> >> À : users@kafka.apache.org
> >> Objet : Re: difficulty to delete a topic because of its syntax
> >>
> >> You *could* go in to zookeeper and nuke the topic, then delete the
> files on
> >> disk
> >>
> >> Slightly more risky but it should work
> >>
> >> On Wednesday, 5 October 2016, Manikumar 
> wrote:
> >>
> >> > Kafka doesn't support white spaces in topic names.  Only support '.',
> '_'
> >> > and '-' these are allowed.
> >> > Not sure how you got white space in topic name.
> >> >
> >> > On Wed, Oct 5, 2016 at 8:19 PM, Hamza HACHANI <
> hamza.hach...@supcom.tn
> >> > >
> >> > wrote:
> >> >
> >> > > Well ackwardly when i list the topics i find it but when i do
> delete it
> >> > it
> >> > > says that this topic does not exist.
> >> > >
> >> > > 
> >> > > De : Ben Davison >
> >> > > Envoyé : mercredi 5 octobre 2016 02:37:14
> >> > > À : users@kafka.apache.org 
> >> > > Objet : Re: difficulty to delete a topic because of its syntax
> >> > >
> >> > > Try putting "" or '' around the string when running the command.
> >> > >
> >> > > On Wed, Oct 5, 2016 at 3:29 PM, Hamza HACHANI <
> hamza.hach...@supcom.tn
> >> > >
> >> > > wrote:
> >> > >
> >> > > > It's between "the" and "metric"
> >> > > >
> >> > > > 
> >> > > > De : Ali Akhtar >
> >> > > > Envoyé : mercredi 5 octobre 2016 02:16:33
> >> > > > À : users@kafka.apache.org 
> >> > > > Objet : Re: difficulty to delete a topic because of its syntax
> >> > > >
> >> > > > I don't see a space in that topic name
> >> > > >
> >> > > > On Wed, Oct 5, 2016 at 6:42 PM, Hamza HACHANI <
> >> hamza.hach...@supcom.tn
> >> > >
> >> > > > wrote:
> >> > > >
> >> > > > > Hi,
> >> > > > >
> >> > > > > I created a topic called device-connection-invert-key-value-the
> >> > > > > metric-changelog.
> >> > > > >
> >> > > > > I insit that there is a space in it.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > Now that i want to delete it because my  cluster can no longer
> work
> >> > > > > correctly i can't do it as it  only reads the first part of it
> : (
> >> > > > > device-connection-invert-key-value-the) which obviously it
> doesn't
> >> > > find.
> >> > > > >
> >> > > > > Does some body have a wolution to delete it ?
> >> > > > >
> >> > > > > Thanks in advance.
> >> > > > >
> >> > > > >
> >> > > > > Hamza
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> > > --
> >> > >
> >> > >
> >> > > This email, including attachments, is private and confidential. If
> you
> >> > have
> >> > > received this email in error please notify the sender and delete it
> >> from
> >> > > your system. Emails are not secure and may contain viruses. No
> >> liability
> >> > > can be accepted for viruses that might be transferred by this email
> or
> >> > any
> >> > > attachment. Any unau

Re: difficulty to delete a topic because of its syntax

2016-10-06 Thread Yuto KAWAMURA
I guess this topic is created by Kafka Streams.
Kafka Streams has it's own topic creation(zookeeper node creation)
implementation and not using core's AdminUtils to create internal use
topics such as XX-changelog:
https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java#L208
In AdminUtils it has topic name
validation(https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/common/Topic.scala#L33-L47)
logic but I don't see similar thing in Kafka Streams version as I read
the code briefly.
Since the topic name is created by the store name which is an
arbitrary string given by users, this could happen if you give a
string that contains whitespace as a name of state store.


2016-10-06 18:40 GMT+09:00 Rajini Sivaram :
> Hamza,
>
> Can you raise a JIRA with details on how the topic was created by Kafka
> with an invalid name? Sounds like there might be a missing validation
> somewhere.
>
> Regards,
>
> Rajini
>
> On Thu, Oct 6, 2016 at 10:12 AM, Hamza HACHANI 
> wrote:
>
>> Thanks Todd,
>>
>>
>> I've resolved it by suing what you told me.
>>
>> Thanks very much. But i think that there is a problem with kafka by
>> letting the saving names of topic and logs where there is a space as i
>> showes in the images.
>>
>> Have a good day to you all.
>>
>>
>> Hamza
>>
>> 
>> De : Hamza HACHANI 
>> Envoyé : mercredi 5 octobre 2016 19:23:00
>> À : users@kafka.apache.org
>> Objet : RE: difficulty to delete a topic because of its syntax
>>
>>
>> Hi,
>>
>> Attached the files showing what i'm talking about.
>>
>>
>> Hamza
>>
>> 
>> De : Todd S 
>> Envoyé : mercredi 5 octobre 2016 07:25:48
>> À : users@kafka.apache.org
>> Objet : Re: difficulty to delete a topic because of its syntax
>>
>> You *could* go in to zookeeper and nuke the topic, then delete the files on
>> disk
>>
>> Slightly more risky but it should work
>>
>> On Wednesday, 5 October 2016, Manikumar  wrote:
>>
>> > Kafka doesn't support white spaces in topic names.  Only support '.', '_'
>> > and '-' these are allowed.
>> > Not sure how you got white space in topic name.
>> >
>> > On Wed, Oct 5, 2016 at 8:19 PM, Hamza HACHANI > > >
>> > wrote:
>> >
>> > > Well ackwardly when i list the topics i find it but when i do delete it
>> > it
>> > > says that this topic does not exist.
>> > >
>> > > 
>> > > De : Ben Davison >
>> > > Envoyé : mercredi 5 octobre 2016 02:37:14
>> > > À : users@kafka.apache.org 
>> > > Objet : Re: difficulty to delete a topic because of its syntax
>> > >
>> > > Try putting "" or '' around the string when running the command.
>> > >
>> > > On Wed, Oct 5, 2016 at 3:29 PM, Hamza HACHANI > > >
>> > > wrote:
>> > >
>> > > > It's between "the" and "metric"
>> > > >
>> > > > 
>> > > > De : Ali Akhtar >
>> > > > Envoyé : mercredi 5 octobre 2016 02:16:33
>> > > > À : users@kafka.apache.org 
>> > > > Objet : Re: difficulty to delete a topic because of its syntax
>> > > >
>> > > > I don't see a space in that topic name
>> > > >
>> > > > On Wed, Oct 5, 2016 at 6:42 PM, Hamza HACHANI <
>> hamza.hach...@supcom.tn
>> > >
>> > > > wrote:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > I created a topic called device-connection-invert-key-value-the
>> > > > > metric-changelog.
>> > > > >
>> > > > > I insit that there is a space in it.
>> > > > >
>> > > > >
>> > > > >
>> > > > > Now that i want to delete it because my  cluster can no longer work
>> > > > > correctly i can't do it as it  only reads the first part of it : (
>> > > > > device-connection-invert-key-value-the) which obviously it doesn't
>> > > find.
>> > > > >
>> > > > > Does some body have a wolution to delete it ?
>> > > > >
>> > > > > Thanks in advance.
>> > > > >
>> > > > >
>> > > > > Hamza
>> > > > >
>> > > > >
>> > > >
>> > >
>> > > --
>> > >
>> > >
>> > > This email, including attachments, is private and confidential. If you
>> > have
>> > > received this email in error please notify the sender and delete it
>> from
>> > > your system. Emails are not secure and may contain viruses. No
>> liability
>> > > can be accepted for viruses that might be transferred by this email or
>> > any
>> > > attachment. Any unauthorised copying of this message or unauthorised
>> > > distribution and publication of the information contained herein are
>> > > prohibited.
>> > >
>> > > 7digital Limited. Registered office: 69 Wilson Street, London EC2A 2BB.
>> > > Registered in England and Wales. Registered No. 04843573.
>> > >
>> >
>>
>
>
>
> --
> Regards,
>
> Rajini


Re: difficulty to delete a topic because of its syntax

2016-10-06 Thread Rajini Sivaram
Hamza,

Can you raise a JIRA with details on how the topic was created by Kafka
with an invalid name? Sounds like there might be a missing validation
somewhere.

Regards,

Rajini

On Thu, Oct 6, 2016 at 10:12 AM, Hamza HACHANI 
wrote:

> Thanks Todd,
>
>
> I've resolved it by suing what you told me.
>
> Thanks very much. But i think that there is a problem with kafka by
> letting the saving names of topic and logs where there is a space as i
> showes in the images.
>
> Have a good day to you all.
>
>
> Hamza
>
> 
> De : Hamza HACHANI 
> Envoyé : mercredi 5 octobre 2016 19:23:00
> À : users@kafka.apache.org
> Objet : RE: difficulty to delete a topic because of its syntax
>
>
> Hi,
>
> Attached the files showing what i'm talking about.
>
>
> Hamza
>
> 
> De : Todd S 
> Envoyé : mercredi 5 octobre 2016 07:25:48
> À : users@kafka.apache.org
> Objet : Re: difficulty to delete a topic because of its syntax
>
> You *could* go in to zookeeper and nuke the topic, then delete the files on
> disk
>
> Slightly more risky but it should work
>
> On Wednesday, 5 October 2016, Manikumar  wrote:
>
> > Kafka doesn't support white spaces in topic names.  Only support '.', '_'
> > and '-' these are allowed.
> > Not sure how you got white space in topic name.
> >
> > On Wed, Oct 5, 2016 at 8:19 PM, Hamza HACHANI  > >
> > wrote:
> >
> > > Well ackwardly when i list the topics i find it but when i do delete it
> > it
> > > says that this topic does not exist.
> > >
> > > 
> > > De : Ben Davison >
> > > Envoyé : mercredi 5 octobre 2016 02:37:14
> > > À : users@kafka.apache.org 
> > > Objet : Re: difficulty to delete a topic because of its syntax
> > >
> > > Try putting "" or '' around the string when running the command.
> > >
> > > On Wed, Oct 5, 2016 at 3:29 PM, Hamza HACHANI  > >
> > > wrote:
> > >
> > > > It's between "the" and "metric"
> > > >
> > > > 
> > > > De : Ali Akhtar >
> > > > Envoyé : mercredi 5 octobre 2016 02:16:33
> > > > À : users@kafka.apache.org 
> > > > Objet : Re: difficulty to delete a topic because of its syntax
> > > >
> > > > I don't see a space in that topic name
> > > >
> > > > On Wed, Oct 5, 2016 at 6:42 PM, Hamza HACHANI <
> hamza.hach...@supcom.tn
> > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I created a topic called device-connection-invert-key-value-the
> > > > > metric-changelog.
> > > > >
> > > > > I insit that there is a space in it.
> > > > >
> > > > >
> > > > >
> > > > > Now that i want to delete it because my  cluster can no longer work
> > > > > correctly i can't do it as it  only reads the first part of it : (
> > > > > device-connection-invert-key-value-the) which obviously it doesn't
> > > find.
> > > > >
> > > > > Does some body have a wolution to delete it ?
> > > > >
> > > > > Thanks in advance.
> > > > >
> > > > >
> > > > > Hamza
> > > > >
> > > > >
> > > >
> > >
> > > --
> > >
> > >
> > > This email, including attachments, is private and confidential. If you
> > have
> > > received this email in error please notify the sender and delete it
> from
> > > your system. Emails are not secure and may contain viruses. No
> liability
> > > can be accepted for viruses that might be transferred by this email or
> > any
> > > attachment. Any unauthorised copying of this message or unauthorised
> > > distribution and publication of the information contained herein are
> > > prohibited.
> > >
> > > 7digital Limited. Registered office: 69 Wilson Street, London EC2A 2BB.
> > > Registered in England and Wales. Registered No. 04843573.
> > >
> >
>



-- 
Regards,

Rajini


RE: difficulty to delete a topic because of its syntax

2016-10-06 Thread Hamza HACHANI
Thanks Todd,


I've resolved it by suing what you told me.

Thanks very much. But i think that there is a problem with kafka by letting the 
saving names of topic and logs where there is a space as i showes in the images.

Have a good day to you all.


Hamza


De : Hamza HACHANI 
Envoyé : mercredi 5 octobre 2016 19:23:00
À : users@kafka.apache.org
Objet : RE: difficulty to delete a topic because of its syntax


Hi,

Attached the files showing what i'm talking about.


Hamza


De : Todd S 
Envoyé : mercredi 5 octobre 2016 07:25:48
À : users@kafka.apache.org
Objet : Re: difficulty to delete a topic because of its syntax

You *could* go in to zookeeper and nuke the topic, then delete the files on
disk

Slightly more risky but it should work

On Wednesday, 5 October 2016, Manikumar  wrote:

> Kafka doesn't support white spaces in topic names.  Only support '.', '_'
> and '-' these are allowed.
> Not sure how you got white space in topic name.
>
> On Wed, Oct 5, 2016 at 8:19 PM, Hamza HACHANI  >
> wrote:
>
> > Well ackwardly when i list the topics i find it but when i do delete it
> it
> > says that this topic does not exist.
> >
> > 
> > De : Ben Davison >
> > Envoyé : mercredi 5 octobre 2016 02:37:14
> > À : users@kafka.apache.org 
> > Objet : Re: difficulty to delete a topic because of its syntax
> >
> > Try putting "" or '' around the string when running the command.
> >
> > On Wed, Oct 5, 2016 at 3:29 PM, Hamza HACHANI  >
> > wrote:
> >
> > > It's between "the" and "metric"
> > >
> > > 
> > > De : Ali Akhtar >
> > > Envoyé : mercredi 5 octobre 2016 02:16:33
> > > À : users@kafka.apache.org 
> > > Objet : Re: difficulty to delete a topic because of its syntax
> > >
> > > I don't see a space in that topic name
> > >
> > > On Wed, Oct 5, 2016 at 6:42 PM, Hamza HACHANI  >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I created a topic called device-connection-invert-key-value-the
> > > > metric-changelog.
> > > >
> > > > I insit that there is a space in it.
> > > >
> > > >
> > > >
> > > > Now that i want to delete it because my  cluster can no longer work
> > > > correctly i can't do it as it  only reads the first part of it : (
> > > > device-connection-invert-key-value-the) which obviously it doesn't
> > find.
> > > >
> > > > Does some body have a wolution to delete it ?
> > > >
> > > > Thanks in advance.
> > > >
> > > >
> > > > Hamza
> > > >
> > > >
> > >
> >
> > --
> >
> >
> > This email, including attachments, is private and confidential. If you
> have
> > received this email in error please notify the sender and delete it from
> > your system. Emails are not secure and may contain viruses. No liability
> > can be accepted for viruses that might be transferred by this email or
> any
> > attachment. Any unauthorised copying of this message or unauthorised
> > distribution and publication of the information contained herein are
> > prohibited.
> >
> > 7digital Limited. Registered office: 69 Wilson Street, London EC2A 2BB.
> > Registered in England and Wales. Registered No. 04843573.
> >
>


RE: difficulty to delete a topic because of its syntax

2016-10-06 Thread Hamza HACHANI
Hi,

Attached the files showing what i'm talking about.


Hamza


De : Todd S 
Envoyé : mercredi 5 octobre 2016 07:25:48
À : users@kafka.apache.org
Objet : Re: difficulty to delete a topic because of its syntax

You *could* go in to zookeeper and nuke the topic, then delete the files on
disk

Slightly more risky but it should work

On Wednesday, 5 October 2016, Manikumar  wrote:

> Kafka doesn't support white spaces in topic names.  Only support '.', '_'
> and '-' these are allowed.
> Not sure how you got white space in topic name.
>
> On Wed, Oct 5, 2016 at 8:19 PM, Hamza HACHANI  >
> wrote:
>
> > Well ackwardly when i list the topics i find it but when i do delete it
> it
> > says that this topic does not exist.
> >
> > 
> > De : Ben Davison >
> > Envoyé : mercredi 5 octobre 2016 02:37:14
> > À : users@kafka.apache.org 
> > Objet : Re: difficulty to delete a topic because of its syntax
> >
> > Try putting "" or '' around the string when running the command.
> >
> > On Wed, Oct 5, 2016 at 3:29 PM, Hamza HACHANI  >
> > wrote:
> >
> > > It's between "the" and "metric"
> > >
> > > 
> > > De : Ali Akhtar >
> > > Envoyé : mercredi 5 octobre 2016 02:16:33
> > > À : users@kafka.apache.org 
> > > Objet : Re: difficulty to delete a topic because of its syntax
> > >
> > > I don't see a space in that topic name
> > >
> > > On Wed, Oct 5, 2016 at 6:42 PM, Hamza HACHANI  >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I created a topic called device-connection-invert-key-value-the
> > > > metric-changelog.
> > > >
> > > > I insit that there is a space in it.
> > > >
> > > >
> > > >
> > > > Now that i want to delete it because my  cluster can no longer work
> > > > correctly i can't do it as it  only reads the first part of it : (
> > > > device-connection-invert-key-value-the) which obviously it doesn't
> > find.
> > > >
> > > > Does some body have a wolution to delete it ?
> > > >
> > > > Thanks in advance.
> > > >
> > > >
> > > > Hamza
> > > >
> > > >
> > >
> >
> > --
> >
> >
> > This email, including attachments, is private and confidential. If you
> have
> > received this email in error please notify the sender and delete it from
> > your system. Emails are not secure and may contain viruses. No liability
> > can be accepted for viruses that might be transferred by this email or
> any
> > attachment. Any unauthorised copying of this message or unauthorised
> > distribution and publication of the information contained herein are
> > prohibited.
> >
> > 7digital Limited. Registered office: 69 Wilson Street, London EC2A 2BB.
> > Registered in England and Wales. Registered No. 04843573.
> >
>