session.timeout.ms was supplied but isn't a known config

2016-06-12 Thread Barry Kaplan
In connect-distributed.properties I have values like:

session.timeout.ms=12


But at startup I get warnings like:

The configuration session.timeout.ms = 12 was supplied but isn't a
> known config.


I'm getting these warnings for most settings (eg, internal.key.converter,
key.converter, but I really don't understand from the docs if most of the
configs are really needed).


Re: Any restrictions on consumer group name?

2016-06-12 Thread Jaikiran Pai

Adding the Kafka dev list to cc, hoping they would answer this question.

-Jaikiran
On Friday 10 June 2016 11:18 AM, Jaikiran Pai wrote:
We are using 0.9.0.1 of Kafka server and (Java) clients. Our (Java) 
consumers are assigned to dynamic runtime generated groups i.e. the 
consumer group name is generated dynamically at runtime, using some 
application specific logic. I have been looking at the docs but 
haven't yet found anything that says if there is any restriction in 
the length and/or characters that make up the consumer group name. Can 
anyone confirm or point me to a doc which states whether or not there 
any restrictions on it?



-Jaikiran




Re: error: ... protocols are incompatible with those of existing members ??

2016-06-12 Thread Barry Kaplan
Thanks Gwen, that was exactly my problem. I had changed the value of the
group.id to the same value as the config name. But rereading the docs I did
not find any hint of that this would occur. Did I miss it somewhere?

-barry


Storing results of a stream

2016-06-12 Thread Kanagha
Hi,

I'm building a topology where I am connecting twitter with another
application (ex: appA)

appA also consists of the following graph model (similar to
facebook/twitter) where a user can have followers and follow other users.

Ex: UserA follows UserB, UserC, UserD.

And UserB/C/D can have any no.of followers.
This information is currently stored in an Oracle table.

I am retrieving the corresponding twitter ids for users B,C and D and
retrieving the latest (n) tweets posted by them.

1) I have a Kafka Spout where I am streaming the tweets for a specific set
of userIds.
2) I do a join of the Kafka Spout with the records in oracle table in
another Bolt, so that each tweet would be joined with all the users who
follow the user who posted the particular tweet.
3) After doing a join, I 'll be using a RollingCountBolt to capture the
latest n tweets posted by all the followers for a given user.


My question is what is the best way to store the results of
RollingCountBolt by avoiding duplication.
I can use a Redis instance to capture the information.
But say if a userA is followed by 100 users, a tweet posted by userA will
be duplicated 100 times.

To avoid duplication I can just store tweetIds in the outputField of
RollingCountBolt and tweets can be stored in a separate table. But since
tweets are streaming, each record must be associated with an expiration
period while being stored (similar to cache).

How are such scenarios dealt with usually in a streaming
application?Suggestions would be helpful.

Thanks
Kanagha


Re: Introducing Dory

2016-06-12 Thread Arya Ketan
Hi Dave,
Dory looks pretty interesting. I had a few  further questions on it
a) How does Dory handle kernel panics?
b) What kind of message guarantees does dory provide and also if you can
share some design decisions taken to enable the guarantees whatever they
are.

Thanks
Arya

Arya

On Sun, Jun 12, 2016 at 8:10 PM, Dave Peterson  wrote:

> Thanks!  Enjoy :-)
>
>
>
> On 6/12/2016 12:24 AM, Gwen Shapira wrote:
>
>> Dory is pretty cool (even though it is named after a somewhat dorky
>> fish). Thank you for sharing :)
>>
>> On Sun, Jun 12, 2016 at 1:24 AM, Dave Peterson 
>> wrote:
>>
>>> Hello Kafka users,
>>>
>>> Version 1.1.0 of Dory is now available.  See
>>> https://github.com/dspeterson/dory for details.  Dory is the successor
>>> to Bruce (https://github.com/tagged/bruce), a Kafka producer daemon I
>>> created while working at if(we) (http://www.ifwe.co/).  The code has
>>> seen a number of improvements since its initial release in September
>>> 2014.  The list of example clients for various programming languages
>>> has also been extended.  Dory maintains full backward compatibility
>>> with Bruce, so existing users can easily switch.
>>>
>>> The latest release adds support for receiving messages from clients by
>>> UNIX domain stream socket or local TCP.  Although UNIX domain
>>> datagrams are still the preferred means of sending messages in most
>>> cases, the option of using stream sockets facilitates sending messages
>>> too large to fit in a single datagram.  The local TCP option
>>> facilitates adding support for clients written in programming
>>> languages that do not provide easy access to UNIX domain sockets.
>>>
>>> Dory's wiki page http://dory.wikidot.com/start contains a list of
>>> ideas for additional features and other improvements.  Community
>>> feedback is welcomed and appreciated.  If you have ideas for things
>>> you would like to see in future releases, please add them to the list.
>>> Also, please contribute code if you can afford the time.
>>>
>>>
>>> Thanks,
>>> Dave Peterson
>>>
>>>
>>>
>


Re: Introducing Dory

2016-06-12 Thread Dave Peterson

Thanks!  Enjoy :-)


On 6/12/2016 12:24 AM, Gwen Shapira wrote:

Dory is pretty cool (even though it is named after a somewhat dorky
fish). Thank you for sharing :)

On Sun, Jun 12, 2016 at 1:24 AM, Dave Peterson  wrote:

Hello Kafka users,

Version 1.1.0 of Dory is now available.  See
https://github.com/dspeterson/dory for details.  Dory is the successor
to Bruce (https://github.com/tagged/bruce), a Kafka producer daemon I
created while working at if(we) (http://www.ifwe.co/).  The code has
seen a number of improvements since its initial release in September
2014.  The list of example clients for various programming languages
has also been extended.  Dory maintains full backward compatibility
with Bruce, so existing users can easily switch.

The latest release adds support for receiving messages from clients by
UNIX domain stream socket or local TCP.  Although UNIX domain
datagrams are still the preferred means of sending messages in most
cases, the option of using stream sockets facilitates sending messages
too large to fit in a single datagram.  The local TCP option
facilitates adding support for clients written in programming
languages that do not provide easy access to UNIX domain sockets.

Dory's wiki page http://dory.wikidot.com/start contains a list of
ideas for additional features and other improvements.  Community
feedback is welcomed and appreciated.  If you have ideas for things
you would like to see in future releases, please add them to the list.
Also, please contribute code if you can afford the time.


Thanks,
Dave Peterson






Kafka controller replica state docs outdated?

2016-06-12 Thread Stevo Slavić
Hello Apache Kafka community,

Is it intentional that not all states (like ReplicaDeletionIneligible) are
documented on
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internals
or is the page just outdated?

Btw I see replica states documented in javadoc
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ReplicaStateMachine.scala#L30

Kind regards,
Stevo Slavic.


Re: Introducing Dory

2016-06-12 Thread Gwen Shapira
Dory is pretty cool (even though it is named after a somewhat dorky
fish). Thank you for sharing :)

On Sun, Jun 12, 2016 at 1:24 AM, Dave Peterson  wrote:
> Hello Kafka users,
>
> Version 1.1.0 of Dory is now available.  See
> https://github.com/dspeterson/dory for details.  Dory is the successor
> to Bruce (https://github.com/tagged/bruce), a Kafka producer daemon I
> created while working at if(we) (http://www.ifwe.co/).  The code has
> seen a number of improvements since its initial release in September
> 2014.  The list of example clients for various programming languages
> has also been extended.  Dory maintains full backward compatibility
> with Bruce, so existing users can easily switch.
>
> The latest release adds support for receiving messages from clients by
> UNIX domain stream socket or local TCP.  Although UNIX domain
> datagrams are still the preferred means of sending messages in most
> cases, the option of using stream sockets facilitates sending messages
> too large to fit in a single datagram.  The local TCP option
> facilitates adding support for clients written in programming
> languages that do not provide easy access to UNIX domain sockets.
>
> Dory's wiki page http://dory.wikidot.com/start contains a list of
> ideas for additional features and other improvements.  Community
> feedback is welcomed and appreciated.  If you have ideas for things
> you would like to see in future releases, please add them to the list.
> Also, please contribute code if you can afford the time.
>
>
> Thanks,
> Dave Peterson
>
>