Re: [Streams] TimeWindows ignores gracePeriodMs in windowsFor(timestamp)

2019-04-30 Thread John Roesler
Hi Ashok,

I think some people may be able to give you advice, but please start a new
thread instead of replying to an existing message. This just helps keep all
the messages organized.

Thanks!
-John

On Thu, Apr 25, 2019 at 6:12 AM ASHOK MACHERLA  wrote:

> Hii,
>
> what I asking
>
> I want to know about kafka partitions
>
>
> we have getting data about 200GB+ from sources to kafka for daily .
>
> I need to know how many partitions are required to pull data from source
> without pileup.
>
> please suggest us to fix this issue.
>
> is there any mathematical rules to create specific no.of partitions for
> Topic.???
>
>
> please help me
>
> Sent from Outlook
> 
> From: Jose Lopez 
> Sent: 25 April 2019 16:34
> To: users@kafka.apache.org
> Subject: [Streams] TimeWindows ignores gracePeriodMs in
> windowsFor(timestamp)
>
> Hi all,
>
> Given that gradePeriodMs is "the time to admit late-arriving events after
> the end of the window", I'd expect it is taken into account in
> windowsFor(timestamp). E.g.:
>
> sizeMs = 5
> gracePeriodMs = 2
> advanceMs = 3
> timestamp = 6
>
> | window | windowStart | windowEnd | windowsEnd + gracePeriod |
> | 1   | 0   | 5 | 7
>|
> | 2   | 5   | 10   | 12
>  |
> ...
>
> Current output:
> windowsFor(timestamp) returns window 2 only.
>
> Expected output:
> windowsFor(timestamp) returns both window 1 and window 2
>
> Do you agree with the expected output? Am I missing something?
>
> Regards,
> Jose
>


Re: [Streams] TimeWindows ignores gracePeriodMs in windowsFor(timestamp)

2019-04-30 Thread John Roesler
Hey, Jose,

This is an interesting thought that I hadn't considered before. I think
(tentatively) that windowsFor should *not* take the grace period into
account.

What I'm thinking is that the method is supposed to return  "all windows
that contain the provided timestamp" . When we keep window1 open until
stream time 7, it's because we're waiting to see if some record with a
timestamp in range [0,5) arrives before the overall stream time ticks past
7. But if/when we get that event, its own timestamp is still in the range
[0-5). For example, its timestamp is *not* 6 (because then it would belong
in window2, not window1). Thus, window1 does not "contain" the timestamp 6,
and therefore, windowsFor(6) is not required to return window 1.

Does that seem right to you?
-John

On Thu, Apr 25, 2019 at 6:04 AM Jose Lopez  wrote:

> Hi all,
>
> Given that gradePeriodMs is "the time to admit late-arriving events after
> the end of the window", I'd expect it is taken into account in
> windowsFor(timestamp). E.g.:
>
> sizeMs = 5
> gracePeriodMs = 2
> advanceMs = 3
> timestamp = 6
>
> | window | windowStart | windowEnd | windowsEnd + gracePeriod |
> | 1   | 0   | 5 | 7
>|
> | 2   | 5   | 10   | 12
>  |
> ...
>
> Current output:
> windowsFor(timestamp) returns window 2 only.
>
> Expected output:
> windowsFor(timestamp) returns both window 1 and window 2
>
> Do you agree with the expected output? Am I missing something?
>
> Regards,
> Jose
>


My IDE which is sitting outside of the kube cluster creates a producer that attempts to connect to kafka using the cluster dns name of the headless service

2019-04-30 Thread Janitha Jayaweera
Hi!

The Kafka chart being used is here 
https://github.com/bitnami/charts/tree/master/bitnami/kafka

| chart | application |
| kafka-1.10.1 | 2.2.0 |


$ kubectl get svc -n kafka

NAME   TYPECLUSTER-IP   EXTERNAL-IP   
PORT(S)  AGE
kafka-janitha  ClusterIP   10.103.152.8 
9092/TCP 5d19h
kafka-janitha-headless ClusterIP   None 
9092/TCP 5d19h
kafka-janitha-zookeeperClusterIP   10.108.191.161   
2181/TCP,2888/TCP,3888/TCP   5d19h
kafka-janitha-zookeeper-headless   ClusterIP   None 
2181/TCP,2888/TCP,3888/TCP   5d19h

I am then port-forwarding like so

kubectl port-forward --namespace postgresql svc/postgres-janitha-postgresql 
5432:5432 --address 0.0.0.0 &
kubectl port-forward --namespace kafka svc/kafka-janitha 9092:9092 --address 
0.0.0.0 &
kubectl port-forward --namespace activemq svc/activemq-tideworks-activemq 
61616:61616 --address 0.0.0.0 &

I have my application properties set like so where the ip is the private ip of 
the node (master). My IDE is running outside the cluster.

tc.db.jdbc.url=jdbc:postgresql://172.31.31.192:5432/tcb_configuration
#
activemq.base.url=tcp://172.31.31.192:61616?wireFormat.maxInactivityDuration=-1
#
kafka.bootstrap.servers=172.31.31.192:9092

The application log states the following

17:36:45.617 [kafka-producer-network-thread | producer-3] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] Give up 
sending metadata request since no node is available
17:36:45.618 [kafka-producer-network-thread | producer-5] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-5] Give up 
sending metadata request since no node is available
17:36:45.667 [kafka-producer-network-thread | producer-3] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] Give up 
sending metadata request since no node is available
17:36:45.668 [kafka-producer-network-thread | producer-5] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-5] Give up 
sending metadata request since no node is available
17:36:45.717 [kafka-producer-network-thread | producer-3] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] Give up 
sending metadata request since no node is available
17:36:45.718 [kafka-producer-network-thread | producer-5] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-5] Give up 
sending metadata request since no node is available
17:36:45.768 [kafka-producer-network-thread | producer-3] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] Give up 
sending metadata request since no node is available
17:36:45.768 [kafka-producer-network-thread | producer-5] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-5] Give up 
sending metadata request since no node is available
17:36:45.818 [kafka-producer-network-thread | producer-3] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] Give up 
sending metadata request since no node is available
17:36:45.818 [kafka-producer-network-thread | producer-5] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-5] Give up 
sending metadata request since no node is available
17:36:45.868 [kafka-producer-network-thread | producer-3] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] Give up 
sending metadata request since no node is available
17:36:45.869 [kafka-producer-network-thread | producer-5] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-5] Give up 
sending metadata request since no node is available
17:36:45.918 [kafka-producer-network-thread | producer-3] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] Give up 
sending metadata request since no node is available
17:36:45.919 [kafka-producer-network-thread | producer-5] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-5] Give up 
sending metadata request since no node is available
17:36:45.969 [kafka-producer-network-thread | producer-3] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] 
Initialize connection to node 
kafka-janitha-0.kafka-janitha-headless.kafka.svc.cluster.local:9092 (id: 1001 
rack: null) for sending metadata request
17:36:45.969 [kafka-producer-network-thread | producer-3] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-3] 
Initiating connection to node 
kafka-janitha-0.kafka-janitha-headless.kafka.svc.cluster.local:9092 (id: 1001 
rack: null)
17:36:45.969 [kafka-producer-network-thread | producer-5] DEBUG 
org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-5] Give up 
sending metadata request since no node is available
17:36:45.969 [kafka-producer-network-

Re: KSQL Question

2019-04-30 Thread Vahid Hashemian
Hi Shalom,

This is the Github repo for KSQL: https://github.com/confluentinc/ksql
However, in order to get that running you have to download few libraries
KSQL depends on. And you'll need Kafka.
For the sake of experimentation you are probably better off using the
all-in-one Confluent Platform. It will save you some time.

I hope that helps.
--Vahid

On Tue, Apr 30, 2019 at 7:38 AM shalom sagges 
wrote:

> Hi All,
>
> I'm new to Kafka and wanted to experience with KSQL.
> However, I can't find a place to download it without downloading the entire
> Confluent Platform package.
> Please correct me if I'm wrong, but I understand that KSQL is free, so
> there must be a place where I can download only KSQL.
> Can someone please assist me with this?
>
> Thanks a lot! :)
>


-- 

Thanks!
--Vahid


KSQL Question

2019-04-30 Thread shalom sagges
Hi All,

I'm new to Kafka and wanted to experience with KSQL.
However, I can't find a place to download it without downloading the entire
Confluent Platform package.
Please correct me if I'm wrong, but I understand that KSQL is free, so
there must be a place where I can download only KSQL.
Can someone please assist me with this?

Thanks a lot! :)


RE: Required guidelines for kafka upgrade

2019-04-30 Thread ASHOK MACHERLA
Dear Team Members



Currently we are using Kafka 0.10.1, Zookeeper 3.4.5 versions, we are planning 
to upgrade to Kafka 2.2.0 version.

We have 3 node Kafka cluster(3 Zookeeper nodes , 3 Kafka nodes)



As suggested by you, Can I follow these steps



  1.

First, I have to download, new Kafka 2.1.0 tar file, and untar that file.

In config directory server.properties

I have to add the below new two parameters

inter.broker.protocol.version=0.10.1

log.message.format.version=0.10.1



  1.  After that stop old Kafka broker version(0.10.1) and then start new Kafka 
broker (2.2.0) in one node.

During this 1st node running with latest version and remaining two nodes are 
running with old version.



And then stop old Kafka in second node, and then start new Kafka broker (2.2.0) 
node,



Same thing we have to do it remaining third.



Now all three Kafka nodes are running with new version 2.2.0



  1.  After that in server.proeprties we have to add update the below parameter

inter.broker.protocol.version=2.2.0



same thing we have to do it remaining nodes



now we have to do rolling restart in all three nodes.





Now all three nodes are upgrading to new version ??



  1.  What I’m set for below parameter, right now we are not set this parameter 
into server.properties.

Can I add this parameter ???



ssl.endpoint.identification.algorithm



what is value should I mention for that above parameter




Please tell me if anything can I miss or else  any extra steps are required.



Please suggest us







Sent from Mail for Windows 10




From: SenthilKumar K 
Sent: Friday, April 26, 2019 7:49:33 PM
To: users@kafka.apache.org
Cc: Senthil kumar
Subject: Re: Required guidelines for kafka upgrade

Answers inline.

first what we upgrade zookeeper or kafka??
 - I can't really comment about ZK upgrade. Recently i upgraded only 10
node Kafka Cluster in production.
   Some of the Key Notes :
1) You need to set inter.broker.protocol.version = 1.1.0( Your Current
Broker Version )
  2) Start upgrading your brokers one by one & Verify the overall
functionality of your Cluster . Do not forget to verify the Clients too.
3) After Completing upgrade , set the version to 2.2.0 if you are
upgrading latest version & restart brokers one by one. Pls be aware you
cant downgrade after this step.
 4) Pls keep in mind : The default value for
ssl.endpoint.identification.algorithm
was changed to https.
So you need to set ssl.endpoint.identification.algorithm = empty to
make it work as previous version of broker.

Go through the documentation and you will more information (1.5 Upgrading
>From Previous Versions) https://kafka.apache.org/documentation/#upgrade .
being upgrade can we pointing old logs directory??
Not sure if i understand this question. In general , you dont need to
change any Kafka Logs directory for broker upgrade.

--Senthil

On Fri, Apr 26, 2019 at 6:28 PM M. Manna  wrote:

> Ashok,
>
> The guideline is well-explained, so please follow that.
>
> https://kafka.apache.org/documentation/#upgrade
>
> The process works, so try and follow what it recommends.
>
> Thanks,
>
> On Fri, 26 Apr 2019 at 12:36, ASHOK MACHERLA  wrote:
>
> > Dear senthil
> >
> > That I know,
> > Could you please explain in details,
> > any documents, step by steps required.
> >
> > first what we upgrade zookeeper or kafka??
> > being upgrade can we pointing old logs directory??
> >
> > Could please explain like that.
> >
> > Sent from Outlook
> > 
> > From: SenthilKumar K 
> > Sent: 26 April 2019 16:03
> > To: users@kafka.apache.org
> > Subject: Re: Required guidelines for kafka upgrade
> >
> > Hi , You can refer official documentation to upgrade Kafka Cluster .
> > Section:
> > 1.5 Upgrading From Previous Versions
> >
> > Last week we did broker upgrade from 1.1.0 to 2.2.0. I think the current
> > stable version is 2.2.0.
> >
> > --Senthil
> >
> > On Fri, Apr 26, 2019, 3:54 PM ASHOK MACHERLA 
> wrote:
> >
> > > Dear Team
> > >
> > > Right now, we are using Kafka 0.10.1 version, zookeeper 3.4.6 version,
> we
> > > are planning to upgrade to New Kafka & Zookeeper version.
> > >
> > > Please suggest us, what is current stable version of Kafka,
> Zookeeper???
> > >
> > > Could you please explain what is steps to upgrade to Kafka, Zookeeper??
> > >
> > > Is there any documentation for that upgrade guide??
> > >
> > > Thanks a lot!!
> > >
> > >
> > > Sent from Outlook
> > >
> >
>


Mirror Maker tool is not running

2019-04-30 Thread ASHOK MACHERLA
Dear Team

Please help us , our mirror maker tool is not running properly


Please look into this belowmirror maker log file exceptions


**

[2019-03-11 18:34:31,906] ERROR Error when sending message to topic audit-logs 
with key: null, value: 304134 bytes with error: 
(org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)

org.apache.kafka.common.errors.RecordTooLargeException: The request included a 
message larger than the max message size the server will accept.

[2019-03-11 18:34:31,909] FATAL [mirrormaker-thread-15] Mirror maker thread 
failure due to  (kafka.tools.MirrorMaker$MirrorMakerThread)

java.lang.IllegalStateException: Cannot send after the producer is closed.

at 
org.apache.kafka.clients.producer.internals.RecordAccumulator.append(RecordAccumulator.java:185)

at 
org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:474)

at 
org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:436)

at 
kafka.tools.MirrorMaker$MirrorMakerProducer.send(MirrorMaker.scala:657)

at 
kafka.tools.MirrorMaker$MirrorMakerThread$$anonfun$run$6.apply(MirrorMaker.scala:434)

at 
kafka.tools.MirrorMaker$MirrorMakerThread$$anonfun$run$6.apply(MirrorMaker.scala:434)

at scala.collection.Iterator$class.foreach(Iterator.scala:893)

at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)

at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)

at scala.collection.AbstractIterable.foreach(Iterable.scala:54)

at kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:434)

Sent from Mail for Windows 10