partition creation using python api

2017-02-26 Thread VIVEK KUMAR MISHRA 13BIT0066
There is one partition class in pykafka package in partition.py .Could you please tell me how to use that class. Thank you.

Re: Creating topic partitions automatically using python

2017-02-26 Thread VIVEK KUMAR MISHRA 13BIT0066
There is one partition class in pykafka package in partition.py .Could you please tell me how to use that class. On Fri, Feb 24, 2017 at 12:36 AM, Jeff Widman wrote: > This is probably a better fit for the pykafka issue tracker. > > AFAIK, there's no public kafka API for creating partitions righ

Re: creating partitions programmatically

2017-02-26 Thread VIVEK KUMAR MISHRA 13BIT0066
There is one partition class in pykafka package in partition.py .Could you please tell me how to use that class. On Sun, Feb 26, 2017 at 11:44 PM, Hans Jespersen wrote: > The Confluent python client does not support this today. If you find a > python api that does create topic partitions don't e

Re: Lock Exception - Failed to lock the state directory

2017-02-26 Thread Matthias J. Sax
In case you use 0.10.0.2 please have a look into this FAQ https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Igetalockingexceptionsimilarto"Causedby:java.io.IOException:Failedtolockthestatedirectory:/tmp/kafka-streams//0_0".HowcanIresolvethis? However, if possible I would recommend to upgr

Re: Kafka Streams vs Spark Streaming

2017-02-26 Thread Guozhang Wang
Hello Kohki, Given your data traffic and the state volume I cannot think of a better solution but suggest using large number of partitioned local states. I'm wondering how would "per partition watermark" can help with your traffic? Guozhang On Sun, Feb 26, 2017 at 10:45 AM, Kohki Nishio wrote:

Re: Kafka Streams vs Spark Streaming

2017-02-26 Thread Guozhang Wang
Hello Tianji, As Kohki mentioned, in Streams joins and aggregations are always done pre-partitioned, and hence locally. So there won't be any inter-node communications needed to execute the join / aggregations. Also they can be hosted as persistent local state stores so you don't need to keep them

Re: Immutable Record with Kafka Stream

2017-02-26 Thread Guozhang Wang
Kohki, >From your use case it seems that you'd like to have an "explicit" trigger on when aggregate data can be forwarded to down stream. But before we dig deep into that feature, I'm wondering for your monitoring cases, do you want to completely ignore any future updates when a record has been s

Re: Kafka SASL and custom LoginModule and Authorizer

2017-02-26 Thread Christian
Thank you Harsha! On Sun, Feb 26, 2017 at 10:27 AM, Harsha Chintalapani wrote: > Hi Christian, > Kafka client connections are long-llving connections, > hence the authentication part comes up during connection establishment and > once we authenticate regular kafka protocols can

Re: Kafka Streams vs Spark Streaming

2017-02-26 Thread Kohki Nishio
Tianji, KStream is indeed Append mode as long as I do stateless processing, but when you do aggregation that is a stateful operation and it turns to KTable and that does Update mode. In regard to your aggregation, I believe Kafka's aggregation works for a single partition not over multiple partiti

Re: Does kafka send the acks response to the producer after flush the messages to the disk or just keep them in the memory

2017-02-26 Thread Guozhang Wang
Hello Chen, Kafka flushes data to disk (i.e. fsync) asynchronously. Based on the ack.mode it will return the response of the produce request to producer after it has been replicated (likely in memory) on N partition replicas. Guozhang On Sun, Feb 26, 2017 at 1:39 AM, Jiecxy <253441...@qq.com> w

Re: Kafka Streams vs Spark Streaming

2017-02-26 Thread Kohki Nishio
Guozhang, Let me explain what I'm trying to do. The message volume is large (TB per Day) and that is coming to a topic. Now I want to do per minute aggregation(Windowed) and send the output to the downstream (a topic) (Topic1 - Large Volume) -> [Stream App] -> (Topic2 - Large Volume) I assume th

Re: Immutable Record with Kafka Stream

2017-02-26 Thread Kohki Nishio
I'm planning to do aggregation over metrics, and when 'flush' happens, it emits an aggregation to the downstream (e.g. alarming) Let say the first message saying some average number is very high and it triggers an alarm and later on user comes to the system and checks the number, it might have alr

Re: creating partitions programmatically

2017-02-26 Thread Hans Jespersen
The Confluent python client does not support this today. If you find a python api that does create topic partitions don't expect it to work with a secure Kafka cluster and expect it to have to be completely re-written in the near future when the new Kafka Admin API is implemented under the covers.

Re: Lock Exception - Failed to lock the state directory

2017-02-26 Thread Eno Thereska
Hi Dan, Just checking on the version: you are using 0.10.0.2 or 0.10.2.0 (i.e., latest)? I ask because state locking was something that was fixed in 0.10.2. Thanks Eno > On 26 Feb 2017, at 13:37, Dan Ofir wrote: > > Hi, > > I encountered an exception while using kafka stream, which I cannot

Re: creating partitions programmatically

2017-02-26 Thread VIVEK KUMAR MISHRA 13BIT0066
My question is can we create partitions in topic using any pythonic API? On Sun, Feb 26, 2017 at 8:24 PM, Hans Jespersen wrote: > The current Java AdminUtils are older code that talks to zookeeper and > does not support a secured Kafka cluster. > There will be a new Java Admin API in the future

Re: Kafka SASL and custom LoginModule and Authorizer

2017-02-26 Thread Harsha Chintalapani
Hi Christian, Kafka client connections are long-llving connections, hence the authentication part comes up during connection establishment and once we authenticate regular kafka protocols can be exchanged. Doing heartbeat to keep the token alive in a Authorizer is not a good idea.

Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-26 Thread yohann jardin
You should read (again?) the Spark documentation about submitting an application: http://spark.apache.org/docs/latest/submitting-applications.html Try with the Pi computation example available with Spark. For example: ./bin/spark-submit --class org.apache.spark.examples.SparkPi examples/jars/sp

Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-26 Thread Anahita Talebi
You're welcome. You need to specify the class. I meant like that: spark-submit /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0. 0-1245-hadoop2.7.3.2.5.0.0-1245.jar --class "give the name of the class" On Saturday, February 25, 2017, Raymond Xie wrote: > Thank you, it is still not

Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-26 Thread Marco Mistroni
Try to use --packages to include the jars. From error it seems it's looking for main class in jars but u r running a python script... On 25 Feb 2017 10:36 pm, "Raymond Xie" wrote: That's right Anahita, however, the class name is not indicated in the original github project so I don't know wh

Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-26 Thread Anahita Talebi
Hi, I think if you remove --jars, it will work. Like: spark-submit /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5. 0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar I had the same problem before and solved it by removing --jars. Cheers, Anahita On Saturday, February 25, 2017, Raymond Xie wrot

Re: Getting NotLeaderForPartitionException in kafka broker

2017-02-26 Thread 宋鑫/NemoVisky
Force on This。 宋鑫/NemoVisky

Lock Exception - Failed to lock the state directory

2017-02-26 Thread Dan Ofir
Hi, I encountered an exception while using kafka stream, which I cannot find any open bug or any related documentations. I am using Kafka stream 0.10.0.2, with 3 nodes kafka server running also 0.10.0.2. I am trying to process huge amount of data ~24800 events. my input topic have 100 partit

Re: creating partitions programmatically

2017-02-26 Thread Hans Jespersen
The current Java AdminUtils are older code that talks to zookeeper and does not support a secured Kafka cluster. There will be a new Java Admin API in the future that talks to the Kafka brokers directly using the admin extensions to the Kafka protocol which are already in the 0.10.2 brokers. I w

creating partitions programmatically

2017-02-26 Thread VIVEK KUMAR MISHRA 13BIT0066
Hi All, In kafka java driver, we have kafka.admin.AdminUtils class which has methods like createTopic and addPartitions(). Do we have these type of class and methods in any of kafka python drivers.If there is please do suggest. Thank you .

Does kafka send the acks response to the producer after flush the messages to the disk or just keep them in the memory

2017-02-26 Thread Jiecxy
Hi guys, Does kafka send the acks response to the producer after flush the messages to the disk or just keep them in the memory? How does Kafka flush the messages? By calling the system call, like fsync()? Thanks Chen

Re: kafka streams locking issue in 0.10.20.0

2017-02-26 Thread Eno Thereska
Hi Ara, There are some instructions here for monitoring whether RocksDb is having its writes stalled: https://github.com/facebook/rocksdb/wiki/Write-Stalls , together with some instructions for tuning. Could you share RocksDb's LOG file?