Re: KIP Wiki

2015-06-01 Thread Neha Narkhede
+1. Thanks Aditya!

On Mon, Jun 1, 2015 at 7:34 PM, Mayuresh Gharat 
wrote:

> +1.
>
> Thanks,
>
> Mayuresh
>
> On Mon, Jun 1, 2015 at 6:51 PM, Joe Stein  wrote:
>
> > We should probably have some release/vXYZ section so that over time we
> can
> > keep track of what KIP where approved for what release, etc/
> >
> > Anything not in a release folder (we could do now release/v0.8.3.0 for
> > everything already approved) would be where it is deemed under
> discussion,
> > or such.
> >
> > ~ Joe Stein
> > - - - - - - - - - - - - - - - - -
> >
> >   http://www.stealth.ly
> > - - - - - - - - - - - - - - - - -
> >
> > On Mon, Jun 1, 2015 at 9:46 PM, Guozhang Wang 
> wrote:
> >
> > > +1
> > >
> > > On Mon, Jun 1, 2015 at 12:00 PM, Jiangjie Qin
>  > >
> > > wrote:
> > >
> > > > +1
> > > >
> > > > On 6/1/15, 11:53 AM, "Ashish Singh"  wrote:
> > > >
> > > > >I like the idea!
> > > > >
> > > > >
> > > > >On Mon, Jun 1, 2015 at 9:51 AM, Aditya Auradkar <
> > > > >aaurad...@linkedin.com.invalid> wrote:
> > > > >
> > > > >> Hey everyone,
> > > > >>
> > > > >> We have enough KIP's now (25) that it's a bit hard to tell which
> > ones
> > > > >>are
> > > > >> adopted or under discussion by glancing at the wiki. Any concerns
> > if I
> > > > >> split it into 3 tables (adopted, discarded and KIP's under
> > > discussion)?
> > > > >>
> > > > >>
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Propo
> > > > >>sals
> > > > >>
> > > > >> Aditya
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > >--
> > > > >
> > > > >Regards,
> > > > >Ashish
> > > >
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>



-- 
Thanks,
Neha


Re: [DISCUSS] KIP-19 Add a request timeout to NetworkClient

2015-06-01 Thread Jiangjie Qin
Bump up this thread.

After several discussions in LinkedIn, we came up with three options. I
have updated the KIP-19 wiki page to summarize the three options and
stated our preference. We can discuss on them in tomorrow’s KIP hangout.
Please let us know what do you think.

Thanks,

Jiangjie (Becket) Qin

On 5/21/15, 5:54 PM, "Jiangjie Qin"  wrote:

>Based on the discussion we have, I just updated the KIP with the following
>proposal and want to see if there is further comments.
>
>The proposal is to have the following four timeout as end state.
>
>1. max.buffer.full.block.ms   - To replace block.on.buffer.full. The max
>time to block when buffer is full.
>2. metadata.fetch.timeout.ms  - reuse metadata timeout as batch.timeout.ms
>because it is essentially metadata not available.
>3. replication.timeout.ms - It defines how long a server will wait for
>the records to be replicated to followers.
>4. network.request.timeout.ms - This timeout is used when producer sends
>request to brokers through TCP connections. It specifies how long the
>producer should wait for the response.
>
>With the above approach, we can achieve the following.
>* We can have bounded blocking time for send() = (1) + (2).
>* The time after send() until response got received is generally bounded
>by linger.ms + (2) + (4), not taking retries into consideration.
>
>So from user’s perspective. Send() depends on metadata of a topic and
>buffer space. I am not sure if user would really care about how long it
>takes to receive the response because it is async anyway and we have so
>many things to consider (retries, linger.ms, retry backoff time, request
>timeout, etc).
>
>I think these configurations are clear enough to let user understand at
>the first glance. Please let me know what do you think.
>
>Thanks.
>
>Jiangjie (Becket) Qin
>
>
>
>On 5/20/15, 9:55 AM, "Joel Koshy"  wrote:
>
>>> The fact that I understand the producer internals and am still
>>>struggling
>>> to understand the implications of the different settings, how I would
>>>set
>>> them, and how they potentially interact such that I could set invalid
>>> combinations seems like a red flag to me... Being able to say "I want
>>> produce requests to timeout in 5s" shouldn't require adjusting 3 or 4
>>> configs if the defaults would normally timeout out in something like
>>>30s.
>>> 
>>> Setting aside compatibility issues and focusing on the best set of
>>>configs,
>>> I agree with Jay that there are two things I actually want out of the
>>>API.
>>> The key thing is a per-request timeout, which should be enforced client
>>> side. I would just expect this to follow the request through any
>>>internals
>>> so it can be enforced no matter where in the pipeline the request is.
>>> Within each component in the pipeline we might have to compute how much
>>> time we have left for the request in order to create a timeout within
>>>that
>>> setting. The second setting is to bound the amount of time spent
>>>blocking
>>> on send(). This is really an implementation detail, but one that people
>>>are
>>> complaining about enough that it seems worthwhile to provide control
>>>over
>>> it (and fixing it would just make that setting superfluous, not break
>>> anything).
>>>
>>> Exposing a lot more settings also exposes a lot about the
>>>implementation
>>> and makes it harder to improve the implementation in the future, but I
>>> don't think we have listed good use cases for setting each of them
>>> individually. Why would the user specifically care about how much time
>>>the
>>> request spends in the accumulator vs. some other component (assuming
>>>they
>>> have the overall timeout)? Same for requests in flight, as long as I
>>>have
>>> that client side timeout? And if they care about what component is the
>>> bottleneck, could that be better exposed by the exceptions that are
>>> returned rather than a ton of different settings?
>>
>>Agreed with the above. I'm also extremely wary of configs that are
>>inherently unintuitive, or can interact to yield unintuitive behavior.
>>OTOH I think it is okay if a config is categorized as "advanced" or if
>>it requires deeper knowledge of the internals of the producer (or the
>>configured system in general). i.e., as long as we think long and hard
>>and agree on necessity (driven by clear use cases) before adding such
>>configs. We should also consider how we can simplify or even eliminate
>>existing configs.
>>
>>Re: requests in flight may be a good example: Becket had given a valid
>>use-case i.e., support strict ordering. Maybe we can replace it with a
>>"enable.strict.ordering" config which is clearer in intent and would
>>internally ensure only one in-flight request per partition and default
>>to a fixed in-flight requests (say, five or 10) if set to false. If we
>>implement idempotence then we won't even need that.
>>
>>> On Tue, May 19, 2015 at 7:13 PM, Jiangjie Qin
>>>
>>> wrote:
>>> 
>>> > Hi Jay,
>>> >
>>> > I updated what I think int KIP wiki

[jira] [Updated] (KAFKA-2236) offset request reply racing with segment rolling

2015-06-01 Thread Alfred Landrum (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alfred Landrum updated KAFKA-2236:
--
Description: 
My use case with kafka involves an aggressive retention policy that rolls 
segment files frequently. My librdkafka based client sees occasional errors to 
offset requests, showing up in the broker log like:

[2015-06-02 02:33:38,047] INFO Rolled new log segment for 
'receiver-93b40462-3850-47c1-bcda-8a3e221328ca-50' in 1 ms. (kafka.log.Log)
[2015-06-02 02:33:38,049] WARN [KafkaApi-0] Error while responding to offset 
request (kafka.server.KafkaApis)
java.lang.ArrayIndexOutOfBoundsException: 3
at kafka.server.KafkaApis.fetchOffsetsBefore(KafkaApis.scala:469)
at kafka.server.KafkaApis.fetchOffsets(KafkaApis.scala:449)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:411)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:402)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at kafka.server.KafkaApis.handleOffsetRequest(KafkaApis.scala:402)
at kafka.server.KafkaApis.handle(KafkaApis.scala:61)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
at java.lang.Thread.run(Thread.java:745)




quoting Guozhang Wang's reply to my query on the users list:

"I check the 0.8.2 code and may probably find a bug related to your issue.
Basically, segsArray.last.size is called multiple times during handling
offset requests, while segsArray.last could get concurrent appends. Hence
it is possible that in line 461, if(segsArray.last.size > 0) returns false
while later in line 468, if(segsArray.last.size > 0) could return true."


http://mail-archives.apache.org/mod_mbox/kafka-users/201506.mbox/%3CCAHwHRrUK-3wdoEAaFbsD0E859Ea0gXixfxgDzF8E3%3D_8r7K%2Bpw%40mail.gmail.com%3E


  was:
My use case with kafka involves an aggressive retention policy that rolls 
segment files frequently. My librdkafka based client sees occasional errors to 
offset requests, showing up in the broker log like:

[2015-06-02 02:33:38,047] INFO Rolled new log segment for 
'receiver-93b40462-3850-47c1-bcda-8a3e221328ca-50' in 1 ms. (kafka.log.Log)
[2015-06-02 02:33:38,049] WARN [KafkaApi-0] Error while responding to offset 
request (kafka.server.KafkaApis)
java.lang.ArrayIndexOutOfBoundsException: 3
at kafka.server.KafkaApis.fetchOffsetsBefore(KafkaApis.scala:469)
at kafka.server.KafkaApis.fetchOffsets(KafkaApis.scala:449)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:411)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:402)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at kafka.server.KafkaApis.handleOffsetRequest(KafkaApis.scala:402)
at kafka.server.KafkaApis.handle(KafkaApis.scala:61)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
at java.lang.Thread.run(Thread.java:745)


quoting Guozhang Wang's reply to my query on the users list:

"I check the 0.8.2 code and may probably find a bug related to your issue.
Basically, segsArray.last.size is called multiple times during handling
offset requests, while segsArray.last could get concurrent appends. Hence
it is possible that in line 461, if(segsArray.last.size > 0) returns false
while later in line 468, if(segsArray.last.size > 0) could return true."


http://mail-archives.apache.org/mod_mbox/kafka-users/201506.mbox/%3CCAHwHRrUK-3wdoEAaFbsD0E859Ea0gXixfxgDzF8E3%3D_8r7K%2Bpw%40mail.gmail.com%3E



> offset request reply racing with segment rolling
> 
>
> Key: KAFKA-2236
> URL: https://issues.apache.org/jira/browse/KAFKA-2236
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.0
> Environment: Linux x86_64, java.1.7.0_72, discovered using librdkafka 
> based client.
>Reporter: Alfred Landrum
>Priority: Critical
>
> My use case with kafka involves an aggressive retention policy that rolls 
> segment files frequently. My librdkafka based client sees occasional errors 
> to offset requests, showing up in the

[jira] [Created] (KAFKA-2236) offset request reply racing with segment rolling

2015-06-01 Thread Alfred Landrum (JIRA)
Alfred Landrum created KAFKA-2236:
-

 Summary: offset request reply racing with segment rolling
 Key: KAFKA-2236
 URL: https://issues.apache.org/jira/browse/KAFKA-2236
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.2.0
 Environment: Linux x86_64, java.1.7.0_72, discovered using librdkafka 
based client.
Reporter: Alfred Landrum
Priority: Critical


My use case with kafka involves an aggressive retention policy that rolls 
segment files frequently. My librdkafka based client sees occasional errors to 
offset requests, showing up in the broker log like:

[2015-06-02 02:33:38,047] INFO Rolled new log segment for 
'receiver-93b40462-3850-47c1-bcda-8a3e221328ca-50' in 1 ms. (kafka.log.Log)
[2015-06-02 02:33:38,049] WARN [KafkaApi-0] Error while responding to offset 
request (kafka.server.KafkaApis)
java.lang.ArrayIndexOutOfBoundsException: 3
at kafka.server.KafkaApis.fetchOffsetsBefore(KafkaApis.scala:469)
at kafka.server.KafkaApis.fetchOffsets(KafkaApis.scala:449)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:411)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:402)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at kafka.server.KafkaApis.handleOffsetRequest(KafkaApis.scala:402)
at kafka.server.KafkaApis.handle(KafkaApis.scala:61)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
at java.lang.Thread.run(Thread.java:745)


quoting Guozhang Wang's reply to my query on the users list:

"I check the 0.8.2 code and may probably find a bug related to your issue.
Basically, segsArray.last.size is called multiple times during handling
offset requests, while segsArray.last could get concurrent appends. Hence
it is possible that in line 461, if(segsArray.last.size > 0) returns false
while later in line 468, if(segsArray.last.size > 0) could return true."


http://mail-archives.apache.org/mod_mbox/kafka-users/201506.mbox/%3CCAHwHRrUK-3wdoEAaFbsD0E859Ea0gXixfxgDzF8E3%3D_8r7K%2Bpw%40mail.gmail.com%3E




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2222) Write "Input/output error" did not result in broker shutdown

2015-06-01 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568476#comment-14568476
 ] 

Guozhang Wang commented on KAFKA-:
--

Today we have IO error handling in KafkaApi's request handling logic, but this 
and KAFKA-1860 actually expose two other places that should have error handling 
covered but not yet:

1. background thread like log compactor / log cleaner (KAFKA-1860).
2. response writing to socket (this ticket).

The reason that 2) is not covered is because writing response back to the 
socket is not wrapped in KafkaApis, but in SocketServer.Processor.run(), which 
simply close the key but not cause broker to shutdown.

> Write "Input/output error" did not result in broker shutdown
> 
>
> Key: KAFKA-
> URL: https://issues.apache.org/jira/browse/KAFKA-
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.2
>Reporter: Jason Rosenberg
>
> We had a disk start failing intermittently, and began seeing errors like this 
> in the broker.  Usually, IOExceptions during a file write result in the 
> broker shutting down immediately.  This is with version 0.8.2.1.
> {code}
> 2015-05-21 23:59:57,841 ERROR [kafka-network-thread-27330-2] 
> network.Processor - Closing socket for /1.2.3.4 because of error
> java.io.IOException: Input/output error
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> at 
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:443)
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:575)
> at kafka.log.FileMessageSet.writeTo(FileMessageSet.scala:147)
> at kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:70)
> at kafka.network.MultiSend.writeTo(Transmission.scala:101)
> at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:125)
> at kafka.network.MultiSend.writeTo(Transmission.scala:101)
> at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:231)
> at kafka.network.Processor.write(SocketServer.scala:472)
> at kafka.network.Processor.run(SocketServer.scala:342)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> This resulted in intermittent producer failures failing to send messages 
> successfully, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP Wiki

2015-06-01 Thread Mayuresh Gharat
+1.

Thanks,

Mayuresh

On Mon, Jun 1, 2015 at 6:51 PM, Joe Stein  wrote:

> We should probably have some release/vXYZ section so that over time we can
> keep track of what KIP where approved for what release, etc/
>
> Anything not in a release folder (we could do now release/v0.8.3.0 for
> everything already approved) would be where it is deemed under discussion,
> or such.
>
> ~ Joe Stein
> - - - - - - - - - - - - - - - - -
>
>   http://www.stealth.ly
> - - - - - - - - - - - - - - - - -
>
> On Mon, Jun 1, 2015 at 9:46 PM, Guozhang Wang  wrote:
>
> > +1
> >
> > On Mon, Jun 1, 2015 at 12:00 PM, Jiangjie Qin  >
> > wrote:
> >
> > > +1
> > >
> > > On 6/1/15, 11:53 AM, "Ashish Singh"  wrote:
> > >
> > > >I like the idea!
> > > >
> > > >
> > > >On Mon, Jun 1, 2015 at 9:51 AM, Aditya Auradkar <
> > > >aaurad...@linkedin.com.invalid> wrote:
> > > >
> > > >> Hey everyone,
> > > >>
> > > >> We have enough KIP's now (25) that it's a bit hard to tell which
> ones
> > > >>are
> > > >> adopted or under discussion by glancing at the wiki. Any concerns
> if I
> > > >> split it into 3 tables (adopted, discarded and KIP's under
> > discussion)?
> > > >>
> > > >>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Propo
> > > >>sals
> > > >>
> > > >> Aditya
> > > >>
> > > >>
> > > >
> > > >
> > > >--
> > > >
> > > >Regards,
> > > >Ashish
> > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: KIP Wiki

2015-06-01 Thread Joe Stein
We should probably have some release/vXYZ section so that over time we can
keep track of what KIP where approved for what release, etc/

Anything not in a release folder (we could do now release/v0.8.3.0 for
everything already approved) would be where it is deemed under discussion,
or such.

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Mon, Jun 1, 2015 at 9:46 PM, Guozhang Wang  wrote:

> +1
>
> On Mon, Jun 1, 2015 at 12:00 PM, Jiangjie Qin 
> wrote:
>
> > +1
> >
> > On 6/1/15, 11:53 AM, "Ashish Singh"  wrote:
> >
> > >I like the idea!
> > >
> > >
> > >On Mon, Jun 1, 2015 at 9:51 AM, Aditya Auradkar <
> > >aaurad...@linkedin.com.invalid> wrote:
> > >
> > >> Hey everyone,
> > >>
> > >> We have enough KIP's now (25) that it's a bit hard to tell which ones
> > >>are
> > >> adopted or under discussion by glancing at the wiki. Any concerns if I
> > >> split it into 3 tables (adopted, discarded and KIP's under
> discussion)?
> > >>
> > >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Propo
> > >>sals
> > >>
> > >> Aditya
> > >>
> > >>
> > >
> > >
> > >--
> > >
> > >Regards,
> > >Ashish
> >
> >
>
>
> --
> -- Guozhang
>


Re: KIP Wiki

2015-06-01 Thread Guozhang Wang
+1

On Mon, Jun 1, 2015 at 12:00 PM, Jiangjie Qin 
wrote:

> +1
>
> On 6/1/15, 11:53 AM, "Ashish Singh"  wrote:
>
> >I like the idea!
> >
> >
> >On Mon, Jun 1, 2015 at 9:51 AM, Aditya Auradkar <
> >aaurad...@linkedin.com.invalid> wrote:
> >
> >> Hey everyone,
> >>
> >> We have enough KIP's now (25) that it's a bit hard to tell which ones
> >>are
> >> adopted or under discussion by glancing at the wiki. Any concerns if I
> >> split it into 3 tables (adopted, discarded and KIP's under discussion)?
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Propo
> >>sals
> >>
> >> Aditya
> >>
> >>
> >
> >
> >--
> >
> >Regards,
> >Ashish
>
>


-- 
-- Guozhang


Re: [DISCUSSION] Partition Selection and Coordination By Brokers for Producers

2015-06-01 Thread Jun Rao
Bhavesh,

I am not sure if load balancing based on the consumption rate (1.b) makes
sense. Each consumer typically consumes all partitions from a topic. So, as
long as the data in each partition is balanced, the consumption rate will
be balanced too. Selecting a partition based on the size of each partition
could be useful, but I am not sure if it's going to be significantly better
than just having the clients pick a random partition. Also, implementing
this on the broker side has downside. First, having the broker forward each
produce request increases the network traffic on the broker. Second, this
likely will make the broker code more complicated since we probably have to
put every forwarded produce request in a purgatory. Third, we currently
don't maintain the size of each partition on every broker.

Given these, I think your best bet is probably to just fix those non-java
clients to send data in a round robin way.

Thanks,

Jun

On Fri, May 29, 2015 at 1:22 PM, Bhavesh Mistry 
wrote:

> Hi Kafka Dev Team,
>
> I would appreciate your feedback on moving producer partition selection
> from producer to Broker.   Also, please do let me know what is correct
> process of collecting feedback from Kafka Dev team and/or community.
>
> Thanks,
>
> Bhavesh
>
> On Tue, May 26, 2015 at 11:54 AM, Bhavesh Mistry <
> mistry.p.bhav...@gmail.com
> > wrote:
>
> > Hi Kafka Dev Team,
> >
> > I am sorry I am new to process of discussion and/or KIP.  So, I had
> > commented  other email voting chain.  Please do let me know correct
> process
> > for collecting and staring discussion with Kafka Dev Group.
> >
> > Here is original message:
> >
> > I have had experience with both producer and consumer side.  I have
> > different  use case on this partition selection strategy.
> >
> >
> >
> > Problem :
> >
> >
> > We have heterogeneous environment of producers (by that I mean we have
> > node js, python, New Java & Old Scala Based producers to same topic).   I
> > have seen that not all producers employ round-robing strategies for
> > non-keyed message like new producer does.  Hence, it creates non uniform
> > data ingestion into partition and delay in overall message processing.
> >
> > How to address uniform distribution/message injection rate to all
> > partitions ?
> >
> >
> >
> > Propose Solution:
> >
> >
> > Let broker cluster decide the next partition for topic to send data
> rather
> > than producer itself with more intelligence.
> >
> > 1)   When sending data to brokers (ProduceResponse) Kafka Protocol over
> > the wire send hint to client which partition to send based on following
> > logic (Or can be customizable)
> >
> > a. Based on overall data injection rate for topic and current
> > producer injection rate
> >
> > b. Ability rank partition based on consumer rate (Advance Use Case as
> > there may be many consumers so weighted average etc... )
> >
> >
> >
> > Untimely, brokers will coordinate among thousand of producers and divert
> > data injection  rate (out-of-box feature) and consumption rate (pluggable
> > interface implementation on brokers’ side).  The goal  here is to attain
> > uniformity and/or lower delivery rate to consumer.  This is similar to
> > consumer coordination moving to brokers. The producer side partition
> > selection would also move to brokers.  This will benefit both java and
> > non-java clients.
> >
> >
> >
> > Please let me know your feedback on this subject matter.  I am sure lots
> > of you run  Kafka in Enterprise Environment where you may have different
> > type of producers for same topic (e.g logging client in JavaScript, PHP,
> > Java and Python etc sending to log topic).  I would really appreciate
> your
> > feedback on this.
> >
> >
> >
> >
> >
> > Thanks,
> >
> >
> > Bhavesh
> >
>


Re: Review Request 34524: Fix KAFKA-2208

2015-06-01 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34524/#review86101
---


It'd rather avoid mixing coordinator failover optimization logic with this rb. 
Can you undo the changes in ConsumerCoordinator.scala from line 214 down to the 
bottom of ConsumerCoordinator.scala?


clients/src/main/java/org/apache/kafka/common/requests/JoinGroupResponse.java


This is missing:
INCONSISTENT_PARTITION_ASSIGNMENT_STRATEGY (23)



core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala


Not sure but I think kafka's convention is Errors.foo.code for scala. 
ReplicaManager and the rest of ConsumerCoordinator does Errors.foo.code, not 
Errors.foo.code()


- Onur Karaman


On June 1, 2015, 12:05 a.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34524/
> ---
> 
> (Updated June 1, 2015, 12:05 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2208
> https://issues.apache.org/jira/browse/KAFKA-2208
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Incorporate Onur's comments; add logic for removing the whole group from 
> consumer.
> 
> 
> Diffs
> -
> 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
>  b2764df11afa7a99fce46d1ff48960d889032d14 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
>  ef9dd5238fbc771496029866ece1d85db6d7b7a5 
>   
> clients/src/main/java/org/apache/kafka/common/requests/HeartbeatResponse.java 
> f548cd0ef70929b35ac887f8fccb7b24c3e2c11a 
>   
> clients/src/main/java/org/apache/kafka/common/requests/JoinGroupResponse.java 
> fd9c545c99058ad3fbe3b2c55ea8b6ea002f5a51 
>   core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala 
> af06ad45cdc46ac3bc27898ebc1a5bd5b1c7b19e 
>   core/src/main/scala/kafka/coordinator/ConsumerGroupMetadata.scala 
> 47bdfa7cc86fd4e841e2b1d6bfd40f1508e643bd 
>   core/src/main/scala/kafka/network/RequestChannel.scala 
> 1d0024c8f0c2ab0efa6d8cfca6455877a6ed8026 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 9efa15ca5567b295ab412ee9eea7c03eb4cdc18b 
>   core/src/test/scala/integration/kafka/api/ConsumerBounceTest.scala 
> 5c4cca653b3801df3494003cc40a56ae60a789a6 
>   core/src/test/scala/integration/kafka/api/ConsumerTest.scala 
> a1eed965a148eb19d9a6cefbfce131f58aaffc24 
>   core/src/test/scala/unit/kafka/server/KafkaConfigConfigDefTest.scala 
> 8014a5a6c362785539f24eb03d77278434614fe6 
> 
> Diff: https://reviews.apache.org/r/34524/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1282:
---
Status: In Progress  (was: Patch Available)

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch, 
> access_order_+_test_waiting_from_350ms_to_1100ms.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568257#comment-14568257
 ] 

Jun Rao commented on KAFKA-1282:


[~nmarasoi], thanks for the patch. We are changing SocketServer to reuse 
Selector right now in KAFKA-1928. Once that's done, the idle connection logic 
will be moved into Selector and should be easier to test since Selector 
supports mock time. That patch is almost ready. Perhaps you can wait until it's 
committed and submit a new patch. 

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch, 
> access_order_+_test_waiting_from_350ms_to_1100ms.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33065: Patch for KAFKA-1928

2015-06-01 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33065/#review86095
---


Thanks for the latest patch. A few more minor comments below.


clients/src/main/java/org/apache/kafka/common/network/Selector.java


We probably should include the metricsTag in the per-node sensor name too. 
It may be worth maintaining a connectionId->sensorName map to avoid the 
overhead of reconstructing the sensorName everytime during recording.



core/src/main/scala/kafka/network/RequestChannel.scala


This defines a method with void return. We need to define it to return a 
string.



core/src/main/scala/kafka/network/SocketServer.scala


Our current convention is to not wrap single line statement with {}.



core/src/main/scala/kafka/network/SocketServer.scala


Is this change needed? Isn't it simpler to just use port?



core/src/main/scala/kafka/server/KafkaApis.scala


Could we just use consumerId instead of joinGroupRequest.consumerId?



core/src/main/scala/kafka/server/KafkaConfig.scala


It's better to define it of type LIST, in the same way as the client. Then, 
we can use config.getConfiguredInstances to get the reporter instances, which 
calls configure() for you.



core/src/main/scala/kafka/server/KafkaServer.scala


Incorrect prefix.



core/src/main/scala/kafka/server/KafkaServer.scala


config.metricReporterClasses is only used here. I am wondering if it's 
better to define it as a java List to avoid the back-and-forth conversion btw 
scala and java.


- Jun Rao


On May 31, 2015, 9:49 p.m., Gwen Shapira wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33065/
> ---
> 
> (Updated May 31, 2015, 9:49 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: 1928 and KAFKA-1928
> https://issues.apache.org/jira/browse/1928
> https://issues.apache.org/jira/browse/KAFKA-1928
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> first pass on replacing Send
> 
> 
> implement maxSize and improved docs
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> Conflicts:
>   core/src/main/scala/kafka/network/RequestChannel.scala
> 
> moved selector out of abstract thread
> 
> 
> mid-way through putting selector in SocketServer
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> Also, SocketServer is now using Selector. Stil a bit messy - but all tests 
> pass.
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> renamed requestKey to connectionId to reflect new use and changed type from 
> Any to String
> 
> 
> Following Jun's comments - moved MultiSend to client. Cleaned up destinations 
> as well
> 
> 
> removed reify and remaining from send/recieve API, per Jun. moved 
> maybeCloseOldest() to Selector per Jay
> 
> 
> added idString to node API, changed written to int in Send API
> 
> 
> cleaning up MultiSend, added size() to Send interface
> 
> 
> fixed some issues with multisend
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> fixed metric thingies
> 
> 
> fixed response order bug
> 
> 
> error handling for illegal selector state and fix metrics bug
> 
> 
> optimized selection key lookup with identity hash
> 
> 
> fix accidental change
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> addressing Jun's comments
> 
> 
> removed connection-aging for clients
> 
> 
> fix issues with exception handling and other cleanup
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> Revert "removed connection-aging for clients"
> 
> This reverts commit 016669123a370b561b5ac78f8f1cf7bdd958e7d1.
> 
> improving exception handling and other minor fixes
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> fixes based on Jun and Guozhang comments. Exposed idle metrics as Gauge, 
> changed Send size to long, and fixed an existing issue where Reporters are 
> not actually loaded
> 
> 
> Merge branch 'trunk' of 

Re: Review Request 33065: Patch for KAFKA-1928

2015-06-01 Thread Jay Kreps

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33065/#review86118
---

Ship it!


Looks good to me.

- Jay Kreps


On May 31, 2015, 9:49 p.m., Gwen Shapira wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33065/
> ---
> 
> (Updated May 31, 2015, 9:49 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: 1928 and KAFKA-1928
> https://issues.apache.org/jira/browse/1928
> https://issues.apache.org/jira/browse/KAFKA-1928
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> first pass on replacing Send
> 
> 
> implement maxSize and improved docs
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> Conflicts:
>   core/src/main/scala/kafka/network/RequestChannel.scala
> 
> moved selector out of abstract thread
> 
> 
> mid-way through putting selector in SocketServer
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> Also, SocketServer is now using Selector. Stil a bit messy - but all tests 
> pass.
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> renamed requestKey to connectionId to reflect new use and changed type from 
> Any to String
> 
> 
> Following Jun's comments - moved MultiSend to client. Cleaned up destinations 
> as well
> 
> 
> removed reify and remaining from send/recieve API, per Jun. moved 
> maybeCloseOldest() to Selector per Jay
> 
> 
> added idString to node API, changed written to int in Send API
> 
> 
> cleaning up MultiSend, added size() to Send interface
> 
> 
> fixed some issues with multisend
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> fixed metric thingies
> 
> 
> fixed response order bug
> 
> 
> error handling for illegal selector state and fix metrics bug
> 
> 
> optimized selection key lookup with identity hash
> 
> 
> fix accidental change
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> addressing Jun's comments
> 
> 
> removed connection-aging for clients
> 
> 
> fix issues with exception handling and other cleanup
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> Revert "removed connection-aging for clients"
> 
> This reverts commit 016669123a370b561b5ac78f8f1cf7bdd958e7d1.
> 
> improving exception handling and other minor fixes
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> fixes based on Jun and Guozhang comments. Exposed idle metrics as Gauge, 
> changed Send size to long, and fixed an existing issue where Reporters are 
> not actually loaded
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> move to single metrics object
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/clients/ClusterConnectionStates.java 
> da76cc257b4cfe3c4bce7120a1f14c7f31ef8587 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/InFlightRequests.java 
> 936487b16e7ac566f8bdcd39a7240ceb619fd30e 
>   clients/src/main/java/org/apache/kafka/clients/KafkaClient.java 
> 1311f85847b022efec8cb05c450bb18231db6979 
>   clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
> 435fbb5116e80302eba11ed1d3069cb577dbdcbd 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
>  b2764df11afa7a99fce46d1ff48960d889032d14 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
>  ef9dd5238fbc771496029866ece1d85db6d7b7a5 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> ded19d85c4605c0ecac0ddca3dc7779d77589ccc 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 023bd2eb6a772d80fe54cbef0182b1b0ad5ef2b3 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
> 1e943d621732889a1c005b243920dc32cea7af66 
>   clients/src/main/java/org/apache/kafka/common/Node.java 
> f4e4186c7602787e58e304a2f1c293

[jira] [Commented] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-06-01 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568212#comment-14568212
 ] 

Jason Gustafson commented on KAFKA-2168:


The most recent patch attempts to follow [~guozhang]'s overall advice above. 
Most of the calls are still blocking, but I have moved the blocking code out of 
Coordinator/Fetcher and into KafkaConsumer. This makes it possible to use 
wakeup() from the consumer without splitting the logic across multiple classes. 
The consumer is no longer synchronized, which makes it unsafe for 
multi-threaded access, but wakeup() can be safely used from other threads. This 
should also resolve KAFKA-2230. Note also that this patch will likely have to 
be updated if KAFKA-2123 is accepted.

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
> Attachments: KAFKA-2168.patch, KAFKA-2168_2015-06-01_16:03:38.patch
>
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33065: Patch for KAFKA-1928

2015-06-01 Thread Jay Kreps

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33065/#review86117
---



clients/src/main/java/org/apache/kafka/common/network/Selector.java


This is less of a "connection id" and more of a formatted connection, 
right? Something like formatConnection?


- Jay Kreps


On May 31, 2015, 9:49 p.m., Gwen Shapira wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33065/
> ---
> 
> (Updated May 31, 2015, 9:49 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: 1928 and KAFKA-1928
> https://issues.apache.org/jira/browse/1928
> https://issues.apache.org/jira/browse/KAFKA-1928
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> first pass on replacing Send
> 
> 
> implement maxSize and improved docs
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> Conflicts:
>   core/src/main/scala/kafka/network/RequestChannel.scala
> 
> moved selector out of abstract thread
> 
> 
> mid-way through putting selector in SocketServer
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> Also, SocketServer is now using Selector. Stil a bit messy - but all tests 
> pass.
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> renamed requestKey to connectionId to reflect new use and changed type from 
> Any to String
> 
> 
> Following Jun's comments - moved MultiSend to client. Cleaned up destinations 
> as well
> 
> 
> removed reify and remaining from send/recieve API, per Jun. moved 
> maybeCloseOldest() to Selector per Jay
> 
> 
> added idString to node API, changed written to int in Send API
> 
> 
> cleaning up MultiSend, added size() to Send interface
> 
> 
> fixed some issues with multisend
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> fixed metric thingies
> 
> 
> fixed response order bug
> 
> 
> error handling for illegal selector state and fix metrics bug
> 
> 
> optimized selection key lookup with identity hash
> 
> 
> fix accidental change
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> addressing Jun's comments
> 
> 
> removed connection-aging for clients
> 
> 
> fix issues with exception handling and other cleanup
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> Revert "removed connection-aging for clients"
> 
> This reverts commit 016669123a370b561b5ac78f8f1cf7bdd958e7d1.
> 
> improving exception handling and other minor fixes
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> fixes based on Jun and Guozhang comments. Exposed idle metrics as Gauge, 
> changed Send size to long, and fixed an existing issue where Reporters are 
> not actually loaded
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> move to single metrics object
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> KAFKA-1928-v2
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/clients/ClusterConnectionStates.java 
> da76cc257b4cfe3c4bce7120a1f14c7f31ef8587 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/InFlightRequests.java 
> 936487b16e7ac566f8bdcd39a7240ceb619fd30e 
>   clients/src/main/java/org/apache/kafka/clients/KafkaClient.java 
> 1311f85847b022efec8cb05c450bb18231db6979 
>   clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
> 435fbb5116e80302eba11ed1d3069cb577dbdcbd 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
>  b2764df11afa7a99fce46d1ff48960d889032d14 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
>  ef9dd5238fbc771496029866ece1d85db6d7b7a5 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> ded19d85c4605c0ecac0ddca3dc7779d77589ccc 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 023bd2eb6a772d80fe54cbef0182b1b0ad5ef2b3 
>   
> client

[jira] [Updated] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-06-01 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-2168:
---
Attachment: KAFKA-2168_2015-06-01_16:03:38.patch

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
> Attachments: KAFKA-2168.patch, KAFKA-2168_2015-06-01_16:03:38.patch
>
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34789: Patch for KAFKA-2168

2015-06-01 Thread Jason Gustafson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34789/
---

(Updated June 1, 2015, 11:04 p.m.)


Review request for kafka.


Bugs: KAFKA-2168
https://issues.apache.org/jira/browse/KAFKA-2168


Repository: kafka


Description (updated)
---

KAFKA-2168; move blocking calls into KafkaConsumer to enable async wakeup()


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 
8f587bc0705b65b3ef37c86e0c25bb43ab8803de 
  
clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerWakeupException.java
 PRE-CREATION 
  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
  clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java 
f50da825756938c193d7f07bee953e000e2627d9 
  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
 b2764df11afa7a99fce46d1ff48960d889032d14 
  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
ef9dd5238fbc771496029866ece1d85db6d7b7a5 
  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/OffsetFetchResult.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/SubscriptionState.java
 cee75410127dd1b86c1156563003216d93a086b3 
  clients/src/test/java/org/apache/kafka/clients/consumer/MockConsumerTest.java 
677edd385f35d4262342b567262c0b874876d25b 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/CoordinatorTest.java
 b06c4a73e2b4e9472cd772c8bc32bf4a29f431bb 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
 419541011d652becf0cda7a5e62ce813cddb1732 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/SubscriptionStateTest.java
 e000cf8e10ebfacd6c9ee68d7b88ff8c157f73c6 

Diff: https://reviews.apache.org/r/34789/diff/


Testing
---


Thanks,

Jason Gustafson



[jira] [Commented] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-06-01 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568198#comment-14568198
 ] 

Jason Gustafson commented on KAFKA-2168:


Updated reviewboard https://reviews.apache.org/r/34789/diff/
 against branch upstream/trunk

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
> Attachments: KAFKA-2168.patch, KAFKA-2168_2015-06-01_16:03:38.patch
>
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2231) Deleting a topic fails

2015-06-01 Thread James G. Haberly (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568122#comment-14568122
 ] 

James G. Haberly commented on KAFKA-2231:
-

I'm not certain what information this would provide that would help in 
resolving the problem on 8.2.1.

Are there no standard regression tests that are run before a release is 
published?




> Deleting a topic fails
> --
>
> Key: KAFKA-2231
> URL: https://issues.apache.org/jira/browse/KAFKA-2231
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: Windows 8.1
>Reporter: James G. Haberly
>Priority: Minor
>
> delete.topic.enable=true is in config\server.properties.
> Using --list shows the topic "marked for deletion".
> Stopping and restarting kafka and zookeeper does not delete the topic; it 
> remains "marked for deletion".
> Trying to recreate the topic fails with "Topic XXX already exists".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2231) Deleting a topic fails

2015-06-01 Thread James G. Haberly (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568118#comment-14568118
 ] 

James G. Haberly commented on KAFKA-2231:
-

It is a test system.  1 ZooKeeper instance, 1 Kafka instance, 1 
partition for each of the 2 topics I had created, and no active 
producers or consumers.




> Deleting a topic fails
> --
>
> Key: KAFKA-2231
> URL: https://issues.apache.org/jira/browse/KAFKA-2231
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: Windows 8.1
>Reporter: James G. Haberly
>Priority: Minor
>
> delete.topic.enable=true is in config\server.properties.
> Using --list shows the topic "marked for deletion".
> Stopping and restarting kafka and zookeeper does not delete the topic; it 
> remains "marked for deletion".
> Trying to recreate the topic fails with "Topic XXX already exists".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Anyone else sees SocketServerTest failing on trunk?

2015-06-01 Thread Jun Rao
Hmm, don't see the failure myself.

Thanks,

Jun

On Sun, May 31, 2015 at 2:30 PM, Gwen Shapira  wrote:

> Hi,
>
> I'm running:
> ./gradlew cleanTest test
>
> on trunk and all of SocketServerTest tests are failing with:
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at
>
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at
>
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at java.net.Socket.connect(Socket.java:528)
> at java.net.Socket.(Socket.java:425)
> at java.net.Socket.(Socket.java:208)
> at
> kafka.network.SocketServerTest.connect(SocketServerTest.scala:79)
> at
> kafka.network.SocketServerTest.simpleRequest(SocketServerTest.scala:89)
>
> This error does not happen when I just run SocketServerTest, and I don't
> see it happening on Jenkins either.
>
> I started seeing it somewhere after commit bb133c6 - KAFKA-1374, but I did
> not run full bisect yet, so I can't point the specific commit. I ran a
> diff, but I don't see any obvious change that could cause it.
>
> Anyone else ran into the same issue?
>
> Gwen
>


[jira] [Updated] (KAFKA-2226) NullPointerException in TestPurgatoryPerformance

2015-06-01 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-2226:
---
   Resolution: Fixed
Fix Version/s: 0.8.3
   Status: Resolved  (was: Patch Available)

Thanks for the latest patch. +1 and committed to trunk.

> NullPointerException in TestPurgatoryPerformance
> 
>
> Key: KAFKA-2226
> URL: https://issues.apache.org/jira/browse/KAFKA-2226
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Yasuhiro Matsuda
> Fix For: 0.8.3
>
> Attachments: KAFKA-2226.patch, KAFKA-2226_2015-05-28_17:18:55.patch, 
> KAFKA-2226_2015-05-29_10:49:34.patch, KAFKA-2226_2015-05-29_15:04:35.patch, 
> KAFKA-2226_2015-05-29_15:10:24.patch
>
>
> A NullPointerException sometimes shows up in TimerTaskList.remove while 
> running TestPurgatoryPerformance. I’m on the HEAD of trunk.
> {code}
> > ./bin/kafka-run-class.sh kafka.TestPurgatoryPerformance --key-space-size 
> > 10 --keys 3 --num 10 --pct50 50 --pct75 75 --rate 1000 --size 50 
> > --timeout 20
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/okaraman/code/kafka/core/build/dependant-libs-2.10.5/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/okaraman/code/kafka/core/build/dependant-libs-2.10.5/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> [2015-05-27 10:02:14,782] ERROR [completion thread], Error due to  
> (kafka.TestPurgatoryPerformance$CompletionQueue$$anon$1)
> java.lang.NullPointerException
>   at kafka.utils.timer.TimerTaskList.remove(TimerTaskList.scala:80)
>   at kafka.utils.timer.TimerTaskEntry.remove(TimerTaskList.scala:128)
>   at kafka.utils.timer.TimerTask$class.cancel(TimerTask.scala:27)
>   at kafka.server.DelayedOperation.cancel(DelayedOperation.scala:50)
>   at kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:71)
>   at 
> kafka.TestPurgatoryPerformance$CompletionQueue$$anon$1.doWork(TestPurgatoryPerformance.scala:263)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2235) LogCleaner offset map overflow

2015-06-01 Thread Ivan Simoneko (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Simoneko updated KAFKA-2235:
-
Attachment: KAFKA-2235_v1.patch

> LogCleaner offset map overflow
> --
>
> Key: KAFKA-2235
> URL: https://issues.apache.org/jira/browse/KAFKA-2235
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.8.1, 0.8.2.0
>Reporter: Ivan Simoneko
>Assignee: Jay Kreps
> Attachments: KAFKA-2235_v1.patch
>
>
> We've seen log cleaning generating an error for a topic with lots of small 
> messages. It seems that cleanup map overflow is possible if a log segment 
> contains more unique keys than empty slots in offsetMap. Check for baseOffset 
> and map utilization before processing segment seems to be not enough because 
> it doesn't take into account segment size (number of unique messages in the 
> segment).
> I suggest to estimate upper bound of keys in a segment as a number of 
> messages in the segment and compare it with the number of available slots in 
> the map (keeping in mind desired load factor). It should work in cases where 
> an empty map is capable to hold all the keys for a single segment. If even a 
> single segment no able to fit into an empty map cleanup process will still 
> fail. Probably there should be a limit on the log segment entries count?
> Here is the stack trace for this error:
> 2015-05-19 16:52:48,758 ERROR [kafka-log-cleaner-thread-0] 
> kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
> java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
> entry to a full offset map.
>at scala.Predef$.require(Predef.scala:233)
>at kafka.log.SkimpyOffsetMap.put(OffsetMap.scala:79)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:543)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:538)
>at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:32)
>at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>at kafka.message.MessageSet.foreach(MessageSet.scala:67)
>at 
> kafka.log.Cleaner.kafka$log$Cleaner$$buildOffsetMapForSegment(LogCleaner.scala:538)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:515)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:512)
>at scala.collection.immutable.Stream.foreach(Stream.scala:547)
>at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:512)
>at kafka.log.Cleaner.clean(LogCleaner.scala:307)
>at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:221)
>at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:199)
>at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2235) LogCleaner offset map overflow

2015-06-01 Thread Ivan Simoneko (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Simoneko updated KAFKA-2235:
-
Status: Patch Available  (was: Open)

> LogCleaner offset map overflow
> --
>
> Key: KAFKA-2235
> URL: https://issues.apache.org/jira/browse/KAFKA-2235
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.8.1, 0.8.2.0
>Reporter: Ivan Simoneko
>Assignee: Jay Kreps
>
> We've seen log cleaning generating an error for a topic with lots of small 
> messages. It seems that cleanup map overflow is possible if a log segment 
> contains more unique keys than empty slots in offsetMap. Check for baseOffset 
> and map utilization before processing segment seems to be not enough because 
> it doesn't take into account segment size (number of unique messages in the 
> segment).
> I suggest to estimate upper bound of keys in a segment as a number of 
> messages in the segment and compare it with the number of available slots in 
> the map (keeping in mind desired load factor). It should work in cases where 
> an empty map is capable to hold all the keys for a single segment. If even a 
> single segment no able to fit into an empty map cleanup process will still 
> fail. Probably there should be a limit on the log segment entries count?
> Here is the stack trace for this error:
> 2015-05-19 16:52:48,758 ERROR [kafka-log-cleaner-thread-0] 
> kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
> java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
> entry to a full offset map.
>at scala.Predef$.require(Predef.scala:233)
>at kafka.log.SkimpyOffsetMap.put(OffsetMap.scala:79)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:543)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:538)
>at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:32)
>at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>at kafka.message.MessageSet.foreach(MessageSet.scala:67)
>at 
> kafka.log.Cleaner.kafka$log$Cleaner$$buildOffsetMapForSegment(LogCleaner.scala:538)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:515)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:512)
>at scala.collection.immutable.Stream.foreach(Stream.scala:547)
>at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:512)
>at kafka.log.Cleaner.clean(LogCleaner.scala:307)
>at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:221)
>at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:199)
>at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2235) LogCleaner offset map overflow

2015-06-01 Thread Ivan Simoneko (JIRA)
Ivan Simoneko created KAFKA-2235:


 Summary: LogCleaner offset map overflow
 Key: KAFKA-2235
 URL: https://issues.apache.org/jira/browse/KAFKA-2235
 Project: Kafka
  Issue Type: Bug
  Components: core, log
Affects Versions: 0.8.1, 0.8.2.0
Reporter: Ivan Simoneko
Assignee: Jay Kreps


We've seen log cleaning generating an error for a topic with lots of small 
messages. It seems that cleanup map overflow is possible if a log segment 
contains more unique keys than empty slots in offsetMap. Check for baseOffset 
and map utilization before processing segment seems to be not enough because it 
doesn't take into account segment size (number of unique messages in the 
segment).

I suggest to estimate upper bound of keys in a segment as a number of messages 
in the segment and compare it with the number of available slots in the map 
(keeping in mind desired load factor). It should work in cases where an empty 
map is capable to hold all the keys for a single segment. If even a single 
segment no able to fit into an empty map cleanup process will still fail. 
Probably there should be a limit on the log segment entries count?

Here is the stack trace for this error:
2015-05-19 16:52:48,758 ERROR [kafka-log-cleaner-thread-0] kafka.log.LogCleaner 
- [kafka-log-cleaner-thread-0], Error due to
java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
entry to a full offset map.
   at scala.Predef$.require(Predef.scala:233)
   at kafka.log.SkimpyOffsetMap.put(OffsetMap.scala:79)
   at 
kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:543)
   at 
kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:538)
   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
   at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:32)
   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
   at kafka.message.MessageSet.foreach(MessageSet.scala:67)
   at 
kafka.log.Cleaner.kafka$log$Cleaner$$buildOffsetMapForSegment(LogCleaner.scala:538)
   at 
kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:515)
   at 
kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:512)
   at scala.collection.immutable.Stream.foreach(Stream.scala:547)
   at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:512)
   at kafka.log.Cleaner.clean(LogCleaner.scala:307)
   at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:221)
   at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:199)
   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34734: Patch for KAFKA-2226

2015-06-01 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review86047
---


Ran the patch with the following command 60+ times and didn't see an NPE 
(earlier attempts before this patch took anywhere from 5 - 30 attempts to hit 
the NPE):
```
./bin/kafka-run-class.sh kafka.TestPurgatoryPerformance --key-space-size 10 
--keys 3 --num 10 --pct50 50 --pct75 75 --rate 1000 --size 50 --timeout 20
```

As a side note, I think the timing wheel design would be simpler if:
1. we allow TimerTasks to only be run at most once
2. we force TimerTask to have exactly one TimerTaskEntry and TimerTaskEntry to 
only ever belong to exactly one TimerTask (just make the TimerTaskEntry in the 
TimerTask's constructor).

- Onur Karaman


On May 29, 2015, 10:10 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 29, 2015, 10:10 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/Timer.scala 
> b8cde820a770a4e894804f1c268b24b529940650 
>   core/src/main/scala/kafka/utils/timer/TimerTask.scala 
> 3407138115d579339ffb6b00e32e38c984ac5d6e 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
>   core/src/main/scala/kafka/utils/timer/TimingWheel.scala 
> e92aba3844dbf3372182e14536a5d98cf3366d73 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



Re: KIP Wiki

2015-06-01 Thread Jiangjie Qin
+1

On 6/1/15, 11:53 AM, "Ashish Singh"  wrote:

>I like the idea!
>
>
>On Mon, Jun 1, 2015 at 9:51 AM, Aditya Auradkar <
>aaurad...@linkedin.com.invalid> wrote:
>
>> Hey everyone,
>>
>> We have enough KIP's now (25) that it's a bit hard to tell which ones
>>are
>> adopted or under discussion by glancing at the wiki. Any concerns if I
>> split it into 3 tables (adopted, discarded and KIP's under discussion)?
>>
>> 
>>https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Propo
>>sals
>>
>> Aditya
>>
>>
>
>
>-- 
>
>Regards,
>Ashish



Re: KIP Wiki

2015-06-01 Thread Ashish Singh
I like the idea!


On Mon, Jun 1, 2015 at 9:51 AM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Hey everyone,
>
> We have enough KIP's now (25) that it's a bit hard to tell which ones are
> adopted or under discussion by glancing at the wiki. Any concerns if I
> split it into 3 tables (adopted, discarded and KIP's under discussion)?
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> Aditya
>
>


-- 

Regards,
Ashish


[jira] [Commented] (KAFKA-2231) Deleting a topic fails

2015-06-01 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567640#comment-14567640
 ] 

Onur Karaman commented on KAFKA-2231:
-

Can you provide the number of partitions for the topic being deleted as well as 
value of the controller.message.queue.size broker config? 

Maybe it's related to a controller deadlock described here: 
[KAFKA-2122|https://issues.apache.org/jira/browse/KAFKA-2122]

> Deleting a topic fails
> --
>
> Key: KAFKA-2231
> URL: https://issues.apache.org/jira/browse/KAFKA-2231
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: Windows 8.1
>Reporter: James G. Haberly
>Priority: Minor
>
> delete.topic.enable=true is in config\server.properties.
> Using --list shows the topic "marked for deletion".
> Stopping and restarting kafka and zookeeper does not delete the topic; it 
> remains "marked for deletion".
> Trying to recreate the topic fails with "Topic XXX already exists".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


KIP Wiki

2015-06-01 Thread Aditya Auradkar
Hey everyone,

We have enough KIP's now (25) that it's a bit hard to tell which ones are 
adopted or under discussion by glancing at the wiki. Any concerns if I split it 
into 3 tables (adopted, discarded and KIP's under discussion)?
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

Aditya



RE: [VOTE] KIP-21 Dynamic Configuration

2015-06-01 Thread Aditya Auradkar
Thanks Jay. Since we have enough votes now, I'll mark this as adopted.

Aditya


From: Jay Kreps [jay.kr...@gmail.com]
Sent: Monday, June 01, 2015 8:42 AM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-21 Dynamic Configuration

Awesome. +1

On Sunday, May 31, 2015, Aditya Auradkar 
wrote:

> 2. There was a typo in my previous email. I meant to say that we should
> use snake case because it's more consistent. I couldn't find any examples
> of camel case but did find some snake case (jmx_port). Other than that,
> most other entries are single word keys.
>
> 3. The purge frequency is short (15 minutes). So it should be safe to
> ignore older notifications. Add a story here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration#KIP-21-DynamicConfiguration-Migrationplanfornotifications
>
> Thanks,
> Aditya
> 
> From: Jay Kreps [jay.kr...@gmail.com ]
> Sent: Saturday, May 30, 2015 9:51 AM
> To: dev@kafka.apache.org 
> Subject: Re: [VOTE] KIP-21 Dynamic Configuration
>
> 1. Great.
> 2. I don't have a preference as to the casing, but I really appreciate
> consistency. Is everything using underscores today? If so let's stick with
> that. If we are already inconsistent then I guess it's too late and we can
> do whatever. Let me know and I'll update the coding standard.
> 3. Not sure what the default purge frequency is. I don't think we need to
> work the details of this out in the KIP, but we need a story for the
> upgrade path so people don't get bitten.
>
> -Jay
>
> On Thu, May 28, 2015 at 11:22 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Yeah, the same cleaning mechanism will be carried over.
> >
> > > 1. Are we introducing a new Java API for the config change protocol and
> > if
> > > so where will that appear? Is that going to be part of the java api in
> > the
> > > admin api kip? Let's document that.
> > Yeah, we need to introduce a new Java API for the config change protocol.
> > It should be a part of the AdminClient API. I'll alter KIP-4 to reflect
> > that since the API is being introduced there.
> >
> > > 2. The proposed JSON format uses camel case for field names, is that
> what
> > > we've used for other JSON in zookeeper?
> > I think camel case is more appropriate for the JSON format. For example,
> > under the "brokers" znode, I found "jmx_port".
> >
> > > 3. This changes the format of the notifications, right? How will we
> > > grandfather in the old format? Clusters will have existing change
> > > notifications in the old format so I think the new code will need to be
> > > able to read those?
> > Interesting, I figured the existing notifications were purged by a
> cleaner
> > thread frequently. In that case, we wouldn't need to grandfather any
> > notifications since we would only need to not make any config changes
> for X
> > minutes for there to be no changes in ZK. But the old notifications are
> > actually removed when a new notification is received or the broker is
> > bounced. So we do need to handle notifications in the old format. Should
> we
> > simply ignore older changes since they are only valid for a short period
> of
> > time?
> >
> > Thanks,
> > Aditya
> > 
> > From: Jay Kreps [jay.kr...@gmail.com ]
> > Sent: Thursday, May 28, 2015 5:25 PM
> > To: dev@kafka.apache.org 
> > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> >
> > That is handled now so I am assuming the same mechanism carries over?
> >
> > -Jay
> >
> > On Thu, May 28, 2015 at 5:12 PM, Guozhang Wang  > wrote:
> >
> > > For the sequential config/changes/config_change_XX znode, do we have
> any
> > > manners to do cleaning in order to avoid the change-log from growing
> > > indefinitely?
> > >
> > > Guozhang
> > >
> > > On Thu, May 28, 2015 at 5:02 PM, Jay Kreps  > wrote:
> > >
> > > > I still have a couple of questions:
> > > > 1. Are we introducing a new Java API for the config change protocol
> and
> > > if
> > > > so where will that appear? Is that going to be part of the java api
> in
> > > the
> > > > admin api kip? Let's document that.
> > > > 2. The proposed JSON format uses camel case for field names, is that
> > what
> > > > we've used for other JSON in zookeeper?
> > > > 3. This changes the format of the notifications, right? How will we
> > > > grandfather in the old format? Clusters will have existing change
> > > > notifications in the old format so I think the new code will need to
> be
> > > > able to read those?
> > > >
> > > > -Jay
> > > >
> > > > On Thu, May 28, 2015 at 11:41 AM, Aditya Auradkar <
> > > > aaurad...@linkedin.com.invalid> wrote:
> > > >
> > > > > bump
> > > > >
> > > > > 
> > > > > From: Aditya Auradkar
> > > > > Sent: Tuesday, May 26, 2015 1:16 PM
> > > > > To: dev@kafka.apache.org 
> > > > > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> > > > >
> > > > > Hey everyo

Re: [VOTE] KIP-21 Dynamic Configuration

2015-06-01 Thread Jay Kreps
Awesome. +1

On Sunday, May 31, 2015, Aditya Auradkar 
wrote:

> 2. There was a typo in my previous email. I meant to say that we should
> use snake case because it's more consistent. I couldn't find any examples
> of camel case but did find some snake case (jmx_port). Other than that,
> most other entries are single word keys.
>
> 3. The purge frequency is short (15 minutes). So it should be safe to
> ignore older notifications. Add a story here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration#KIP-21-DynamicConfiguration-Migrationplanfornotifications
>
> Thanks,
> Aditya
> 
> From: Jay Kreps [jay.kr...@gmail.com ]
> Sent: Saturday, May 30, 2015 9:51 AM
> To: dev@kafka.apache.org 
> Subject: Re: [VOTE] KIP-21 Dynamic Configuration
>
> 1. Great.
> 2. I don't have a preference as to the casing, but I really appreciate
> consistency. Is everything using underscores today? If so let's stick with
> that. If we are already inconsistent then I guess it's too late and we can
> do whatever. Let me know and I'll update the coding standard.
> 3. Not sure what the default purge frequency is. I don't think we need to
> work the details of this out in the KIP, but we need a story for the
> upgrade path so people don't get bitten.
>
> -Jay
>
> On Thu, May 28, 2015 at 11:22 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Yeah, the same cleaning mechanism will be carried over.
> >
> > > 1. Are we introducing a new Java API for the config change protocol and
> > if
> > > so where will that appear? Is that going to be part of the java api in
> > the
> > > admin api kip? Let's document that.
> > Yeah, we need to introduce a new Java API for the config change protocol.
> > It should be a part of the AdminClient API. I'll alter KIP-4 to reflect
> > that since the API is being introduced there.
> >
> > > 2. The proposed JSON format uses camel case for field names, is that
> what
> > > we've used for other JSON in zookeeper?
> > I think camel case is more appropriate for the JSON format. For example,
> > under the "brokers" znode, I found "jmx_port".
> >
> > > 3. This changes the format of the notifications, right? How will we
> > > grandfather in the old format? Clusters will have existing change
> > > notifications in the old format so I think the new code will need to be
> > > able to read those?
> > Interesting, I figured the existing notifications were purged by a
> cleaner
> > thread frequently. In that case, we wouldn't need to grandfather any
> > notifications since we would only need to not make any config changes
> for X
> > minutes for there to be no changes in ZK. But the old notifications are
> > actually removed when a new notification is received or the broker is
> > bounced. So we do need to handle notifications in the old format. Should
> we
> > simply ignore older changes since they are only valid for a short period
> of
> > time?
> >
> > Thanks,
> > Aditya
> > 
> > From: Jay Kreps [jay.kr...@gmail.com ]
> > Sent: Thursday, May 28, 2015 5:25 PM
> > To: dev@kafka.apache.org 
> > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> >
> > That is handled now so I am assuming the same mechanism carries over?
> >
> > -Jay
> >
> > On Thu, May 28, 2015 at 5:12 PM, Guozhang Wang  > wrote:
> >
> > > For the sequential config/changes/config_change_XX znode, do we have
> any
> > > manners to do cleaning in order to avoid the change-log from growing
> > > indefinitely?
> > >
> > > Guozhang
> > >
> > > On Thu, May 28, 2015 at 5:02 PM, Jay Kreps  > wrote:
> > >
> > > > I still have a couple of questions:
> > > > 1. Are we introducing a new Java API for the config change protocol
> and
> > > if
> > > > so where will that appear? Is that going to be part of the java api
> in
> > > the
> > > > admin api kip? Let's document that.
> > > > 2. The proposed JSON format uses camel case for field names, is that
> > what
> > > > we've used for other JSON in zookeeper?
> > > > 3. This changes the format of the notifications, right? How will we
> > > > grandfather in the old format? Clusters will have existing change
> > > > notifications in the old format so I think the new code will need to
> be
> > > > able to read those?
> > > >
> > > > -Jay
> > > >
> > > > On Thu, May 28, 2015 at 11:41 AM, Aditya Auradkar <
> > > > aaurad...@linkedin.com.invalid> wrote:
> > > >
> > > > > bump
> > > > >
> > > > > 
> > > > > From: Aditya Auradkar
> > > > > Sent: Tuesday, May 26, 2015 1:16 PM
> > > > > To: dev@kafka.apache.org 
> > > > > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> > > > >
> > > > > Hey everyone,
> > > > >
> > > > > Completed the changes to KIP-4. After today's hangout, there
> doesn't
> > > > > appear to be anything remaining to discuss on this KIP.
> > > > > Please vote so we can formally close this.
> > > > >
> > > > > Thanks,
> > > > > Aditya
> > > > >
>

[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567322#comment-14567322
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 2:24 PM:
--

Hi [~junrao], [~nehanarkhede], I added a test, please review. The patch has 2 
variations (latest 2 patches), explained at point 2 below, while the latest 
implements 1' below.

1. I wanted to sleep on MockTime, but here we actually need to physically wait 
at leat one epoll/select cycle. Since I have put 10ms idle time & it works, 
mocked time would not bring benefits, i.e. only the select time needs to be 
waited over. 

1'. Because of potentially large & not deterministically bounded select times, 
I implemented a mechanism to try a few times, waiting 50% more time every time.

2. Seems to work with low (10ms) idle timeout for all current test methods. 
However, I attach a patch with separate test class for this (and yet another 
utils class for reuse), to isolate configuration between group of test methods.

3. Shall I do a multiple connections test?


was (Author: nmarasoi):
Hi [~junrao], [~nehanarkhede], I added a test, please review. The patch has 2 
variations (latest 2 patches), explained at point 2 below.

1. I wanted to sleep on MockTime, but here we actually need to physically wait 
at leat one epoll/select cycle. Since I have put 10ms idle time & it works, 
mocked time would not bring benefits, i.e. only the select time needs to be 
waited over. 

1'. Because of potentially large & not deterministically bounded select times, 
I will implement a mechanism to try a few times, waiting 50% more time every 
time.

2. Seems to work with low (10ms) idle timeout for all current test methods. 
However, I attach a patch with separate test class for this (and yet another 
utils class for reuse), to isolate configuration between group of test methods.

3. Shall I do a multiple connections test?

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch, 
> access_order_+_test_waiting_from_350ms_to_1100ms.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: access_order_+_test_waiting_from_350ms_to_1100ms.patch

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch, 
> access_order_+_test_waiting_from_350ms_to_1100ms.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567322#comment-14567322
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 2:14 PM:
--

Hi [~junrao], [~nehanarkhede], I added a test, please review. The patch has 2 
variations (latest 2 patches), explained at point 2 below.

1. I wanted to sleep on MockTime, but here we actually need to physically wait 
at leat one epoll/select cycle. Since I have put 10ms idle time & it works, 
mocked time would not bring benefits, i.e. only the select time needs to be 
waited over. 

1'. Because of potentially large & not deterministically bounded select times, 
I will implement a mechanism to try a few times, waiting 50% more time every 
time.

2. Seems to work with low (10ms) idle timeout for all current test methods. 
However, I attach a patch with separate test class for this (and yet another 
utils class for reuse), to isolate configuration between group of test methods.

3. Shall I do a multiple connections test?


was (Author: nmarasoi):
Hi [~junrao], [~nehanarkhede], I added a test, please review. The patch has 2 
variations (latest 2 patches), explained at point 2 below.

1. I wanted to sleep on MockTime, but here we actually need to physically wait 
at leat one epoll/select cycle. Since I have put 10ms idle time & it works, 
mocked time would not bring benefits.

2. Seems to work with low (10ms) idle timeout for all current test methods. 
However, I attach a patch with separate test class for this (and yet another 
utils class for reuse), to isolate configuration between group of test methods.

3. Shall I do a multiple connections test?

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567322#comment-14567322
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 2:12 PM:
--

Hi [~junrao], [~nehanarkhede], I added a test, please review. The patch has 2 
variations (latest 2 patches), explained at point 2 below.

1. I wanted to sleep on MockTime, but here we actually need to physically wait 
at leat one epoll/select cycle. Since I have put 10ms idle time & it works, 
mocked time would not bring benefits.

2. Seems to work with low (10ms) idle timeout for all current test methods. 
However, I attach a patch with separate test class for this (and yet another 
utils class for reuse), to isolate configuration between group of test methods.

3. Shall I do a multiple connections test?


was (Author: nmarasoi):
Hi [~junrao], [~nehanarkhede], I added a test, please review. The patch has 2 
variations (latest 2 patches), explained at point 2 below.

1. I wanted to sleep on MockTime, but here we actually need to physically wait 
at leat one epoll/select cycle. Since I have put 10ms idle time, no need for 
mocked time as well.

2. Seems to work with low (10ms) idle timeout for all current test methods. 
However, I attach a patch with separate test class for this (and yet another 
utils class for reuse), to isolate configuration between group of test methods.

3. Shall I do a multiple connections test?

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567322#comment-14567322
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 2:03 PM:
--

Hi [~junrao], [~nehanarkhede], I added a test, please review. The patch has 2 
variations (latest 2 patches), explained at point 2 below.

1. I wanted to sleep on MockTime, but here we actually need to physically wait 
at leat one epoll/select cycle. Since I have put 10ms idle time, no need for 
mocked time as well.

2. Seems to work with low (10ms) idle timeout for all current test methods. 
However, I attach a patch with separate test class for this (and yet another 
utils class for reuse), to isolate configuration between group of test methods.

3. Shall I do a multiple connections test?


was (Author: nmarasoi):
Hi [~junrao], [~nehanarkhede], I added a test, please review.

1. I wanted to sleep on MockTime, but here we actually need to physically wait 
at leat one epoll/select cycle. Since I have put 10ms idle time, no need for 
mocked time as well.

2. Seems to work with low (10ms) idle timeout for all current test methods. 
However, I attach a patch with separate test class for this (and yet another 
utils class for reuse), to isolate configuration between group of test methods.

3. Shall I do a multiple connections test?

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: (was: 1282_brush.patch)

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: (was: 1282_access-order.patch)

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567322#comment-14567322
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 2:01 PM:
--

Hi [~junrao], [~nehanarkhede], I added a test, please review.

1. I wanted to sleep on MockTime, but here we actually need to physically wait 
at leat one epoll/select cycle. Since I have put 10ms idle time, no need for 
mocked time as well.

2. Seems to work with low (10ms) idle timeout for all current test methods. 
However, I attach a patch with separate test class for this (and yet another 
utils class for reuse), to isolate configuration between group of test methods.

3. Shall I do a multiple connections test?


was (Author: nmarasoi):
Added a test. I wanted to sleep on MockTime, but here we actually need to 
physically wait at leat one epoll/select cycle. Since I have put 10ms idle 
time, no need for mocked time as well.

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 
> 1282_access-order_+_test_(same_class).patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: (was: 1282_brushed_up.patch)

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order_+_test_(same_class).patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: access-order_+_test.patch

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 
> 1282_access-order_+_test_(same_class).patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2234) Partition reassignment of a nonexistent topic prevents future reassignments

2015-06-01 Thread Bob Halley (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Halley updated KAFKA-2234:
--
Summary: Partition reassignment of a nonexistent topic prevents future 
reassignments  (was: Partition reassignment of an empty topic prevents future 
reassignments)

> Partition reassignment of a nonexistent topic prevents future reassignments
> ---
>
> Key: KAFKA-2234
> URL: https://issues.apache.org/jira/browse/KAFKA-2234
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Bob Halley
>Priority: Blocker
>
> The results of this bug are like those of KAFKA-822.  If I erroneously list a 
> non-existent topic in a partition reassignment request, then it will never 
> complete and it becomes impossible to do reassignments until the 
> admin/reassign-partitions node is deleted by hand from zookeeper.
> Note too the incoherent messaging in the bad command.  First it says ERROR 
> what I'm trying to do is bad, and then it says it has successfully started it 
> (which indeed it has, at least in the sense of writing an empty list to to 
> zookeeper :)).
> # reassignment.json is bad, it refers to the non-existent topic "bad-foo"
> $ cat reassignment.json
>  {"partitions": 
>   [{"topic": "bad-foo", 
> "partition": 0, 
> "replicas": [2] }], 
>   "version":1
>  }
> $ kafka-reassign-partitions.sh --reassignment-json-file reassignment.json 
> --zookeeper localhost:2181/kafka --execute
> Current partition replica assignment
> {"version":1,"partitions":[]}
> Save this to use as the --reassignment-json-file option during rollback
> [2015-06-01 06:34:26,275] ERROR Skipping reassignment of partition 
> [bad-foo,0] since it doesn't exist (kafka.admin.ReassignPartitionsCommand)
> Successfully started reassignment of partitions 
> {"version":1,"partitions":[{"topic":"bad-foo","partition":0,"replicas":[2]}]}
> $ zkCli
> Connecting to localhost:2181
> Welcome to ZooKeeper!
> JLine support is enabled
> WATCHER::
> WatchedEvent state:SyncConnected type:None path:null
> [zk: localhost:2181(CONNECTED) 2] get /kafka/admin/reassign_partitions
> {"version":1,"partitions":[]}
> cZxid = 0x5d
> ctime = Mon Jun 01 06:34:26 PDT 2015
> mZxid = 0x5d
> mtime = Mon Jun 01 06:34:26 PDT 2015
> pZxid = 0x5d
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 29
> numChildren = 0
> ^C
> # Fix reassignment.json
> $kafka-reassign-partitions.sh --reassignment-json-file reassignment.json 
> --zookeeper localhost:2181/kafka --executetions
> Current partition replica assignment
> {"version":1,"partitions":[{"topic":"good-foo","partition":0,"replicas":[2]}]}
> Save this to use as the --reassignment-json-file option during rollback
> Partitions reassignment failed due to Partition reassignment currently in 
> progress for Map(). Aborting operation
> kafka.common.AdminCommandFailedException: Partition reassignment currently in 
> progress for Map(). Aborting operation
>   at 
> kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:216)
>   at 
> kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:133)
>   at 
> kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:47)
>   at 
> kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: 1282_access-order_+_test_(same_class).patch

Added a test. I wanted to sleep on MockTime, but here we actually need to 
physically wait at leat one epoll/select cycle. Since I have put 10ms idle 
time, no need for mocked time as well.

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 
> 1282_access-order_+_test_(same_class).patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2234) Partition reassignment of an empty topic prevents future reassignments

2015-06-01 Thread Bob Halley (JIRA)
Bob Halley created KAFKA-2234:
-

 Summary: Partition reassignment of an empty topic prevents future 
reassignments
 Key: KAFKA-2234
 URL: https://issues.apache.org/jira/browse/KAFKA-2234
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2.1
Reporter: Bob Halley
Priority: Blocker


The results of this bug are like those of KAFKA-822.  If I erroneously list a 
non-existent topic in a partition reassignment request, then it will never 
complete and it becomes impossible to do reassignments until the 
admin/reassign-partitions node is deleted by hand from zookeeper.

Note too the incoherent messaging in the bad command.  First it says ERROR what 
I'm trying to do is bad, and then it says it has successfully started it (which 
indeed it has, at least in the sense of writing an empty list to to zookeeper 
:)).

# reassignment.json is bad, it refers to the non-existent topic "bad-foo"

$ cat reassignment.json
 {"partitions": 
  [{"topic": "bad-foo", 
"partition": 0, 
"replicas": [2] }], 
  "version":1
 }

$ kafka-reassign-partitions.sh --reassignment-json-file reassignment.json 
--zookeeper localhost:2181/kafka --execute
Current partition replica assignment

{"version":1,"partitions":[]}

Save this to use as the --reassignment-json-file option during rollback
[2015-06-01 06:34:26,275] ERROR Skipping reassignment of partition [bad-foo,0] 
since it doesn't exist (kafka.admin.ReassignPartitionsCommand)
Successfully started reassignment of partitions 
{"version":1,"partitions":[{"topic":"bad-foo","partition":0,"replicas":[2]}]}

$ zkCli
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabled

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 2] get /kafka/admin/reassign_partitions
{"version":1,"partitions":[]}
cZxid = 0x5d
ctime = Mon Jun 01 06:34:26 PDT 2015
mZxid = 0x5d
mtime = Mon Jun 01 06:34:26 PDT 2015
pZxid = 0x5d
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 29
numChildren = 0

^C

# Fix reassignment.json

$kafka-reassign-partitions.sh --reassignment-json-file reassignment.json 
--zookeeper localhost:2181/kafka --executetions
Current partition replica assignment

{"version":1,"partitions":[{"topic":"good-foo","partition":0,"replicas":[2]}]}

Save this to use as the --reassignment-json-file option during rollback
Partitions reassignment failed due to Partition reassignment currently in 
progress for Map(). Aborting operation
kafka.common.AdminCommandFailedException: Partition reassignment currently in 
progress for Map(). Aborting operation
at 
kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:216)
at 
kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:133)
at 
kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:47)
at 
kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: (was: 1282_access-order_+_test_(same_class).patch)

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Comment: was deleted

(was: Hi [~junrao], [~nehanarkhede],

I added a straight test. 

I saw on my mac that an epoll can take 300ms on low/no traffic. Consequently I 
have put a wait slightly longer but clearly there is a coin flip. I see the 
following possible solutions to deterministic testing here:
- mock time: inject a Time instance in constructor, SystemTime in prod code, 
and a mocked/stubbed on from test (I am on this currently)
- mock selector => select time will be deterministic and close to zero
- try more than once, waiting 50% more time each time (until n attempts and/or 
max wait time)
- simulate time (e.g. clock change) and "wait" more

I can also try a test with multiple connections, if you think it is worth it. 

I have 2 versions of the same patch: it differs in the way the new test is 
inserted: in the same or new test class (which allows independent 
parametrizations of the server).
I extracted some methods out in order to parametrize differently the server but 
still reuse. 
However I also attach a version without extracting.)

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: (was: access-order_+_test.patch)

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567222#comment-14567222
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 1:27 PM:
--

Hi [~junrao], [~nehanarkhede],

I added a straight test. 

I saw on my mac that an epoll can take 300ms on low/no traffic. Consequently I 
have put a wait slightly longer but clearly there is a coin flip. I see the 
following possible solutions to deterministic testing here:
- mock time: inject a Time instance in constructor, SystemTime in prod code, 
and a mocked/stubbed on from test (I am on this currently)
- mock selector => select time will be deterministic and close to zero
- try more than once, waiting 50% more time each time (until n attempts and/or 
max wait time)
- simulate time (e.g. clock change) and "wait" more

I can also try a test with multiple connections, if you think it is worth it. 

I have 2 versions of the same patch: it differs in the way the new test is 
inserted: in the same or new test class (which allows independent 
parametrizations of the server).
I extracted some methods out in order to parametrize differently the server but 
still reuse. 
However I also attach a version without extracting.


was (Author: nmarasoi):
Hi [~junrao], [~nehanarkhede],

I added a straight test. 

I saw on my mac that an epoll can take 300ms on low/no traffic. Consequently I 
have put a wait slightly longer but clearly there is a coin flip. I see the 
following possible solutions to deterministic testing here:
- mock selector => select time will be deterministic and close to zero
- try more than once, waiting 50% more time each time (until n attempts and/or 
max wait time)
- simulate time (e.g. clock change; do you suggest another?) and "wait" more

I can also try a test with multiple connections, if you think it is worth it. 

I have 2 versions of the same patch: it differs in the way the new test is 
inserted: in the same or new test class (which allows independent 
parametrizations of the server).
I extracted some methods out in order to parametrize differently the server but 
still reuse. 
However I also attach a version without extracting.

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 
> 1282_access-order_+_test_(same_class).patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: 1282_access-order_+_test_(same_class).patch

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 
> 1282_access-order_+_test_(same_class).patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567222#comment-14567222
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 1:18 PM:
--

Hi [~junrao], [~nehanarkhede],

I added a straight test. 

I saw on my mac that an epoll can take 300ms on low/no traffic. Consequently I 
have put a wait slightly longer but clearly there is a coin flip. I see the 
following possible solutions to deterministic testing here:
- mock selector => select time will be deterministic and close to zero
- try more than once, waiting 50% more time each time (until n attempts and/or 
max wait time)
- simulate time (e.g. clock change; do you suggest another?) and "wait" more

I can also try a test with multiple connections, if you think it is worth it. 

I have 2 versions of the same patch: it differs in the way the new test is 
inserted: in the same or new test class (which allows independent 
parametrizations of the server).
I extracted some methods out in order to parametrize differently the server but 
still reuse. 
However I also attach a version without extracting.


was (Author: nmarasoi):
Hi [~junrao], [~nehanarkhede],

I added a straight test. I extracted some methods out in order to parametrize 
differently the server but still reuse. I saw on my mac that an epoll can take 
300ms on low/no traffic. Consequently I have put a wait slightly longer but 
clearly there is a coin flip, unless I want to mock selector. Currently 
selector is not in the constructor args. Maybe I will try this. Also, do you 
have any idea to simulate time without waiting? I can also try a test with 
multiple connections, if you think it is worth it. 

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567222#comment-14567222
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 12:55 PM:
---

Hi [~junrao], [~nehanarkhede],

I added a straight test. I extracted some methods out in order to parametrize 
differently the server but still reuse. I saw on my mac that an epoll can take 
300ms on low/no traffic. Consequently I have put a wait slightly longer but 
clearly there is a coin flip, unless I want to mock selector. Currently 
selector is not in the constructor args. Maybe I will try this. Also, do you 
have any idea to simulate time without waiting? I can also try a test with 
multiple connections, if you think it is worth it. 


was (Author: nmarasoi):
Hi [~junrao], [~nehanarkhede],

I added a straight test. I extracted some methods out in order to parametrize 
differently the server but still reuse. I think it can be tweaked to wait for 
less time, but experiments with smaller millis failed. I can also try a test 
with multiple connections, if you think it is worth it. 

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: access-order_+_test.patch

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: (was: 1282_access-order_+_test_single_conn.patch)

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch, 
> access-order_+_test.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567222#comment-14567222
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 12:15 PM:
---

Hi [~junrao], [~nehanarkhede],

I added a straight test. I extracted some methods out in order to parametrize 
differently the server but still reuse. I think it can be tweaked to wait for 
less time, but experiments with smaller millis failed. I can also try a test 
with multiple connections, if you think it is worth it. 


was (Author: nmarasoi):
Hi,

I added a straight test. I extracted some methods out in order to parametrize 
differently the server but still reuse. I think it can be tweaked to wait for 
less time, but experiments with smaller millis failed. I can also try a test 
with multiple connections, if you think it is worth it. 

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 
> 1282_access-order_+_test_single_conn.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567222#comment-14567222
 ] 

nicu marasoiu edited comment on KAFKA-1282 at 6/1/15 12:14 PM:
---

Hi,

I added a straight test. I extracted some methods out in order to parametrize 
differently the server but still reuse. I think it can be tweaked to wait for 
less time, but experiments with smaller millis failed. I can also try a test 
with multiple connections, if you think it is worth it. 


was (Author: nmarasoi):
Hi,

Added a test. I think it can be tweaked to wait for less time, but experiments 
with really small millis failed. I can also try a test with multiple 
connections, if you think it is worth it. 

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 
> 1282_access-order_+_test_single_conn.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1282) Disconnect idle socket connection in Selector

2015-06-01 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1282:
-
Attachment: 1282_access-order_+_test_single_conn.patch

Hi,

Added a test. I think it can be tweaked to wait for less time, but experiments 
with really small millis failed. I can also try a test with multiple 
connections, if you think it is worth it. 

> Disconnect idle socket connection in Selector
> -
>
> Key: KAFKA-1282
> URL: https://issues.apache.org/jira/browse/KAFKA-1282
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: nicu marasoiu
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: 1282_access-order.patch, 
> 1282_access-order_+_test_single_conn.patch, 1282_brush.patch, 
> 1282_brushed_up.patch, 
> KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch
>
>
> To reduce # socket connections, it would be useful for the new producer to 
> close socket connections that are idle. We can introduce a new producer 
> config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2187) Introduce merge-kafka-pr.py script

2015-06-01 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567032#comment-14567032
 ] 

Ismael Juma commented on KAFKA-2187:


Sure, that was indeed my plan. Will do it soon.

> Introduce merge-kafka-pr.py script
> --
>
> Key: KAFKA-2187
> URL: https://issues.apache.org/jira/browse/KAFKA-2187
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Minor
> Attachments: KAFKA-2187.patch, KAFKA-2187_2015-05-20_23:14:05.patch
>
>
> This script will be used to merge GitHub pull requests and it will pull from 
> the Apache Git repo to the current branch, squash and merge the PR, push the 
> commit to trunk, close the PR (via commit message) and close the relevant 
> JIRA issue (via JIRA API).
> Spark has a script that does most (if not all) of this and that will be used 
> as the starting point:
> https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)