[jira] [Created] (KAFKA-5340) Add additional test cases for batch splitting to ensure idempotent/transactional metadata is preserved

2017-05-27 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5340:
--

 Summary: Add additional test cases for batch splitting to ensure 
idempotent/transactional metadata is preserved
 Key: KAFKA-5340
 URL: https://issues.apache.org/jira/browse/KAFKA-5340
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jason Gustafson
Assignee: Jason Gustafson


We noticed in the patch for KAFKA-5316 that the transactional flag was not 
being preserved after batch splitting. We should make sure that we have all the 
cases covered in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5093) Load only batch header when rebuilding producer ID map

2017-05-27 Thread Umesh Chaudhary (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027696#comment-16027696
 ] 

Umesh Chaudhary commented on KAFKA-5093:


No worries at all [~hachikuji] :)

> Load only batch header when rebuilding producer ID map
> --
>
> Key: KAFKA-5093
> URL: https://issues.apache.org/jira/browse/KAFKA-5093
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> When rebuilding the producer ID map for KIP-98, we unnecessarily load the 
> full record data into memory when scanning through the log. It would be 
> better to only load the batch header since it is all that is needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5338) There is a Misspell in ResetIntegrationTest

2017-05-27 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027695#comment-16027695
 ] 

Matthias J. Sax commented on KAFKA-5338:


[~guozhang] Can you add [~hejiefang] to contributor list, so we can assign this 
Jira. Thx.

> There is a Misspell in ResetIntegrationTest
> ---
>
> Key: KAFKA-5338
> URL: https://issues.apache.org/jira/browse/KAFKA-5338
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.1
>Reporter: hejiefang
> Fix For: 0.11.0.0
>
>
> There is a Misspell in Annotations of ResetIntegrationTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5338) There is a Misspell in ResetIntegrationTest

2017-05-27 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5338:
---
Fix Version/s: 0.11.0.0

> There is a Misspell in ResetIntegrationTest
> ---
>
> Key: KAFKA-5338
> URL: https://issues.apache.org/jira/browse/KAFKA-5338
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.1
>Reporter: hejiefang
> Fix For: 0.11.0.0
>
>
> There is a Misspell in Annotations of ResetIntegrationTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5338) There is a Misspell in ResetIntegrationTest

2017-05-27 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5338:
---
Status: Patch Available  (was: Open)

> There is a Misspell in ResetIntegrationTest
> ---
>
> Key: KAFKA-5338
> URL: https://issues.apache.org/jira/browse/KAFKA-5338
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.1
>Reporter: hejiefang
> Fix For: 0.11.0.0
>
>
> There is a Misspell in Annotations of ResetIntegrationTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5338) There is a Misspell in ResetIntegrationTest

2017-05-27 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5338:
---
Affects Version/s: (was: 0.11.1.0)
   0.10.2.1

> There is a Misspell in ResetIntegrationTest
> ---
>
> Key: KAFKA-5338
> URL: https://issues.apache.org/jira/browse/KAFKA-5338
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.1
>Reporter: hejiefang
> Fix For: 0.11.0.0
>
>
> There is a Misspell in Annotations of ResetIntegrationTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

2017-05-27 Thread Matthias J. Sax
Thanks Jeyhun.

About ValueTransformer: I don't think we can remove it. Note,
ValueTransformer allows to attach a state and also allows to register
punctuations. Both those features will not be available via withKey()
interfaces.

-Matthias


On 5/27/17 1:25 PM, Jeyhun Karimov wrote:
> Hi Matthias,
> 
> Thanks for your comments.
> 
> I tested the deep copy approach. It has significant overhead. Especially
> for "light" and stateless operators it slows down the topology
> significantly (> 20% ). I think "warning"  users about not-changing the key
> is better warning them about possible performance loss.
> 
> About the interfaces, additionally I considered adding InitializerWithKey,
> AggregatorWithKey and ValueTransformerWithKey. I think they are included in
> PR but not in KIP. I will also include them in KIP, sorry my bad.
> Including ReducerWithKey definitely makes sense. Thanks.
> 
> One thing I want to mention is that, maybe we should deprecate methods with
> argument type ValueTransformerSupplier (KStream.transformValues(...)) and
> and as a whole the ValueTransformerSupplier interface.
> We can use ValueTransformer/ValueTransformerWithKey type instead without
> additional supplier layer.
> 
> 
> Cheers,
> Jeyhun
> 
> 
> On Thu, May 25, 2017 at 1:07 AM Matthias J. Sax 
> wrote:
> 
>> One more question:
>>
>> Should we add any of
>>  - InitizialierWithKey
>>  - ReducerWithKey
>>  - ValueTransformerWithKey
>>
>> To get consistent/complete API, it might be a good idea. Any thoughts?
>>
>>
>> -Matthias
>>
>>
>> On 5/24/17 3:47 PM, Matthias J. Sax wrote:
>>> Jeyhun,
>>>
>>> I was just wondering if you did look into the key-deep-copy idea we
>>> discussed. I am curious to see what the impact might be.
>>>
>>>
>>> -Matthias
>>>
>>> On 5/20/17 2:03 AM, Jeyhun Karimov wrote:
 Hi,

 Thanks for your comments. I rethink about including rich functions into
 this KIP.
 I think once we include rich functions in this KIP and then fix
 ProcessorContext in another KIP and incorporate with existing rich
 functions, the code will not be backwards compatible.

 I see Damian's and your point more clearly now.

 Conclusion: we include only withKey interfaces in this KIP (I updated
>> the
 KIP), I will try also initiate another KIP for rich functions.

 Cheers,
 Jeyhun

 On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax >>
 wrote:

> With the current KIP, using ValueMapper and ValueMapperWithKey
> interfaces, RichFunction seems to be an independent add-on. To fix the
> original issue to allow key access, RichFunctions are not required
>> IMHO.
>
> I initially put the RichFunction idea on the table, because I was
>> hoping
> to get a uniform API. And I think, is was good to consider them --
> however, the discussion showed that they are not necessary for key
> access. Thus, it seems to be better to handle RichFunctions in an own
> KIP. The ProcessorContext/RecordContext issues seems to be a main
> challenge for this. And introducing RichFunctions with parameter-less
> init() method, seem not to help too much. We would get an
>> "intermediate"
> API that we want to change anyway later on...
>
> As you put already much effort into RichFunction, feel free do push
>> this
> further and start a new KIP (we can do this even in parallel) -- we
> don't want to slow you down :) But it make the discussion and code
> review easier, if we separate both IMHO.
>
>
> -Matthias
>
>
> On 5/19/17 2:25 AM, Jeyhun Karimov wrote:
>> Hi Damian,
>>
>> Thanks for your comments. I think providing to users *interface*
>> rather
>> than *abstract class* should be preferred (Matthias also raised this
> issue
>> ), anyway I changed the corresponding parts of KIP.
>>
>> Regarding with passing additional contextual information, I think it
>> is a
>> tradeoff,
>> 1) first, we fix the context parameter for *init() *method in another
>> PR
>> and solve Rich functions afterwards
>> 2) first, we fix the requested issues on jira ([1-3]) with providing
>> (not-complete) Rich functions and integrate the context parameters to
> this
>> afterwards (like improvement)
>>
>> To me, the second approach seems more incremental. However you are
>> right,
>> the names might confuse the users.
>>
>>
>>
>> [1] https://issues.apache.org/jira/browse/KAFKA-4218
>> [2] https://issues.apache.org/jira/browse/KAFKA-4726
>> [3] https://issues.apache.org/jira/browse/KAFKA-3745
>>
>>
>> Cheers,
>> Jeyhun
>>
>>
>> On Fri, May 19, 2017 at 10:42 AM Damian Guy 
> wrote:
>>
>>> Hi,
>>>
>>> I see you've removed the `ProcessorContext` from the RichFunction
>> which
> is
>>> good, but why is it a `RichFunction`? I'd have expected it to pass
>> some
>>> additional contextual in

[jira] [Issue Comment Deleted] (KAFKA-4660) Improve test coverage KafkaStreams

2017-05-27 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-4660:
---
Comment: was deleted

(was: [~damianguy] Should we just close this as `toString()` is going to be 
removed anyway?)

> Improve test coverage KafkaStreams
> --
>
> Key: KAFKA-4660
> URL: https://issues.apache.org/jira/browse/KAFKA-4660
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Damian Guy
>Assignee: Umesh Chaudhary
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> {{toString}} is used to print the topology, so probably should have a unit 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4660) Improve test coverage KafkaStreams

2017-05-27 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027693#comment-16027693
 ] 

Matthias J. Sax commented on KAFKA-4660:


[~damianguy] Should we just close this as `toString()` is going to be removed 
anyway?

> Improve test coverage KafkaStreams
> --
>
> Key: KAFKA-4660
> URL: https://issues.apache.org/jira/browse/KAFKA-4660
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Damian Guy
>Assignee: Umesh Chaudhary
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> {{toString}} is used to print the topology, so probably should have a unit 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: KIP-162: Enable topic deletion by default

2017-05-27 Thread Vahid S Hashemian
Sure, that sounds good.

I suggested that to keep command line behavior consistent.
Plus, removal of ACL access is something that can be easily undone, but 
topic deletion is not reversible.
So, perhaps a new follow-up JIRA to this KIP to add the confirmation for 
topic deletion.

Thanks.
--Vahid



From:   Gwen Shapira 
To: dev@kafka.apache.org, us...@kafka.apache.org
Date:   05/27/2017 11:04 AM
Subject:Re: KIP-162: Enable topic deletion by default



Thanks Vahid,

Do you mind if we leave the command-line out of scope for this?

I can see why adding confirmations, options to bypass confirmations, etc
would be an improvement. However, I've seen no complaints about the 
current
behavior of the command-line and the KIP doesn't change it at all. So I'd
rather address things separately.

Gwen

On Fri, May 26, 2017 at 8:10 PM Vahid S Hashemian 

wrote:

> Gwen, thanks for the KIP.
> It looks good to me.
>
> Just a minor suggestion: It would be great if the command asks for a
> confirmation (y/n) before deleting the topic (similar to how removing 
ACLs
> works).
>
> Thanks.
> --Vahid
>
>
>
> From:   Gwen Shapira 
> To: "dev@kafka.apache.org" , Users
> 
> Date:   05/26/2017 07:04 AM
> Subject:KIP-162: Enable topic deletion by default
>
>
>
> Hi Kafka developers, users and friends,
>
> I've added a KIP to improve our out-of-the-box usability a bit:
> KIP-162: Enable topic deletion by default:
>
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-162+-+Enable+topic+deletion+by+default

>
>
> Pretty simple :) Discussion and feedback are welcome.
>
> Gwen
>
>
>
>
>






[jira] [Created] (KAFKA-5339) Transactions system test with hard broker bounces fails sporadically

2017-05-27 Thread Apurva Mehta (JIRA)
Apurva Mehta created KAFKA-5339:
---

 Summary: Transactions system test with hard broker bounces fails 
sporadically
 Key: KAFKA-5339
 URL: https://issues.apache.org/jira/browse/KAFKA-5339
 Project: Kafka
  Issue Type: Sub-task
Reporter: Apurva Mehta
Assignee: Apurva Mehta
Priority: Blocker


The transactions hard bounce test occasionally fails because the transactional 
message copy just seems to hang. In one of the client logs, I noticed: 

{noformat}

[2017-05-27 20:36:12,596] WARN Got error produce response with correlation id 
124 on topic-partition output-topic-0, retrying (2147483646 attempts left). 
Error: NOT_LEADER_FOR_PARTITION 
(org.apache.kafka.clients.producer.internals.Sender)
[2017-05-27 20:36:15,386] ERROR Uncaught error in kafka producer I/O thread:  
(org.apache.kafka.clients.producer.internals.Sender)
java.lang.NullPointerException
at 
org.apache.kafka.clients.producer.internals.TransactionManager$1.compare(TransactionManager.java:146)
at 
org.apache.kafka.clients.producer.internals.TransactionManager$1.compare(TransactionManager.java:143)
at 
java.util.PriorityQueue.siftDownUsingComparator(PriorityQueue.java:721)
at java.util.PriorityQueue.siftDown(PriorityQueue.java:687)
at java.util.PriorityQueue.poll(PriorityQueue.java:595)
at 
org.apache.kafka.clients.producer.internals.TransactionManager.nextRequestHandler(TransactionManager.java:351)
at 
org.apache.kafka.clients.producer.internals.Sender.maybeSendTransactionalRequest(Sender.java:303)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:193)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:154)
at java.lang.Thread.run(Thread.java:748)
[2017-05-27 20:36:52,007] INFO Closing the Kafka producer with timeoutMillis = 
9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2017-05-27 20:36:52,036] INFO Marking the coordinator knode02:9092 (id: 
2147483645 rack: null) dead for group transactions-test-consumer-group 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
root@7dcd60017519:/opt/kafka-dev/results/latest/TransactionsTest/test_transactions/failure_mode=hard_bounce.bounce_target=brokers/1#
{noformat}

This suggests that the client has gotten to a bad state which is why it stops 
processing messages, causing the tests to fail. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4340) Change the default value of log.message.timestamp.difference.max.ms to the same as log.retention.ms

2017-05-27 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027616#comment-16027616
 ] 

Jun Rao commented on KAFKA-4340:


[~edenhill], I wasn't suggesting that this commit is a breaking change. In 
fact, it gives the user the same current behavior if 
log.message.timestamp.difference.max.ms is explicitly set to log.retention.ms. 
I am just saying that the current behavior of rejecting the whole batch w/o 
telling the clients the violating messages can be improved, which may need a 
separate KIP. So, I am not sure if we need to revert this commit since it 
avoids unnecessary log rolling.

> Change the default value of log.message.timestamp.difference.max.ms to the 
> same as log.retention.ms
> ---
>
> Key: KAFKA-4340
> URL: https://issues.apache.org/jira/browse/KAFKA-4340
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.1.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.11.0.0
>
>
> [~junrao] brought up the following scenario: 
> If users are pumping data with timestamp already passed log.retention.ms into 
> Kafka, the messages will be appended to the log but will be immediately 
> rolled out by log retention thread when it kicks in and the messages will be 
> deleted. 
> To avoid this produce-and-deleted scenario, we can set the default value of 
> log.message.timestamp.difference.max.ms to be the same as log.retention.ms.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Efficient way of Searching Messages By Timestamp - Kafka

2017-05-27 Thread SenthilKumar K
Hi Team , Any help here Pls ?

Cheers,
Senthil

On Sat, May 27, 2017 at 8:25 PM, SenthilKumar K 
wrote:

> Hello Kafka Developers , Users ,
>
> We are exploring the SearchMessageByTimestamp feature in Kafka for our
> use case .
>
> Use Case : Kafka will be realtime message bus , users should be able
> to pull Logs by specifying start_date and end_date or  Pull me last five
> minutes data etc ...
>
> I did POC on SearchMessageByTimestamp , here is the code
> https://gist.github.com/senthilec566/16e8e28b32834666fea132afc3a4e2f9 .
> And i observed that Searching Messages is slow ..
>
> Here is small test i did :
> Query :Fetch Logs of Last *5 minutes*:
> Result:
> No of Records fetched : *30*
> Fetch Time *6210* ms
>
> Above test performed in a topic which has 4 partitions. In each partition
> search & query processing happened .. in other words
> consumer.offsetsForTimes()
> consumer.assign(Arrays.asList(partition))
> consumer.seek(this.partition, offsetTimestamp.offset())
> consumer.poll(100)
>
> are the API calls of each partition.. I realized that , this was the
> reason for Kafka taking more time..
>
> What is efficient way of implementing SerachMessageByTimeStamp ?  Is Kafka
> right candidate for our Use Case ?
>
> Pls add your thoughts here ...
>
>
> Cheers,
> Senthil
>


Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

2017-05-27 Thread Jeyhun Karimov
Hi Matthias,

Thanks for your comments.

I tested the deep copy approach. It has significant overhead. Especially
for "light" and stateless operators it slows down the topology
significantly (> 20% ). I think "warning"  users about not-changing the key
is better warning them about possible performance loss.

About the interfaces, additionally I considered adding InitializerWithKey,
AggregatorWithKey and ValueTransformerWithKey. I think they are included in
PR but not in KIP. I will also include them in KIP, sorry my bad.
Including ReducerWithKey definitely makes sense. Thanks.

One thing I want to mention is that, maybe we should deprecate methods with
argument type ValueTransformerSupplier (KStream.transformValues(...)) and
and as a whole the ValueTransformerSupplier interface.
We can use ValueTransformer/ValueTransformerWithKey type instead without
additional supplier layer.


Cheers,
Jeyhun


On Thu, May 25, 2017 at 1:07 AM Matthias J. Sax 
wrote:

> One more question:
>
> Should we add any of
>  - InitizialierWithKey
>  - ReducerWithKey
>  - ValueTransformerWithKey
>
> To get consistent/complete API, it might be a good idea. Any thoughts?
>
>
> -Matthias
>
>
> On 5/24/17 3:47 PM, Matthias J. Sax wrote:
> > Jeyhun,
> >
> > I was just wondering if you did look into the key-deep-copy idea we
> > discussed. I am curious to see what the impact might be.
> >
> >
> > -Matthias
> >
> > On 5/20/17 2:03 AM, Jeyhun Karimov wrote:
> >> Hi,
> >>
> >> Thanks for your comments. I rethink about including rich functions into
> >> this KIP.
> >> I think once we include rich functions in this KIP and then fix
> >> ProcessorContext in another KIP and incorporate with existing rich
> >> functions, the code will not be backwards compatible.
> >>
> >> I see Damian's and your point more clearly now.
> >>
> >> Conclusion: we include only withKey interfaces in this KIP (I updated
> the
> >> KIP), I will try also initiate another KIP for rich functions.
> >>
> >> Cheers,
> >> Jeyhun
> >>
> >> On Fri, May 19, 2017 at 10:50 PM Matthias J. Sax  >
> >> wrote:
> >>
> >>> With the current KIP, using ValueMapper and ValueMapperWithKey
> >>> interfaces, RichFunction seems to be an independent add-on. To fix the
> >>> original issue to allow key access, RichFunctions are not required
> IMHO.
> >>>
> >>> I initially put the RichFunction idea on the table, because I was
> hoping
> >>> to get a uniform API. And I think, is was good to consider them --
> >>> however, the discussion showed that they are not necessary for key
> >>> access. Thus, it seems to be better to handle RichFunctions in an own
> >>> KIP. The ProcessorContext/RecordContext issues seems to be a main
> >>> challenge for this. And introducing RichFunctions with parameter-less
> >>> init() method, seem not to help too much. We would get an
> "intermediate"
> >>> API that we want to change anyway later on...
> >>>
> >>> As you put already much effort into RichFunction, feel free do push
> this
> >>> further and start a new KIP (we can do this even in parallel) -- we
> >>> don't want to slow you down :) But it make the discussion and code
> >>> review easier, if we separate both IMHO.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>>
> >>> On 5/19/17 2:25 AM, Jeyhun Karimov wrote:
>  Hi Damian,
> 
>  Thanks for your comments. I think providing to users *interface*
> rather
>  than *abstract class* should be preferred (Matthias also raised this
> >>> issue
>  ), anyway I changed the corresponding parts of KIP.
> 
>  Regarding with passing additional contextual information, I think it
> is a
>  tradeoff,
>  1) first, we fix the context parameter for *init() *method in another
> PR
>  and solve Rich functions afterwards
>  2) first, we fix the requested issues on jira ([1-3]) with providing
>  (not-complete) Rich functions and integrate the context parameters to
> >>> this
>  afterwards (like improvement)
> 
>  To me, the second approach seems more incremental. However you are
> right,
>  the names might confuse the users.
> 
> 
> 
>  [1] https://issues.apache.org/jira/browse/KAFKA-4218
>  [2] https://issues.apache.org/jira/browse/KAFKA-4726
>  [3] https://issues.apache.org/jira/browse/KAFKA-3745
> 
> 
>  Cheers,
>  Jeyhun
> 
> 
>  On Fri, May 19, 2017 at 10:42 AM Damian Guy 
> >>> wrote:
> 
> > Hi,
> >
> > I see you've removed the `ProcessorContext` from the RichFunction
> which
> >>> is
> > good, but why is it a `RichFunction`? I'd have expected it to pass
> some
> > additional contextual information, i.e., the `RecordContext` that
> >>> contains
> > just the topic, partition, timestamp, offset.  I'm ok with it not
> >>> passing
> > this contextual information, but is the naming incorrect? I'm not
> sure,
> > tbh. I'm wondering if we should drop `RichFunctions` until we can do
> it
> > properly with the correct context?
> >
> 

[jira] [Updated] (KAFKA-5093) Load only batch header when rebuilding producer ID map

2017-05-27 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-5093:
---
Assignee: Jason Gustafson
  Status: Patch Available  (was: Open)

> Load only batch header when rebuilding producer ID map
> --
>
> Key: KAFKA-5093
> URL: https://issues.apache.org/jira/browse/KAFKA-5093
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> When rebuilding the producer ID map for KIP-98, we unnecessarily load the 
> full record data into memory when scanning through the log. It would be 
> better to only load the batch header since it is all that is needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5093) Load only batch header when rebuilding producer ID map

2017-05-27 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027555#comment-16027555
 ] 

Jason Gustafson commented on KAFKA-5093:


[~umesh9...@gmail.com] Apologies. I decided to pick this up yesterday, but 
forgot to assign myself.

> Load only batch header when rebuilding producer ID map
> --
>
> Key: KAFKA-5093
> URL: https://issues.apache.org/jira/browse/KAFKA-5093
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> When rebuilding the producer ID map for KIP-98, we unnecessarily load the 
> full record data into memory when scanning through the log. It would be 
> better to only load the batch header since it is all that is needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5093) Load only batch header when rebuilding producer ID map

2017-05-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027541#comment-16027541
 ] 

ASF GitHub Bot commented on KAFKA-5093:
---

GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3160

KAFKA-5093: Avoid loading full batch data when possible when iterating 
FileRecords



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5093

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3160.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3160


commit a453e37032a8e0db7dde69f2c76d22070cabe80a
Author: Jason Gustafson 
Date:   2017-05-27T07:56:55Z

KAFKA-5093: Avoid loading full batch data when possible when iterating 
FileRecords




> Load only batch header when rebuilding producer ID map
> --
>
> Key: KAFKA-5093
> URL: https://issues.apache.org/jira/browse/KAFKA-5093
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> When rebuilding the producer ID map for KIP-98, we unnecessarily load the 
> full record data into memory when scanning through the log. It would be 
> better to only load the batch header since it is all that is needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3160: KAFKA-5093: Avoid loading full batch data when pos...

2017-05-27 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3160

KAFKA-5093: Avoid loading full batch data when possible when iterating 
FileRecords



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5093

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3160.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3160


commit a453e37032a8e0db7dde69f2c76d22070cabe80a
Author: Jason Gustafson 
Date:   2017-05-27T07:56:55Z

KAFKA-5093: Avoid loading full batch data when possible when iterating 
FileRecords




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5155) Messages can be deleted prematurely when some producers use timestamps and some not

2017-05-27 Thread Michal Borowiecki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027521#comment-16027521
 ] 

Michal Borowiecki commented on KAFKA-5155:
--

Hi [~plavjanik], do you care to submit a pull request with the test and the fix?

> Messages can be deleted prematurely when some producers use timestamps and 
> some not
> ---
>
> Key: KAFKA-5155
> URL: https://issues.apache.org/jira/browse/KAFKA-5155
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.2.0
>Reporter: Petr Plavjaník
>
> Some messages can be deleted prematurely and never read in following 
> scenario. A producer uses timestamps and produces messages that are appended 
> to the beginning of a log segment. Other producer produces messages without a 
> timestamp. In that case the largest timestamp is made by the old messages 
> with a timestamp and new messages with the timestamp does not influence and 
> the log segment with old and new messages can be delete immediately after the 
> last new message with no timestamp is appended. When all appended messages 
> have no timestamp, then they are not deleted because {{lastModified}} 
> attribute of a {{LogSegment}} is used.
> New test case to {{kafka.log.LogTest}} that fails:
> {code}
>   @Test
>   def 
> shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
> val retentionMs = 1000
> val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
> val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1, 
> magicValue = 0)
> val log = createLog(set.sizeInBytes, retentionMs = retentionMs)
> // append some messages to create some segments
> log.append(old)
> for (_ <- 0 until 12)
>   log.append(set)
> assertEquals("No segment should be deleted", 0, log.deleteOldSegments())
>   }
> {code}
> It can be prevented by using {{def largestTimestamp = 
> Math.max(maxTimestampSoFar, lastModified)}} in LogSegment, or by using 
> current timestamp when messages with timestamp {{-1}} are appended.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-147: Add missing type parameters to StateStoreSupplier factories and KGroupedStream/Table methods

2017-05-27 Thread Michal Borowiecki

Hi all,

I've updated the KIP to reflect the proposed backwards-compatible approach:

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=69408481


Given the vast area of APIs affected, I think the PR is easier to read 
than the code excerpts in the KIP itself:

https://github.com/apache/kafka/pull/2992/files

Thanks,
Michał

On 07/05/17 10:16, Eno Thereska wrote:

I like this KIP in general and I agree it’s needed. Perhaps Damian can comment 
on the session store issue?

Thanks
Eno

On May 6, 2017, at 10:32 PM, Michal Borowiecki  
wrote:

Hi Matthias,

Agreed. I tried your proposal and indeed it would work.

However, I think to maintain full backward compatibility we would also need to 
deprecate Stores.create() and leave it unchanged, while providing a new method 
that returns the more strongly typed Factories.

( This is because PersistentWindowFactory and PersistentSessionFactory cannot extend the existing 
PersistentKeyValueFactory interface, since their build() methods will be returning 
TypedStateStoreSupplier> and TypedStateStoreSupplier> 
respectively, which are NOT subclasses of TypedStateStoreSupplier>. I do not see 
another way around it. Admittedly, my type covariance skills are rudimentary. Does anyone see a better way around 
this? )

Since create() takes only the store name as argument, and I don't see what we 
could overload it with, the new method would need to have a different name.

Alternatively, since create(String) is the only method in Stores, we could 
deprecate the entire class and provide a new one. That would be my preference. 
Any ideas what to call it?



All comments and suggestions appreciated.



Cheers,

Michał


On 04/05/17 21:48, Matthias J. Sax wrote:

I had a quick look into this.

With regard to backward compatibility, I think it would be required do
introduce a new type `TypesStateStoreSupplier` (that extends
`StateStoreSupplier`) and to overload all methods that take a
`StateStoreSupplier` that accept the new type instead of the current one.

This would allow `.build` to return a `TypedStateStoreSupplier` and
thus, would not break any code. As least if I did not miss anything with
regard to some magic of type inference using generics (I am not an
expert in this field).


-Matthias

On 5/4/17 11:32 AM, Matthias J. Sax wrote:

Did not have time to have a look. But backward compatibility is a must
from my point of view.

-Matthias


On 5/4/17 12:56 AM, Michal Borowiecki wrote:

Hello,

I've updated the KIP with missing information.

I would especially appreciate some comments on the compatibility aspects
of this as the proposed change is not fully backwards-compatible.

In the absence of comments I shall call for a vote in the next few days.

Thanks,

Michal


On 30/04/17 23:11, Michal Borowiecki wrote:

Hi community!

I have just drafted KIP-147: Add missing type parameters to
StateStoreSupplier factories and KGroupedStream/Table methods
 


Please let me know if this a step in the right direction.

All comments welcome.

Thanks,
Michal
--
Signature
  Michal Borowiecki
Senior Software Engineer L4
T:  +44 208 742 1600


+44 203 249 8448



E:  michal.borowie...@openbet.com 

W:  www.openbet.com   



OpenBet Ltd

Chiswick Park Building 9

566 Chiswick High Rd

London

W4 5XT

UK


 

This message is confidential and intended only for the addressee. If
you have received this message in error, please immediately notify the
postmas...@openbet.com  
  and delete it
from your system as well as any copies. The content of e-mails as well
as traffic data may be monitored by OpenBet for employment and
security purposes. To protect the environment please do not print this
e-mail unless necessary. OpenBet Ltd. Registered Office: Chiswick Park
Building 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A
company registered in England and Wales. Registered no. 3134634. VAT
no. GB927523612


--
Signature
  Michal Borowiecki
Senior Software Engineer L4
T:  +44 208 742 1600


+44 203 249 8448



E:  michal.borowie...@openbet.com 

W:  www.openbet.com   



OpenBet Ltd

Chiswick Park Building 9

 

Re: KIP-162: Enable topic deletion by default

2017-05-27 Thread Gwen Shapira
Agreed and updated the KIP accordingly. Thank you!

On Sat, May 27, 2017 at 1:25 AM Guozhang Wang  wrote:

> I'd say just remove those two lines.
>
> On Fri, May 26, 2017 at 7:55 AM, Gwen Shapira  wrote:
>
> > This was a discussion, not a vote (sorry for mangling the title), but
> > thanks for all the +1 anyway.
> >
> > Regarding Ismael's feedback:
> > The current server.properties includes the following:
> > # Switch to enable topic deletion or not, default value is false
> > #delete.topic.enable=true
> >
> > We can't leave it as is, obviously - since the KIP invalidates the
> > comment.  Lets just remove those two lines?
> >
> > Note that all our proposed changes may break few community puppet/docker
> > scripts that use these lines for "sed" that enables topic deletion.
> >
> > Gwen
> >
> > On Fri, May 26, 2017 at 5:41 PM Tom Crayford 
> wrote:
> >
> > > +1 (non-binding)
> > >
> > > On Fri, May 26, 2017 at 3:38 PM, Damian Guy 
> > wrote:
> > >
> > > > +1
> > > > Also agree with what Ismael said.
> > > >
> > > > On Fri, 26 May 2017 at 15:26 Ismael Juma  wrote:
> > > >
> > > > > Thanks for the KIP, sounds good to me. One comment: not sure we
> need
> > to
> > > > add
> > > > > the config to server.properties. Do we expect people to change this
> > > > > default?
> > > > >
> > > > > On Fri, May 26, 2017 at 3:03 PM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > > > Hi Kafka developers, users and friends,
> > > > > >
> > > > > > I've added a KIP to improve our out-of-the-box usability a bit:
> > > > > > KIP-162: Enable topic deletion by default:
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-162+-+
> > > > > > Enable+topic+deletion+by+default
> > > > > >
> > > > > > Pretty simple :) Discussion and feedback are welcome.
> > > > > >
> > > > > > Gwen
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: KIP-162: Enable topic deletion by default

2017-05-27 Thread Gwen Shapira
Thanks Vahid,

Do you mind if we leave the command-line out of scope for this?

I can see why adding confirmations, options to bypass confirmations, etc
would be an improvement. However, I've seen no complaints about the current
behavior of the command-line and the KIP doesn't change it at all. So I'd
rather address things separately.

Gwen

On Fri, May 26, 2017 at 8:10 PM Vahid S Hashemian 
wrote:

> Gwen, thanks for the KIP.
> It looks good to me.
>
> Just a minor suggestion: It would be great if the command asks for a
> confirmation (y/n) before deleting the topic (similar to how removing ACLs
> works).
>
> Thanks.
> --Vahid
>
>
>
> From:   Gwen Shapira 
> To: "dev@kafka.apache.org" , Users
> 
> Date:   05/26/2017 07:04 AM
> Subject:KIP-162: Enable topic deletion by default
>
>
>
> Hi Kafka developers, users and friends,
>
> I've added a KIP to improve our out-of-the-box usability a bit:
> KIP-162: Enable topic deletion by default:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-162+-+Enable+topic+deletion+by+default
>
>
> Pretty simple :) Discussion and feedback are welcome.
>
> Gwen
>
>
>
>
>


Re: KIP-162: Enable topic deletion by default

2017-05-27 Thread Gwen Shapira
This KIP only changes a configuration default, so the two changes seem
independent to me.

And since both changes will be included in the release following 0.11
(October) and no sooner, I don't know if the order matters that much.

Am I missing anything regarding KAFKA-4893 that means KIP-162 isn't as
simple as I assume?

Gwen

On Sat, May 27, 2017 at 4:25 AM Onur Karaman 
wrote:

> Would it make sense to resolve KAFKA-4893 before enabling it by default, as
> fixing the ticket would likely involve changing the log directory
> structure?
>
> On Fri, May 26, 2017 at 3:24 PM, Guozhang Wang  wrote:
>
> > I'd say just remove those two lines.
> >
> > On Fri, May 26, 2017 at 7:55 AM, Gwen Shapira  wrote:
> >
> > > This was a discussion, not a vote (sorry for mangling the title), but
> > > thanks for all the +1 anyway.
> > >
> > > Regarding Ismael's feedback:
> > > The current server.properties includes the following:
> > > # Switch to enable topic deletion or not, default value is false
> > > #delete.topic.enable=true
> > >
> > > We can't leave it as is, obviously - since the KIP invalidates the
> > > comment.  Lets just remove those two lines?
> > >
> > > Note that all our proposed changes may break few community
> puppet/docker
> > > scripts that use these lines for "sed" that enables topic deletion.
> > >
> > > Gwen
> > >
> > > On Fri, May 26, 2017 at 5:41 PM Tom Crayford 
> > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > On Fri, May 26, 2017 at 3:38 PM, Damian Guy 
> > > wrote:
> > > >
> > > > > +1
> > > > > Also agree with what Ismael said.
> > > > >
> > > > > On Fri, 26 May 2017 at 15:26 Ismael Juma 
> wrote:
> > > > >
> > > > > > Thanks for the KIP, sounds good to me. One comment: not sure we
> > need
> > > to
> > > > > add
> > > > > > the config to server.properties. Do we expect people to change
> this
> > > > > > default?
> > > > > >
> > > > > > On Fri, May 26, 2017 at 3:03 PM, Gwen Shapira  >
> > > > wrote:
> > > > > >
> > > > > > > Hi Kafka developers, users and friends,
> > > > > > >
> > > > > > > I've added a KIP to improve our out-of-the-box usability a bit:
> > > > > > > KIP-162: Enable topic deletion by default:
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-162+-+
> > > > > > > Enable+topic+deletion+by+default
> > > > > > >
> > > > > > > Pretty simple :) Discussion and feedback are welcome.
> > > > > > >
> > > > > > > Gwen
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>


Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-05-27 Thread Jeyhun Karimov
Hi,

Thanks for your comments. I will refer the overall approach as rich
functions until we find a better name.

I think there are some pros and cons of the approach you described.

Pros is that it is simple, has clear boundaries, avoids misunderstanding of
term "function".
So you propose sth like:
KStream.valueMapper (ValueMapper vm, RecordContext rc)
or
having rich functions with only a single init(RecordContext rc) method.

Cons is that:
 - This will bring another set of overloads (if we use RecordContext as a
separate parameter). We should consider that the rich functions will be for
all main interfaces.
 - I don't think that we need lambdas in rich functions. It is by
definition "rich" so, no single method in interface -> as a result no
lambdas.
 - I disagree that rich functions should only contain init() method. This
depends on each interface. For example, for specific interfaces  we can add
methods (like punctuate()) to their rich functions.


Cheers,
Jeyhun



On Thu, May 25, 2017 at 1:02 AM Matthias J. Sax 
wrote:

> I confess, the term is borrowed from Flink :)
>
> Personally, I never thought about it, but I tend to agree with Michal. I
> also want to clarify, that the main purpose is the ability to access
> record metadata. Thus, it might even be sufficient to only have "init".
>
> An alternative would of course be, to pass in the RecordContext as
> method parameter. This would allow us to drop "init()". This might even
> allow to use Lambdas and we could keep the name RichFunction as we
> preserve the nature of being a function.
>
>
> -Matthias
>
> On 5/24/17 12:13 PM, Jeyhun Karimov wrote:
> > Hi Michal,
> >
> > Thanks for your comments. I see your point and I agree with it. However,
> > I don't have a better idea for naming. I checked MR source code. There
> > it is used JobConfigurable and Closable, two different interfaces. Maybe
> > we can rename RichFunction as Configurable?
> >
> >
> > Cheers,
> > Jeyhun
> >
> > On Tue, May 23, 2017 at 2:58 PM Michal Borowiecki
> > mailto:michal.borowie...@openbet.com>>
> > wrote:
> >
> > Hi Jeyhun,
> >
> > I understand your argument about "Rich" in RichFunctions. Perhaps
> > I'm just being too puritan here, but let me ask this anyway:
> >
> > What is it that makes something a function? To me a function is
> > something that takes zero or more arguments and possibly returns a
> > value and while it may have side-effects (as opposed to "pure
> > functions" which can't), it doesn't have any life-cycle of its own.
> > This is what, in my mind, distinguishes the concept of a "function"
> > from that of more vaguely defined concepts.
> >
> > So if we add a life-cycle to a function, in that understanding, it
> > doesn't become a rich function but instead stops being a function
> > altogether.
> >
> > You could say it's "just semantics" but to me precise use of
> > language in the given context is an important foundation for good
> > engineering. And in the context of programming "function" has a
> > precise meaning. Of course we can say that in the context of Kafka
> > Streams "function" has a different, looser meaning but I'd argue
> > that won't do anyone any good.
> >
> > On the other hand other frameworks such as Flink use this
> > terminology, so it could be that consistency is the reason. I'm
> > guessing that's why the name was proposed in the first place. My
> > point is simply that it's a poor choice of wording and Kafka Streams
> > don't have to follow that to the letter.
> >
> > Cheers,
> >
> > Michal
> >
> >
> > On 23/05/17 13:26, Jeyhun Karimov wrote:
> >> Hi Michal,
> >>
> >> Thanks for your comments.
> >>
> >>
> >> To me at least it feels strange that something is called a
> >> function yet doesn't follow the functional interface
> >> definition of having just one abstract method. I suppose init
> >> and close could be made default methods with empty bodies once
> >> Java 7 support is dropped to mitigate that concern. Still, I
> >> feel some resistance to consider something that requires
> >> initialisation and closing (which implies holding state) as
> >> being a function. Sounds more like the Processor/Transformer
> >> kind of thing semantically, rather than a function.
> >>
> >>
> >>  -  If we called the interface name only Function your assumptions
> >> will hold. However, the keyword Rich by definition implies that we
> >> have a function (as you described, with one abstract method and
> >> etc) but it is rich. So, there are multiple methods in it.
> >> Ideally it should be:
> >>
> >> public interface RichFunction extends Function {  // this
> >> is the Function that you described
> >>   void close();
> >>   void init(Some params);
> >>...
> >> }
> >>
> >>
> >> The KIP says there are multiple use-

Build failed in Jenkins: kafka-trunk-jdk8 #1604

2017-05-27 Thread Apache Jenkins Server
See 


Changes:

[ismael] MINOR: Remove unused method parameter in `SimpleAclAuthorizer`

--
[...truncated 2.90 MB...]

kafka.api.AuthorizerIntegrationTest > 
testTransactionalProducerTopicAuthorizationExceptionInSendCallback PASSED

kafka.api.AuthorizerIntegrationTest > 
testTransactionalProducerInitTransactionsNoWriteTransactionalIdAcl STARTED

kafka.api.AuthorizerIntegrationTest > 
testTransactionalProducerInitTransactionsNoWriteTransactionalIdAcl PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoAccess STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithoutTopicDescribeAccess 
STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithoutTopicDescribeAccess 
PASSED

kafka.api.AuthorizerIntegrationTest > 
shouldThrowTransactionalIdAuthorizationExceptionWhenNoTransactionAccessOnEndTransaction
 STARTED

kafka.api.AuthorizerIntegrationTest > 
shouldThrowTransactionalIdAuthorizationExceptionWhenNoTransactionAccessOnEndTransaction
 PASSED

kafka.api.AuthorizerIntegrationTest > 
shouldThrowTransactionalIdAuthorizationExceptionWhenNoTransactionAccessOnSendOffsetsToTxn
 STARTED

kafka.api.AuthorizerIntegrationTest > 
shouldThrowTransactionalIdAuthorizationExceptionWhenNoTransactionAccessOnSendOffsetsToTxn
 PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > 
testTransactionalProducerInitTransactionsNoDescribeTransactionalIdAcl STARTED

kafka.api.AuthorizerIntegrationTest > 
testTransactionalProducerInitTransactionsNoDescribeTransactionalIdAcl PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithNoAccess STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithTopicAndGroupRead 
STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithTopicAndGroupRead 
PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testAuthorizationWithTopicExisting STARTED

kafka.api.AuthorizerIntegrationTest > testAuthorizationWithTopicExisting PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionMatchingInternalTopic STARTED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionMatchingInternalTopic PASSED

kafka.api.AuthorizerIntegrationTest > 
testSendOffsetsWithNoConsumerGroupDescribeAccess STARTED

kafka.api.AuthorizerIntegrationTest > 
testSendOffsetsWithNoConsumerGroupDescribeAccess PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicAndGroupRead STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicAndGroupRead PASSED

kafka.api.AuthorizerIntegrationTest > 
testIdempotentProducerNoIdempotentWriteAclInInitProducerId STARTED

kafka.api.AuthorizerIntegrationTest > 
testIdempotentProducerNoIdempotentWriteAclInInitProducerId PASSED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithExplicitSeekAndNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithExplicitSeekAndNoGroupAccess PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testAcls STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testAcls PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testTwoConsumersWithDifferentSaslCredentials STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testTwoConsumersWithDifferentSaslCredentials PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaAssign STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaAssign PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaSubscribe STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaSubscribe PASSED

kafka.api.SaslPlainSs

Efficient way of Searching Messages By Timestamp - Kafka

2017-05-27 Thread SenthilKumar K
Hello Kafka Developers , Users ,

We are exploring the SearchMessageByTimestamp feature in Kafka for our
use case .

Use Case : Kafka will be realtime message bus , users should be able to
pull Logs by specifying start_date and end_date or  Pull me last five
minutes data etc ...

I did POC on SearchMessageByTimestamp , here is the code
https://gist.github.com/senthilec566/16e8e28b32834666fea132afc3a4e2f9 . And
i observed that Searching Messages is slow ..

Here is small test i did :
Query :Fetch Logs of Last *5 minutes*:
Result:
No of Records fetched : *30*
Fetch Time *6210* ms

Above test performed in a topic which has 4 partitions. In each partition
search & query processing happened .. in other words
consumer.offsetsForTimes()
consumer.assign(Arrays.asList(partition))
consumer.seek(this.partition, offsetTimestamp.offset())
consumer.poll(100)

are the API calls of each partition.. I realized that , this was the reason
for Kafka taking more time..

What is efficient way of implementing SerachMessageByTimeStamp ?  Is Kafka
right candidate for our Use Case ?

Pls add your thoughts here ...


Cheers,
Senthil


Jenkins build is back to normal : kafka-trunk-jdk8 #1603

2017-05-27 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-0.11.0-jdk7 #41

2017-05-27 Thread Apache Jenkins Server
See 


Changes:

[ismael] MINOR: Cleanup in tests to avoid threads being left behind

--
[...truncated 902.84 KB...]
kafka.controller.ControllerEventManagerTest > testSuccessfulEvent PASSED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentWithOfflineReplicaHaltingProgress STARTED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentWithOfflineReplicaHaltingProgress PASSED

kafka.controller.ControllerIntegrationTest > 
testControllerEpochPersistsWhenAllBrokersDown STARTED

kafka.controller.ControllerIntegrationTest > 
testControllerEpochPersistsWhenAllBrokersDown PASSED

kafka.controller.ControllerIntegrationTest > 
testTopicCreationWithOfflineReplica STARTED

kafka.controller.ControllerIntegrationTest > 
testTopicCreationWithOfflineReplica PASSED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentResumesAfterReplicaComesOnline STARTED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentResumesAfterReplicaComesOnline PASSED

kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionDisabled STARTED

kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionDisabled PASSED

kafka.controller.ControllerIntegrationTest > 
testTopicPartitionExpansionWithOfflineReplica STARTED

kafka.controller.ControllerIntegrationTest > 
testTopicPartitionExpansionWithOfflineReplica PASSED

kafka.controller.ControllerIntegrationTest > 
testPreferredReplicaLeaderElectionWithOfflinePreferredReplica STARTED

kafka.controller.ControllerIntegrationTest > 
testPreferredReplicaLeaderElectionWithOfflinePreferredReplica PASSED

kafka.controller.ControllerIntegrationTest > 
testAutoPreferredReplicaLeaderElection STARTED

kafka.controller.ControllerIntegrationTest > 
testAutoPreferredReplicaLeaderElection PASSED

kafka.controller.ControllerIntegrationTest > testTopicCreation STARTED

kafka.controller.ControllerIntegrationTest > testTopicCreation PASSED

kafka.controller.ControllerIntegrationTest > testPartitionReassignment STARTED

kafka.controller.ControllerIntegrationTest > testPartitionReassignment PASSED

kafka.controller.ControllerIntegrationTest > testTopicPartitionExpansion STARTED

kafka.controller.ControllerIntegrationTest > testTopicPartitionExpansion PASSED

kafka.controller.ControllerIntegrationTest > 
testControllerMoveIncrementsControllerEpoch STARTED

kafka.controller.ControllerIntegrationTest > 
testControllerMoveIncrementsControllerEpoch PASSED

kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionEnabled STARTED

kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionEnabled PASSED

kafka.controller.ControllerIntegrationTest > testEmptyCluster STARTED

kafka.controller.ControllerIntegrationTest > testEmptyCluster PASSED

kafka.controller.ControllerIntegrationTest > testPreferredReplicaLeaderElection 
STARTED

kafka.controller.ControllerIntegrationTest > testPreferredReplicaLeaderElection 
PASSED

kafka.controller.ControllerFailoverTest > testMetadataUpdate STARTED

kafka.controller.ControllerFailoverTest > testMetadataUpdate SKIPPED

kafka.tools.ConsoleProducerTest > testParseKeyProp STARTED

kafka.tools.ConsoleProducerTest > testParseKeyProp PASSED

kafka.tools.ConsoleProducerTest > testValidConfigsOldProducer STARTED

kafka.tools.ConsoleProducerTest > testValidConfigsOldProducer PASSED

kafka.tools.ConsoleProducerTest > testInvalidConfigs STARTED

kafka.tools.ConsoleProducerTest > testInvalidConfigs PASSED

kafka.tools.ConsoleProducerTest > testValidConfigsNewProducer STARTED

kafka.tools.ConsoleProducerTest > testValidConfigsNewProducer PASSED

kafka.tools.ConsoleConsumerTest > shouldLimitReadsToMaxMessageLimit STARTED

kafka.tools.ConsoleConsumerTest > shouldLimitReadsToMaxMessageLimit PASSED

kafka.tools.ConsoleConsumerTest > shouldParseValidNewConsumerValidConfig STARTED

kafka.tools.ConsoleConsumerTest > shouldParseValidNewConsumerValidConfig PASSED

kafka.tools.ConsoleConsumerTest > shouldStopWhenOutputCheckErrorFails STARTED

kafka.tools.ConsoleConsumerTest > shouldStopWhenOutputCheckErrorFails PASSED

kafka.tools.ConsoleConsumerTest > 
shouldParseValidNewSimpleConsumerValidConfigWithStringOffset STARTED

kafka.tools.ConsoleConsumerTest > 
shouldParseValidNewSimpleConsumerValidConfigWithStringOffset PASSED

kafka.tools.ConsoleConsumerTest > shouldParseConfigsFromFile STARTED

kafka.tools.ConsoleConsumerTest > shouldParseConfigsFromFile PASSED

kafka.tools.ConsoleConsumerTest > 
shouldParseValidNewSimpleConsumerValidConfigWithNumericOffset STARTED

kafka.tools.ConsoleConsumerTest > 
shouldParseValidNewSimpleConsumerValidConfigWithNumericOffset PASSED

kafka.tools.ConsoleConsumerTest > testDefaultConsumer STARTED

ka

[jira] [Commented] (KAFKA-5338) There is a Misspell in ResetIntegrationTest

2017-05-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027371#comment-16027371
 ] 

ASF GitHub Bot commented on KAFKA-5338:
---

GitHub user hejiefang opened a pull request:

https://github.com/apache/kafka/pull/3159

[KAFKA-5338]There is a Misspell in ResetIntegrationTest



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hejiefang/kafka KAFKA-5338

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3159.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3159


commit b74e6b332c0f5f827d2d5b4d3bbcacf4e3c2d523
Author: hejiefang 
Date:   2017-05-27T09:23:42Z

[KAFKA-5338]There is a Misspell in ResetIntegrationTest




> There is a Misspell in ResetIntegrationTest
> ---
>
> Key: KAFKA-5338
> URL: https://issues.apache.org/jira/browse/KAFKA-5338
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.1.0
>Reporter: hejiefang
>
> There is a Misspell in Annotations of ResetIntegrationTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3159: [KAFKA-5338]There is a Misspell in ResetIntegratio...

2017-05-27 Thread hejiefang
GitHub user hejiefang opened a pull request:

https://github.com/apache/kafka/pull/3159

[KAFKA-5338]There is a Misspell in ResetIntegrationTest



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hejiefang/kafka KAFKA-5338

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3159.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3159


commit b74e6b332c0f5f827d2d5b4d3bbcacf4e3c2d523
Author: hejiefang 
Date:   2017-05-27T09:23:42Z

[KAFKA-5338]There is a Misspell in ResetIntegrationTest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3147: MINOR: Remove unused method parameter in `SimpleAc...

2017-05-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3147


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3146: MINOR: Cleanup in tests to avoid threads being lef...

2017-05-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3146


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4340) Change the default value of log.message.timestamp.difference.max.ms to the same as log.retention.ms

2017-05-27 Thread Magnus Edenhill (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027366#comment-16027366
 ] 

Magnus Edenhill commented on KAFKA-4340:


Since this is a change to the protocol API (change of behaviour), I suspect 
that a proper KIP is required and would recommend to revert this commit for the 
upcoming release.


> Change the default value of log.message.timestamp.difference.max.ms to the 
> same as log.retention.ms
> ---
>
> Key: KAFKA-4340
> URL: https://issues.apache.org/jira/browse/KAFKA-4340
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.1.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.11.0.0
>
>
> [~junrao] brought up the following scenario: 
> If users are pumping data with timestamp already passed log.retention.ms into 
> Kafka, the messages will be appended to the log but will be immediately 
> rolled out by log retention thread when it kicks in and the messages will be 
> deleted. 
> To avoid this produce-and-deleted scenario, we can set the default value of 
> log.message.timestamp.difference.max.ms to be the same as log.retention.ms.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5338) There is a Misspell in ResetIntegrationTest

2017-05-27 Thread hejiefang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hejiefang updated KAFKA-5338:
-
Affects Version/s: 0.11.1.0

> There is a Misspell in ResetIntegrationTest
> ---
>
> Key: KAFKA-5338
> URL: https://issues.apache.org/jira/browse/KAFKA-5338
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.1.0
>Reporter: hejiefang
>
> There is a Misspell in Annotations of ResetIntegrationTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5338) There is a Misspell in ResetIntegrationTest

2017-05-27 Thread hejiefang (JIRA)
hejiefang created KAFKA-5338:


 Summary: There is a Misspell in ResetIntegrationTest
 Key: KAFKA-5338
 URL: https://issues.apache.org/jira/browse/KAFKA-5338
 Project: Kafka
  Issue Type: Bug
Reporter: hejiefang


There is a Misspell in Annotations of ResetIntegrationTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5093) Load only batch header when rebuilding producer ID map

2017-05-27 Thread Umesh Chaudhary (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027362#comment-16027362
 ] 

Umesh Chaudhary commented on KAFKA-5093:


Tried to co-relate it with KIP but was unable to locate the intended piece of 
code to tweak. If possible, can you please point that? 

> Load only batch header when rebuilding producer ID map
> --
>
> Key: KAFKA-5093
> URL: https://issues.apache.org/jira/browse/KAFKA-5093
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> When rebuilding the producer ID map for KIP-98, we unnecessarily load the 
> full record data into memory when scanning through the log. It would be 
> better to only load the batch header since it is all that is needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KAFKA-5078) PartitionRecords.fetchRecords(...) should defer exception to the next call if iterator has already moved across any valid record

2017-05-27 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin resolved KAFKA-5078.
-
Resolution: Fixed

> PartitionRecords.fetchRecords(...) should defer exception to the next call if 
> iterator has already moved across any valid record
> 
>
> Key: KAFKA-5078
> URL: https://issues.apache.org/jira/browse/KAFKA-5078
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 0.11.0.0
>
>
> Suppose there are two valid records followed by one invalid records in the 
> FetchResponse.PartitionData(). As of current implementation, 
> PartitionRecords.fetchRecords(...) will throw exception without returning the 
> two valid records. The next call to PartitionRecords.fetchRecords(...) will 
> not return that two valid records either because the iterator has already 
> moved across them.
> We can fix this problem by deferring exception to the next call of 
> PartitionRecords.fetchRecords(...) if iterator has already moved across any 
> valid record.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5078) PartitionRecords.fetchRecords(...) should defer exception to the next call if iterator has already moved across any valid record

2017-05-27 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-5078:

Fix Version/s: 0.11.0.0

> PartitionRecords.fetchRecords(...) should defer exception to the next call if 
> iterator has already moved across any valid record
> 
>
> Key: KAFKA-5078
> URL: https://issues.apache.org/jira/browse/KAFKA-5078
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 0.11.0.0
>
>
> Suppose there are two valid records followed by one invalid records in the 
> FetchResponse.PartitionData(). As of current implementation, 
> PartitionRecords.fetchRecords(...) will throw exception without returning the 
> two valid records. The next call to PartitionRecords.fetchRecords(...) will 
> not return that two valid records either because the iterator has already 
> moved across them.
> We can fix this problem by deferring exception to the next call of 
> PartitionRecords.fetchRecords(...) if iterator has already moved across any 
> valid record.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5326) Log offset index resize cause empty index file

2017-05-27 Thread Ma Tianchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ma Tianchi updated KAFKA-5326:
--
Status: Patch Available  (was: Open)

> Log offset index  resize cause empty  index file
> 
>
> Key: KAFKA-5326
> URL: https://issues.apache.org/jira/browse/KAFKA-5326
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>
> After a log index file lost or be removed when kafka server is running,then 
> some one calls OffsetIndex.resize(newSize: Int) .It will cause a new index 
> file be created but there is no values in it,even though the size is same 
> with old one.
> It will do something as below. 
> val raf = new RandomAccessFile(file, "rw")
> This time file does not exit.It will create a new file.
> Then at:
> mmap = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, roundedNewSize)
> The mmap is created from new file which is empty.
> and then:
> mmap.position(position)
> This make the new index file which size is same with old index file,but there 
> is nothing in it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5319) Add a tool to make cluster replica and leader balance

2017-05-27 Thread Ma Tianchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ma Tianchi updated KAFKA-5319:
--
Attachment: KAFKA-5319.patch

> Add a tool to make cluster replica and leader balance
> -
>
> Key: KAFKA-5319
> URL: https://issues.apache.org/jira/browse/KAFKA-5319
> Project: Kafka
>  Issue Type: New Feature
>  Components: admin
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>  Labels: patch
> Attachments: KAFKA-5319.patch
>
>
> When a new broker is added to cluster,there is not any topics in the new 
> broker.When we use console command to create a topic without 
> 'replicaAssignmentOpt',Kafka use AdminUtils.assignReplicasToBrokers to get 
> replicaAssignment.Even though it is balance at the creating time if the 
> cluster never change,with more and more brokers added to cluster the replica 
> balanced will become not well. We also can use 'kafka-reassign-partitions.sh' 
> to balance ,but the big amount of topics make it not easy.And at the topic 
> created time , Kafka choose a  PreferredReplicaLeader which be put at the 
> first position of  the AR to make leader balance.But the balance will be 
> destroyed when cluster changed.Using  'kafka-reassign-partitions.sh' to make 
> partition reassignment may be also destroy the leader balance ,because user 
> can change the AR of the partition . It may be not balanced , but Kafka 
> believe cluster leader balance is well with every leaders is the first on at 
> AR.
> So we create a tool to make the number of replicas and number of leaders on 
> every brokers is balanced.It uses a algorithm to get a balanced replica and 
> leader  reassignment,then uses ReassignPartitionsCommand to real balance the 
> cluster.
> It can be used to make balance when cluster added brokers or cluster is not 
> balanced .It does not deal with moving replicas of a dead broker to an alive 
> broker,it only makes replicas and leaders on alive brokers is balanced.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5319) Add a tool to make cluster replica and leader balance

2017-05-27 Thread Ma Tianchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ma Tianchi updated KAFKA-5319:
--
  Labels: patch  (was: )
Reviewer:   (was: xuzq)
  Status: Patch Available  (was: Reopened)

> Add a tool to make cluster replica and leader balance
> -
>
> Key: KAFKA-5319
> URL: https://issues.apache.org/jira/browse/KAFKA-5319
> Project: Kafka
>  Issue Type: New Feature
>  Components: admin
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>  Labels: patch
>
> When a new broker is added to cluster,there is not any topics in the new 
> broker.When we use console command to create a topic without 
> 'replicaAssignmentOpt',Kafka use AdminUtils.assignReplicasToBrokers to get 
> replicaAssignment.Even though it is balance at the creating time if the 
> cluster never change,with more and more brokers added to cluster the replica 
> balanced will become not well. We also can use 'kafka-reassign-partitions.sh' 
> to balance ,but the big amount of topics make it not easy.And at the topic 
> created time , Kafka choose a  PreferredReplicaLeader which be put at the 
> first position of  the AR to make leader balance.But the balance will be 
> destroyed when cluster changed.Using  'kafka-reassign-partitions.sh' to make 
> partition reassignment may be also destroy the leader balance ,because user 
> can change the AR of the partition . It may be not balanced , but Kafka 
> believe cluster leader balance is well with every leaders is the first on at 
> AR.
> So we create a tool to make the number of replicas and number of leaders on 
> every brokers is balanced.It uses a algorithm to get a balanced replica and 
> leader  reassignment,then uses ReassignPartitionsCommand to real balance the 
> cluster.
> It can be used to make balance when cluster added brokers or cluster is not 
> balanced .It does not deal with moving replicas of a dead broker to an alive 
> broker,it only makes replicas and leaders on alive brokers is balanced.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)