[jira] [Commented] (KAFKA-1791) Corrupt index after safe shutdown and restart

2014-11-21 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221851#comment-14221851
 ] 

Vamsi Subhash Achanta commented on KAFKA-1791:
--

Attached the file.

> Corrupt index after safe shutdown and restart
> -
>
> Key: KAFKA-1791
> URL: https://issues.apache.org/jira/browse/KAFKA-1791
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.1
> Environment: Debian6 with Sun-Java6
>Reporter: Vamsi Subhash Achanta
>Priority: Critical
> Attachments: 0233.index
>
>
> We have 3 kafka brokers - all VMs. One of the broker was stopped for around 
> 30 minutes to fix a problem with the bare metal. Upon restart, after some 
> time, the broker went out of file-descriptors (FDs) and started throwing 
> errors. Upon restart, it started throwing this corrupted index exceptions. I 
> followed the other JIRAs related to corrupted indices but the solutions 
> mentioned there (deleting the index and restart) didn't work - the index gets 
> created again. The other JIRAs solution of deleting those indexes which got 
> wrongly compacted (> 10MB) didn't work either.
> What is the error? How can I fix this and bring back the broker? Thanks.
> INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Found clean 
> shutdown file. Skipping recovery for all logs in data directory 
> '/var/lib/fk-3p-kafka/logs'
>  INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Loading log 
> 'kf.production.b2b.return_order.status-25'
> FATAL [2014-11-21 02:57:17,533] [main][] kafka.server.KafkaServerStartable - 
> Fatal error during KafkaServerStable startup. Prepare to shutdown
> java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
> index file 
> (/var/lib/fk-3p-kafka/logs/kf.production.b2b.return_order.status-25/0233.index)
>  has non-zero size but the last offset is 233 and the base offset is 233
>   at scala.Predef$.require(Predef.scala:145)
>   at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
>   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:159)
>   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:158)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:631)
>   at 
> scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:474)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
>   at 
> scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:495)
>   at kafka.log.Log.loadSegments(Log.scala:158)
>   at kafka.log.Log.(Log.scala:64)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
>   at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34)
>   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113)
>   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
>   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32)
>   at kafka.log.LogManager.loadLogs(LogManager.scala:105)
>   at kafka.log.LogManager.(LogManager.scala:57)
>   at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275)
>   at kafka.server.KafkaServer.startup(KafkaServer.scala:72)
>   at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
>   at kafka.Kafka$.main(Kafka.scala:46)
>   at kafka.Kafka.main(Kafka.scala)
>  INFO [2014-11-21 02:57:17,534] [main][] kafka.server.KafkaServer - [Kafka 
> Server 2], shutting down
>  INFO [2014-11-21 02:57:17,538] [main][] kafka.server.KafkaServer - [Kafka 
> Server 2], shut down completed
>  INFO [2014-11-21 02:57:17,539] [Thread-2][] kafka.server.KafkaServer - 
> [Kafka Server 2], shutting down



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1791) Corrupt index after safe shutdown and restart

2014-11-21 Thread Vamsi Subhash Achanta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsi Subhash Achanta updated KAFKA-1791:
-
Attachment: 0233.index

This was over a period of time when the exception is thrown again and again. 
The file size when copied increased to 10MB. Was it compressed?

> Corrupt index after safe shutdown and restart
> -
>
> Key: KAFKA-1791
> URL: https://issues.apache.org/jira/browse/KAFKA-1791
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.1
> Environment: Debian6 with Sun-Java6
>Reporter: Vamsi Subhash Achanta
>Priority: Critical
> Attachments: 0233.index
>
>
> We have 3 kafka brokers - all VMs. One of the broker was stopped for around 
> 30 minutes to fix a problem with the bare metal. Upon restart, after some 
> time, the broker went out of file-descriptors (FDs) and started throwing 
> errors. Upon restart, it started throwing this corrupted index exceptions. I 
> followed the other JIRAs related to corrupted indices but the solutions 
> mentioned there (deleting the index and restart) didn't work - the index gets 
> created again. The other JIRAs solution of deleting those indexes which got 
> wrongly compacted (> 10MB) didn't work either.
> What is the error? How can I fix this and bring back the broker? Thanks.
> INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Found clean 
> shutdown file. Skipping recovery for all logs in data directory 
> '/var/lib/fk-3p-kafka/logs'
>  INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Loading log 
> 'kf.production.b2b.return_order.status-25'
> FATAL [2014-11-21 02:57:17,533] [main][] kafka.server.KafkaServerStartable - 
> Fatal error during KafkaServerStable startup. Prepare to shutdown
> java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
> index file 
> (/var/lib/fk-3p-kafka/logs/kf.production.b2b.return_order.status-25/0233.index)
>  has non-zero size but the last offset is 233 and the base offset is 233
>   at scala.Predef$.require(Predef.scala:145)
>   at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
>   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:159)
>   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:158)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:631)
>   at 
> scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:474)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
>   at 
> scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:495)
>   at kafka.log.Log.loadSegments(Log.scala:158)
>   at kafka.log.Log.(Log.scala:64)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
>   at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34)
>   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113)
>   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
>   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32)
>   at kafka.log.LogManager.loadLogs(LogManager.scala:105)
>   at kafka.log.LogManager.(LogManager.scala:57)
>   at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275)
>   at kafka.server.KafkaServer.startup(KafkaServer.scala:72)
>   at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
>   at kafka.Kafka$.main(Kafka.scala:46)
>   at kafka.Kafka.main(Kafka.scala)
>  INFO [2014-11-21 02:57:17,534] [main][] kafka.server.KafkaServer - [Kafka 
> Server 2], shutting down
>  INFO [2014-11-21 02:57:17,538] [main][] kafka.server.KafkaServer - [Kafka 
> Server 2], shut down completed
>  INFO [2014-11-21 02:57:17,539] [Thread-2][] kafka.server.KafkaServer - 
> [Kafka Server 2], shutting down



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1789) Issue with Async producer

2014-11-21 Thread devendra tagare (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

devendra tagare updated KAFKA-1789:
---
Priority: Critical  (was: Major)

> Issue with Async producer
> -
>
> Key: KAFKA-1789
> URL: https://issues.apache.org/jira/browse/KAFKA-1789
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1
>Reporter: devendra tagare
>Priority: Critical
>
> Hi,
> We are using an async producer to send data to a kafka cluster.The event rate 
> at peak is around 250 events/second of size 25KB each.
> In the producer code base we have added specific debug statements to capture 
> the time taken to create a producer,create a keyed message with a byte 
> payload & send the message.
> We have added the below properties to the producerConfig
> queue.enqueue.timeout.ms=20
> send.buffer.bytes=1024000
> topic.metadata.refresh.interval.ms=3
> Based on the documentation, producer.send() queues the message on the async 
> producer's queue.
> So, ideally if the queue is full then the enqueue operation should result in 
> an kafka.common.QueueFullException in 20 ms.
> The logs indicate that the enqueue operation is taking more than 20ms (takes 
> around 250ms) without throwing any exceptions.
> Is there any other property that could conflict with queue.enqueue.timeout.ms 
> which is causing this behavior ?
> Or is it possible that the queue is not full & yet the producer.send() call 
> is still taking around 200ms under peak load ?
> Also, could you suggest any other alternatives so that we can either enforce 
> a timeout or throw an exception in-case the async producer is taking more 
> than a specified amount of time.
> Regards,
> Dev



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 28268: Bump up default scala version from 2.10.1 to 2.10.4

2014-11-21 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28268/#review62721
---

Ship it!


Looks good, although the patch does not apply - can you rebase; check that it 
builds properly and then we can close this out.

- Joel Koshy


On Nov. 20, 2014, 12:23 a.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28268/
> ---
> 
> (Updated Nov. 20, 2014, 12:23 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1624
> https://issues.apache.org/jira/browse/KAFKA-1624
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Compiles with Java 1.6/7/8
> 
> 
> Diffs
> -
> 
>   bin/kafka-run-class.sh 36c742b67a7259fa35c3ed862f7ccc4673b69d9f 
>   bin/windows/kafka-run-class.bat 8e9780e2eb74a35a726787155c09b151d0ba 
>   build.gradle 11eb11355efddacf62d61690ad13b9c82a200230 
>   gradle.properties 5d3155fd4461438d8b2ec4faa9534cc2383d4951 
>   scala.gradle 6adf9af7dbbe71e68a07b387c3854d8c9ad339e0 
> 
> Diff: https://reviews.apache.org/r/28268/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



Re: [DISCUSSION] Message Metadata

2014-11-21 Thread Timothy Chen
Hi Guozhang,

I don't think that is publically accessible, can you update it to the
Kafka wiki?

Tim

On Fri, Nov 21, 2014 at 5:24 PM, Guozhang Wang  wrote:
> Hi all,
>
> I have updated the wiki page (
> https://iwww.corp.linkedin.com/wiki/cf/display/ENGS/Kafka+Enriched+Message+Metadata)
> according to people's comments and discussions offline.
>
> Guozhang
>
> On Thu, Nov 13, 2014 at 9:43 AM, Guozhang Wang  wrote:
>
>> Hi Jun,
>>
>> Sorry for the delay on your comments in the wiki page as well as this
>> thread; quite swamped now. I will get back to you as soon as I find some
>> time.
>>
>> Guozhang
>>
>> On Tue, Nov 11, 2014 at 6:26 PM, Jun Rao  wrote:
>>
>>> Thinking about this a bit more. For adding the auditing support, I am not
>>> sure if we need to change the message format by adding the application
>>> tags. An alternative way to do that is to add it in the producer client.
>>> For example, for each message payload (doesn't matter what the
>>> serialization mechanism is) that a producer receives, the producer can
>>> just
>>> add a header before the original payload. The header will contain all
>>> needed fields (e.g. timestamp, host, etc) for the purpose of auditing.
>>> This
>>> way, we don't need to change the message format and the auditing info can
>>> be added independent of the serialization mechanism of the message. The
>>> header can use a different serialization mechanism for better efficiency.
>>> For example, if we use Avro to serialize the header, the encoded bytes
>>> won't include the field names in the header. This is potentially more
>>> efficient than representing those fields as application tags in the
>>> message
>>> where the tags have to be explicitly store in every message.
>>>
>>> To make it easier for the client to add and make use of this kind of
>>> auditing support, I was imagining that we can add a ProducerFactory in the
>>> new java client. The ProducerFactory will create an instance of Producer
>>> based on a config property. By default, the current KafkaProducer will be
>>> returned. However, a user can plug in a different implementation of
>>> Producer that does auditing. For example, an implementation of an
>>> AuditProducer.send() can take the original ProducerRecord, add the header
>>> to the value byte array and then forward the record to an underlying
>>> KafkaProducer. We can add a similar ConsumerFactory to the new consumer
>>> client. If a user plugs in an implementation of the AuditingConsumer, the
>>> consumer will then be audited automatically.
>>>
>>> Thanks,
>>>
>>> Jun
>>>
>>> On Tue, Oct 21, 2014 at 4:06 PM, Guozhang Wang 
>>> wrote:
>>>
>>> > Hi Jun,
>>> >
>>> > Regarding 4) in your comment, after thinking it for a while I cannot
>>> come
>>> > up a way to it along with log compaction without adding new fields into
>>> the
>>> > current format on message set. Do you have a better way that do not
>>> require
>>> > protocol changes?
>>> >
>>> > Guozhang
>>> >
>>> > On Mon, Oct 20, 2014 at 9:53 AM, Guozhang Wang 
>>> wrote:
>>> >
>>> > > I have updated the wiki page incorporating received comments. We can
>>> > > discuss some more details on:
>>> > >
>>> > > 1. How we want to do audit? Whether we want to have in-built auditing
>>> on
>>> > > brokers or even MMs or use  an audit consumer to fetch all messages
>>> from
>>> > > just brokers.
>>> > >
>>> > > 2. How we can avoid de-/re-compression on brokers and MMs with log
>>> > > compaction turned on.
>>> > >
>>> > > 3. How we can resolve unclean leader election resulted data
>>> inconsistency
>>> > > with control messages.
>>> > >
>>> > > Guozhang
>>> > >
>>> > > On Sun, Oct 19, 2014 at 11:41 PM, Guozhang Wang 
>>> > > wrote:
>>> > >
>>> > >> Thanks for the detailed comments Jun! Some replies inlined.
>>> > >>
>>> > >> On Sun, Oct 19, 2014 at 7:42 PM, Jun Rao  wrote:
>>> > >>
>>> > >>> Hi, Guozhang,
>>> > >>>
>>> > >>> Thanks for the writeup.
>>> > >>>
>>> > >>> A few high level comments.
>>> > >>>
>>> > >>> 1. Associating (versioned) schemas to a topic can be a good thing
>>> > >>> overall.
>>> > >>> Yes, this could add a bit more management overhead in Kafka.
>>> However,
>>> > it
>>> > >>> makes sure that the data format contract between a producer and a
>>> > >>> consumer
>>> > >>> is kept and managed in a central place, instead of in the
>>> application.
>>> > >>> The
>>> > >>> latter is probably easier to start with, but is likely to be
>>> brittle in
>>> > >>> the
>>> > >>> long run.
>>> > >>>
>>> > >>
>>> > >> I am actually not proposing to not support associated versioned
>>> schemas
>>> > >> for topics, but to not let some core Kafka functionalities like
>>> auditing
>>> > >> being depend on schemas. I think this alone can separate the schema
>>> > >> management from Kafka piping management (i.e. making sure every
>>> single
>>> > >> message is delivered, and within some latency, etc). Adding
>>> additional
>>> > >> auditing info into an existing schema will force Kafka to be aware of
>>> > th

Re: [DISCUSSION] Message Metadata

2014-11-21 Thread Guozhang Wang
Hi all,

I have updated the wiki page (
https://iwww.corp.linkedin.com/wiki/cf/display/ENGS/Kafka+Enriched+Message+Metadata)
according to people's comments and discussions offline.

Guozhang

On Thu, Nov 13, 2014 at 9:43 AM, Guozhang Wang  wrote:

> Hi Jun,
>
> Sorry for the delay on your comments in the wiki page as well as this
> thread; quite swamped now. I will get back to you as soon as I find some
> time.
>
> Guozhang
>
> On Tue, Nov 11, 2014 at 6:26 PM, Jun Rao  wrote:
>
>> Thinking about this a bit more. For adding the auditing support, I am not
>> sure if we need to change the message format by adding the application
>> tags. An alternative way to do that is to add it in the producer client.
>> For example, for each message payload (doesn't matter what the
>> serialization mechanism is) that a producer receives, the producer can
>> just
>> add a header before the original payload. The header will contain all
>> needed fields (e.g. timestamp, host, etc) for the purpose of auditing.
>> This
>> way, we don't need to change the message format and the auditing info can
>> be added independent of the serialization mechanism of the message. The
>> header can use a different serialization mechanism for better efficiency.
>> For example, if we use Avro to serialize the header, the encoded bytes
>> won't include the field names in the header. This is potentially more
>> efficient than representing those fields as application tags in the
>> message
>> where the tags have to be explicitly store in every message.
>>
>> To make it easier for the client to add and make use of this kind of
>> auditing support, I was imagining that we can add a ProducerFactory in the
>> new java client. The ProducerFactory will create an instance of Producer
>> based on a config property. By default, the current KafkaProducer will be
>> returned. However, a user can plug in a different implementation of
>> Producer that does auditing. For example, an implementation of an
>> AuditProducer.send() can take the original ProducerRecord, add the header
>> to the value byte array and then forward the record to an underlying
>> KafkaProducer. We can add a similar ConsumerFactory to the new consumer
>> client. If a user plugs in an implementation of the AuditingConsumer, the
>> consumer will then be audited automatically.
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Oct 21, 2014 at 4:06 PM, Guozhang Wang 
>> wrote:
>>
>> > Hi Jun,
>> >
>> > Regarding 4) in your comment, after thinking it for a while I cannot
>> come
>> > up a way to it along with log compaction without adding new fields into
>> the
>> > current format on message set. Do you have a better way that do not
>> require
>> > protocol changes?
>> >
>> > Guozhang
>> >
>> > On Mon, Oct 20, 2014 at 9:53 AM, Guozhang Wang 
>> wrote:
>> >
>> > > I have updated the wiki page incorporating received comments. We can
>> > > discuss some more details on:
>> > >
>> > > 1. How we want to do audit? Whether we want to have in-built auditing
>> on
>> > > brokers or even MMs or use  an audit consumer to fetch all messages
>> from
>> > > just brokers.
>> > >
>> > > 2. How we can avoid de-/re-compression on brokers and MMs with log
>> > > compaction turned on.
>> > >
>> > > 3. How we can resolve unclean leader election resulted data
>> inconsistency
>> > > with control messages.
>> > >
>> > > Guozhang
>> > >
>> > > On Sun, Oct 19, 2014 at 11:41 PM, Guozhang Wang 
>> > > wrote:
>> > >
>> > >> Thanks for the detailed comments Jun! Some replies inlined.
>> > >>
>> > >> On Sun, Oct 19, 2014 at 7:42 PM, Jun Rao  wrote:
>> > >>
>> > >>> Hi, Guozhang,
>> > >>>
>> > >>> Thanks for the writeup.
>> > >>>
>> > >>> A few high level comments.
>> > >>>
>> > >>> 1. Associating (versioned) schemas to a topic can be a good thing
>> > >>> overall.
>> > >>> Yes, this could add a bit more management overhead in Kafka.
>> However,
>> > it
>> > >>> makes sure that the data format contract between a producer and a
>> > >>> consumer
>> > >>> is kept and managed in a central place, instead of in the
>> application.
>> > >>> The
>> > >>> latter is probably easier to start with, but is likely to be
>> brittle in
>> > >>> the
>> > >>> long run.
>> > >>>
>> > >>
>> > >> I am actually not proposing to not support associated versioned
>> schemas
>> > >> for topics, but to not let some core Kafka functionalities like
>> auditing
>> > >> being depend on schemas. I think this alone can separate the schema
>> > >> management from Kafka piping management (i.e. making sure every
>> single
>> > >> message is delivered, and within some latency, etc). Adding
>> additional
>> > >> auditing info into an existing schema will force Kafka to be aware of
>> > the
>> > >> schema systems (Avro, JSON, etc).
>> > >>
>> > >>
>> > >>>
>> > >>> 2. Auditing can be a general feature that's useful for many
>> > applications.
>> > >>> Such a feature can be implemented by extending the low level message
>> > >>> format
>> > >>> with a header. However, it can al

[jira] [Commented] (KAFKA-1737) Document required ZkSerializer for ZkClient used with AdminUtils

2014-11-21 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221526#comment-14221526
 ] 

Guozhang Wang commented on KAFKA-1737:
--

[~vivekpm] you should be able to assign the ticket to yourself now.

> Document required ZkSerializer for ZkClient used with AdminUtils
> 
>
> Key: KAFKA-1737
> URL: https://issues.apache.org/jira/browse/KAFKA-1737
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.8.1.1
>Reporter: Stevo Slavic
>Priority: Minor
>
> {{ZkClient}} instances passed to {{AdminUtils}} calls must have 
> {{kafka.utils.ZKStringSerializer}} set as {{ZkSerializer}}. Otherwise 
> commands executed via {{AdminUtils}} may not be seen/recognizable to broker, 
> producer or consumer. E.g. producer (with auto topic creation turned off) 
> will not be able to send messages to a topic created via {{AdminUtils}}, it 
> will throw {{UnknownTopicOrPartitionException}}.
> Please consider at least documenting this requirement in {{AdminUtils}} 
> scaladoc.
> For more info see [related discussion on Kafka user mailing 
> list|http://mail-archives.apache.org/mod_mbox/kafka-users/201410.mbox/%3CCAAUywg-oihNiXuQRYeS%3D8Z3ymsmEHo6ghLs%3Dru4nbm%2BdHVz6TA%40mail.gmail.com%3E].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1770) The description of UnknownTopicOrPartitionException in doc is not accurate.

2014-11-21 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1770:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> The description of UnknownTopicOrPartitionException in doc is not accurate.
> ---
>
> Key: KAFKA-1770
> URL: https://issues.apache.org/jira/browse/KAFKA-1770
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Attachments: KAFKA-1770.patch, KAFKA-1770_2014-11-19_16:37:11.patch
>
>
> It was "Indicates an unknown topic or a partition id not between 0 and 
> numPartitions-1", whereas should be
> "Indicates one of the following situation: 
> 1. Partition id is not between 0 - numPartitions-1
> 2. Partition id for the topic does not exist on the broker (This could happen 
> when partitions are reassigned)."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 28040: Patch for KAFKA-1770

2014-11-21 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28040/#review62665
---

Ship it!


Ship It!

- Guozhang Wang


On Nov. 20, 2014, 12:37 a.m., Jiangjie Qin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28040/
> ---
> 
> (Updated Nov. 20, 2014, 12:37 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1770
> https://issues.apache.org/jira/browse/KAFKA-1770
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Addressed Guozhang's comments
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/common/UnknownTopicOrPartitionException.scala 
> 781e551e5b78b5f436431575c2961fe15acd1414 
> 
> Diff: https://reviews.apache.org/r/28040/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jiangjie Qin
> 
>



[jira] [Created] (KAFKA-1794) Make config and config defaults accessible to clients

2014-11-21 Thread Navina Ramesh (JIRA)
Navina Ramesh created KAFKA-1794:


 Summary: Make config and config defaults accessible to clients
 Key: KAFKA-1794
 URL: https://issues.apache.org/jira/browse/KAFKA-1794
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Navina Ramesh


In the new Kafka producer API, the ProducerConfig is not accessible to the 
clients. Samza uses the ProducerConfig instance to access the defaults property 
values, which can then be used in the various helper utils. Config instance is 
accessible even without instantiating a Kafka producer. 
With the new API, there is no way to instantiate a ProducerConfig as the 
constructor is marked private. Also, it does not make the default config values 
accessible to the client without actually instantiating a KafkaProducer.

Changes suggested:
1. Make the ProducerConfig constructor public
2. Make ConfigDef in ProducerConfig accessible by the client
3. Use public static variables for kafka config default "values" 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation

2014-11-21 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1634:
-
Attachment: KAFKA-1634_2014-11-21_14:00:34.patch

> Improve semantics of timestamp in OffsetCommitRequests and update 
> documentation
> ---
>
> Key: KAFKA-1634
> URL: https://issues.apache.org/jira/browse/KAFKA-1634
> Project: Kafka
>  Issue Type: Bug
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
>Priority: Blocker
> Fix For: 0.8.3
>
> Attachments: KAFKA-1634.patch, KAFKA-1634_2014-11-06_15:35:46.patch, 
> KAFKA-1634_2014-11-07_16:54:33.patch, KAFKA-1634_2014-11-17_17:42:44.patch, 
> KAFKA-1634_2014-11-21_14:00:34.patch
>
>
> From the mailing list -
> following up on this -- I think the online API docs for OffsetCommitRequest
> still incorrectly refer to client-side timestamps:
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest
> Wasn't that removed and now always handled server-side now?  Would one of
> the devs mind updating the API spec wiki?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation

2014-11-21 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221482#comment-14221482
 ] 

Guozhang Wang commented on KAFKA-1634:
--

Updated reviewboard https://reviews.apache.org/r/27391/diff/
 against branch origin/trunk

> Improve semantics of timestamp in OffsetCommitRequests and update 
> documentation
> ---
>
> Key: KAFKA-1634
> URL: https://issues.apache.org/jira/browse/KAFKA-1634
> Project: Kafka
>  Issue Type: Bug
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
>Priority: Blocker
> Fix For: 0.8.3
>
> Attachments: KAFKA-1634.patch, KAFKA-1634_2014-11-06_15:35:46.patch, 
> KAFKA-1634_2014-11-07_16:54:33.patch, KAFKA-1634_2014-11-17_17:42:44.patch, 
> KAFKA-1634_2014-11-21_14:00:34.patch
>
>
> From the mailing list -
> following up on this -- I think the online API docs for OffsetCommitRequest
> still incorrectly refer to client-side timestamps:
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest
> Wasn't that removed and now always handled server-side now?  Would one of
> the devs mind updating the API spec wiki?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27391: Fix KAFKA-1634

2014-11-21 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27391/
---

(Updated Nov. 21, 2014, 10 p.m.)


Review request for kafka.


Bugs: KAFKA-1634
https://issues.apache.org/jira/browse/KAFKA-1634


Repository: kafka


Description
---

The timestamp field of OffsetAndMetadata is preserved since we need to be 
backward compatible with older versions


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
7517b879866fc5dad5f8d8ad30636da8bbe7784a 
  
clients/src/main/java/org/apache/kafka/common/requests/OffsetCommitRequest.java 
3ee5cbad55ce836fd04bb954dcf6ef2f9bc3288f 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
df37fc6d8f0db0b8192a948426af603be3444da4 
  core/src/main/scala/kafka/api/OffsetCommitRequest.scala 
050615c72efe7dbaa4634f53943bd73273d20ffb 
  core/src/main/scala/kafka/api/OffsetFetchRequest.scala 
c7604b9cdeb8f46507f04ed7d35fcb3d5f150713 
  core/src/main/scala/kafka/common/OffsetMetadataAndError.scala 
4cabffeacea09a49913505db19a96a55d58c0909 
  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
3e1718bc7ca6c835a59ca7c6879f558ec06866ee 
  core/src/main/scala/kafka/server/KafkaApis.scala 
968b0c4f809ea9f9c3f322aef9db105f9d2e0e23 
  core/src/main/scala/kafka/server/KafkaServer.scala 
1bf7d10cef23a77e71eb16bf6d0e68bc4ebe 
  core/src/main/scala/kafka/server/OffsetManager.scala 
2957bc435102bc4004d8f100dbcdd56287c8ffae 
  core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
cd16ced5465d098be7a60498326b2a98c248f343 
  core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
8c5364fa97da1be09973c176d1baeb339455d319 

Diff: https://reviews.apache.org/r/27391/diff/


Testing
---


Thanks,

Guozhang Wang



Re: Review Request 27391: Fix KAFKA-1634

2014-11-21 Thread Guozhang Wang


> On Nov. 21, 2014, 2:41 p.m., Joel Koshy wrote:
> > core/src/main/scala/kafka/server/OffsetManager.scala, line 500
> > 
> >
> > This and the above use give compilation warnings. I think it is 
> > reasonable to "copy" those constants here to get rid of the warnings since 
> > we are anyway in an undesirable state right now of maintaining a mirror 
> > wire-protocol implementation in scala

Not sure what you mean by "copy those contants"? I did the copy inside the if 
block and it is the compilation error from the condition.


- Guozhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27391/#review62553
---


On Nov. 18, 2014, 1:42 a.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27391/
> ---
> 
> (Updated Nov. 18, 2014, 1:42 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1634
> https://issues.apache.org/jira/browse/KAFKA-1634
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> The timestamp field of OffsetAndMetadata is preserved since we need to be 
> backward compatible with older versions
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
> 7517b879866fc5dad5f8d8ad30636da8bbe7784a 
>   
> clients/src/main/java/org/apache/kafka/common/requests/OffsetCommitRequest.java
>  3ee5cbad55ce836fd04bb954dcf6ef2f9bc3288f 
>   
> clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
>  df37fc6d8f0db0b8192a948426af603be3444da4 
>   core/src/main/scala/kafka/api/OffsetCommitRequest.scala 
> 050615c72efe7dbaa4634f53943bd73273d20ffb 
>   core/src/main/scala/kafka/api/OffsetFetchRequest.scala 
> c7604b9cdeb8f46507f04ed7d35fcb3d5f150713 
>   core/src/main/scala/kafka/common/OffsetMetadataAndError.scala 
> 4cabffeacea09a49913505db19a96a55d58c0909 
>   core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
> f476973eeff653473a60c3ecf36e870e386536bc 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 968b0c4f809ea9f9c3f322aef9db105f9d2e0e23 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> 4de812374e8fb1fed834d2be3f9655f55b511a74 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 2957bc435102bc4004d8f100dbcdd56287c8ffae 
>   core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
> cd16ced5465d098be7a60498326b2a98c248f343 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 8c5364fa97da1be09973c176d1baeb339455d319 
> 
> Diff: https://reviews.apache.org/r/27391/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



Re: Review Request 27391: Fix KAFKA-1634

2014-11-21 Thread Guozhang Wang


> On Nov. 21, 2014, 2:41 p.m., Joel Koshy wrote:
> > core/src/main/scala/kafka/server/KafkaApis.scala, line 147
> > 
> >
> > I think what you meant earlier was with < v2 you could have different 
> > timestamps for each partition so a global retention won't work with < v2

In the OffsetManager, the timestamp will only be overriden by the global one if 
it is set to Default which is -1, so for v0/1 although the retention is used it 
will not be overriding the timestamp.


> On Nov. 21, 2014, 2:41 p.m., Joel Koshy wrote:
> > core/src/main/scala/kafka/common/OffsetMetadataAndError.scala, line 17
> > 
> >
> > I thought we would be going with separate format for on-disk storage?
> > 
> > E.g., one thing that is extremely useful (until we have timestamp as a 
> > first-class field of messages) is to have the receive time of the 
> > offsetcommit in the stored offset entries. This is very useful for 
> > debugging.

Yes they are separated: for on-disk storage the timestamp will always been 
stored, and for wire protocol only v0/1 will contain that value, but for v2 
this value will be computed via retention. So the on-disk format is specified 
as OffsetAndMetadata, and when we deprecating v0/1 and adding the timestmap to 
message header we will replace this with OffsetMetadata.


- Guozhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27391/#review62553
---


On Nov. 18, 2014, 1:42 a.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27391/
> ---
> 
> (Updated Nov. 18, 2014, 1:42 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1634
> https://issues.apache.org/jira/browse/KAFKA-1634
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> The timestamp field of OffsetAndMetadata is preserved since we need to be 
> backward compatible with older versions
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
> 7517b879866fc5dad5f8d8ad30636da8bbe7784a 
>   
> clients/src/main/java/org/apache/kafka/common/requests/OffsetCommitRequest.java
>  3ee5cbad55ce836fd04bb954dcf6ef2f9bc3288f 
>   
> clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
>  df37fc6d8f0db0b8192a948426af603be3444da4 
>   core/src/main/scala/kafka/api/OffsetCommitRequest.scala 
> 050615c72efe7dbaa4634f53943bd73273d20ffb 
>   core/src/main/scala/kafka/api/OffsetFetchRequest.scala 
> c7604b9cdeb8f46507f04ed7d35fcb3d5f150713 
>   core/src/main/scala/kafka/common/OffsetMetadataAndError.scala 
> 4cabffeacea09a49913505db19a96a55d58c0909 
>   core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
> f476973eeff653473a60c3ecf36e870e386536bc 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 968b0c4f809ea9f9c3f322aef9db105f9d2e0e23 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> 4de812374e8fb1fed834d2be3f9655f55b511a74 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 2957bc435102bc4004d8f100dbcdd56287c8ffae 
>   core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
> cd16ced5465d098be7a60498326b2a98c248f343 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 8c5364fa97da1be09973c176d1baeb339455d319 
> 
> Diff: https://reviews.apache.org/r/27391/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



Re: Review Request 28240: Follow-up KAFKA-1580

2014-11-21 Thread Guozhang Wang


> On Nov. 21, 2014, 1:48 p.m., Joel Koshy wrote:
> > core/src/main/scala/kafka/server/KafkaApis.scala, line 193
> > 
> >
> > I think it is more suitable to have this access control implemented 
> > inside ReplicaManager. One can argue both ways though...

I was actually back and forth quite some times back then. The main reason I 
choose to do this in the Kafka API layer is that otherwise the offset manager / 
replica manager has to handle operations with clientId, which defeats the 
purpose of the refactoring effort.


- Guozhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28240/#review62547
---


On Nov. 19, 2014, 7:01 p.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28240/
> ---
> 
> (Updated Nov. 19, 2014, 7:01 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1580
> https://issues.apache.org/jira/browse/KAFKA-1580
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Add the logic for checking internal topics in the replica manager layer
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/admin/AdminUtils.scala 
> 94c53320b768b83b0324336b73cc06537c0fe78d 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 968b0c4f809ea9f9c3f322aef9db105f9d2e0e23 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 2957bc435102bc4004d8f100dbcdd56287c8ffae 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 3007a6d89b637b93f71fdb7adab561a93d9c4c62 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> 8531f533f3a6431f4f9fc8cb1ad3e9f1f1b110e0 
> 
> Diff: https://reviews.apache.org/r/28240/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



Re: Review Request 27634: Patch for KAFKA-1667

2014-11-21 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27634/#review62608
---


Thanks for the new patch. A few more comments below.


core/src/main/scala/kafka/log/LogConfig.scala


This is a useful genenal check for boolean type. Could we include the 
validation in ConfigDef.parseType() when parsing the boolean type? We probably 
want to make it case insensitive too. Then, we can make this a boolean type.



core/src/test/scala/kafka/log/LogConfigTest.scala


Could this be private?



core/src/test/scala/kafka/log/LogConfigTest.scala


Could this be private?


- Jun Rao


On Nov. 16, 2014, 5:33 p.m., Dmytro Kostiuchenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27634/
> ---
> 
> (Updated Nov. 16, 2014, 5:33 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1667
> https://issues.apache.org/jira/browse/KAFKA-1667
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1667 Fixed bugs in LogConfig. Added test and documentation
> 
> 
> KAFKA-1667 Updated tests to reflect new boolean property parsing logic
> 
> 
> KAFKA-1667 renamed methods to match naming convention
> 
> 
> KAFKA-1667 Added unit test to cover invalid configuration case
> 
> 
> KAFKA-1667 Strict UncleanLeaderElection property parsing
> 
> 
> KAFKA-1667 minor non-functional re-factoring
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java 
> c4cea2cc072f4db4ce014b63d226431d3766bef1 
>   core/src/main/scala/kafka/admin/TopicCommand.scala 
> 0b2735e7fc42ef9894bef1997b1f06a8ebee5439 
>   core/src/main/scala/kafka/log/LogConfig.scala 
> e48922a97727dd0b98f3ae630ebb0af3bef2373d 
>   core/src/main/scala/kafka/utils/Utils.scala 
> 23aefb4715b177feae1d2f83e8b910653ea10c5f 
>   core/src/test/scala/kafka/log/LogConfigTest.scala PRE-CREATION 
>   core/src/test/scala/unit/kafka/integration/UncleanLeaderElectionTest.scala 
> f44568cb25edf25db857415119018fd4c9922f61 
> 
> Diff: https://reviews.apache.org/r/27634/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Dmytro Kostiuchenko
> 
>



[jira] [Created] (KAFKA-1793) Allow replacing ZooKeeper with a different distributed coordination system

2014-11-21 Thread Paul Lorenz (JIRA)
Paul Lorenz created KAFKA-1793:
--

 Summary: Allow replacing ZooKeeper with a different distributed 
coordination system
 Key: KAFKA-1793
 URL: https://issues.apache.org/jira/browse/KAFKA-1793
 Project: Kafka
  Issue Type: Wish
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Paul Lorenz
Assignee: Neha Narkhede
Priority: Minor


It would simplify our deployment infrastructure if we could reuse our current 
service discovery and distributed configuration system (Consul) instead of 
requiring a separate ZK installation. 

I looked through the code and it looks like ZK code (specifically zkClient), is 
scattered throughout the codebase. 

It would be nice if any distributed coordination code that used ZK was behind a 
single (or set of) API. It would then hopefully be straightforward to provide a 
replacement implementation for that API. The implementing class could be 
specified in the config.

One potential benefit would be the ability to provide implementations for 
different versions of ZK. 

Is there any interest in having this feature available?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1791) Corrupt index after safe shutdown and restart

2014-11-21 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221155#comment-14221155
 ] 

Jun Rao commented on KAFKA-1791:


Just to be clear. So, you deleted 0233.index and then restarted 
the broker. The broker then hit the IllegalArgumentException during restart and 
left a 40KB 0233.index?

If that's the case, could you attach 0233.log to the jira?

> Corrupt index after safe shutdown and restart
> -
>
> Key: KAFKA-1791
> URL: https://issues.apache.org/jira/browse/KAFKA-1791
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.1
> Environment: Debian6 with Sun-Java6
>Reporter: Vamsi Subhash Achanta
>Priority: Critical
>
> We have 3 kafka brokers - all VMs. One of the broker was stopped for around 
> 30 minutes to fix a problem with the bare metal. Upon restart, after some 
> time, the broker went out of file-descriptors (FDs) and started throwing 
> errors. Upon restart, it started throwing this corrupted index exceptions. I 
> followed the other JIRAs related to corrupted indices but the solutions 
> mentioned there (deleting the index and restart) didn't work - the index gets 
> created again. The other JIRAs solution of deleting those indexes which got 
> wrongly compacted (> 10MB) didn't work either.
> What is the error? How can I fix this and bring back the broker? Thanks.
> INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Found clean 
> shutdown file. Skipping recovery for all logs in data directory 
> '/var/lib/fk-3p-kafka/logs'
>  INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Loading log 
> 'kf.production.b2b.return_order.status-25'
> FATAL [2014-11-21 02:57:17,533] [main][] kafka.server.KafkaServerStartable - 
> Fatal error during KafkaServerStable startup. Prepare to shutdown
> java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
> index file 
> (/var/lib/fk-3p-kafka/logs/kf.production.b2b.return_order.status-25/0233.index)
>  has non-zero size but the last offset is 233 and the base offset is 233
>   at scala.Predef$.require(Predef.scala:145)
>   at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
>   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:159)
>   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:158)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:631)
>   at 
> scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:474)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
>   at 
> scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:495)
>   at kafka.log.Log.loadSegments(Log.scala:158)
>   at kafka.log.Log.(Log.scala:64)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
>   at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34)
>   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113)
>   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
>   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32)
>   at kafka.log.LogManager.loadLogs(LogManager.scala:105)
>   at kafka.log.LogManager.(LogManager.scala:57)
>   at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275)
>   at kafka.server.KafkaServer.startup(KafkaServer.scala:72)
>   at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
>   at kafka.Kafka$.main(Kafka.scala:46)
>   at kafka.Kafka.main(Kafka.scala)
>  INFO [2014-11-21 02:57:17,534] [main][] kafka.server.KafkaServer - [Kafka 
> Server 2], shutting down
>  INFO [2014-11-21 02:57:17,538] [main][] kafka.server.KafkaServer - [Kafka 
> Server 2], shut down completed
>  INFO [2014-11-21 02:57:17,539] [Thread-2][] kafka.server.KafkaServer - 
> [Kafka Server 2], shutting down



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka

2014-11-21 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221118#comment-14221118
 ] 

Joe Stein commented on KAFKA-1173:
--

[~ewencp] I think we both agree we are not trying to deal with every use case 
just making sure folks don't have a bad experience.  Let me go through your 
latest patch and should be in a place I can commit when i get some time for 
spin ups/down and such again.  Thanks!

> Using Vagrant to get up and running with Apache Kafka
> -
>
> Key: KAFKA-1173
> URL: https://issues.apache.org/jira/browse/KAFKA-1173
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Joe Stein
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.8.3
>
> Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, 
> KAFKA-1173_2014-11-11_13:50:55.patch, KAFKA-1173_2014-11-12_11:32:09.patch, 
> KAFKA-1173_2014-11-18_16:01:33.patch
>
>
> Vagrant has been getting a lot of pickup in the tech communities.  I have 
> found it very useful for development and testing and working with a few 
> clients now using it to help virtualize their environments in repeatable ways.
> Using Vagrant to get up and running.
> For 0.8.0 I have a patch on github https://github.com/stealthly/kafka
> 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/)
> 2) Install Virtual Box 
> [https://www.virtualbox.org/](https://www.virtualbox.org/)
> In the main kafka folder
> 1) ./sbt update
> 2) ./sbt package
> 3) ./sbt assembly-package-dependency
> 4) vagrant up
> once this is done 
> * Zookeeper will be running 192.168.50.5
> * Broker 1 on 192.168.50.10
> * Broker 2 on 192.168.50.20
> * Broker 3 on 192.168.50.30
> When you are all up and running you will be back at a command brompt.  
> If you want you can login to the machines using vagrant shh  but 
> you don't need to.
> You can access the brokers and zookeeper by their IP
> e.g.
> bin/kafka-console-producer.sh --broker-list 
> 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox
> bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox 
> --from-beginning



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27391: Fix KAFKA-1634

2014-11-21 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27391/#review62553
---



core/src/main/scala/kafka/api/OffsetCommitRequest.scala


`>=`



core/src/main/scala/kafka/api/OffsetCommitRequest.scala


`>=`



core/src/main/scala/kafka/common/OffsetMetadataAndError.scala


I thought we would be going with separate format for on-disk storage?

E.g., one thing that is extremely useful (until we have timestamp as a 
first-class field of messages) is to have the receive time of the offsetcommit 
in the stored offset entries. This is very useful for debugging.



core/src/main/scala/kafka/server/KafkaApis.scala


I think what you meant earlier was with < v2 you could have different 
timestamps for each partition so a global retention won't work with < v2



core/src/main/scala/kafka/server/OffsetManager.scala


use



core/src/main/scala/kafka/server/OffsetManager.scala


This and the above use give compilation warnings. I think it is reasonable 
to "copy" those constants here to get rid of the warnings since we are anyway 
in an undesirable state right now of maintaining a mirror wire-protocol 
implementation in scala


- Joel Koshy


On Nov. 18, 2014, 1:42 a.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27391/
> ---
> 
> (Updated Nov. 18, 2014, 1:42 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1634
> https://issues.apache.org/jira/browse/KAFKA-1634
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> The timestamp field of OffsetAndMetadata is preserved since we need to be 
> backward compatible with older versions
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
> 7517b879866fc5dad5f8d8ad30636da8bbe7784a 
>   
> clients/src/main/java/org/apache/kafka/common/requests/OffsetCommitRequest.java
>  3ee5cbad55ce836fd04bb954dcf6ef2f9bc3288f 
>   
> clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
>  df37fc6d8f0db0b8192a948426af603be3444da4 
>   core/src/main/scala/kafka/api/OffsetCommitRequest.scala 
> 050615c72efe7dbaa4634f53943bd73273d20ffb 
>   core/src/main/scala/kafka/api/OffsetFetchRequest.scala 
> c7604b9cdeb8f46507f04ed7d35fcb3d5f150713 
>   core/src/main/scala/kafka/common/OffsetMetadataAndError.scala 
> 4cabffeacea09a49913505db19a96a55d58c0909 
>   core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
> f476973eeff653473a60c3ecf36e870e386536bc 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 968b0c4f809ea9f9c3f322aef9db105f9d2e0e23 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> 4de812374e8fb1fed834d2be3f9655f55b511a74 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 2957bc435102bc4004d8f100dbcdd56287c8ffae 
>   core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
> cd16ced5465d098be7a60498326b2a98c248f343 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 8c5364fa97da1be09973c176d1baeb339455d319 
> 
> Diff: https://reviews.apache.org/r/27391/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



Re: Review Request 28240: Follow-up KAFKA-1580

2014-11-21 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28240/#review62547
---

Ship it!


Minor comment.


core/src/main/scala/kafka/server/KafkaApis.scala


I think it is more suitable to have this access control implemented inside 
ReplicaManager. One can argue both ways though...


- Joel Koshy


On Nov. 19, 2014, 7:01 p.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28240/
> ---
> 
> (Updated Nov. 19, 2014, 7:01 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1580
> https://issues.apache.org/jira/browse/KAFKA-1580
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Add the logic for checking internal topics in the replica manager layer
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/admin/AdminUtils.scala 
> 94c53320b768b83b0324336b73cc06537c0fe78d 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 968b0c4f809ea9f9c3f322aef9db105f9d2e0e23 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 2957bc435102bc4004d8f100dbcdd56287c8ffae 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 3007a6d89b637b93f71fdb7adab561a93d9c4c62 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> 8531f533f3a6431f4f9fc8cb1ad3e9f1f1b110e0 
> 
> Diff: https://reviews.apache.org/r/28240/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



[jira] [Work started] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-11-21 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-1792 started by Dmitry Pekar.
---
> change behavior of --generate to produce assignment config with fair replica 
> distribution and minimal number of reassignments
> -
>
> Key: KAFKA-1792
> URL: https://issues.apache.org/jira/browse/KAFKA-1792
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
>
> Current implementation produces fair replica distribution between specified 
> list of brokers. Unfortunately, it doesn't take
> into account current replica assignment.
> So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
> broker id=3, 
> generate will create an assignment config which will redistribute replicas 
> fairly across brokers [0..3] 
> in the same way as those partitions were created from scratch. It will not 
> take into consideration current replica 
> assignment and accordingly will not try to minimize number of replica moves 
> between brokers.
> As proposed by [~charmalloc] this should be improved. New output of improved 
> --generate algorithm should suite following requirements:
> - fairness of replica distribution - every broker will have R or R+1 replicas 
> assigned;
> - minimum of reassignments - number of replica moves between brokers will be 
> minimal;
> Example.
> Consider following replica distribution per brokers [0..3] (we just added 
> brokers 2 and 3):
> - broker - 0, 1, 2, 3 
> - replicas - 7, 6, 0, 0
> The new algorithm will produce following assignment:
> - broker - 0, 1, 2, 3 
> - replicas - 4, 3, 3, 3
> - moves - -3, -3, +3, +3
> It will be fair and number of moves will be 6, which is minimal for specified 
> initial distribution.
> The scope of this issue is:
> - design an algorithm matching the above requirements;
> - implement this algorithm and unit tests;
> - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-11-21 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pekar reassigned KAFKA-1792:
---

Assignee: Dmitry Pekar

> change behavior of --generate to produce assignment config with fair replica 
> distribution and minimal number of reassignments
> -
>
> Key: KAFKA-1792
> URL: https://issues.apache.org/jira/browse/KAFKA-1792
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
>
> Current implementation produces fair replica distribution between specified 
> list of brokers. Unfortunately, it doesn't take
> into account current replica assignment.
> So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
> broker id=3, 
> generate will create an assignment config which will redistribute replicas 
> fairly across brokers [0..3] 
> in the same way as those partitions were created from scratch. It will not 
> take into consideration current replica 
> assignment and accordingly will not try to minimize number of replica moves 
> between brokers.
> As proposed by [~charmalloc] this should be improved. New output of improved 
> --generate algorithm should suite following requirements:
> - fairness of replica distribution - every broker will have R or R+1 replicas 
> assigned;
> - minimum of reassignments - number of replica moves between brokers will be 
> minimal;
> Example.
> Consider following replica distribution per brokers [0..3] (we just added 
> brokers 2 and 3):
> - broker - 0, 1, 2, 3 
> - replicas - 7, 6, 0, 0
> The new algorithm will produce following assignment:
> - broker - 0, 1, 2, 3 
> - replicas - 4, 3, 3, 3
> - moves - -3, -3, +3, +3
> It will be fair and number of moves will be 6, which is minimal for specified 
> initial distribution.
> The scope of this issue is:
> - design an algorithm matching the above requirements;
> - implement this algorithm and unit tests;
> - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-11-21 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pekar updated KAFKA-1792:

Description: 
Current implementation produces fair replica distribution between specified 
list of brokers. Unfortunately, it doesn't take
into account current replica assignment.

So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
broker id=3, 
generate will create an assignment config which will redistribute replicas 
fairly across brokers [0..3] 
in the same way as those partitions were created from scratch. It will not take 
into consideration current replica 
assignment and accordingly will not try to minimize number of replica moves 
between brokers.

As proposed by [~charmalloc] this should be improved. New output of improved 
--generate algorithm should suite following requirements:
- fairness of replica distribution - every broker will have R or R+1 replicas 
assigned;
- minimum of reassignments - number of replica moves between brokers will be 
minimal;

Example.
Consider following replica distribution per brokers [0..3] (we just added 
brokers 2 and 3):
- broker - 0, 1, 2, 3 
- replicas - 7, 6, 0, 0

The new algorithm will produce following assignment:
- broker - 0, 1, 2, 3 
- replicas - 4, 3, 3, 3
- moves - -3, -3, +3, +3

It will be fair and number of moves will be 6, which is minimal for specified 
initial distribution.

The scope of this issue is:
- design an algorithm matching the above requirements;
- implement this algorithm and unit tests;
- test it manually using different initial assignments;

  was:
Current implementation produces fair replica distribution between specified 
list of brokers. Unfortunately, it doesn't take
into account current replica assignment.
So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
broker id=3, 
generate will create an assignment config which will redistribute replicas 
fairly accross brokers [0..3] 
in the same way as those partitions were created from scratch.
It will not take into consideration current replica assignment and accordingly 
will not try to minimize number 
of replica moves between brokers.
As proposed by Joe Stein this should be improved. New output of improved 
--generate algorithm should suite following requirements:
fairness of replica distribution - every broker will have R or R+1 replicas 
assigned;
minimum of reassignments - number of replica moves between brokers will be 
minimal;
Example.
Consider following replica distribution per brokers [0..3] (we just added 
brokers 2 and 3):
broker - 0, 1, 2, 3 
replicas - 7, 6, 0, 0
The new algorithm will produce following assignment:
broker - 0, 1, 2, 3 
replicas - 4, 3, 3, 3
moves - -3, -3, +3, +3
It will be fair and number of moves will be 6, which is minimal for specified 
initial distribution.
The scope of this issue is:
design an algorithm matching the above requirements;
implement this algorithm and unit tests;
test it manually using different initial assignments;


> change behavior of --generate to produce assignment config with fair replica 
> distribution and minimal number of reassignments
> -
>
> Key: KAFKA-1792
> URL: https://issues.apache.org/jira/browse/KAFKA-1792
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
> Fix For: 0.8.3
>
>
> Current implementation produces fair replica distribution between specified 
> list of brokers. Unfortunately, it doesn't take
> into account current replica assignment.
> So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
> broker id=3, 
> generate will create an assignment config which will redistribute replicas 
> fairly across brokers [0..3] 
> in the same way as those partitions were created from scratch. It will not 
> take into consideration current replica 
> assignment and accordingly will not try to minimize number of replica moves 
> between brokers.
> As proposed by [~charmalloc] this should be improved. New output of improved 
> --generate algorithm should suite following requirements:
> - fairness of replica distribution - every broker will have R or R+1 replicas 
> assigned;
> - minimum of reassignments - number of replica moves between brokers will be 
> minimal;
> Example.
> Consider following replica distribution per brokers [0..3] (we just added 
> brokers 2 and 3):
> - broker - 0, 1, 2, 3 
> - replicas - 7, 6, 0, 0
> The new algorithm will produce following assignment:
> - broker - 0, 1, 2, 3 
> - replicas - 4, 3, 3, 3
> - moves - -3, -3, +3, +3
> It will be fair and number of moves will be 6, which is minimal for specified 
> initial distribution.
> The scope of this issue is:
> - design an algorithm matching 

[jira] [Work stopped] (KAFKA-1752) add --replace-broker option

2014-11-21 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-1752 stopped by Dmitry Pekar.
---
> add --replace-broker option
> ---
>
> Key: KAFKA-1752
> URL: https://issues.apache.org/jira/browse/KAFKA-1752
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-1752) add --replace-broker option

2014-11-21 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pekar resolved KAFKA-1752.
-
Resolution: Won't Fix

> add --replace-broker option
> ---
>
> Key: KAFKA-1752
> URL: https://issues.apache.org/jira/browse/KAFKA-1752
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1752) add --replace-broker option

2014-11-21 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220877#comment-14220877
 ] 

Dmitry Pekar commented on KAFKA-1752:
-

As discusses this functionality is not needed any more. Instead of this 
KAFKA-1792 should be implemented.
Closing this as not required to implement.

> add --replace-broker option
> ---
>
> Key: KAFKA-1752
> URL: https://issues.apache.org/jira/browse/KAFKA-1752
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-1752) add --replace-broker option

2014-11-21 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pekar updated KAFKA-1752:

Comment: was deleted

(was: Changed title and description according to new requirements.)

> add --replace-broker option
> ---
>
> Key: KAFKA-1752
> URL: https://issues.apache.org/jira/browse/KAFKA-1752
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1752) add --replace-broker option

2014-11-21 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pekar updated KAFKA-1752:

Description: (was: Current implementation produces fair replica 
distribution between specified list of brokers. Unfortunately, it doesn't take
into account current replica assignment. 

So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
broker id=3, 
generate will create an assignment config which will redistribute replicas 
fairly accross brokers [0..3] 
in the same way as those partitions were created from scratch.

It will not take into consideration current replica assignment and accordingly 
will not try to minimize number 
of replica moves between brokers.

As proposed by [~charmalloc] this should be improved. New output of improved 
--generate algorithm should suite following requirements:
- fairness of replica distribution - every broker will have R or R+1 replicas 
assigned;
- minimum of reassignments - number of replica moves between brokers will be 
minimal;

Example.
Consider following replica distribution per brokers [0..3] (we just added 
brokers 2 and 3):
broker   - 0, 1, 2, 3 
replicas - 7, 6, 0, 0

The new algorithm will produce following assignment:
broker   -  0,  1,  2,  3 
replicas -  4,  3,  3,  3
moves- -3, -3, +3, +3

It will be fair and number of moves will be 6, which is minimal for specified 
initial distribution.

The scope of this issue is:
- design an algorithm matching the above requirements;
- implement this algorithm and unit tests;
- test it manually using different initial assignments;)
Summary: add --replace-broker option  (was: change behavior of 
--generate to produce assignment config with fair replica distribution and 
minimal number of reassignments)

> add --replace-broker option
> ---
>
> Key: KAFKA-1752
> URL: https://issues.apache.org/jira/browse/KAFKA-1752
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-11-21 Thread Dmitry Pekar (JIRA)
Dmitry Pekar created KAFKA-1792:
---

 Summary: change behavior of --generate to produce assignment 
config with fair replica distribution and minimal number of reassignments
 Key: KAFKA-1792
 URL: https://issues.apache.org/jira/browse/KAFKA-1792
 Project: Kafka
  Issue Type: Sub-task
Reporter: Dmitry Pekar


Current implementation produces fair replica distribution between specified 
list of brokers. Unfortunately, it doesn't take
into account current replica assignment.
So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
broker id=3, 
generate will create an assignment config which will redistribute replicas 
fairly accross brokers [0..3] 
in the same way as those partitions were created from scratch.
It will not take into consideration current replica assignment and accordingly 
will not try to minimize number 
of replica moves between brokers.
As proposed by Joe Stein this should be improved. New output of improved 
--generate algorithm should suite following requirements:
fairness of replica distribution - every broker will have R or R+1 replicas 
assigned;
minimum of reassignments - number of replica moves between brokers will be 
minimal;
Example.
Consider following replica distribution per brokers [0..3] (we just added 
brokers 2 and 3):
broker - 0, 1, 2, 3 
replicas - 7, 6, 0, 0
The new algorithm will produce following assignment:
broker - 0, 1, 2, 3 
replicas - 4, 3, 3, 3
moves - -3, -3, +3, +3
It will be fair and number of moves will be 6, which is minimal for specified 
initial distribution.
The scope of this issue is:
design an algorithm matching the above requirements;
implement this algorithm and unit tests;
test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)