[jira] [Commented] (KAFKA-3432) Cluster.update() thread-safety

2016-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209148#comment-15209148
 ] 

ASF GitHub Bot commented on KAFKA-3432:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1118


> Cluster.update() thread-safety
> --
>
> Key: KAFKA-3432
> URL: https://issues.apache.org/jira/browse/KAFKA-3432
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> A `Cluster.update()` method was introduced during the development of 0.10.0 
> so that `StreamPartitionAssignor` can add internal topics on-the-fly and give 
> the augmented metadata to its underlying grouper.
> `Cluster` was supposed to be immutable after construction and all 
> synchronization happens via the `Metadata` instance. As far as I can see 
> `Cluster.update()` is not thread-safe even though `Cluster` is accessed by 
> multiple threads in some cases (I am not sure about the Streams case). Since 
> this is a public API, it is important to fix this in my opinion.
> A few options I can think of:
> * Since `PartitionAssignor` is an internal class, change 
> `PartitionAssignor.assign` to return a class containing the assignments and 
> optionally an updated cluster. This is straightforward, but I am not sure if 
> it's good enough for the Streams use-case. Can you please confirm [~guozhang]?
> * Pass `Metadata` instead of `Cluster` to `PartitionAssignor.assign`, giving 
> assignors the ability to update the metadata as needed.
> * Make `Cluster` thread-safe in the face of mutations (without relying on 
> synchronization at the `Metadata` level). This is not ideal, KAFKA-3428 shows 
> that the synchronization at `Metadata` level is already too costly for high 
> concurrency situations.
> Thoughts [~guozhang], [~hachikuji]?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3432) Cluster.update() thread-safety

2016-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207715#comment-15207715
 ] 

ASF GitHub Bot commented on KAFKA-3432:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1118

KAFKA-3432; Cluster.update() thread-safety

Replace `update` with `withPartitions`, which returns a
copy instead of mutating the instance.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-3432-cluster-update-thread-safety

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1118.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1118


commit c76566f4cfeb25a757c223f63e13d81e33806d68
Author: Ismael Juma 
Date:   2016-03-23T01:47:30Z

KAFKA-3432; Cluster.update() thread-safety

Replace `update` with `withPartitions` that returns a
copy instead of mutating the instance.




> Cluster.update() thread-safety
> --
>
> Key: KAFKA-3432
> URL: https://issues.apache.org/jira/browse/KAFKA-3432
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> A `Cluster.update()` method was introduced during the development of 0.10.0 
> so that `StreamPartitionAssignor` can add internal topics on-the-fly and give 
> the augmented metadata to its underlying grouper.
> `Cluster` was supposed to be immutable after construction and all 
> synchronization happens via the `Metadata` instance. As far as I can see 
> `Cluster.update()` is not thread-safe even though `Cluster` is accessed by 
> multiple threads in some cases (I am not sure about the Streams case). Since 
> this is a public API, it is important to fix this in my opinion.
> A few options I can think of:
> * Since `PartitionAssignor` is an internal class, change 
> `PartitionAssignor.assign` to return a class containing the assignments and 
> optionally an updated cluster. This is straightforward, but I am not sure if 
> it's good enough for the Streams use-case. Can you please confirm [~guozhang]?
> * Pass `Metadata` instead of `Cluster` to `PartitionAssignor.assign`, giving 
> assignors the ability to update the metadata as needed.
> * Make `Cluster` thread-safe in the face of mutations (without relying on 
> synchronization at the `Metadata` level). This is not ideal, KAFKA-3428 shows 
> that the synchronization at `Metadata` level is already too costly for high 
> concurrency situations.
> Thoughts [~guozhang], [~hachikuji]?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3432) Cluster.update() thread-safety

2016-03-22 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207622#comment-15207622
 ] 

Ismael Juma commented on KAFKA-3432:


That sounds better [~guozhang], I'll give it a try.

> Cluster.update() thread-safety
> --
>
> Key: KAFKA-3432
> URL: https://issues.apache.org/jira/browse/KAFKA-3432
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> A `Cluster.update()` method was introduced during the development of 0.10.0 
> so that `StreamPartitionAssignor` can add internal topics on-the-fly and give 
> the augmented metadata to its underlying grouper.
> `Cluster` was supposed to be immutable after construction and all 
> synchronization happens via the `Metadata` instance. As far as I can see 
> `Cluster.update()` is not thread-safe even though `Cluster` is accessed by 
> multiple threads in some cases (I am not sure about the Streams case). Since 
> this is a public API, it is important to fix this in my opinion.
> A few options I can think of:
> * Since `PartitionAssignor` is an internal class, change 
> `PartitionAssignor.assign` to return a class containing the assignments and 
> optionally an updated cluster. This is straightforward, but I am not sure if 
> it's good enough for the Streams use-case. Can you please confirm [~guozhang]?
> * Pass `Metadata` instead of `Cluster` to `PartitionAssignor.assign`, giving 
> assignors the ability to update the metadata as needed.
> * Make `Cluster` thread-safe in the face of mutations (without relying on 
> synchronization at the `Metadata` level). This is not ideal, KAFKA-3428 shows 
> that the synchronization at `Metadata` level is already too costly for high 
> concurrency situations.
> Thoughts [~guozhang], [~hachikuji]?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3432) Cluster.update() thread-safety

2016-03-22 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206721#comment-15206721
 ] 

Guozhang Wang commented on KAFKA-3432:
--

I think an additional option is to change `update` to return a new `Cluster` 
object with the augmented partitions, and in `StreamPartitionAssignor` only 
pass this augmented `Cluster` object to the underlying grouper. Implementation 
wise it would require a `clone` function along with additions of the partition 
info. Thoughts [~ijuma][~hachikuji]?

> Cluster.update() thread-safety
> --
>
> Key: KAFKA-3432
> URL: https://issues.apache.org/jira/browse/KAFKA-3432
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> A `Cluster.update()` method was introduced during the development of 0.10.0 
> so that `StreamPartitionAssignor` can add internal topics on-the-fly and give 
> the augmented metadata to its underlying grouper.
> `Cluster` was supposed to be immutable after construction and all 
> synchronization happens via the `Metadata` instance. As far as I can see 
> `Cluster.update()` is not thread-safe even though `Cluster` is accessed by 
> multiple threads in some cases (I am not sure about the Streams case). Since 
> this is a public API, it is important to fix this in my opinion.
> A few options I can think of:
> * Since `PartitionAssignor` is an internal class, change 
> `PartitionAssignor.assign` to return a class containing the assignments and 
> optionally an updated cluster. This is straightforward, but I am not sure if 
> it's good enough for the Streams use-case. Can you please confirm [~guozhang]?
> * Pass `Metadata` instead of `Cluster` to `PartitionAssignor.assign`, giving 
> assignors the ability to update the metadata as needed.
> * Make `Cluster` thread-safe in the face of mutations (without relying on 
> synchronization at the `Metadata` level). This is not ideal, KAFKA-3428 shows 
> that the synchronization at `Metadata` level is already too costly for high 
> concurrency situations.
> Thoughts [~guozhang], [~hachikuji]?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)