[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gianmarco De Francisci Morales updated KAFKA-2092:
--
Attachment: KAFKA-2092-v3.patch
Updated formatting to pass
AM, Gianmarco De Francisci Morales
g...@apache.org wrote:
Jason,
Thanks for starting the discussion and for your very concise (and
correct)
summary.
Ewen, while what you say is true, those kinds of detasets (large number
of
keys with skew) are very typical in the Web (think Twitter
in the
context of KIP-28 which would provide some higher-level processing
capabilities (though it doesn't seem like the KStream abstraction
would
provide a direct way to leverage this partitioner without custom
logic).
Thanks,
Jason
On Wed, Jul 22, 2015 at 12:14 AM, Gianmarco De
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642427#comment-14642427
]
Gianmarco De Francisci Morales commented on KAFKA-2092:
---
[~hachikuji
Hello folks,
I'd like to ask the community about its opinion on the partitioning
functions in Kafka.
With KAFKA-2091 https://issues.apache.org/jira/browse/KAFKA-2091
integrated we are now able to have custom partitioners in the producer.
The question now becomes *which* partitioners should ship
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634927#comment-14634927
]
Gianmarco De Francisci Morales commented on KAFKA-2092:
---
[hachikuji
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gianmarco De Francisci Morales updated KAFKA-2092:
--
Attachment: KAFKA-2092-v2.patch
Added explanation and example
---
Thanks,
Gianmarco De Francisci Morales
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613028#comment-14613028
]
Gianmarco De Francisci Morales commented on KAFKA-2092:
---
[~hachikuji
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609701#comment-14609701
]
Gianmarco De Francisci Morales commented on KAFKA-2092
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595510#comment-14595510
]
Gianmarco De Francisci Morales commented on KAFKA-2092:
---
Any more
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589634#comment-14589634
]
Gianmarco De Francisci Morales edited comment on KAFKA-2092 at 6/22/15 8:42 AM
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589634#comment-14589634
]
Gianmarco De Francisci Morales commented on KAFKA-2092:
---
Thanks
/internals/PKGPartitioner.java
PRE-CREATION
clients/src/main/java/org/apache/kafka/common/utils/Utils.java f73eedb
Diff: https://reviews.apache.org/r/35524/diff/
Testing
---
Thanks,
Gianmarco De Francisci Morales
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gianmarco De Francisci Morales updated KAFKA-2092:
--
Attachment: KAFKA-2092-v1.patch
New partitioning for better
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gianmarco De Francisci Morales updated KAFKA-2092:
--
Status: Patch Available (was: Open)
First attempt at a patch
, at 02:15 AM, Gianmarco De Francisci Morales
wrote:
Hi,
Here are the questions I think we should consider:
1. Do we need this at all given that we have the partition
argument in
ProducerRecord which gives full control? I think we do need it
because
AM, Gianmarco De Francisci Morales wrote:
Hi,
Here are the questions I think we should consider:
1. Do we need this at all given that we have the partition argument in
ProducerRecord which gives full control? I think we do need it because
this
is a way to plug in a different
Hi,
Here are the questions I think we should consider:
1. Do we need this at all given that we have the partition argument in
ProducerRecord which gives full control? I think we do need it because this
is a way to plug in a different partitioning strategy at run time and do it
in a fairly
[
https://issues.apache.org/jira/browse/KAFKA-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508649#comment-14508649
]
Gianmarco De Francisci Morales commented on KAFKA-2091:
---
Hi,
I think
[
https://issues.apache.org/jira/browse/KAFKA-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484926#comment-14484926
]
Gianmarco De Francisci Morales commented on KAFKA-2091:
---
Looks good
framework?
Guozhang
On Sun, Apr 5, 2015 at 12:19 AM, Gianmarco De Francisci Morales
g...@apache.org wrote:
Hi Jay,
Thanks, that sounds a necessary step. I guess I expected something like
that to be already there, at least internally.
I created KAFKA-2092 to track the PKG integration
Gianmarco De Francisci Morales created KAFKA-2092:
-
Summary: New partitioning for better load balancing
Key: KAFKA-2092
URL: https://issues.apache.org/jira/browse/KAFKA-2092
Project
,
I am coming from storm community. I think PKG is a very
interesting and we can provide an implementation of Partitioner for PKG.
Can you open a JIRA for this.
--
Harsha
Sent with Airmail
On April 3, 2015 at 4:49:15 AM, Gianmarco De Francisci Morales (
g...@apache.org) wrote
[
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gianmarco De Francisci Morales updated KAFKA-2092:
--
Description:
We have recently studied the problem of load
Hi,
We have recently studied the problem of load balancing in distributed
stream processing systems such as Samza [1].
In particular, we focused on what happens when the key distribution of the
stream is skewed when using key grouping.
We developed a new stream partitioning scheme (which we call
26 matches
Mail list logo