[ https://issues.apache.org/jira/browse/SAMZA-662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guozhang Wang updated SAMZA-662: -------------------------------- Attachment: SAMZA-662.v1.patch > Samza auto-creates changelog stream without sufficient partitions when > container number > 1 > ------------------------------------------------------------------------------------------- > > Key: SAMZA-662 > URL: https://issues.apache.org/jira/browse/SAMZA-662 > Project: Samza > Issue Type: Bug > Components: container > Affects Versions: 0.9.0 > Reporter: Yan Fang > Assignee: Guozhang Wang > Attachments: SAMZA-662.v1.patch > > > We currently provide auto-create for changelog streams. However, when there > are more than 1 containers, it is possible that Samza creates a changelog > stream with insufficient partitions. > Reason: > assume we have an input stream with 3 partitions and then we assign 3 > containers for this job. According to the > [JobCoordinator|https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala], > we will get: > ||Container(Model) || InputStream Partition || Changelog Partition || > |0 | 0 | 0| > |1 | 1 | 1| > |2 | 2 | 2| > If Container 0 is brought up first, it calls > {code} > val maxChangeLogStreamPartitions = containerModel.getTasks.values > .max(Ordering.by { task:TaskModel => > task.getChangelogPartition.getPartitionId }) > .getChangelogPartition.getPartitionId + 1 > {code} > in > [SamzaContainer|https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala]. > The maxChangeLogStreamPartition is 1. So we will auto-create a changelog > stream with only 1 partitions. > Similarly, if the Container 2 is brought up first, we will get a stream with > 2 partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)