xinyuiscool commented on a change in pull request #950: SAMZA-2126: Bug fixes
for batch-mode generated stream specs
URL: https://github.com/apache/samza/pull/950#discussion_r264769050
##########
File path: samza-core/src/main/java/org/apache/samza/execution/StreamEdge.java
##########
@@ -75,7 +75,9 @@ StreamSpec getStreamSpec() {
StreamSpec spec = (partitions == PARTITIONS_UNKNOWN) ?
streamSpec : streamSpec.copyWithPartitionCount(partitions);
- if (isIntermediate) {
+ // Append unique id to the batch intermediate streams
+ // Check the physical stream name is already generated first
+ if (isIntermediate && spec.getId().equals(spec.getPhysicalName())) {
Review comment:
Good question. Usually the id and physical are the same for intermediate
streams since they are generated. If the user overrides it for some reason,
then we won't append the unique id, for either stream or batch cases.
This check is not very obvious given I couldn't find a way to tell that the
physical name has been generated already due to the double planning problem in
Samza. If we only invoke the planner once during submission, then we don't need
this logic anymore :(.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services