[ https://issues.apache.org/jira/browse/BEAM-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aviem Zur updated BEAM-1052: ---------------------------- Description: We use a "running-id" to identify source splits, but we reiterate for each source evaluated. Spark already assigns a unique id per InputDStream, it would be unique enough if we replace {{MicrobatchSource}} hash code with one containing both the running-id and the InputDStream id. was: We use a "running-id" to identify source splits, but we reiterate for each source evaluated. Spark already assigns a unique id per InputDStream, it would be unique enough if we replace MicrobatchSource hash code with one containing both the running-id and the InputDStream id. > UnboundedSource splitId uniqueness breaks if more than one source is used. > -------------------------------------------------------------------------- > > Key: BEAM-1052 > URL: https://issues.apache.org/jira/browse/BEAM-1052 > Project: Beam > Issue Type: Bug > Components: runner-spark > Reporter: Amit Sela > Assignee: Aviem Zur > > We use a "running-id" to identify source splits, but we reiterate for each > source evaluated. > Spark already assigns a unique id per InputDStream, it would be unique enough > if we replace {{MicrobatchSource}} hash code with one containing both the > running-id and the InputDStream id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)