[jira] [Updated] (SAMZA-1126) Semantics of ProcessorId in Samza

Navina Ramesh (JIRA) Wed, 08 Mar 2017 00:19:52 -0800

     [ 
https://issues.apache.org/jira/browse/SAMZA-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Navina Ramesh updated SAMZA-1126:
---------------------------------
    Description: 
Until today, we have been using "processorId" to be synonymous to the logical 
"containerId", assigned by Samza. 
It is easy for Samza to generate a unique set of containerIds per job because 
the number of containers is expected to be fixed/constant throughout the job's 
lifecycle. However, with the new Zookeeper based model, we allow the number of 
processors to be changed while the job is executing. In other words, we want to 
make a Samza job "elastic" in nature. 
The proposal in SAMZA-1084 expects the user to assign a unique processorId to 
each StreamProcessor associated with the job. This is tedious on the user since 
the processors are going to be distributed across one or more machines and the 
user should coordinate among these machines for guaranteeing uniqueness of 
processorId within a job. 
The goal of this JIRA is to understand and define the semantics of processorId 
and investigate a solution which does not impose this requirement on the user. 

  was:
Until today, we have been using "processorId" to be synonymous to the logical 
"containerId", assigned by Samza. 
It is easy for Samza to generate a unique set of containerIds per job because 
the number of containers is expected to be fixed/constant throughout the job's 
lifecycle. However, with the new Zookeeper based model, we allow the number of 
processors to be changed while the job is executing. In other words, we want to 
make a Samza job "elastic" in nature. 
The proposal in SAMZA-1084 expects the user to assign a unique processorId to 
each StreamProcessor associated with the job. This is tedious on the user since 
the processors are going to be distributed across one or more machines and the 
user should coordinate among these machines for guaranteeing uniqueness of 
processorId within a job. 
The goal of this JIRA is to understand and describe the semantics of 
processorId and investigate a solution which does not impose this requirement 
on the user. 


> Semantics of ProcessorId in Samza 
> ----------------------------------
>
>                 Key: SAMZA-1126
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1126
>             Project: Samza
>          Issue Type: Sub-task
>            Reporter: Navina Ramesh
>            Assignee: Navina Ramesh
>             Fix For: 0.13.0
>
>
> Until today, we have been using "processorId" to be synonymous to the logical 
> "containerId", assigned by Samza. 
> It is easy for Samza to generate a unique set of containerIds per job because 
> the number of containers is expected to be fixed/constant throughout the 
> job's lifecycle. However, with the new Zookeeper based model, we allow the 
> number of processors to be changed while the job is executing. In other 
> words, we want to make a Samza job "elastic" in nature. 
> The proposal in SAMZA-1084 expects the user to assign a unique processorId to 
> each StreamProcessor associated with the job. This is tedious on the user 
> since the processors are going to be distributed across one or more machines 
> and the user should coordinate among these machines for guaranteeing 
> uniqueness of processorId within a job. 
> The goal of this JIRA is to understand and define the semantics of 
> processorId and investigate a solution which does not impose this requirement 
> on the user. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (SAMZA-1126) Semantics of ProcessorId in Samza

Reply via email to