[ 
https://issues.apache.org/jira/browse/SAMZA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137909#comment-14137909
 ] 

Jon Bringhurst commented on SAMZA-348:
--------------------------------------

Regarding the client commands to create and modify Samza jobs (such as 
configure-job and run-job), it may be useful to review existing commands that 
perform a similar role:

My personal favorite is Slurm's set of client commands, of which sbatch is 
probably the most relevant (http://www.schedmd.com/slurmdocs/sbatch.html).

To go back a bit further in history, it might be a good idea to take a look at 
the POSIX qsub style command from PBS/Torque 
(http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qsub.htm).
 Moab's msub also follows this design.

Regarding a possible DSL for building configuration, it may be useful to look 
at Slurm's lua callback for job configuration (*warning, GPLv2 code*) 
https://github.com/SchedMD/slurm/blob/master/contribs/lua/job_submit.lua

> Configure Samza jobs through a stream
> -------------------------------------
>
>                 Key: SAMZA-348
>                 URL: https://issues.apache.org/jira/browse/SAMZA-348
>             Project: Samza
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Chris Riccomini
>              Labels: project
>         Attachments: DESIGN-SAMZA-348-0.md, DESIGN-SAMZA-348-0.pdf
>
>
> Samza's existing config setup is problematic for a number of reasons:
> # It's completely immutable once a job starts. This prevents any dynamic 
> reconfiguration and auto-scaling. It is debatable whether we want these 
> feature or not, but our existing implementation actively prevents it. See 
> SAMZA-334 for discussion.
> # We pass existing configuration through environment variables. YARN exports 
> environment variables in a shell script, which limits the size to the varargs 
> length on the machine. This is usually ~128KB. See SAMZA-333 and SAMZA-337 
> for details.
> # User-defined configuration (the Config object) and programmatic 
> configuration (checkpoints and TaskName:State mappings (see SAMZA-123)) are 
> handled differently. It's debatable whether this makes sense.
> In SAMZA-123, [~jghoman] and I propose implementing a ConfigLog. This log 
> would replace both the checkpoint topic and the existing config environment 
> variables in SamzaContainer and Samza's YARN AM.
> I'd like to keep this ticket's scope limited to just the implementation of 
> the ConfigLog, and not re-designing how Samza's config is used in the code 
> (SAMZA-40). We should, however, discuss how this feature would affect dynamic 
> reconfiguration/auto-scaling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to