David Chen created SAMZA-416:
--------------------------------

             Summary: Samza Configuration DSL
                 Key: SAMZA-416
                 URL: https://issues.apache.org/jira/browse/SAMZA-416
             Project: Samza
          Issue Type: New Feature
            Reporter: David Chen


The user-facing language for Samza configurations is currently Java Properties. 
While this works for the time being and simple to implement, it is verbose and 
cumbersome for the end user.

Since SAMZA-40 is opened to refactor the way Samza configuration is wired and 
SAMZA-348 is opened to allow Samza jobs to be configured through a stream, it 
is thus natural to consider a scripting model for configuring Samza jobs.

One approach would be to implement a DSL in a scripting language like Python. 
Implementing a Python DSL is actually very easy to do, and there are many 
examples of Python DSLs in non-trivial use cases, most notably, [Google's build 
system, 
Blaze|http://google-engtools.blogspot.com/2011/08/build-in-cloud-how-build-system-works.html].
 I found [a GitHub Gist containing a more detailed example of a Blaze BUILD 
file|https://gist.github.com/wiseman/3834928]. Blaze also inspired a number of 
open source clones, most notably [Twitter Pants|http://pantsbuild.github.io/].

Another idea, by [~jonbringhurst] from SAMZA-348:

> Regarding the client commands to create and modify Samza jobs (such as 
> configure-job and run-job), it may be useful to review existing commands that 
> perform a similar role:
> My personal favorite is Slurm's set of client commands, of which sbatch is 
> probably the most relevant (http://www.schedmd.com/slurmdocs/sbatch.html).
To go back a bit further in history, it might be a good idea to take a look at 
the POSIX qsub style command from PBS/Torque 
(http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qsub.htm).
 Moab's msub also follows this design.
Regarding a possible DSL for building configuration, it may be useful to look 
at Slurm's lua callback for job configuration (warning, GPLv2 code) 
https://github.com/SchedMD/slurm/blob/master/contribs/lua/job_submit.lua

We could, of course, have both a command line tool and a DSL, and I am sure 
once Samza takes off, there will be people implementing DSLs for other language 
clients as well. The key would be to implement a standard interface for these 
different implementations to talk to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to