[ 
https://issues.apache.org/jira/browse/SAMZA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138174#comment-14138174
 ] 

David Chen commented on SAMZA-416:
----------------------------------

bq. One thing to consider is how to make this language-agnostic. The 
ConfigStream API will be pluggable and could sit on top of multiple 
implementations (Kafka, HBase, etc). If we have a Python-based DSL, we'd need 
an easy way for Python to write to the ConfigStream.

Agreed. That would be the main challenge. I am still reading through SAMZA-348, 
but will the ConfigStream API live in the AM or on the client side? One 
possibility would be to expose a REST API, which would be more 
language-agnostic.

bq. I'm a bit hesitant to use a build system as an example of how to do 
configuration, since I think the two use cases might be different.

FWIW, the same approach is discussed in [these slides on Creating DSLs in 
Python 
|http://www.slideshare.net/Siddhi/creating-domain-specific-languages-in-python],
 in particular starting on Slide 38 where he discusses creating an internal DSL 
using the same approach for representing HTML.

Another approach purely for making the configuration more hierarchical would be 
to use JSON to represent the configuration.

> Samza Configuration DSL
> -----------------------
>
>                 Key: SAMZA-416
>                 URL: https://issues.apache.org/jira/browse/SAMZA-416
>             Project: Samza
>          Issue Type: New Feature
>            Reporter: David Chen
>
> The user-facing language for Samza configurations is currently Java 
> Properties. While this works for the time being and simple to implement, it 
> is verbose and cumbersome for the end user.
> Since SAMZA-40 is opened to refactor the way Samza configuration is wired and 
> SAMZA-348 is opened to allow Samza jobs to be configured through a stream, it 
> is thus natural to consider a scripting model for configuring Samza jobs.
> One approach would be to implement a DSL in a scripting language like Python. 
> Implementing a Python DSL is actually very easy to do, and there are many 
> examples of Python DSLs in non-trivial use cases, most notably, [Google's 
> build system, 
> Blaze|http://google-engtools.blogspot.com/2011/08/build-in-cloud-how-build-system-works.html].
>  I found [a GitHub Gist containing a more detailed example of a Blaze BUILD 
> file|https://gist.github.com/wiseman/3834928]. Blaze also inspired a number 
> of open source clones, most notably [Twitter 
> Pants|http://pantsbuild.github.io/].
> Another idea, by [~jonbringhurst] from SAMZA-348:
> {quote}
> Regarding the client commands to create and modify Samza jobs (such as 
> configure-job and run-job), it may be useful to review existing commands that 
> perform a similar role:
> My personal favorite is Slurm's set of client commands, of which sbatch is 
> probably the most relevant (http://www.schedmd.com/slurmdocs/sbatch.html).
> To go back a bit further in history, it might be a good idea to take a look 
> at the POSIX qsub style command from PBS/Torque 
> (http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qsub.htm).
>  Moab's msub also follows this design.
> Regarding a possible DSL for building configuration, it may be useful to look 
> at Slurm's lua callback for job configuration (warning, GPLv2 code) 
> https://github.com/SchedMD/slurm/blob/master/contribs/lua/job_submit.lua
> {quote}
> We could, of course, have both a command line tool and a DSL, and I am sure 
> once Samza takes off, there will be people implementing DSLs for other 
> language clients as well. The key would be to implement a standard interface 
> for these different implementations to talk to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to