[
https://issues.apache.org/jira/browse/SAMZA-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127739#comment-14127739
]
Martin Kleppmann commented on SAMZA-40:
---------------------------------------
bq. What about a way to specify a DAG for the job?
This would effectively mean introducing a "topology" concept to Samza, which I
think would be a good thing. Often several jobs are quite tightly coupled —
e.g. one repartitions a stream in a particular way for another to consume. It
makes sense to deploy such DAGs of jobs together. (In YARN terms I think they
should still be separate jobs, so this would just be a shortcut for launching
several jobs in one go, perhaps with some wiring.)
However, this could also be a feature outside of core Samza, analogous to Oozie
or Azkaban for MapReduce. It could potentially grow into a web dashboard with
visualisation of the data flow in the cluster (cf. SAMZA-300).
bq. Can sensible defaults make the config less verbose?
+1. I found the frequent occurrence of Java factory classnames in the config a
bit intimidating at first. As a suggestion, we could register a default set of
serdes, systems etc. under sensible names ("json", "kafka", etc). Anyone who
wants to write their own factories should still be able to plug them in, but
some simple defaults would make a simple job's config less scary-looking and
less error-prone.
> Refactor Samza configuration
> ----------------------------
>
> Key: SAMZA-40
> URL: https://issues.apache.org/jira/browse/SAMZA-40
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.6.0
> Reporter: Chris Riccomini
> Labels: project
>
> Samza's configuration system has several problems that we need to resolved.
> * Want to auto-generate documentation based off of configuration.
> * Should support global defaults for a config property. Right now, we do
> config.getFoo.getOrElse() everywhere.
> * Should validate config up front, rather than thrown runtime exceptions
> randomly throughout the code.
> * We are mixing wiring and configuration together. How do other systems
> handle this?
> * We have fragmented configuration (anybody can define configuration). How do
> other systems handle this?
> * How to handle undefined configuration? How to make this interoperable with
> both Java and Scala (i.e. should we support Option in Scala)?
> * Should remain immutable.
> * Should remove implicits. It's just confusing.
> * Do we want to support complex types (list, map) for values, not just String?
> We need a design proposal for this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)