It might be worth taking a look at how Beam does test streams. The API is
more powerful than just passing in a queue, e.g.:
TestStream<String> source = TestStream.create(StringUtf8Coder.of())
.addElements(TimestampedValue.of("this", start))
.addElements(TimestampedValue.of("that", start))
.addElements(TimestampedValue.of("future",
start.plus(Duration.standardMinutes(1))))
.advanceProcessingTime(Duration.standardMinutes(3))
.advanceWatermarkTo(start.plus(Duration.standardSeconds(30)))
.advanceWatermarkTo(start.plus(Duration.standardMinutes(1)))
.advanceWatermarkToInfinity();
---
BTW, have we given up on the idea of a simpler input system, e.g. one that
assumes all input messages are keyed? It seems it would be possible to
support legacy "system streams" via an adapter that mapped K, V -> V' and
could open the possibility of inputs in whatever for users want, e.g.
(again from Beam):
final Create.Values<String> values = Create.of("test", "one", "two", "three");
final TextIO.Read.Bound from = TextIO.Read.from("src/main/resources/words.txt");
final KafkaIO.Read<Integer, Long> reader = KafkaIO.<Integer, Long>read()
.withBootstrapServers("myServer1:9092,myServer2:9092")
.withTopics(topics)
.withConsumerFactoryFn(new ConsumerFactoryFn(
topics, 10, numElements, OffsetResetStrategy.EARLIEST))
.withKeyCoder(BigEndianIntegerCoder.of())
.withValueCoder(BigEndianLongCoder.of())
.withMaxNumRecords(numElements);
Ideally, such a simple input system specification would be useable in
production as well as test. At that point I don't know if we need a
separate TestApplicationRunner except perhaps as a hint to what we've
been calling an Environment?
---
Aren't we supposed to be able to run applications without blocking (e.g.
for embedded cases)? The API suggests that run is going to be a blocking
call?
- Chris
On Thu, Apr 20, 2017 at 1:06 PM, Jacob Maes <[email protected]> wrote:
> Thanks for the SEP!
>
> +1 on introducing these new components
> -1 on the current definition of their roles (see Design feedback below)
>
> *Design*
>
> - If LocalJobRunner and RemoteJobRunner handle the different methods of
> launching a Job, what additional value do the different types of
> ApplicationRunner and RuntimeEnvironment provide? It seems like a red
> flag
> that all 3 would need to change from environment to environment. It
> indicates that they don't have proper modularity. The
> call-sequence-figures
> support this; LocalApplicationRunner and RemoteApplicationRunner make
> the
> same calls and the diagram only varies after jobRunner.start()
> - As far as I can tell, the only difference between Local and Remote
> ApplicationRunner is that one is blocking and the other is
> non-blocking. If
> that's all they're for then either the names should be changed to
> reflect
> this, or they should be combined into one ApplicationRunner and just
> expose
> separate methods for run() and runBlocking()
> - There isn't much detail on why the main() methods for Local/Remote
> have such different implementations, how they receive the Application
> (direct vs config), and concretely how the deployment scripts, if any,
> should interact with them.
>
>
> *Style*
>
> - nit: None of the 11 uses of the word "actual" in the doc are
> *actually*
> needed. :-)
> - nit: Colors of the runtime blocks in the diagrams are unconventional
> and a little distracting. Reminds me of nai won bao. Now I'm hungry. :-)
> - Prefer the name "ExecutionEnvironment" over "RuntimeEnvironment". The
> term "execution environment" is used
> - The code comparisons for the ApplicationRunners are not apples-apples.
> The local runner example is an application that USES the local runner.
> The
> remote runner example is the just the runner code itself. So, it's not
> readily apparent that we're comparing the main() methods and not the
> application itself.
>
>
> On Mon, Apr 17, 2017 at 5:02 PM, Yi Pan <[email protected]> wrote:
>
> > Made some updates to clarify the role and functions of RuntimeEnvironment
> > in SEP-2.
> >
> > On Fri, Apr 14, 2017 at 9:30 AM, Yi Pan <[email protected]> wrote:
> >
> > > Hi, everyone,
> > >
> > > In light of new features such as fluent API and standalone that
> introduce
> > > new deployment / application launch models in Samza, I created a new
> > SEP-2
> > > to address the new use cases. SEP-2 link: https://cwiki.apache.
> > > org/confluence/display/SAMZA/SEP-2%3A+ApplicationRunner+Design
> > >
> > > Please take a look and give feedbacks!
> > >
> > > Thanks!
> > >
> > > -Yi
> > >
> >
>