Re: End to end unit tests for stateful pipeline

2021-06-22 Thread Luke Cwik
I have also seen a DefaultValueFactory which looks at another setting
stating the class name it should instantiate and invoke to create the
object[1]. This way you don't need to make it serializable, you just need
the factory that creates the test instance available on the class path.

1:
https://github.com/apache/beam/blob/cf8ffe660cfcb1f7d421171f406fa991b93e043b/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.java#L164

On Thu, Jun 17, 2021 at 1:06 PM gaurav mishra 
wrote:

> Thanks Luke. That kind of worked. But to make the serialization and
> deserialization work I had to put some test code into production code.
> ProxyInvocationHandler.ensureSerializable() tries to serialize and
> deserialize my `TestRedisClient`. But since the return type of
> getRedisClient() in  ProductionPipelineOptions is an
> interface `RedisClient`,  jackson cannot deserialize the given string to an
> instance of `TestRedisClient` . So to force Jackson to instantiate the
> correct instance of RedisClient I had to move `TestRedisClient` in the code
> package where interface `RedisClient` lives. And had to add a couple of
> annotations on interface like
> @JsonTypeInfo(...)
> @JsonSubTypes({@Type(value = TestRedisClient.class, name =
> "testRedisClient")})
>
> Maybe there is still a better way to do this without having to mix my test
> related classes in the real code.
>
> On Wed, Jun 16, 2021 at 2:12 PM Luke Cwik  wrote:
>
>> In your test I would have expected to have seen something like:
>> ```
>> // create instance of TestRedisClient which is serializable
>> RedisClient testClient = createTestRedisClient();
>> ... setup any expected interactions or test data on testClient ...
>> options = PipelineOptionsFactory.as(ProductionPipelineOptions.class);
>> options.setRedisClient(testClient);
>> pipeline.run(options);
>> ```
>>
>> Your DoFn as is looks fine.
>>
>> On Mon, Jun 14, 2021 at 10:27 PM gaurav mishra <
>> gauravmishra.it...@gmail.com> wrote:
>>
>>> Hi Luke,
>>> I tried going down the path which you suggested but hitting some
>>> roadblocks. Maybe I am doing something wrong. As you said I created a unit
>>> test specific class for PipelineOptions, created a TestRedisFactory which
>>> is setup to return a mock instance of RedisClient. In my test code I have
>>>  ```
>>> options = PipelineOptionsFactory.as(TestPipelineOptions.class);
>>> // get instance of TestRedisClient which is serializable
>>> RedisClient client = options.getRedisClient();
>>> // some code to setup mocked interactions
>>> pipeline.run(options);
>>> ```
>>>
>>> In my DoFn I have
>>> ```
>>> ProductionPipelineOptions pipelineOptions =
>>> context.getPipelineOptions().as(ProductionPipelineOptions.class);
>>>
>>> // get instance of RealRedisClient
>>> RedisClient redisClient = pipelineOptions.getRedisClient();
>>> redisClient.get(key)
>>> ```
>>> In unit test my options is getting serialized along with the
>>> TestRedisClient inside it. But when my DoFn is being called the framework
>>> tries to deserialize the string representation of `TestRedisClient` to
>>> `something that implements RedisClient` and this is where I am getting
>>> stuck. Not able to wrap my head around how to tell the framework to
>>> deserialize the string to TestRedisClient and return that in my DoFn.
>>>
>>> On Mon, Jun 14, 2021 at 2:07 PM Luke Cwik  wrote:
>>>
 You can create a PipelineOption which represents your Redis client
 object. For tests you would set the PipelineOption to a serializable
 fake/mock that can replay the results you want. The default for the
 PipelineOption object would instantiate your production client. You can see
 an example usage of the DefaultValueFactory here[1].

 1:
 https://github.com/apache/beam/blob/5cebe0fd82ade3f957fe70e25aa3e399d2e91b32/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectOptions.java#L71

 On Mon, Jun 14, 2021 at 10:54 AM gaurav mishra <
 gauravmishra.it...@gmail.com> wrote:

> Hi,
> I have a streaming pipeline which reads from pubsub, enriches data
> using redis and finally writes to pubsub. The code has some stateful DoFns
> with timers. I wanted to write unit tests for the whole pipeline, that
> reads from TestStream<> , enriches data using a mocked redis client, and
> writes data to a PCollection on which I can do PAsserts. The trouble I am
> having here is how to set up the mocked redis client. Are there any
> examples that I can take a look at? I am using java with junit4 as a
> testing framework.
> More details about my code are here -
> https://stackoverflow.com/questions/67963189/unit-tests-apache-beam-stateful-pipeline-with-external-dependencies
>



Re: End to end unit tests for stateful pipeline

2021-06-17 Thread gaurav mishra
Thanks Luke. That kind of worked. But to make the serialization and
deserialization work I had to put some test code into production code.
ProxyInvocationHandler.ensureSerializable() tries to serialize and
deserialize my `TestRedisClient`. But since the return type of
getRedisClient() in  ProductionPipelineOptions is an
interface `RedisClient`,  jackson cannot deserialize the given string to an
instance of `TestRedisClient` . So to force Jackson to instantiate the
correct instance of RedisClient I had to move `TestRedisClient` in the code
package where interface `RedisClient` lives. And had to add a couple of
annotations on interface like
@JsonTypeInfo(...)
@JsonSubTypes({@Type(value = TestRedisClient.class, name =
"testRedisClient")})

Maybe there is still a better way to do this without having to mix my test
related classes in the real code.

On Wed, Jun 16, 2021 at 2:12 PM Luke Cwik  wrote:

> In your test I would have expected to have seen something like:
> ```
> // create instance of TestRedisClient which is serializable
> RedisClient testClient = createTestRedisClient();
> ... setup any expected interactions or test data on testClient ...
> options = PipelineOptionsFactory.as(ProductionPipelineOptions.class);
> options.setRedisClient(testClient);
> pipeline.run(options);
> ```
>
> Your DoFn as is looks fine.
>
> On Mon, Jun 14, 2021 at 10:27 PM gaurav mishra <
> gauravmishra.it...@gmail.com> wrote:
>
>> Hi Luke,
>> I tried going down the path which you suggested but hitting some
>> roadblocks. Maybe I am doing something wrong. As you said I created a unit
>> test specific class for PipelineOptions, created a TestRedisFactory which
>> is setup to return a mock instance of RedisClient. In my test code I have
>>  ```
>> options = PipelineOptionsFactory.as(TestPipelineOptions.class);
>> // get instance of TestRedisClient which is serializable
>> RedisClient client = options.getRedisClient();
>> // some code to setup mocked interactions
>> pipeline.run(options);
>> ```
>>
>> In my DoFn I have
>> ```
>> ProductionPipelineOptions pipelineOptions =
>> context.getPipelineOptions().as(ProductionPipelineOptions.class);
>>
>> // get instance of RealRedisClient
>> RedisClient redisClient = pipelineOptions.getRedisClient();
>> redisClient.get(key)
>> ```
>> In unit test my options is getting serialized along with the
>> TestRedisClient inside it. But when my DoFn is being called the framework
>> tries to deserialize the string representation of `TestRedisClient` to
>> `something that implements RedisClient` and this is where I am getting
>> stuck. Not able to wrap my head around how to tell the framework to
>> deserialize the string to TestRedisClient and return that in my DoFn.
>>
>> On Mon, Jun 14, 2021 at 2:07 PM Luke Cwik  wrote:
>>
>>> You can create a PipelineOption which represents your Redis client
>>> object. For tests you would set the PipelineOption to a serializable
>>> fake/mock that can replay the results you want. The default for the
>>> PipelineOption object would instantiate your production client. You can see
>>> an example usage of the DefaultValueFactory here[1].
>>>
>>> 1:
>>> https://github.com/apache/beam/blob/5cebe0fd82ade3f957fe70e25aa3e399d2e91b32/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectOptions.java#L71
>>>
>>> On Mon, Jun 14, 2021 at 10:54 AM gaurav mishra <
>>> gauravmishra.it...@gmail.com> wrote:
>>>
 Hi,
 I have a streaming pipeline which reads from pubsub, enriches data
 using redis and finally writes to pubsub. The code has some stateful DoFns
 with timers. I wanted to write unit tests for the whole pipeline, that
 reads from TestStream<> , enriches data using a mocked redis client, and
 writes data to a PCollection on which I can do PAsserts. The trouble I am
 having here is how to set up the mocked redis client. Are there any
 examples that I can take a look at? I am using java with junit4 as a
 testing framework.
 More details about my code are here -
 https://stackoverflow.com/questions/67963189/unit-tests-apache-beam-stateful-pipeline-with-external-dependencies

>>>


Re: End to end unit tests for stateful pipeline

2021-06-14 Thread gaurav mishra
Hi Luke,
I tried going down the path which you suggested but hitting some
roadblocks. Maybe I am doing something wrong. As you said I created a unit
test specific class for PipelineOptions, created a TestRedisFactory which
is setup to return a mock instance of RedisClient. In my test code I have
 ```
options = PipelineOptionsFactory.as(TestPipelineOptions.class);
// get instance of TestRedisClient which is serializable
RedisClient client = options.getRedisClient();
// some code to setup mocked interactions
pipeline.run(options);
```

In my DoFn I have
```
ProductionPipelineOptions pipelineOptions =
context.getPipelineOptions().as(ProductionPipelineOptions.class);

// get instance of RealRedisClient
RedisClient redisClient = pipelineOptions.getRedisClient();
redisClient.get(key)
```
In unit test my options is getting serialized along with the
TestRedisClient inside it. But when my DoFn is being called the framework
tries to deserialize the string representation of `TestRedisClient` to
`something that implements RedisClient` and this is where I am getting
stuck. Not able to wrap my head around how to tell the framework to
deserialize the string to TestRedisClient and return that in my DoFn.

On Mon, Jun 14, 2021 at 2:07 PM Luke Cwik  wrote:

> You can create a PipelineOption which represents your Redis client object.
> For tests you would set the PipelineOption to a serializable fake/mock that
> can replay the results you want. The default for the PipelineOption object
> would instantiate your production client. You can see an example usage of
> the DefaultValueFactory here[1].
>
> 1:
> https://github.com/apache/beam/blob/5cebe0fd82ade3f957fe70e25aa3e399d2e91b32/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectOptions.java#L71
>
> On Mon, Jun 14, 2021 at 10:54 AM gaurav mishra <
> gauravmishra.it...@gmail.com> wrote:
>
>> Hi,
>> I have a streaming pipeline which reads from pubsub, enriches data using
>> redis and finally writes to pubsub. The code has some stateful DoFns with
>> timers. I wanted to write unit tests for the whole pipeline, that reads
>> from TestStream<> , enriches data using a mocked redis client, and writes
>> data to a PCollection on which I can do PAsserts. The trouble I am having
>> here is how to set up the mocked redis client. Are there any examples that
>> I can take a look at? I am using java with junit4 as a testing framework.
>> More details about my code are here -
>> https://stackoverflow.com/questions/67963189/unit-tests-apache-beam-stateful-pipeline-with-external-dependencies
>>
>


Re: End to end unit tests for stateful pipeline

2021-06-14 Thread Luke Cwik
You can create a PipelineOption which represents your Redis client object.
For tests you would set the PipelineOption to a serializable fake/mock that
can replay the results you want. The default for the PipelineOption object
would instantiate your production client. You can see an example usage of
the DefaultValueFactory here[1].

1:
https://github.com/apache/beam/blob/5cebe0fd82ade3f957fe70e25aa3e399d2e91b32/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectOptions.java#L71

On Mon, Jun 14, 2021 at 10:54 AM gaurav mishra 
wrote:

> Hi,
> I have a streaming pipeline which reads from pubsub, enriches data using
> redis and finally writes to pubsub. The code has some stateful DoFns with
> timers. I wanted to write unit tests for the whole pipeline, that reads
> from TestStream<> , enriches data using a mocked redis client, and writes
> data to a PCollection on which I can do PAsserts. The trouble I am having
> here is how to set up the mocked redis client. Are there any examples that
> I can take a look at? I am using java with junit4 as a testing framework.
> More details about my code are here -
> https://stackoverflow.com/questions/67963189/unit-tests-apache-beam-stateful-pipeline-with-external-dependencies
>


End to end unit tests for stateful pipeline

2021-06-14 Thread gaurav mishra
Hi,
I have a streaming pipeline which reads from pubsub, enriches data using
redis and finally writes to pubsub. The code has some stateful DoFns with
timers. I wanted to write unit tests for the whole pipeline, that reads
from TestStream<> , enriches data using a mocked redis client, and writes
data to a PCollection on which I can do PAsserts. The trouble I am having
here is how to set up the mocked redis client. Are there any examples that
I can take a look at? I am using java with junit4 as a testing framework.
More details about my code are here -
https://stackoverflow.com/questions/67963189/unit-tests-apache-beam-stateful-pipeline-with-external-dependencies