Hi Zike Yang,

thanks for providing the PR.
I really like the approach with the PuslsarContainer service.

Maybe we should discuss at which level we want to test the adapters.
In general, I guess we can use either an end-to-end approach using the REST API 
and the message broker with the resulting events or use some more lower-level 
APIs to start the adapter and check if the events are processed correctly.
I'm a little more in favor of taking the low-level API approach as it gives us 
fewer dependencies and hopefully makes the tests easier to write. For the 
complete integration we will have the e2e test later.
What do you think?

Here is some general background information how streampipes connect works.
During deployment we have the connect master service running within the backend.
This REST API [1] is responsible for managing all the adapters. The running 
adapter instances are executed within worker containers.
They also have a REST API [2] to start and stop adapters instances.

I think there are several ways to validate that the adapter processed the 
events correctly.
For the PR you used the method ‘getNElements’. Which makes completely sense, 
because it is a synchronous method call, however this method is only available 
for generic adapters.
It is used to get some sample data when guessing the event schema. Later when 
the adapter is started this method is not used, instead the adapters are 
started within the IAdapter.startAdapter method.
In general, we distinguish between generic and specific adapters. The main 
difference between them is that generic adapters support multiple formats.
E.g. events on a broker topic could be json or xml.
I see two options to get the resulting events from the adapter:

  *   Option 1: We start a service for the internal used message broker and 
validate that the events on the result topic are correct
  *   Option 2: All adapters have an internally used preprocessing pipeline 
[3]. The last element of this pipeline is responsible for writing the resulting 
event onto a broker. We could add a testing sink that validates those events. 
So we do not need a broker service.

In general, we can also think about refactoring the connect API. This API is 
rather old and could use some restructuring. This would also improve the 
testability of the code.

One minor comment. The PR validation failed due to some dependency convergence.

I hope those comments are helpful.

Cheers,
Philipp

[1] 
https://github.com/apache/incubator-streampipes/blob/dev/streampipes-rest/src/main/java/org/apache/streampipes/rest/impl/connect/AdapterResource.java#L51
[2] 
https://github.com/apache/incubator-streampipes/blob/dev/streampipes-connect-container-worker/src/main/java/org/apache/streampipes/connect/container/worker/rest/AdapterWorkerResource.java
[3] 
https://github.com/apache/incubator-streampipes/tree/dev/streampipes-connect/src/main/java/org/apache/streampipes/connect/adapter/preprocessing



Von: Zike Yang <[email protected]>
Datum: Mittwoch, 12. Oktober 2022 um 18:18
An: [email protected] <[email protected]>
Betreff: Re: AW: [DISCUSS] Integration test for adapters and sinks
Hi all,

Thanks for your replies. I am very glad that all of you are involved
in this discussion.

I will try to answer your replies here:

> Do you have an idea how we can realize all the user input in the integration 
> tests?
Regarding the input data for tests, I haven't very clear ideas. This
is an interesting point to investigate.

The integration test will interact directly with the
streampipes-backend. Although we can’t test these components at the UI
level in the integration test, interacting via the backend using REST
protocol should be able to simulate all user input.

> Maybe it is even possible to re-use those or similar configuration files
Agree. This will probably be a point of improvement.

> How can we deal with third party dependencies?
We could implement a TestContainer class for each third-party
component. And we manage all dependencies for that component in that
class. We are using the same approach for testing the pulsar sinks and
pulsar sources component in the Apache Pulsar project.[0]

> Maybe we can start with a rather simple example and based on that
> together find a way to implement integration testing for StreamPipes?
Great!

> what do you think would be a good first step?
I have created a PR[1] to demonstrate the integration test framework.
And I have added an example for testing the Pulsar adapter which I am
most familiar with.

The PulsarContainer class manages the pulsar service. And we would
implement the Tester for each component. PulsarAdapterTester is the
tester for the pulsar adapter. We define some testing steps in the
tester. And we execute the actual test in the AdaptersTest class by
calling the third-party testers.

This is a very simple and basic PR. I haven’t dealt with the part of
interaction with the streampipes-backend. Actually, we need to create,
interact, and delete the adapter through the backend instead of using
the specific adapter class. The test in this PR needs to be optimized
in the future.

Regarding talking with the backend, I want to find the easiest way to
create the adapter through the REST protocol. I tried to do that using
this API[2]. But I found that the AdapterDescription is too complex
for me. it would be appreciated If someone could help deal with that
or guide me.

I would also like to discuss a simpler way to take the loaded data
from the adapter for testing through the REST. I haven't found a
suitable approach or interface.

And thank you all again. Please feel free to share your thoughts.

[0] 
https://github.com/apache/pulsar/tree/master/tests/integration/src/test/java/org/apache/pulsar/tests/integration/io
[1] https://github.com/apache/incubator-streampipes/pull/117
[2] 
https://github.com/apache/incubator-streampipes/blob/dev/streampipes-rest/src/main/java/org/apache/streampipes/rest/impl/connect/AdapterResource.java#L51

Best,
Zike Yang

On Wed, Oct 12, 2022 at 2:47 PM Dominik Riemer
<[email protected]> wrote:
>
> Hi,
> sounds great!
> That's a very important issue and a module for integration tests makes 
> totally sense - would be cool to work towards a first example and then extend 
> this to all adapters, processors and sinks.
>
> @Zike Yang what do you think would be a good first step?
>
> Cheers
> Dominik
>
> -----Original Message-----
> From: Tim <[email protected]>
> Sent: Tuesday, October 11, 2022 9:58 PM
> To: [email protected]
> Subject: Re: AW: [DISCUSS] Integration test for adapters and sinks
>
> Hi,
>
> Thank you for initiating this discussion and already providing a solid 
> foundation for integration testing for StreamPipes.
> I completely agree with you and Philipp that testing (both in terms of unit 
> tests and integration tests) is a weak point in our codebase and has been 
> treated stepmotherly so far.
> It would be great to have integration tests for all our adapters and sinks.
> Maybe we can start with a rather simple example and based on that together 
> find a way to implement integration testing for StreamPipes?
>
> I would be happy to contribute to this effort to some extent.
>
> Best
> Tim
>
> Am 11.10.2022 19:15 schrieb Philipp Zehnder:
> > Hi Zike Yang,
> >
> > thanks a lot for opening the discussion. I really like this idea!
> > We started with the e2e tests to have some basic tests, but we
> > definitely need way more test coverage and any help is highly
> > appreciated.
> > To have some kind of integration or unit tests for adapters and
> > processors would be awesome, especially because users can provide a
> > wide range of input configurations.
> >
> > For the e2e tests we tried to provide as much as possible via
> > configuration files see [1]. They describe the configuration for the
> > processors, the input data and the expected output data. This is
> > required to also test more complex patterns in the event stream. Do
> > you have an idea how we can realize all the user input in the
> > integration tests?
> > Maybe it is even possible to re-use those or similar configuration
> > files for both the integration and the e2e tests of processors. I
> > expect the integration tests to be much faster then the e2e tests, but
> > for the e2e tests we can ensure that the user can use the GUI for all
> > the input.
> >
> > How can we deal with third party dependencies? Especially for adapters
> > and data sinks we require other services that must be configured
> > accordingly. Do you have any experience with that how to do it best?
> >
> > I am really looking forward to discuss this further with you.
> >
> > Cheers,
> > Philipp
> >
> > [1]
> > https://github.com/apache/incubator-streampipes/tree/dev/ui/cypress/fi
> > xtures/pipelineElement
> >
> > Von: Zike Yang <[email protected]>
> > Datum: Dienstag, 11. Oktober 2022 um 17:31
> > An: [email protected] <[email protected]>
> > Betreff: [DISCUSS] Integration test for adapters and sinks Hi, all
> >
> > I want to use this thread to discuss the integration tests for
> > adapters and sinks.
> >
> > Currently, there seems to be very little testing for adapters and
> > sinks. It’s not convenient to develop and fix them. We already have
> > the e2e tests for these third-party components[0]. But this requires
> > starting all third-party docker containers before running e2e tests.
> > If there are many adapters and sinks components(actually we already
> > have), it will lead to inconvenient testing. It also seems
> > inconvenient to develop and debug them. I wonder if there is a better
> > way to optimize this testing approach for third-party components.
> >
> > I would like to propose adding the integration test for adapters and
> > sinks. And we could still use the e2e tests to do some small smoke
> > tests.
> >
> > We could add a new module called streampipes-integration-tests. In
> > this module, we do all integration tests for all adapters and sinks
> > and perhaps some other components. For each test, we could start the
> > docker container of the third-party component using Testcontainers[1].
> > And we use the streampipes-backend to create, interact, and delete the
> > component through the Rest protocol. We could abstract some common
> > tests and utilities for all adapters and sinks.
> >
> > Using the streampipes-integration-tests, we can test a large number of
> > adapters and sinks and be able to clean up these docker container
> > resources in a timely manner. It also facilitates the development and
> > debugging. It is still possible to test these components at a higher
> > level(at the backend level).
> >
> > In addition, regarding the unit test, we could also add some unit
> > tests for each adapter and sink at their corresponding modules if
> > necessary.
> >
> > Please feel free to share your thoughts. And I'm interested in making
> > it happen.
> >
> > [0]
> > https://github.com/apache/incubator-streampipes/tree/dev/ui/cypress/te
> > sts/thirdparty
> > [1] https://www.testcontainers.org/
> >
> > Thanks,
> > Zike Yang

Reply via email to