Hi, Dominik, Thanks for your reply.
> 2. The schema guess step is currently mandatory (in most cases where we > connect to machines, data is already available), but it would be nice to > support both - e.g., users could upload an example or just manually define > the schema. We can work on a concept for this if you want! Get it. That's a good idea. I will try to investigate it and start a discussion when I have the initial concept. > I forgot to mention that we already have some e2e tests for third-party > components under [1]. These tests create a data sink and an adapter and check > that data sent by the sink is received by the adapter. That's great. I can start to add the integration test for the pulsar component. I have created a PR to refactor the pulsar sink component: [0]. Please help review it and feel free to comment on it when you have time. Thanks. In addition, I find another point we can improve when I write this PR. In the current implementation, the pulsar producer publishes messages in a synchronous way, which makes it impossible to leverage the advantage of the batch sending of the pulsar producer. I think we can add an option for the user to choose whether to send synchronously or asynchronously. [0] https://github.com/apache/incubator-streampipes/pull/107 Thanks, Zike Yang On Thu, Aug 25, 2022 at 10:41 PM Dominik Riemer <[email protected]> wrote: > > Hi Zike, > > just one addition concerning tests: > I forgot to mention that we already have some e2e tests for third-party > components under [1]. These tests create a data sink and an adapter and check > that data sent by the sink is received by the adapter. > We would just need to add Pulsar to the validation docker compose file in the > project root which is used for running the e2e tests. > > Cheers > Dominik > > [1] > https://github.com/apache/incubator-streampipes/tree/dev/ui/cypress/tests/thirdparty > > > -----Original Message----- > From: Zike Yang <[email protected]> > Sent: Wednesday, August 24, 2022 4:53 PM > To: [email protected] > Subject: Re: New comer to the Apache Streampipes > > Hi, Dominik > > Thanks for your feedback and your helpful information. Today I set up my data > sinks development environment, and it worked fine. Next, I will start working > on refactoring the old three-class implementation. > > However, I have some questions: > 1. I see that other data sink modules also use the old three-class > implementation like the kafka data sink or rabbitmq data sink. We will > refactor them too, right? > 2. When I tried to create my adapter, I found that it must guess the schema > before creating the adapter. It's mandatory, and it would only work when > there were some data in the data source. Why do we need to guess the schema > when creating the adapter? Can we make it optional? > Because I think that in many cases the data source does not have pre-stored > data when creating the adapter. > 3. How could I write tests to verify my code changes? For my current task, I > think I could write some unit tests by mocking the pulsar client. What's your > thought? > > Thanks for your kind guidance. I am excited to contribute to this project. > > Thanks, > Zike Yang > > On Wed, Aug 24, 2022 at 4:20 AM Dominik Riemer <[email protected]> > wrote: > > > > Hi Zike, > > > > great to hear that you want to contribute! > > > > A good start can be to setup your development environment and to improve > > some connectors or sinks. Improving the pulsar components would be great as > > these are some rather basic implementations we did at ApacheCon. > > > > For development, the best setup is to use the CLI tool [1] which can be > > configured to for different development targets. E.g., "dev" mode can be > > used to develop core services, UI and extensions so that only other > > mandatory services are started in Docker, while "pipeline-element" mode > > starts also the core and UI in Docker which is useful in case you are only > > developing pipeline elements. There are a few environment variables which > > might be needed to set and I'm happy to help with that. > > > > If you are interested in improving the Pulsar components, a good start > > could be to take the Pulsar sink [2] and refactor the old three-class > > implementation used there to the one-class-implementation described in the > > documentation [3]. Other cool things would be to upgrade the Pulsar version > > to the latest version and it would be also great to support more advanced > > options, e.g., authentication or topic discovery so that users see a list > > of available topics, or whatever you think would be a good configuration > > option for Pulsar users. Similar things would also be great for the > > connector. > > > > I'm happy to guide you through the first steps, feel free to ask here or > > join the ASF StreamPipes Slack channel for quick questions! > > > > Cheers > > Dominik > > > > [1] > > https://github.com/apache/incubator-streampipes/tree/dev/installer/cli > > [2] > > https://github.com/apache/incubator-streampipes/tree/dev/streampipes-e > > xtensions/streampipes-sinks-brokers-jvm/src/main/java/org/apache/strea > > mpipes/sinks/brokers/jvm/pulsar [3] > > https://streampipes.apache.org/docs/docs/extend-tutorial-data-sinks.ht > > ml > > > > -----Original Message----- > > From: Zike Yang <[email protected]> > > Sent: Tuesday, August 23, 2022 5:50 PM > > To: [email protected] > > Subject: New comer to the Apache Streampipes > > > > Hi, Apache Streampipes community > > > > I am a software engineer and a committer from the Apache Pulsar community. > > > > I recently learned about this project. I read some documentation and tried > > to use it myself and found that it's an excellent project. I'm interested > > in it. I wish I could contribute to this project. > > > > Currently, I'm familiar with the Apache Pulsar. I see that this project > > also uses Apache Pulsar as a data source connector and sink. I think I can > > get started by contributing to those modules. > > > > Are there other learning resources that can help me deepen my understanding > > of this project? > > Is there anything else I can start working on? > > Very appreciate it if you could provide me with more information. > > > > Thanks, > > Zike Yang
