Hi all! (I've recently joined the Beam community and I wanted to take this opportunity to introduce myself. I previously worked on the Dataflow team and I'm transitioning to working on Beam. I'm looking forward to getting started.)
The Sources and Runners discussion seemed to be headed in two directions, so I thought I would split part of that conversation into a new thread to address Aljoscha's question of "Should we maybe add integration tests that verify that all runners can correctly read from and write to an external system in a complete Pipeline"? [1] Having embedded data services run along side the runner seems like an expedient way to get some test coverage on runner's interactions with sources & sinks, and it'd be an important part of the pre-commits or post-commits for the runners. I'm also interested in a problem related to what Aljoscha raised and am starting to investigate having a cluster of machines available to run other data services (HDFS, mongodb, redis, ActiveMQ, etc...) so we can get good integration test coverage on the connectors themselves. I'm excited to hear about the work JB has done in this area [2] and I'd be building off of that/learning from that. My goal would be to have automated integration tests running against real instances of the data services. JB has been working with mesos+marathon - in addition to those, I'm taking a quick look at kubernetes and docker swarm to see what would be easiest to maintain, what the tradeoffs are, etc... If folks have experience they'd like to share with those tools, or other tools worth looking into, I'd be excited to hear about it. As Dan previously mentioned [3], we would also need to have a cluster available for performance testing anyway, so this is just adding more data services to run on the cluster we likely already need. [1] - https://lists.apache.org/thread.html/0b15378d4b85e55e1e76c3c7ae8933a4de02171442c63f3cec7d1c8b@%3Cdev.beam.apache.org%3E [2] - https://lists.apache.org/thread.html/7b5e9c4e21f5d3a0698db2edf1422681afd64b40e7eb1e588c49e59d@%3Cdev.beam.apache.org%3E [3] - https://lists.apache.org/thread.html/ff083837b839cb3b90944d0db998d36aeee76008cec8e42c98490174@%3Cdev.beam.apache.org%3E Thanks! Stephen