Hello,

sorry if this is kind of a beginner question to ask, but I couldn't find any documentation on this. I'm using PySpark 2.4.3 running with the Bahir git master, and everything seems to work great, thank you for that.

I didn't do any real scaling tests jet, but I was wondering how the flow of data works with bahir. I have a single DStream created by MQTTUtils.createStream() and this seems to create a single MQTT listener according to my mosquitto logs. So, my question is: is that correct? Did I do something wrong? My original plan was to use some DNS trickery in order to scale beyond what a singe machine is capable of delivering via network, is that still possible? Basically, I wanted a MQTT subscriber per spark worker if that is supported.

Any pointing to some documentation or example even would be greatly appreciated.

Reply via email to