Hi all i am currently getting acquainted with Apache beam to replace my current workflow, and was wondering if Beam can handle it. Currently, my workflow is based entirely on python asyncio plus some groupby operations, and it consists of the following
- have a list of remote directories from which i need to download a file - file has same name across directories - for each of the file above, i need to scan the content (which is itself a list of remote file paths) - for each of the file paths above i need to extract the content to a list of string - i need to do a reducebYkey operation out of all the lists extracted above To me, it seems suitable... the only thing that concerns me is that i probably have to drop asyncio.... Could anyone advise? kind regards Marco