Hello! I am developing a Spark program that uses both batch and streaming (separately). They are both pretty much the exact same programs, except the inputs come from different sources. Unfortunately, RDD's and DStream's define all of their transformations in their own files, and so I have two different files with pretty much the exact same code. If I make a change to a transformation in one program, I have to make the exact same change to the other program. It would be nice to be able to have a third file that has all of my transformations. The batch program and the streaming program can then both reference this third file to know what transformations to perform on the data.
Anyone know a good way of doing this? I want to be able to keep the exact same syntax (......rdd.filter({i:Int=>i*2}.map(.......).....) in this third file. With this method, if I make any changes to the transformations, it will apply to both the batch AND streaming processes. I tried a couple of ideas with no avail. Thanks in advance, Sidd