Re: Possible issue with bounded Read translation using SDF

2020-12-18 Thread Steve Niemitz
I think this actually the same problem as I reported w/ the PubsubIO [1], but in the bounded case. The BoundedSourceAsSDFWrapper closes (and then re-creates) the underlying source each time it checkpoints, and the default behavior is to checkpoint very frequently. [1] https://lists.apache.org/thr

Possible issue with bounded Read translation using SDF

2020-12-18 Thread Ismaël Mejía
Hello, I was trying to profile some pipeline using Java's direct runner. It reads ~30 60MB text files (CSV). When I started the profiler it reported more than 40K instances of TextSource being built which really surprised me given the small size of the data being processed. I wonder if I found may