Hi guys, We are trying to run our pipeline using direct runner and the input dataset is a large amount of HDFS files (few hundred of GB data)
We experienced OOM issue crash. Then inside the direct runner document, I realized direct runner loads the whole dataset into the memory. Is there any way we can avoid this OOM issue? Regards ------------------------------------------------------------- Wilson(Xiaoshuang) Wang Sr. Software Engineer