> > I don't quite understand why exposing it indirectly through a typed > interface should be delayed before finalizing the API. >
Spark has a long history <https://spark-project.atlassian.net/browse/SPARK-1094> of maintaining binary compatibility in its public APIs. I strongly believe this is one of the things that has made the project successful. Exposing internals that we know are going to change in the primary user facing API for creating Streaming DataFrames seems directly counter to this goal. I think the argument that "you can do it anyway" fails to capture user expectations who probably aren't closely following this discussion. If advanced users want to dig though the code and experiment, great. I hope they report back on whats good and what can be improved. However, if you add the function suggested in the PR to DataStreamReader, you are giving them a bad experience by leaking internals that don't even show up in the published documentation.