I created https://issues.apache.org/jira/browse/SPARK-21123. PR is welcome.
On Thu, Jun 15, 2017 at 10:55 AM, Shixiong(Ryan) Zhu < [email protected]> wrote: > Good catch. These are file source options. Could you submit a PR to fix > the doc? Thanks! > > On Thu, Jun 15, 2017 at 10:46 AM, Mendelson, Assaf < > [email protected]> wrote: > >> Hi, >> >> I have started to play around with structured streaming and it seems the >> documentation (structured streaming programming guide) does not match the >> actual behavior I am seeing. >> >> It says in the documentation that maxFilesPerTrigger (as well as >> latestFirst) are options for the File sink. However, in fact, at least >> maxFilesPerTrigger does not seem to have any real effect. On the other >> hand, the streaming source (readStream) which has no documentation for this >> option, does limit the number of files. >> >> This behavior actually makes more sense than the documentation as I >> expect the file reader to define how to read files rather than the sink >> (e.g. if I would use a kafka sink or foreach sink, they should still get >> the same behavior from the reading). >> >> >> >> Thanks, >> >> Assaf. >> >> >> > >
