I would like to know how we should handle the two Kinesis-related modules in
Spark 4.0. They have a very low frequency of code updates, and because the
corresponding tests are not continuously executed in any GitHub Actions
pipeline, so I think they significantly lack quality assurance. On top
Hello Wenchen,
On Wed, Aug 16, 2023 at 23:33 Wenchen Fan wrote:
> > is there a way to hint to the downstream users on the number of rows
> expected to write?
>
> It will be very hard to do. Spark pipelines the execution (within shuffle
> boundaries) and we can't predict the number of final
> is there a way to hint to the downstream users on the number of rows
expected to write?
It will be very hard to do. Spark pipelines the execution (within shuffle
boundaries) and we can't predict the number of final output rows.
On Mon, Aug 7, 2023 at 8:27 PM Steve Loughran
wrote:
>
>
> On
赵军