Hi Ryan,
Thanks for the pointers
On Thu, Nov 7, 2019 at 8:13 AM Ryan Blue wrote:
> Hi Andrew,
>
> This is expected behavior for DSv2 in 2.4. A separate reader is configured
> for each operation because the configuration will change. A count, for
> example, doesn't need to project any columns,
Hi Andrew,
This is expected behavior for DSv2 in 2.4. A separate reader is configured
for each operation because the configuration will change. A count, for
example, doesn't need to project any columns, but a count distinct will.
Similarly, if your read has different filters we need to apply
Hello,
During testing of our DSv2 implementation (on 2.4.3 FWIW), it appears that
our DataSourceReader is being instantiated multiple times for the same
dataframe. For example, the following snippet
Dataset df = spark
.read()