Hi All,
Just wanted to check in to see if anyone has any insight about this
behavior. Any pointers would help.
Thanks,
Rishi
On Fri, Jun 14, 2019 at 7:05 AM Rishi Shah wrote:
> Hi All,
>
> Recently we noticed that countDistinct on a larger dataframe doesn't
> always return the same value. Any
Hi Gabor,
sure, the DSv2 seems to be undergoing backward-incompatible changes from
Spark 2 -> 3 though, right? That combined with the fact that the API is
pretty new still doesn't instill confidence in its stability (API wise I
mean).
Cheers,
Lars
On Fri, Jun 28, 2019 at 4:10 PM Gabor Somogyi
Hi Lars,
DSv2 already used in production.
Documentation, well since Spark evolving fast I would take a look at how
the built-in connectors implemented.
BR?
G
On Fri, Jun 28, 2019 at 3:52 PM Lars Francke wrote:
> Gabor,
>
> thank you. That is immensely helpful. DataSource v1 it is then. Does
Gabor,
thank you. That is immensely helpful. DataSource v1 it is then. Does that
mean DSV2 is not really for production use yet?
Any idea what the best documentation would be? I'd probably start by
looking at existing code.
Cheers,
Lars
On Fri, Jun 28, 2019 at 1:06 PM Gabor Somogyi
wrote:
>
Hi Lars,
Since Structured Streaming doesn't support receivers at all so that
source/sink can't be used.
Data source v2 is under development and because of that it's a moving
target so I suggest to implement it with v1 (unless special features are
required from v2).
Additionally since I've just