Contribution help needed for sub-tasks of an umbrella JIRA - port *.sql tests to improve coverage of Python, Pandas, Scala UDF cases

2019-07-08 Thread Hyukjin Kwon
Hi all,

I am currently targeting to improve Python, Pandas UDFs Scala UDF test
cases by integrating our existing *.sql files at
https://issues.apache.org/jira/browse/SPARK-27921

I would appreciate that anyone who's interested in Spark contribution takes
some sub-tasks. It's too many for me to do :-). I am doing one by one for
now.

I wrote some guides about this umbrella JIRA specifically so if you're able
to follow it very closely one by one, I think the process itself isn't that
difficult.

The most import guide that should be carefully addressed is:
> 7. If there are diff, analyze it, file or find the JIRA, skip the tests
with comments.

Thanks!


disable checkpointing in structured streaming

2019-07-08 Thread Charles vinodh
Hi ,

is it possible to disable checkpointing in structured streaming and have it
replaced by our own checkpointing implementation where the offsets are
saved in an external database?.. I looked up the docs and it seems this is
supported on spark DStream streaming but not in structured streaming for
some reason. Is that the case and if so why is this not possible with
structured streaming? is it possible to have this feature in the future?

Thanks,
Charles


Re: Opinions wanted: how much to match PostgreSQL semantics?

2019-07-08 Thread Marco Gaido
Hi Sean,

Thanks for bringing this up. Honestly, my opinion is that Spark should be
fully ANSI SQL compliant. Where ANSI SQL compliance is not an issue, I am
fine following any other DB. IMHO, we won't get anyway 100% compliance with
any DB - postgres in this case (e.g. for decimal operations, we are
following SQLServer, and postgres behaviour would be very hard to meet) -
so I think it is fine that PMC members decide for each feature whether it
is worth to support it or not.

Thanks,
Marco

On Mon, 8 Jul 2019, 20:09 Sean Owen,  wrote:

> See the particular issue / question at
> https://github.com/apache/spark/pull/24872#issuecomment-509108532 and
> the larger umbrella at
> https://issues.apache.org/jira/browse/SPARK-27764 -- Dongjoon rightly
> suggests this is a broader question.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Opinions wanted: how much to match PostgreSQL semantics?

2019-07-08 Thread Sean Owen
See the particular issue / question at
https://github.com/apache/spark/pull/24872#issuecomment-509108532 and
the larger umbrella at
https://issues.apache.org/jira/browse/SPARK-27764 -- Dongjoon rightly
suggests this is a broader question.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org