Re: DataSourceV2 : Transactional Write support

2019-08-02 Thread Jungtaek Lim
I asked similar question for end-to-end exactly-once with Kafka, and you're correct distributed transaction is not supported. Introducing distributed transaction like "two-phase commit" requires huge change on Spark codebase and the feedback was not positive. What you could try instead is

Re: DataSourceV2 : Transactional Write support

2019-08-02 Thread Matt Cheah
Can we check that the latest staging APIs work for the JDBC use case in a single transactional write? See https://github.com/apache/spark/pull/24798/files#diff-c9d2f9c9d20452939b7c28ebdae0503dR53 But also acknowledge that transactions from a more traditional RDBMS sense tend to have pretty

Re: [Discuss] Follow ANSI SQL on table insertion

2019-08-02 Thread Matt Cheah
I agree that having both modes and let the user choose the one he/she wants is the best option (I don't see big arguments on this honestly). Once we have this, I don't see big differences on what is the default. What - I think - we still have to work on, is to go ahead with the "strict mode"

Python API for mapGroupsWithState

2019-08-02 Thread Nicholas Chammas
Can someone succinctly describe the challenge in adding the `mapGroupsWithState()` API to PySpark? I was hoping for some suboptimal but nonetheless working solution to be available in Python, as there are with Python UDFs for example, but that doesn't seem to be case. The JIRA ticket for

DataSourceV2 : Transactional Write support

2019-08-02 Thread Shiv Prashant Sood
All, I understood that DataSourceV2 supports Transactional write and wanted to implement that in JDBC DataSource V2 connector ( PR#25211 ). Don't see how this is feasible for JDBC based connector. The FW suggest that EXECUTOR send a commit message

Re: Ask for ARM CI for spark

2019-08-02 Thread shane knapp
i'm out of town, but will answer some of your questions next week. On Fri, Aug 2, 2019 at 2:39 AM bo zhaobo wrote: > > Hi Team, > > Any updates about the CI details? ;-) > > Also, I will also need your kind help about Spark QA test, could any one > can tell us how to trigger that tests? When?

Re: Recognizing non-code contributions

2019-08-02 Thread Sean Owen
Yes, there's an interesting idea that came up on members@: should there be a status in Spark that doesn't include the commit bit or additional 'rights', but is formally recognized by the PMC? An MVP, VIP, Knight of the Apache Foo project. I don't think any other project does this, but don't think

Re: Ask for ARM CI for spark

2019-08-02 Thread bo zhaobo
Hi Team, Any updates about the CI details? ;-) Also, I will also need your kind help about Spark QA test, could any one can tell us how to trigger that tests? When? How? So far, I haven't notices how it works. Thanks Best Regards, ZhaoBo [image: Mailtrack]