Re: [ANNOUNCE] Apache Spark 3.0.3 released

2021-06-25 Thread L . C . Hsieh
Thanks Yi for the work! On 2021/06/25 05:51:38, Yi Wu wrote: > We are happy to announce the availability of Spark 3.0.3! > > Spark 3.0.3 is a maintenance release containing stability fixes. This > release is based on the branch-3.0 maintenance branch of Spark. We strongly > recommend all 3.0

Re: [DISCUSS] Rename hadoop-3.2/hadoop-2.7 profile to hadoop-3/hadoop-2?

2021-06-25 Thread Chao Sun
Thanks all for the feedback! Yes I agree that we should target this for Apache Spark 3.3 release. I'll put this aside for now and pick it up again after the 3.2 release is finished. > And maybe the current naming leaves the possibility for a "hadoop-3.5" or something if that needed to be

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-06-25 Thread huaxin gao
I took a quick look at the PR and it looks like a great feature to have. It provides unified APIs for data sources to perform the commonly used operations easily and efficiently, so users don't have to implement customer extensions on their own. Thanks Anton for the work! On Thu, Jun 24, 2021 at

Re: [ANNOUNCE] Apache Spark 3.0.3 released

2021-06-25 Thread Dongjoon Hyun
Thank you, Yi! On Thu, Jun 24, 2021 at 10:52 PM Yi Wu wrote: > We are happy to announce the availability of Spark 3.0.3! > > Spark 3.0.3 is a maintenance release containing stability fixes. This > release is based on the branch-3.0 maintenance branch of Spark. We strongly > recommend all 3.0

Fail to run benchmark in Github Action

2021-06-25 Thread Kevin Su
Hi all, I try to run a benchmark test in GitHub action in my fork, and I faced the below error. https://github.com/pingsutw/spark/runs/2867617238?check_suite_focus=true java.lang.AssertionError: assertion failed: spark.test.home is not set! 23799

Re: Spark on Kubernetes scheduler variety

2021-06-25 Thread Yikun Jiang
Oops, sorry for the error link, it should be: We will also prepare to propose an initial design and POC[3] on a shared branch (based on spark master branch) where we can collaborate on it, so I created the spark-volcano[1] org in github to make it happen. [3]

Lift the limitation of Spark JDBC handling of individual rows with DML

2021-06-25 Thread Mich Talebzadeh
*Challenge* Insert data from Spark dataframe when one or more columns in theOracle table rely on some derived_colums dependent on data in one or more dataframe columns. Standard JDBC from Spark to Oracle does batch insert of dataframe into Oracle *so it cannot handle these derived columns*.