Re: External Spark shuffle service for k8s

2024-04-07 Thread Cheng Pan
-samples/emr-remote-shuffle-service [4] https://github.com/apache/celeborn/issues/2140 Thanks, Cheng Pan > On Apr 6, 2024, at 21:41, Mich Talebzadeh wrote: > > I have seen some older references for shuffle service for k8s, > although it is not clear they are talking about a generic shuff

[DISCUSS] MySQL version support policy

2024-03-24 Thread Cheng Pan
-innovation-and-long-term-support-lts-versions/ [3] https://github.com/apache/spark/pull/45581 [4] https://aws.amazon.com/rds/mysql/ [5] https://learn.microsoft.com/en-us/azure/mysql/concepts-version-policy Thanks, Cheng Pan

[ANNOUNCE] Apache Kyuubi 1.8.1 is available

2024-02-20 Thread Cheng Pan
ank all contributors of the Kyuubi community who made this release possible! Thanks, Cheng Pan, on behalf of Apache Kyuubi community - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Cheng Pan
Spark has supported the window-based executor failure-tracking mechanism for YARN for a long time, SPARK-41210[1][2] (included in 3.5.0) extended this feature to K8s. [1] https://issues.apache.org/jira/browse/SPARK-41210 [2] https://github.com/apache/spark/pull/38732 Thanks, Cheng Pan

[ANNOUNCE] Apache Kyuubi released 1.8.0

2023-11-06 Thread Cheng Pan
Hi all, The Apache Kyuubi community is pleased to announce that Apache Kyuubi 1.8.0 has been released! Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses. Kyuubi provides a pure SQL gateway through Thrift JDBC/ODBC interface for

[ANNOUNCE] Apache Celeborn(incubating) 0.3.1 available

2023-10-13 Thread Cheng Pan
: https://celeborn.apache.org/ Celeborn Resources: - Issue Management: https://issues.apache.org/jira/projects/CELEBORN - Mailing List: d...@celeborn.apache.org Thanks, Cheng Pan On behalf of the Apache Celeborn(incubating) community

Re: Spark Vulnerabilities

2023-08-14 Thread Cheng Pan
For the Guava case, you may be interested in https://github.com/apache/spark/pull/42493 Thanks, Cheng Pan > On Aug 14, 2023, at 16:50, Sankavi Nagalingam > wrote: > > Hi Team, > We could see there are many dependent vulnerabilities present in the latest > spark-core:3.4.

Re: Spark Multiple Hive Metastore Catalog Support

2023-04-17 Thread Cheng Pan
] https://github.com/apache/kyuubi/tree/master/extensions/spark/kyuubi-spark-connector-hive Thanks, Cheng Pan On Apr 18, 2023 at 00:38:23, Elliot West wrote: > Hi Ankit, > > While not a part of Spark, there is a project called 'WaggleDance' that > can federate multiple Hive m

Re: spark on k8s daemonset collect log

2023-03-14 Thread Cheng Pan
://github.com/apache/spark/pull/38357 Thanks, Cheng Pan On Mar 14, 2023 at 16:36:45, 404 wrote: > hi, all > > Spark runs on k8s, uses daemonset filebeat to collect logs, and writes > them to elasticsearch. The docker logs are in json format, and each line is > a json string. How to m

[ANNOUNCE] Apache Kyuubi released 1.7.0

2023-03-07 Thread Cheng Pan
yuubi.apache.org We would like to thank all contributors of the Kyuubi community who made this release possible! Thanks, Cheng Pan, on behalf of Apache Kyuubi community

Re: The Dataset unit test is much slower than the RDD unit test (in Scala)

2022-11-01 Thread Cheng Pan
://issues.apache.org/jira/browse/SPARK-38138 Thanks, Cheng Pan On Nov 2, 2022 at 00:14:34, Enrico Minack wrote: > Hi Tanin, > > running your test with option "spark.sql.planChangeLog.level" set to > "info" or "warn" (depending on your Spark log level) will sh

Re: Writing Custom Spark Readers and Writers

2022-04-06 Thread Cheng Pan
There are some projects based on Spark DataSource V2 that I hope will help you. https://github.com/datastax/spark-cassandra-connector https://github.com/housepower/spark-clickhouse-connector https://github.com/oracle/spark-oracle https://github.com/pingcap/tispark Thanks, Cheng Pan On Wed, Apr

Re: spark as data warehouse?

2022-03-26 Thread Cheng Pan
test/deployment/engine_share_level.html [2] https://github.com/apache/incubator-kyuubi/discussions/925 Thanks, Cheng Pan --- Thanks, I'll check it out. I have a use case where we want to use dbt as data middling tool . Will it take dbt queries and create the resulting model ? I see it supports T

[ANNOUNCE] Release Apache Kyuubi(Incubating) 1.3.0-incubating

2021-09-26 Thread Cheng Pan
Hello Spark Community, The Apache Kyuubi(Incubating) community is pleased to announce that Apache Kyuubi(Incubating) 1.3.0-incubating has been released! Apache Kyuubi(Incubating) is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark