Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread John Zhuge
eleases/spark-release-3-5-1.html >> >> We would like to acknowledge all community members for contributing to >> this >> release. This release would not have been possible without you. >> >> Jungtaek Lim >> >> ps. Yikun is helping us through releasing the official docker image for >> Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally available. >> >> -- John Zhuge

Re: Introducing Comet, a plugin to accelerate Spark execution via DataFusion and Arrow

2024-02-13 Thread John Zhuge
https://github.com/apache/arrow-datafusion-comet for more details if >> you are interested. We'd love to collaborate with people from the open >> source community who share similar goals. >> >> Thanks, >> Chao >> >> ----- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> -- John Zhuge

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
n Karau >>>> wrote: >>>> >>>>> Hi Folks, >>>>> >>>>> I'm continuing my adventures to make Spark on containers party and I >>>>> was wondering if folks have experience with the different batch >>>>> scheduler options that they prefer? I was thinking so that we can >>>>> better support dynamic allocation it might make sense for us to >>>>> support using different schedulers and I wanted to see if there are >>>>> any that the community is more interested in? >>>>> >>>>> I know that one of the Spark on Kube operators supports >>>>> volcano/kube-batch so I was thinking that might be a place I start >>>>> exploring but also want to be open to other schedulers that folks >>>>> might be interested in. >>>>> >>>>> Cheers, >>>>> >>>>> Holden :) >>>>> >>>>> -- >>>>> Twitter: https://twitter.com/holdenkarau >>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>> https://amzn.to/2MaRAG9 >>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>> >>>>> - >>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>> >>>>> -- >>> Twitter: https://twitter.com/holdenkarau >>> Books (Learning Spark, High Performance Spark, etc.): >>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>> >> -- John Zhuge

Re: Timestamp Difference/operations

2018-10-12 Thread John Zhuge
Yeah, operator "-" does not seem to be supported, however, you can use "datediff" function: In [9]: select datediff(CAST('2000-02-01 12:34:34' AS TIMESTAMP), CAST('2000-01-01 00:00:00' AS TIMESTAMP)) Out[9]:

Re: Handle BlockMissingException in pyspark

2018-08-06 Thread John Zhuge
BlockMissingException typically indicates the HDFS file is corrupted. Might be an HDFS issue, Hadoop mailing list is a better bet: u...@hadoop.apache.org. Capture at the full stack trace in executor log. If the file still exists, run `hdfs fsck -blockId blk_1233169822_159765693` to determine

Re: Is spark-env.sh sourced by Application Master and Executor for Spark on YARN?

2018-01-03 Thread John Zhuge
ine (where the app is being submitted from). > > On Wed, Jan 3, 2018 at 6:46 PM, John Zhuge <john.zh...@gmail.com> wrote: > > Thanks Jacek and Marcelo! > > > > Any reason it is not sourced? Any security consideration? > > > > > > On Wed, Jan 3, 2018 at 9:59 A

Re: Is spark-env.sh sourced by Application Master and Executor for Spark on YARN?

2018-01-03 Thread John Zhuge
Thanks Jacek and Marcelo! Any reason it is not sourced? Any security consideration? On Wed, Jan 3, 2018 at 9:59 AM, Marcelo Vanzin <van...@cloudera.com> wrote: > On Tue, Jan 2, 2018 at 10:57 PM, John Zhuge <jzh...@apache.org> wrote: > > I am running Spark 2.0.0 and 2.1.

Is spark-env.sh sourced by Application Master and Executor for Spark on YARN?

2018-01-02 Thread John Zhuge
lustermode. See the YARN-related Spark Properties > <https://github.com/apache/spark/blob/master/docs/running-on-yarn.html#spark-properties> > for > more information. Does it mean spark-env.sh will not be sourced when starting AM in cluster mode? Does this paragraph appy to executor as well? Thanks, -- John Zhuge