from:"Ray"

Re: Introducing English SDK for Apache Spark - Seeking Your Feedback and Contributions

2023-07-03 Thread Gavin Ray

Wow, really neat -- thanks for sharing! On Mon, Jul 3, 2023 at 8:12 PM Gengliang Wang wrote: > Dear Apache Spark community, > > We are delighted to announce the launch of a groundbreaking tool that aims > to make Apache Spark more user-friendly and accessible - the English SDK >

Re: Complexity with the data

2022-05-25 Thread Gavin Ray

Forgot to reply-all last message, whoops. Not very good at email. You need to normalize the CSV with a parser that can escape commas inside of strings Not sure if Spark has an option for this? On Wed, May 25, 2022 at 4:37 PM Sid wrote: > Thank you so much for your time. > > I have data like

Re: [Spark SQL]: Configuring/Using Spark + Catalyst optimally for read-heavy transactional workloads in JDBC sources?

2022-05-18 Thread Gavin Ray

ot;, "true") > .config("spark.sql.cbo.joinReorder.enabled", "true") > .config("spark.sql.cbo.planStats.enabled", "true") > .config("spark.sql.cbo.starSchemaDetection", "true") If you're running on more recent JDK's, you'l

[SQL] Why does a small two-source JDBC query take ~150-200ms with all optimizations (AQE, CBO, pushdown, Kryo, unsafe) enabled? (v3.4.0-SNAPSHOT)

2022-05-18 Thread Gavin Ray

I did some basic testing of multi-source queries with the most recent Spark: https://github.com/GavinRay97/spark-playground/blob/44a756acaee676a9b0c128466e4ab231a7df8d46/src/main/scala/Application.scala#L46-L115 The output of "spark.time()" surprised me: SELECT p.id, p.name, t.id, t.title FROM

[Spark SQL]: Configuring/Using Spark + Catalyst optimally for read-heavy transactional workloads in JDBC sources?

2022-05-16 Thread Gavin Ray

Hi all, I've not got much experience with Spark, but have been reading the Catalyst and Datasources V2 code/tests to try to get a basic understanding. I'm interested in trying Catalyst's query planner + optimizer for queries spanning one-or-more JDBC sources. Somewhat unusually, I'd like to do

unsubscribe

2022-05-02 Thread Ray Qiu

Re: No SparkR on Mesos?

2016-09-08 Thread ray

Hi, Rodrick, Interesting. SparkR is expected not to work with Mesos due to lack of support for mesos in some places, and it has not been tested yet. Have you modified Spark source code by yourself? Have you deployed Spark binary distribution on all salve nodes, and set

How to build Spark with my own version of Hadoop?

2015-07-21 Thread Dogtail Ray

Hi, I have modified some Hadoop code, and want to build Spark with the modified version of Hadoop. Do I need to change the compilation dependency files? How to then? Great thanks!

Re: init / shutdown for complex map job?

2014-12-28 Thread Ray Melton

A follow-up to the blog cited below was hinted at, per But Wait, There's More ... To keep this post brief, the remainder will be left to a follow-up post. Is this follow-up pending? Is it sort of pending? Did the follow-up happen, but I just couldn't find it on the web? Regards, Ray. On Sun

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-15 Thread Ray

there for almost 1 hour. I guess I can only go with random initialization in KMeans. Thanks again for your help. Ray -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16530.html Sent from the Apache Spark User List

Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray

=1 spark.storage.blockManagerHeartBeatMs=3 --driver-memory 2g --executor-memory 2g --num-executors 100 I am running spark-submit on YARN. The Spark version is 1.1.0, and Hadoop is 2.4.1. Could you please some comments/insights? Thanks a lot. Ray -- View this message in context

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray

observable hanging. Hopefully this provides more information. Thanks. Ray -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16417.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray

see it got 201 executors (as shown below). http://apache-spark-user-list.1001560.n3.nabble.com/file/n16428/spark_core.png http://apache-spark-user-list.1001560.n3.nabble.com/file/n16428/spark_executor.png Thanks. Ray -- View this message in context: http://apache-spark-user-list

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray

be an active stage with an incomplete progress bar in the UI. Am I wrong? Thanks, Burak! Ray -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16438.html Sent from the Apache Spark User List mailing

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray

, it just finished quickly~~ In your test on mnis8m, did you use KMeans++ as initialization mode? How long it takes? Thanks again for your help. Ray -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16450

Spark cluster spanning multiple data centers

2014-07-23 Thread Ray Qiu

anyone tried this? Thanks, Ray

Re: Introducing English SDK for Apache Spark - Seeking Your Feedback and Contributions

Re: Complexity with the data

Re: [Spark SQL]: Configuring/Using Spark + Catalyst optimally for read-heavy transactional workloads in JDBC sources?

[SQL] Why does a small two-source JDBC query take ~150-200ms with all optimizations (AQE, CBO, pushdown, Kryo, unsafe) enabled? (v3.4.0-SNAPSHOT)

[Spark SQL]: Configuring/Using Spark + Catalyst optimally for read-heavy transactional workloads in JDBC sources?

unsubscribe

Re: No SparkR on Mesos?

How to build Spark with my own version of Hadoop?

Re: init / shutdown for complex map job?

Re: Spark KMeans hangs at reduceByKey / collectAsMap

Spark KMeans hangs at reduceByKey / collectAsMap

Re: Spark KMeans hangs at reduceByKey / collectAsMap

Re: Spark KMeans hangs at reduceByKey / collectAsMap

Re: Spark KMeans hangs at reduceByKey / collectAsMap

Re: Spark KMeans hangs at reduceByKey / collectAsMap

Spark cluster spanning multiple data centers

16 matches

Site Navigation

Mail list logo

Footer information