Re: Write Spark Connection client application in Go

2023-09-14 Thread bo yang
at’s so cool! Great work y’all :) >> >> On Tue, Sep 12, 2023 at 8:14 PM bo yang wrote: >> >>> Hi Spark Friends, >>> >>> Anyone interested in using Golang to write Spark application? We created >>> a Spark Connect Go Client library >>>

Write Spark Connection client application in Go

2023-09-12 Thread bo yang
Hi Spark Friends, Anyone interested in using Golang to write Spark application? We created a Spark Connect Go Client library . Would love to hear feedback/thoughts from the community. Please see the quick start guide

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread bo yang
Interesting discussion here, looks like Spark does not support configuring different number of executors in different stages. Would love to see the community come out such a feature. On Thu, Nov 3, 2022 at 9:10 AM Shay Elbaz wrote: > Thanks again Artemis, I really appreciate it. I have watched

Re: Reverse proxy for Spark UI on Kubernetes

2022-05-17 Thread bo yang
Yes, it should be possible, any interest to work on this together? Need more hands to add more features here :) On Tue, May 17, 2022 at 2:06 PM Holden Karau wrote: > Could we make it do the same sort of history server fallback approach? > > On Tue, May 17, 2022 at 10:41 PM bo ya

Re: Reverse proxy for Spark UI on Kubernetes

2022-05-17 Thread bo yang
is to behave like that Web Application Proxy. It will simplify settings to access Spark UI on Kubernetes. On Mon, May 16, 2022 at 11:46 PM wilson wrote: > what's the advantage of using reverse proxy for spark UI? > > Thanks > > On Tue, May 17, 2022 at 1:47 PM bo yang wrote: >

Re: Reverse proxy for Spark UI on Kubernetes

2022-05-17 Thread bo yang
Thanks Holden :) On Mon, May 16, 2022 at 11:12 PM Holden Karau wrote: > Oh that’s rad  > > On Tue, May 17, 2022 at 7:47 AM bo yang wrote: > >> Hi Spark Folks, >> >> I built a web reverse proxy to access Spark UI on Kubernetes (working >> together with >&

Reverse proxy for Spark UI on Kubernetes

2022-05-16 Thread bo yang
Hi Spark Folks, I built a web reverse proxy to access Spark UI on Kubernetes (working together with https://github.com/GoogleCloudPlatform/spark-on-k8s-operator). Want to share here in case other people have similar need. The reverse proxy code is here:

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
gt; chart to > deploy Spark and some other stuff on K8S? > > ons. 23. feb. 2022 kl. 17:49 skrev bo yang : > >> Hi Sarath, let's follow up offline on this. >> >> On Wed, Feb 23, 2022 at 8:32 AM Sarath Annareddy < >> sarath.annare...@gmail.com> wrote: >> &

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
Hi Sarath, let's follow up offline on this. On Wed, Feb 23, 2022 at 8:32 AM Sarath Annareddy wrote: > Hi bo > > How do we start? > > Is there a plan? Onboarding, Arch/design diagram, tasks lined up etc > > > Thanks > Sarath > > > Sent from my iPhone >

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
Guidance is appreciated. > > Sarath > > Sent from my iPhone > > On Feb 23, 2022, at 2:01 AM, bo yang wrote: > >  > > Right, normally people start with simple script, then add more stuff, like > permission and more components. After some time, people want to run th

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Wed, 23 Feb 2022 at 04:06, bo yang wrote: > >> Hi Spark Community, >> >> We built an open source tool to deploy and run Spark on Kubernetes with a >> one click

Re: One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
you share link to the source? > > בתאריך יום ד׳, 23 בפבר׳ 2022, 6:52, מאת bo yang ‏: > >> We do not have SaaS yet. Now it is an open source project we build in our >> part time , and we welcome more people working together on that. >> >> You could specify cluste

Re: One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
r > about 1 hour. Do you have the SaaS solution for this? I can pay as I did. > > Thanks > > On Wed, Feb 23, 2022 at 12:21 PM bo yang wrote: > >> It is not a standalone spark cluster. In some details, it deploys a Spark >> Operator (https://github.com/GoogleCloudPlatfo

Re: One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
ion of spark? or just the standalone node? > > Thanks > > On Wed, Feb 23, 2022 at 12:06 PM bo yang wrote: > >> Hi Spark Community, >> >> We built an open source tool to deploy and run Spark on Kubernetes with a >> one click command. For example, on AWS, it co

One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
Hi Spark Community, We built an open source tool to deploy and run Spark on Kubernetes with a one click command. For example, on AWS, it could automatically create an EKS cluster, node group, NGINX ingress, and Spark Operator. Then you will be able to use curl or a CLI tool to submit Spark

Anyone interested in Remote Shuffle Service

2020-10-21 Thread bo yang
Hi Spark Users, Uber open sourced Remote Shuffle Service ( https://github.com/uber/RemoteShuffleService ) recently. It works with open source Spark version without code change needed, and could store shuffle data on separate machines other than Spark executors. Anyone interested to try? Also we

Re: Spark Profiler

2019-03-28 Thread bo yang
Yeah, these options are very valuable. Just add another option :) We build a jvm profiler (https://github.com/uber-common/jvm-profiler) to monitor and profile Spark applications in large scale (e.g. sending metrics to kafka / hive for batch analysis). People could try it as well. On Wed, Mar 27,

Re: Use SQL Script to Write Spark SQL Jobs

2017-06-14 Thread bo yang
> Regards, > > On Wed, 14 Jun 2017 at 04:32, bo yang <bobyan...@gmail.com> wrote: > >> Thanks Benjamin and Ayan for the feedback! You kind of represent two >> group of people who need such script tool or not. Personally I find the >> script is very useful for m

Re: Use SQL Script to Write Spark SQL Jobs

2017-06-13 Thread bo yang
;> interface, such as Talend, SSIS, etc. There is a small amount of scripting >> involved but not too much. I looked at what you are trying to do, and I >> welcome it. This could open up Spark to the masses and shorten development >> times. >> >> Cheers, >> Ben >>

Re: Use SQL Script to Write Spark SQL Jobs

2017-06-12 Thread bo yang
> > On 12-Jun-2017 11:00 AM, "bo yang" <bobyan...@gmail.com> wrote: > >> Hi Guys, >> >> I am writing a small open source project >> <https://github.com/uber/uberscriptquery> to use SQL Script to write >> Spark Jobs. Want to see if ther

Use SQL Script to Write Spark SQL Jobs

2017-06-11 Thread bo yang
Hi Guys, I am writing a small open source project to use SQL Script to write Spark Jobs. Want to see if there are other people interested to use or contribute to this project. The project is called UberScriptQuery (

Re: Kafka segmentation

2016-11-16 Thread bo yang
and whether maxRatePerPartition is set. > > I expect that there is something wrong with backpressure, see e.g. > https://issues.apache.org/jira/browse/SPARK-18371 > > On Wed, Nov 16, 2016 at 5:05 PM, bo yang <bobyan...@gmail.com> wrote: > > I hit similar issue with Spark St

Re: Kafka segmentation

2016-11-16 Thread bo yang
I hit similar issue with Spark Streaming. The batch size seemed a little random. Sometime it was large with many Kafka messages inside same batch, sometimes it was very small with just a few messages. Is it possible that was caused by the backpressure implementation in Spark Streaming? On Wed,

Re: Spark SQL . How to enlarge output rows ?

2016-01-27 Thread bo yang
Hi Eli, are you using Python? I see there is a method show(numRows) in Java, but not sure about Python. On Wed, Jan 27, 2016 at 2:39 AM, Akhil Das wrote: > Why would you want to print all rows? You can try the following: > > sqlContext.sql("select day_time from

Re: How to create DataFrame from a binary file?

2015-08-09 Thread bo yang
through Spark SQL: https://www.linkedin.com/pulse/light-weight-self-service-data-query-through-spark-sql-bo-yang Take a look and feel free to let me know for any question. Best, Bo On Sat, Aug 8, 2015 at 1:42 PM, unk1102 umesh.ka...@gmail.com wrote: Hi how do we create DataFrame from a binary

Re: How to create DataFrame from a binary file?

2015-08-09 Thread bo yang
-Weight Self-Service Data Query through Spark SQL: https://www.linkedin.com/pulse/light-weight-self-service-data-query-through-spark-sql-bo-yang Take a look and feel free to let me know for any question. Best, Bo On Sat, Aug 8, 2015 at 1:42 PM, unk1102 umesh.ka...@gmail.com wrote: Hi

Re: Accessing S3 files with s3n://

2015-08-09 Thread bo yang
Hi Akshat, I find some open source library which implements S3 InputFormat for Hadoop. Then I use Spark newAPIHadoopRDD to load data via that S3 InputFormat. The open source library is https://github.com/ATLANTBH/emr-s3-io. It is a little old. I look inside it and make some changes. Then it

Re: “mapreduce.job.user.classpath.first” for Spark

2015-02-04 Thread bo yang
suggestion is to build Spark by yourself. Anyway, would like to see your update once you figure out the solution. Best wishes! Bo On Wed, Feb 4, 2015 at 4:47 AM, Corey Nolet cjno...@gmail.com wrote: Bo yang- I am using Spark 1.2.0 and undoubtedly there are older Guava classes which are being

Re: “mapreduce.job.user.classpath.first” for Spark

2015-02-03 Thread bo yang
Corey, Which version of Spark do you use? I am using Spark 1.2.0, and guava 15.0. It seems fine. Best, Bo On Tue, Feb 3, 2015 at 8:56 PM, M. Dale medal...@yahoo.com.invalid wrote: Try spark.yarn.user.classpath.first (see https://issues.apache.org/jira/browse/SPARK-2996 - only works for