date:20190328

Re: How to extract data in parallel from RDBMS tables

2019-03-28 Thread Surendra , Manchikanti

Hi Jason, Thanks for your reply, But I am looking for a way to parallelly extract all the tables in a Database. On Thu, Mar 28, 2019 at 2:50 PM Jason Nerothin wrote: > Yes. > > If you use the numPartitions option, your max parallelism will be that > number. See also: partitionColumn,

unsubscribe

2019-03-28 Thread Byron Lee

unsubscribe

Re: spark.submit.deployMode: cluster

2019-03-28 Thread Jason Nerothin

Meant this one: https://docs.databricks.com/api/latest/jobs.html On Thu, Mar 28, 2019 at 5:06 PM Pat Ferrel wrote: > Thanks, are you referring to > https://github.com/spark-jobserver/spark-jobserver or the undocumented > REST job server included in Spark? > > > From: Jason Nerothin > Reply:

BLAS library class def not found error

2019-03-28 Thread Serena S Yuan

Hi, I was using the apache spark machine learning library in java (posted this issue at https://stackoverflow.com/questions/55367722/apache-spark-in-java-machine-learning-com-github-fommil-netlib-f2jblas-dscalf?noredirect=1#comment97464462_55367722 ), and I had an error while trying to train

Re: spark.submit.deployMode: cluster

2019-03-28 Thread Pat Ferrel

Thanks, are you referring to https://github.com/spark-jobserver/spark-jobserver or the undocumented REST job server included in Spark? From: Jason Nerothin Reply: Jason Nerothin Date: March 28, 2019 at 2:53:05 PM To: Pat Ferrel Cc: Felix Cheung , Marcelo Vanzin , user Subject: Re:

Re: spark.submit.deployMode: cluster

2019-03-28 Thread Jason Nerothin

Check out the Spark Jobs API... it sits behind a REST service... On Thu, Mar 28, 2019 at 12:29 Pat Ferrel wrote: > ;-) > > Great idea. Can you suggest a project? > > Apache PredictionIO uses spark-submit (very ugly) and Apache Mahout only > launches trivially in test apps since most uses are

Re: How to extract data in parallel from RDBMS tables

2019-03-28 Thread Jason Nerothin

Yes. If you use the numPartitions option, your max parallelism will be that number. See also: partitionColumn, lowerBound, and upperBound https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html On Wed, Mar 27, 2019 at 23:06 Surendra , Manchikanti < surendra.manchika...@gmail.com> wrote:

Re: Spark Profiler

2019-03-28 Thread bo yang

Yeah, these options are very valuable. Just add another option :) We build a jvm profiler (https://github.com/uber-common/jvm-profiler) to monitor and profile Spark applications in large scale (e.g. sending metrics to kafka / hive for batch analysis). People could try it as well. On Wed, Mar 27,

Re: spark.submit.deployMode: cluster

2019-03-28 Thread Pat Ferrel

;-) Great idea. Can you suggest a project? Apache PredictionIO uses spark-submit (very ugly) and Apache Mahout only launches trivially in test apps since most uses are as a lib. From: Felix Cheung Reply: Felix Cheung Date: March 28, 2019 at 9:42:31 AM To: Pat Ferrel , Marcelo Vanzin Cc:

Re: Where does the Driver run?

2019-03-28 Thread Pat Ferrel

Thanks for the pointers. We’ll investigate. We have been told that the “Driver” is run in the launching JVM because deployMode = cluster is ignored if spark-submit is not used to launch. You are saying that there is a loophole and if you use one of these client classes there is a way to run part

Re: spark.submit.deployMode: cluster

2019-03-28 Thread Felix Cheung

If anyone wants to improve docs please create a PR. lol But seriously you might want to explore other projects that manage job submission on top of spark instead of rolling your own with spark-submit. From: Pat Ferrel Sent: Tuesday, March 26, 2019 2:38 PM

Adaptive query execution and CBO

2019-03-28 Thread Tomasz Krol

I asked this question while ago on StackOverflow but got no response, so trying here:) Whats your experience with using adaptive query execution and CBO? Do you use them enabled together? or seperate? Do you experience any issues using them? For example Ive seen that bucketing doesnt work

Re: Where does the Driver run?

2019-03-28 Thread Mich Talebzadeh

Hi, I have explained this in my following Linkedlin article "The Operational Advantages of Spark as a Distributed Processing Framework " An extract *2) YARN Deployment Modes* The term D*eployment mode of

Re: Streaming data out of spark to a Kafka topic

2019-03-28 Thread Mich Talebzadeh

Hi Gabor, I will look at the link and see what it provides. Thanks, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw *

Re: Where does the Driver run?

2019-03-28 Thread Jianneng Li

Hi Pat, The driver runs in the same JVM as SparkContext. You didn't go into detail about how you "launch" the job (i.e. how the SparkContext is created), so it's hard for me to guess where the driver is. For reference, we've had success launching Spark programmatically to YARN in cluster mode

Re: How to extract data in parallel from RDBMS tables

unsubscribe

Re: spark.submit.deployMode: cluster

BLAS library class def not found error

Re: spark.submit.deployMode: cluster

Re: spark.submit.deployMode: cluster

Re: How to extract data in parallel from RDBMS tables

Re: Spark Profiler

Re: spark.submit.deployMode: cluster

Re: Where does the Driver run?

Re: spark.submit.deployMode: cluster

Adaptive query execution and CBO

Re: Where does the Driver run?

Re: Streaming data out of spark to a Kafka topic

Re: Where does the Driver run?

15 matches

Site Navigation

Mail list logo

Footer information