Re: Saving a pyspark.ml.feature.PCA model

2016-07-20 Thread Ajinkya Kale
PM Ajinkya Kale <kaleajin...@gmail.com> wrote: > I am using google cloud dataproc which comes with spark 1.6.1. So upgrade > is not really an option. > No way / hack to save the models in spark 1.6.1 ? > > On Tue, Jul 19, 2016 at 8:13 PM Shuai Lin <linshuai2...@gmail.com

Re: Saving a pyspark.ml.feature.PCA model

2016-07-19 Thread Ajinkya Kale
> > https://issues.apache.org/jira/browse/SPARK-13036 > https://github.com/apache/spark/commit/83302c3b > > so i guess you need to wait for 2.0 release (or use the current rc4). > > On Wed, Jul 20, 2016 at 6:54 AM, Ajinkya Kale <kaleajin...@gmail.com> > wrote: > >> Is there

Saving a pyspark.ml.feature.PCA model

2016-07-19 Thread Ajinkya Kale
Is there a way to save a pyspark.ml.feature.PCA model ? I know mllib has that but mllib does not have PCA afaik. How do people do model persistence for inference using the pyspark ml models ? Did not find any documentation on model persistency for ml. --ajinkya

Re: installing packages with pyspark

2016-03-19 Thread Ajinkya Kale
mitting-applications.html > > _ > From: Jakob Odersky <ja...@odersky.com> > Sent: Thursday, March 17, 2016 6:40 PM > Subject: Re: installing packages with pyspark > To: Ajinkya Kale <kaleajin...@gmail.com> > Cc: <user@spark.apache.org> > > > Hi, > re

installing packages with pyspark

2016-03-19 Thread Ajinkya Kale
Hi all, I had couple of questions. 1. Is there documentation on how to add the graphframes or any other package for that matter on the google dataproc managed spark clusters ? 2. Is there a way to add a package to an existing pyspark context through a jupyter notebook ? --aj

Re: Logistic Regression using ML Pipeline

2016-02-19 Thread Ajinkya Kale
Please take a look at the example here http://spark.apache.org/docs/latest/ml-guide.html#example-pipeline On Thu, Feb 18, 2016 at 9:27 PM Arunkumar Pillai wrote: > Hi > > I'm trying to build logistic regression using ML Pipeline > > val lr = new LogisticRegression() >

Reading multiple avro files from a dir - Spark 1.5.1

2016-01-29 Thread Ajinkya Kale
Trying to load avro from hdfs. I have around 1000 part avro files in a dir. I am using this to read them - val df = sqlContext.read.format("com.databricks.spark.avro").load("path/to/avro/dir") df.select("QUERY").take(50).foreach(println) It works if I have pass only 1or 2 avro files in the

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-22 Thread Ajinkya Kale
I tried --jars which supposedly does that but that did not work. On Fri, Jan 22, 2016 at 4:33 PM Ajinkya Kale <kaleajin...@gmail.com> wrote: > Hi Ted, > Is there a way for the executors to have the hbase-protocol jar on their > classpath ? > > On Fri, Jan 22, 2016 at 4

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-22 Thread Ajinkya Kale
Is this issue only when the computations are in distributed mode ? If I do (pseudo code) : rdd.collect.call_to_hbase I dont get this error, but if I do : rdd.call_to_hbase.collect it throws this error. On Wed, Jan 20, 2016 at 6:50 PM Ajinkya Kale <kaleajin...@gmail.com> wrote: > Unfo

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-22 Thread Ajinkya Kale
Hi Ted, Is there a way for the executors to have the hbase-protocol jar on their classpath ? On Fri, Jan 22, 2016 at 4:00 PM Ted Yu <yuzhih...@gmail.com> wrote: > The class path formations on driver and executors are different. > > Cheers > > On Fri, Jan 22, 2016 at

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-20 Thread Ajinkya Kale
Unfortunately I cannot at this moment (not a decision I can make) :( On Wed, Jan 20, 2016 at 6:46 PM Ted Yu <yuzhih...@gmail.com> wrote: > I am not aware of a workaround. > > Can you upgrade to 0.98.4+ release ? > > Cheers > > On Wed, Jan 20, 2016 at 6:26 PM, Ajinkya

HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-20 Thread Ajinkya Kale
I have posted this on hbase user list but i thought makes more sense on spark user list. I am able to read the table in yarn-client mode from spark-shell but I have exhausted all online forums for options to get it working in the yarn-cluster mode through spark-submit. I am using this

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-20 Thread Ajinkya Kale
version and try again. > > If still there is problem, please pastebin the stack trace. > > Thanks > > On Wed, Jan 20, 2016 at 5:41 PM, Ajinkya Kale <kaleajin...@gmail.com> > wrote: > >> >> I have posted this on hbase user list but i thought makes more sense on &