t; no target variable.
>
> PCA will not 'improve' clustering per se but can make it faster.
> You may want to specify what you are actually trying to optimize.
>
>
> On Tue, Aug 9, 2016, 03:23 Rohit Chaddha <rohitchaddha1...@gmail.com>
> wrote:
>
>> I would rather have less feat
you classification, you can then run your
> model again with the smaller set of features.
> The two approaches are quite different, what I'm suggesting involves
> training (supervised learning) in the context of a target function, with
> SVD you are doing unsupervised learning.
>
> O
> >> I know we can reduce dimensions by using PCA, but i think that does not
> >> allow us to understand which factors from the original are we using in
> the
> >> end.
> >>
> >> - Tony L.
> >>
> >> On Mon, Aug 8, 2016 at 5:12 P
I have a data-set where each data-point has 112 factors.
I want to remove the factors which are not relevant, and say reduce to 20
factors out of these 112 and then do clustering of data-points using these
20 factors.
How do I do these and how do I figure out which of the 20 factors are
useful
The predict method takes a Vector object
I am unable to figure out how to make this spark vector object for getting
predictions from my model.
Does anyone has some code in java for this ?
Thanks
Rohit
---
T E S T S
---
Running org.apache.spark.api.java.OptionalSuite
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.052 sec -
in org.apache.spark.api.java.OptionalSuite
Running
I have a custom object called A and corresponding Dataset
when I call datasetA.show() method i get the following
+++-+-+---+
|id|da|like|values|uid|
+++-+-+---+
|A.toString()...|
Jul 28, 2016 at 11:52 AM, Rohit Chaddha
> <rohitchaddha1...@gmail.com> wrote:
> > Sean,
> >
> > I saw some JIRA tickets and looks like this is still an open bug (rather
> > than an improvement as marked in JIRA).
> >
> > https://issues.apache.org/jira/bro
On Fri, Jul 29, 2016 at 12:06 AM, Rohit Chaddha <rohitchaddha1...@gmail.com>
wrote:
> I am simply trying to do
> session.read().json("file:///C:/data/a.json");
>
> in 2.0.0-preview it was working fine with
> sqlContext.read().json("C:/data/a.json");
>
>
t work? that should certainly be an absolute
> URI with an absolute path. What exactly is your input value for this
> property?
>
> On Thu, Jul 28, 2016 at 11:28 AM, Rohit Chaddha
> <rohitchaddha1...@gmail.com> wrote:
> > Hello Sean,
> >
> > I have tried both f
Thu, Jul 28, 2016 at 10:47 AM, Rohit Chaddha
> <rohitchaddha1...@gmail.com> wrote:
> > I upgraded from 2.0.0-preview to 2.0.0
> > and I started getting the following error
> >
> > Caused by: java.net.URISyntaxException: Relative path in absolute URI:
> >
My bad. Please ignore this question.
I accidentally reverted to sparkContext causing the issue
On Thu, Jul 28, 2016 at 11:36 PM, Rohit Chaddha <rohitchaddha1...@gmail.com>
wrote:
> In spark 2.0 there is an addtional parameter of type ClassTag in the
> broadcast method of the
In spark 2.0 there is an addtional parameter of type ClassTag in the
broadcast method of the sparkContext
What is this variable and how to do broadcast now?
here is my exisitng code with 2.0.0-preview
Broadcast
I upgraded from 2.0.0-preview to 2.0.0
and I started getting the following error
Caused by: java.net.URISyntaxException: Relative path in absolute URI:
file:C:/ibm/spark-warehouse
Any ideas how to fix this
-Rohit
It is present in mlib but I don't seem to find it in ml package.
Any suggestions please ?
-Rohit
Hi Krishna,
Great .. I had no idea about this. I tried your suggestion by using
na.drop() and got a rmse = 1.5794048211812495
Any suggestions how this can be reduced and the model improved ?
Regards,
Rohit
On Mon, Jul 25, 2016 at 4:12 AM, Krishna Sankar wrote:
> Thanks
Great thanks both of you. I was struggling with this issue as well.
-Rohit
On Mon, Jul 25, 2016 at 4:12 AM, Krishna Sankar wrote:
> Thanks Nick. I also ran into this issue.
> VG, One workaround is to drop the NaN from predictions (df.na.drop()) and
> then use the dataset
17 matches
Mail list logo