On Thu, Jan 24, 2013 at 9:24 AM, Gael Varoquaux
wrote:
> Yes, there is a massive difference in amount of work and performance when
> you try to replace the Euclidean distance. Amongst other problems, the
> mean is no longer the sum divided by the number of points, but the
> Frechet mean, which re
hi Ariel,
what I would do, if the data are not too big, is reimplement my kmeans in
10 lines and after you update the centers, normalize them to put them back
on the sphere. I don't think you can say much about convergence but
it might work in practice.
HTH
Alex
On Thu, Jan 24, 2013 at 1:24 AM,
Hi Peter,
Thanks for sharing the experience and code. I will try the same.
@Jaques : Thanks for the link. My plan is to use sklearn only . If I have
to use Mahout the entire project has to be converted to java. I am
interested to accomplish it in Python only !!
Best regards
jaganadh
On Wed,
On Thu, Jan 24, 2013 at 12:34:31AM +0100, Andreas Mueller wrote:
> Sorry, custom metrics for K means are not possible at the moment.
Yes, there is a massive difference in amount of work and performance when
you try to replace the Euclidean distance. Amongst other problems, the
mean is no longer th
Hi Ariel.
Sorry, custom metrics for K means are not possible at the moment.
If you wanted to tweak the sklearn implementation, you would have to
look into this file:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cluster/k_means_.py#L413
In particular the function _labels_inert
Hi everyone,
I am interested in using the sklearn implementation of k means to estimate
clusters of unit vectors on the surface of a sphere.
This requires that the distance metric be changed from the current
Euclidean distance metric to angles.
Is there any easy way to achieve that with the curr
Am 23.01.2013 20:32, schrieb Ronnie Ghose:
> How can _best_score in GridSearchCV be negative? R^2 can only be from
> 0 to -1 ...?
R^2 can also be negative afaik. It is somewhat unstable for small sample
sizes.
--
Master
How can _best_score in GridSearchCV be negative? R^2 can only be from 0 to
-1 ...?
--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills curr
Am 23.01.2013 18:39, schrieb Lars Buitinck:
>
> if you want more predictions or something...
> More in detail: OneVsRestClassifier exports an object called
> label_binarizer_, which is used to transform decision function values
> D back to class labels. By default, it picks all the classes for whic
2013/1/23 Andreas Mueller :
> Am 23.01.2013 16:47, schrieb Philipp Singer:
>> That's what I originally thought, but then I tried it with just using
>> LinearSVC and it magically worked for my sample dataset, really
>> interesting. I think it is working now properly.
> I'm pretty sure it shouldn't.
Am 23.01.2013 16:47, schrieb Philipp Singer:
> Hey,
>
> That's what I originally thought, but then I tried it with just using
> LinearSVC and it magically worked for my sample dataset, really
> interesting. I think it is working now properly.
I'm pretty sure it shouldn't.
> What I am asking myself
* bug for
On Jan 23, 2013 10:48 AM, "Ronnie Ghose" wrote:
> File a bugbor inadequate validation also?
> On Jan 23, 2013 10:34 AM, "Andreas Mueller"
> wrote:
>
>> Hi Philipp.
>> LinearSVC can not cope with multilabel problems.
>> It seems it is not doing enough input validation.
>> You have to us
File a bugbor inadequate validation also?
On Jan 23, 2013 10:34 AM, "Andreas Mueller"
wrote:
> Hi Philipp.
> LinearSVC can not cope with multilabel problems.
> It seems it is not doing enough input validation.
> You have to use OneVsRestClassifier together with LinearSVC
> to do that afaik.
> Che
Hey,
That's what I originally thought, but then I tried it with just using
LinearSVC and it magically worked for my sample dataset, really
interesting. I think it is working now properly.
What I am asking myself is how exactly the decision is made for the
multilabel prediction. Is there some w
Hi Philipp.
LinearSVC can not cope with multilabel problems.
It seems it is not doing enough input validation.
You have to use OneVsRestClassifier together with LinearSVC
to do that afaik.
Cheers,
Andy
Am 23.01.2013 16:27, schrieb Philipp Singer:
> Hey guys!
>
> I am currently trying to do multila
Hey guys!
I am currently trying to do multilabel prediction using textual features
(e.g., tfidf).
My data consists of a different amount of labels for a sample. One can
have just one label and one can have 10 labels.
I now simply built a list of tuples for my y vector.
So for example:
(19, 8,
Hi Jaganadh,
I once used hadoop to implement grid search / multi-task learning with
hadoop streaming. The setup was fairly simple: I put the serialized
dataset (joblib dump) on HDFS and created an input file - one line for
each parameter setting for grid search. The map script deserialized
the dat
2013/1/23 JAGANADH G
> Hadoop/Dumbo or hadoop
This thread may be of some interest :
http://news.ycombinator.com/item?id=4968609
Regards
J
--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Win
Hi All,
Does anybody tried using sklearn with Hadoop/Dumbo or hadoop streaming.
Please share your thoughts and experience.
Best regards
--
**
JAGANADH G
http://jaganadhg.in
*ILUGCBE*
http://ilugcbe.org.in
--
Actually it looks like John opened a pull request for the feature today:
https://github.com/scikit-learn/scikit-learn/pull/1611
--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, J
I am confused now, too.
Fernando, you want 2 dimensional targets, right? So y is (n_samples, 2)?
This is not possible with the current code afaik.
It should be possible to extend the code but that hasn't been done yet.
hth,
Andy
-
21 matches
Mail list logo