Hi there,
I'm currently thinking whether we should do a little cleanup in the
non-distributed recommenders package and throw out recommenders that
have not been used/asked about on the mailinglist or that have been
replaced by a superior implementation.
If anyone reads this and sees a
The tree-based ones are very old and not fast, and were more of an
experiment. I recall a few questions about them but it seemed like
people were really just trying to do clustering, and this is a bad way
to do clustering.
knn is old too, and in a sense spiritually quite similar to ALS. I
don't
As a n00b, I am still revolving in the kNN space.
Could you please point me to some details on ALS.
Thanks!
On Thu, Dec 6, 2012 at 10:14 AM, Sean Owen sro...@gmail.com wrote:
The tree-based ones are very old and not fast, and were more of an
experiment. I recall a few questions about them but
Are you speaking specifically about the implementation in the .knn
package, which is a fairly particular thing, or just a k nearest
neighbor approaches in general? The latter aren't going away.
On Thu, Dec 6, 2012 at 3:18 PM, Koobas koo...@gmail.com wrote:
As a n00b, I am still revolving in the
FunkSVD is a suboptimal duplicate of RatingSGDFactorizer,
ImplicitLinearRegressionFactorizer is a duplicate of ALSWR so I think we
should only keep one of each.
The other three recommenders seem to be used almost never, so I'd like
to remove them, however I wouldn't have a problem with keeping
On Thu, Dec 6, 2012 at 10:20 AM, Sean Owen sro...@gmail.com wrote:
Are you speaking specifically about the implementation in the .knn
package, which is a fairly particular thing, or just a k nearest
neighbor approaches in general? The latter aren't going away.
kNN in general.
Glad to hear it
I took the plunge and rendered a few plots in R with how the
parameters of streaming-k-means evolve.
Here's the link [1].
[1] https://github.com/dfilimon/knn/wiki/skm-visualization
On Thu, Dec 6, 2012 at 2:01 AM, Ted Dunning ted.dunn...@gmail.com wrote:
Still not that odd if several clusters
Yeah... very useful. Clearly the adaptive limit on the number of surrogate
points is much too restrictive.
On Fri, Dec 7, 2012 at 1:21 AM, Dan Filimon dangeorge.fili...@gmail.comwrote:
I took the plunge and rendered a few plots in R with how the
parameters of streaming-k-means evolve.
Here's
Deprecating is a nice first step to let people know where things are headed.
On Thu, Dec 6, 2012 at 4:21 PM, Sebastian Schelter s...@apache.org wrote:
The other three recommenders seem to be used almost never, so I'd like
to remove them, however I wouldn't have a problem with keeping them for
Yes I'm on a project in which we classify a large data set. We do use
mapreduce to do the classification as the data set is much larger than
the working memory. We have a non-mahout implementation...
So we put the decision forest in memory via a distributed cache and
partition the data set
One nice way to do this to to mark the classes in question depreciated
for a few releases, and then remove them on an announced schedule. That
lets any end users know what is coming and gives them time to respond.
On 12/06/2012 10:21 AM, Sebastian Schelter wrote:
FunkSVD is a suboptimal
May be I dont understand your question.
we apply this formula
http://search.apache.org/~doronc/api/org/apache/lucene/search/DefaultSimilarity.html#idf%28int,%20int%29
to frequency.file-0. (seq2sparse output with term DocFreq)
Just checked TF vectors and TFIDF vectors, this furmula gives me
12 matches
Mail list logo