re for comparing rankings. There are
> ranking metrics like mean average precision that would be appropriate
> instead.
>
> On Wed, Sep 14, 2016 at 9:11 PM, Pasquinell Urbani <
> pasquinell.urb...@exalitica.com> wrote:
>
>> It was a typo mistake, both are rmse.
>
x27;re on the scale of 1-5, that's extremely poor.
>
> What's RMS vs RMSE?
>
> On Wed, Sep 14, 2016 at 8:33 PM, Pasquinell Urbani
> wrote:
> > Hi Community
> >
> > I'm performing an ALS for retail product recommendation. Right now I'm
> > re
Hi Community
I'm performing an ALS for retail product recommendation. Right now I'm
reaching rms_test = 2.3 and rmse_test = 32.5. Is this too much in your
experience? Does the transformation of the ranking values important for
having good errors?
Thank you all.
Pasquinell Urbani
Hi there
I am performing a product recommendation system for retail. I have been
able to compute the TF-IDF of user-items data frame in spark 2.0.
Now I need to transform the TF-IDF output in a data frame with columns
(user_id, item_id, TF_IDF_ratings) in order to perform an ALS. But I have
no cl
, 1.0)
Is there another way?
2016-07-11 18:28 GMT-04:00 Pasquinell Urbani <
pasquinell.urb...@exalitica.com>:
> Hi all,
>
> We have a dataframe with 2.5 millions of records and 13 features. We want
> to perform a logistic regression with this data but first we neet to divide
Hi all,
We have a dataframe with 2.5 millions of records and 13 features. We want
to perform a logistic regression with this data but first we neet to divide
each columns in discrete values using QuantileDiscretizer. This will
improve the performance of the model by avoiding outliers.
For small d
Hi all
I need to apply QuantileDiscretizer() over a 16 columns sql.dataframe.
Which is the most efficient way to apply a function over each columns? Do I
need to iterate over columns? Which is the best way to do this?
Thank you all.
Hello all,
I have to build a item-based recommendation system. First I obtained the
similarity matrix with CosineSimilarity DIMSUM by twitter solution (
https://blog.twitter.com/2014/all-pairs-similarity-via-dimsum). The
similarity matrix is in the following format:
org.apache.spark.rdd.RDD[org.ap
Hi all,
I'm following an TF-IDF example but I’m having some issues that i’m not
sure how to fix.
The input is the following
val test = sc.textFile("s3n://.../test_tfidf_products.txt")
test.collect.mkString("\n")
which prints
test: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[370] at tex
Hi all,
I'm following an TF-IDF example but I’m having some issues that i’m not
sure how to fix.
The input is the following
val test = sc.textFile("s3n://.../test_tfidf_products.txt")
test.collect.mkString("\n")
which prints
test: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[370] at tex
10 matches
Mail list logo