Re: Documentation
On Fri, Feb 13, 2015 at 9:37 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: If I need to use a classical user-based technique, however, the only alternative is the Taste-oriented code, am I right? Right. Still, I can't see how to perform a prediction for a a user/item couple, is there a class for that? Not directly, but I think that you cna cobble something simple together.
Re: How can I manually specify user similarities in the user-based algorithm?
Ok, thanks for your support. Eugenio 2015-02-11 11:54 GMT+01:00 Juanjo Ramos jjar...@gmail.com: Yes. You approach sounds about right. As far as I know, you just cannot not pass a file to Mahout with user similarities and it will create a UserSimilarity object as it can do with the DataModel. When I have done something like that in the past, you need to build your own thing of parsing the file and loading it into memory. On Wed, Feb 11, 2015 at 10:42 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: Yes, I know I can implement a custom user similarity but what I want to do is passing to mahout fixed, pre-computed user similarities I have already stored in a text file in the easiest way possible, since I am not a Java programmer. If there is no way to do it, I will implement CustomUserSimilarity just by reading the text file, storing the file in memory and returning the corresponding similarity. I should do that making sure the read of the text file is done just once, though. Eugenio 2015-02-11 11:28 GMT+01:00 Juanjo Ramos jjar...@gmail.com: You can create your custom class with your similarity implementation. All you need is that class to implement the UserSimilarity interface and use it here UserSimilarity similarity = new PearsonCorrelationSimilarity(dm); instead of the PearsonCorrelationSimilarity. UserSimilarity similarity = new CustomUserSimilarity(dm); // CustomUserSimilarity implements UserSimilarity If the implementation of that CustomUserSimilarity is in C, you may want to look into JNI (Java Native Interface) to call C code from Java. Best, Juanjo. On Wed, Feb 11, 2015 at 9:48 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: Hello Pat and thanks for your reply, I know that when users items normally item-based works better and I don't assume my similarity metric works better but I have, for research purposes, to compare: - RMSE produced by a pearson correlation user-based algorithm VS - RMSE produced by a user-based algorithm where similarities are computed in a completely different and not standard way (algorithm implemented in C) so I am looking for a way to assign manually the user similarities; the test will be performed just on a couple of datasets so it's fine if I have to hard-code the assignment. Eugenio 2015-02-10 23:58 GMT+01:00 Pat Ferrel p...@occamsmachete.com: There are many algorithms in Mahout but not all are equal. Some combinations never perform well even though they are described in Mahout in Action. The combination below is probably not the best. You seem to assume your user similarity metric is better than Mahout’s? Do you have more users or items? If I were you I'd try user or item based recs in Mahout using LLR similarity. It’s always performed best when I’ve compared. I say this because I know of no way to do what you ask without writing some code and partly because I bet it will outperform. Also be aware that the only good way to compare completely different recommenders is A/B user testing. On Feb 10, 2015, at 3:39 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: Hi all, I am new to mahout but I work with recommender systems, I have just tried to implement a simple user-based recommender: DataModel dm = new FileDataModel(new File(data/ratings.dat)); UserSimilarity similarity = new PearsonCorrelationSimilarity(dm); UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.1,similarity, dm); UserBasedRecommender r = new GenericUserBasedRecommender(dm, neighborhood, similarity); I would like to compare the results of this recommender with another I implemented using another technology. The only differences between the two algorithms is the way I choose neighbors; since I am not very fluent in Java, instead of implementing the second algorithm in mahout, I would like to manually specify the neighbors for each user, is this possible? Which is the easiest way to provide an alternative user-user similarity matrix (computed using my algorithm)? Just to recap: I want to use GenericUserBasedRecommender but providing an alternative users similarity matrix, without reimplementing my similarity algorithm in Java. Basically if I could import the similarities from a text file it would be great, but other methods are fine as well. Thanks a lot in advance. Eugenio Tacchini
Re: How can I manually specify user similarities in the user-based algorithm?
I am trying to add the fixed user similarities in the easiest possible way. This is my starting code (a normal user-based algorithm based on Pearson Correlation): UserSimilarity similarity = new PearsonCorrelationSimilarity(dm); UserNeighborhood neighborhood = new NearestNUserNeighborhood(15, 0.1, similarity, dm); GenericUserBasedRecommender = new GenericUserBasedRecommender(dm, neighborhood, similarity); I would say my (pseudo) code will be: // UserSimilarity similarity = new PearsonCorrelationSimilarity(dm); // I don't need this anymore // UserNeighborhood neighborhood = new NearestNUserNeighborhood(15, 0.1, similarity, dm); // I don't need this anymore 1) Read the similarities from a file ... 2) Build the neighborhood and similarity objects according to my matrix. 3) GenericUserBasedRecommender = new GenericUserBasedRecommender(dm, neighborhood, similarity); Part 2) is the most difficult, I thought the neighborhood object represented, for each user, his neighbors but from my Eclipse inspection I see there is much information and the neighbors seem not to be listed here, but retrieved using getUserNeighborhood + userSimilarity ? I am getting lost here, also because I am almost new to Java. Is there anyone who can give me some hints about this task? Thanks a lot in advance. Eugenio 2015-02-13 18:29 GMT+01:00 Eugenio Tacchini eugenio.tacch...@gmail.com: Ok, thanks for your support. Eugenio 2015-02-11 11:54 GMT+01:00 Juanjo Ramos jjar...@gmail.com: Yes. You approach sounds about right. As far as I know, you just cannot not pass a file to Mahout with user similarities and it will create a UserSimilarity object as it can do with the DataModel. When I have done something like that in the past, you need to build your own thing of parsing the file and loading it into memory. On Wed, Feb 11, 2015 at 10:42 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: Yes, I know I can implement a custom user similarity but what I want to do is passing to mahout fixed, pre-computed user similarities I have already stored in a text file in the easiest way possible, since I am not a Java programmer. If there is no way to do it, I will implement CustomUserSimilarity just by reading the text file, storing the file in memory and returning the corresponding similarity. I should do that making sure the read of the text file is done just once, though. Eugenio 2015-02-11 11:28 GMT+01:00 Juanjo Ramos jjar...@gmail.com: You can create your custom class with your similarity implementation. All you need is that class to implement the UserSimilarity interface and use it here UserSimilarity similarity = new PearsonCorrelationSimilarity(dm); instead of the PearsonCorrelationSimilarity. UserSimilarity similarity = new CustomUserSimilarity(dm); // CustomUserSimilarity implements UserSimilarity If the implementation of that CustomUserSimilarity is in C, you may want to look into JNI (Java Native Interface) to call C code from Java. Best, Juanjo. On Wed, Feb 11, 2015 at 9:48 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: Hello Pat and thanks for your reply, I know that when users items normally item-based works better and I don't assume my similarity metric works better but I have, for research purposes, to compare: - RMSE produced by a pearson correlation user-based algorithm VS - RMSE produced by a user-based algorithm where similarities are computed in a completely different and not standard way (algorithm implemented in C) so I am looking for a way to assign manually the user similarities; the test will be performed just on a couple of datasets so it's fine if I have to hard-code the assignment. Eugenio 2015-02-10 23:58 GMT+01:00 Pat Ferrel p...@occamsmachete.com: There are many algorithms in Mahout but not all are equal. Some combinations never perform well even though they are described in Mahout in Action. The combination below is probably not the best. You seem to assume your user similarity metric is better than Mahout’s? Do you have more users or items? If I were you I'd try user or item based recs in Mahout using LLR similarity. It’s always performed best when I’ve compared. I say this because I know of no way to do what you ask without writing some code and partly because I bet it will outperform. Also be aware that the only good way to compare completely different recommenders is A/B user testing. On Feb 10, 2015, at 3:39 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: Hi all, I am new to mahout but I work with recommender systems, I have just tried to implement a simple user-based recommender: DataModel dm = new FileDataModel(new
Re: How can I manually specify user similarities in the user-based algorithm?
On Fri, Feb 13, 2015 at 11:11 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: Is there anyone who can give me some hints about this task? Another way to look at this is to try to wedge this into the item similarity code. There are hooks available in the map-reduce version of item similarity to put an arbitrary user distance in. This only works well if there are sparsity constraints that limit the number of distances that need to be computed, but if it works, it can be really excellent. This would allow you to put your distances in and still use an indicator-based recommender.
Re: Documentation
spark-rowsimilarity will give you a list of similar users (rows in the interaction matrix) using LLR with several downsampling options. This works with rows for input but you can input elements with a little custom code to get exactly the same result. Let me understand the second part of your question. The recs query is (user id, item id)? So you want both to contribute to the recommendations? This is different than a typical “other people who like this also liked these” type rec set, which is non-personal—the same for every user. If you are asking for something like recs on a product page using the item being viewed as context and the user’s preference history too—the multimodal recommender can do that. But please explain before I go into a long reply. On Feb 13, 2015, at 9:53 AM, Ted Dunning ted.dunn...@gmail.com wrote: On Fri, Feb 13, 2015 at 9:37 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: If I need to use a classical user-based technique, however, the only alternative is the Taste-oriented code, am I right? Right. Still, I can't see how to perform a prediction for a a user/item couple, is there a class for that? Not directly, but I think that you cna cobble something simple together.
Re: How can I manually specify user similarities in the user-based algorithm?
If the user - similar users relationship is really fixed for some test this isn’t even a Mahout problem… All you need to do is create a linear combination of all the similar user's preferences and rank accordingly. This produces ranked recs for some “current user”. If you have a record of user preferences and similar users it’s not even a Mahout thing. A DB will do this just fine for a test. The current code in spark-rowsimilarity will give similar users based on interaction input data using LLR. Adding a custom distance metric to SimilarityAnalysis.rowSimilarity should be pretty easy. So you have several ways to go using new code or old Taste code. To make it work generally you’ll have to write some code since your metric is really new. On Feb 13, 2015, at 11:14 AM, Ted Dunning ted.dunn...@gmail.com wrote: On Fri, Feb 13, 2015 at 11:11 AM, Eugenio Tacchini eugenio.tacch...@gmail.com wrote: Is there anyone who can give me some hints about this task? Another way to look at this is to try to wedge this into the item similarity code. There are hooks available in the map-reduce version of item similarity to put an arbitrary user distance in. This only works well if there are sparsity constraints that limit the number of distances that need to be computed, but if it works, it can be really excellent. This would allow you to put your distances in and still use an indicator-based recommender.
SLF4J: Class path contains multiple SLF4J bindings. error when MAHOUT_LOCAL is TRUE
Hi Looks like this is typical everywhere, however I have'nt figured out how to resolve in my case. There is nothing I have done explicitly regarding SLF4J. Both Hadoop and Mahout environment are built by just simply downloading jar files. Not built locally. Both Hadoop and Mahout have been working fine as pseudo-distributed mode for quite a while... Also not sure what information would be required, however, some of the class path that might relates to this are as follows. MAHOUT_HOME=/usr/local/mahout-distribution-0.7 MAHOUT_LOCAL=TRUE CLASS_PATH=/usr/local/hadoop:/usr/local/hadoop/conf:/usr/local/mahout-distribution-0.7/conf HADOOP_CONF_DIR=/usr/local/hadoop/conf HADOOP_HOME=/usr/local/hadoop JAVA_HOME=/usr/java/latest The only thing I have done to my existing healthy Hadoop/Mahout environment was setting MAHOUT_LOCAL TRUE. ... MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath. MAHOUT_LOCAL is set, running locally SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/mahout-distribution-0.7/mahout-examples-0.7-job.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/mahout-distribution-0.7/lib/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/mahout-distribution-0.7/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/ProgramDriver at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:96) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.ProgramDriver at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 1 more Regards,,, Y.Mandai