Ok, here you go: I have created a simple class with main-method (no server and other stuff): public class RecommenderTest { public static void main(String[] args) throws IOException, TasteException { DataModel dataModel = new FileDataModel(new File("/Users/najum/Documents/recommender-console/src/main/webapp/resources/preference_csv/1mil.csv")); ItemSimilarity similarity = new LogLikelihoodSimilarity(dataModel); ItemBasedRecommender recommender = new GenericItemBasedRecommender(dataModel, similarity); String pathToPreComputedFile = preComputeSimilarities(recommender, dataModel.getNumItems()); InputStream inputStream = new FileInputStream(new File(pathToPreComputedFile)); BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream)); Collection<GenericItemSimilarity.ItemItemSimilarity> correlations = bufferedReader.lines().map(mapToItemItemSimilarity).collect(Collectors.toList()); ItemSimilarity precomputedSimilarity = new GenericItemSimilarity(correlations); ItemBasedRecommender recommenderWithPrecomputation = new GenericItemBasedRecommender(dataModel, precomputedSimilarity); recommend(recommender); recommend(recommenderWithPrecomputation); } private static String preComputeSimilarities(ItemBasedRecommender recommender, int simItemsPerItem) throws TasteException { String pathToAbsolutePath = ""; try { File resultFile = new File(System.getProperty("java.io.tmpdir"), "similarities.csv"); if (resultFile.exists()) { resultFile.delete(); } BatchItemSimilarities batchJob = new MultithreadedBatchItemSimilarities(recommender, simItemsPerItem); int numSimilarities = batchJob.computeItemSimilarities(Runtime.getRuntime().availableProcessors(), 1, new FileSimilarItemsWriter(resultFile)); pathToAbsolutePath = resultFile.getAbsolutePath(); System.out.println("Computed " + numSimilarities + " similarities and saved them to " + pathToAbsolutePath); } catch (IOException e) { System.out.println("Error while writing pre computed similarities to file"); } return pathToAbsolutePath; } private static void recommend(ItemBasedRecommender recommender) throws TasteException { long start = System.nanoTime(); List<RecommendedItem> recommendations = recommender.recommend(1, 10); long end = System.nanoTime(); System.out.println("Created recommendations in " + getCalculationTimeInMilliseconds(start, end) + " ms. Recommendations:" + recommendations); } private static double getCalculationTimeInMilliseconds(long start, long end) { double calculationTime = (end - start); return (calculationTime / 1_000_000); } private static Function<String, GenericItemSimilarity.ItemItemSimilarity> mapToItemItemSimilarity = (line) -> { String[] row = line.split(","); return new GenericItemSimilarity.ItemItemSimilarity( Long.parseLong(row[0]), Long.parseLong(row[1]), Double.parseDouble(row[2])); }; } And thats the Output-log: 3 [main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Creating FileDataModel for file /Users/najum/Documents/recommender-console/src/main/webapp/resources/preference_csv/1mil.csv 63 [main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Reading file info... 1207 [main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Processed 1000000 lines 1208 [main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Read lines: 1000209 1475 [main] INFO org.apache.mahout.cf.taste.impl.model.GenericDataModel - Processed 6040 users 1599 [main] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - Queued 3706 items in 38 batches 10928 [pool-1-thread-8] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 7 processed 5 batches 10928 [pool-1-thread-8] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 7 processed 5 batches. done. 10978 [pool-1-thread-5] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 4 processed 4 batches. done. 11589 [pool-1-thread-4] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 3 processed 5 batches 11589 [pool-1-thread-4] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 3 processed 5 batches. done. 11592 [pool-1-thread-6] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 5 processed 5 batches 11592 [pool-1-thread-6] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 5 processed 5 batches. done. 11707 [pool-1-thread-7] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 6 processed 5 batches 11707 [pool-1-thread-7] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 6 processed 5 batches. done. 11730 [pool-1-thread-3] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 2 processed 4 batches. done. 11849 [pool-1-thread-1] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 0 processed 5 batches 11849 [pool-1-thread-1] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 0 processed 5 batches. done. 11854 [pool-1-thread-2] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 1 processed 5 batches 11854 [pool-1-thread-2] INFO org.apache.mahout.cf.taste.impl.similarity.precompute.MultithreadedBatchItemSimilarities - worker 1 processed 5 batches. done. Computed 9174333 similarities and saved them to /var/folders/9g/4h38v1tj3ps9j21skc72b56r0000gn/T/similarities.csv Created recommendations in 1683.613 ms. Recommendations:[RecommendedItem[item:3890, value:4.6771617], RecommendedItem[item:3530, value:4.662509], RecommendedItem[item:127, value:4.660716], RecommendedItem[item:3323, value:4.660716], RecommendedItem[item:3382, value:4.660716], RecommendedItem[item:3123, value:4.603366], RecommendedItem[item:3233, value:4.5707765], RecommendedItem[item:1434, value:4.553473], RecommendedItem[item:989, value:4.5263577], RecommendedItem[item:2343, value:4.524066]] Created recommendations in 985.679 ms. Recommendations:[RecommendedItem[item:3530, value:5.0], RecommendedItem[item:3382, value:5.0], RecommendedItem[item:3890, value:4.6771617], RecommendedItem[item:127, value:4.660716], RecommendedItem[item:3323, value:4.660716], RecommendedItem[item:3123, value:4.603366], RecommendedItem[item:3233, value:4.5707765], RecommendedItem[item:1434, value:4.553473], RecommendedItem[item:989, value:4.5263577], RecommendedItem[item:2343, value:4.524066]] Again almost same results. Although what I also don´t understand is, why am I getting different RecommendItems? That really frustrates me… You can find the Java file in the attachment. |
RecommenderTest.java
Description: Binary data
Greetings from Germany, Najum Am 17.04.2014 um 11:44 schrieb Sebastian Schelter <s...@apache.org>: Yes, just to make sure the problem is in the mahout code and not in the surrounding environment. |