Re: Apache Mahout 0.6 Released

2012-02-10 Thread Ahmed Abdeen Hamed
Thank you for the wonderful work! Does the new release support built-in MapReduce for the User-based Recommenders? Thanks very much! -Ahmed On Mon, Feb 6, 2012 at 4:19 PM, Shannon Quinn wrote: > Apache Mahout has reached version 0.6. All developers are encouraged to > begin using version 0.6,

GenericBooleanPrefDataModel MapReduce

2012-02-13 Thread Ahmed Abdeen Hamed
Hello, I am trying to write a MapReduce job for a boolean preference model. The book exmple in 3.6 listing in MiA only supports FileDataModel. That won't work for the MapReduce since the the files will be sent to the job itself and key/value will replace this data model. I played some with the API

Support of HBase

2012-02-15 Thread Ahmed Abdeen Hamed
Hello, I learned from the MiA book that Mahout supports recommendations from a database using JDBCDataModel VS in-memory recommendation. Does Mahout also support HBase natively the way it does with JDBC? I would appreciate any help including examples. Thanks very much, -Ahmed

Re: Support of HBase

2012-02-16 Thread Ahmed Abdeen Hamed
Thank you everyone for making this such a live conversation. If you have any examples of how you used HBase + Mahout together, and you would like to share, that would be most appreciated! Thanks again, -Ahmed On Thu, Feb 16, 2012 at 3:48 AM, Stuart Smith wrote: > Hey, > Just to throw my 2 cen

Injecting content into item-item CF

2012-03-06 Thread Ahmed Abdeen Hamed
Hello friends, Is there an example on how you can inject intem attributes into a item-item similarity algorithm? Thanks very much, -Ahmed

Cluster-based recommenders

2012-03-12 Thread Ahmed Abdeen Hamed
Hello friends, I am considering the use of Cluster-based recommender for a problem that I am solving. However, what I am trying to do the mirror or how cluster-based is doing. Specifically, cluster-based clusters users to groups then recommends items to each cluster. Would it be possible for me to

Re: Cluster-based recommenders

2012-03-12 Thread Ahmed Abdeen Hamed
This is really great. Thanks so much! -Ahmed On Mon, Mar 12, 2012 at 12:13 PM, Sean Owen wrote: > Sure -- to do this, you simply flip your items and users. Feed item > IDs as user IDs and vice versa. Then you have a system that recommends > users to items, really. And you can use clustering if y

Re: Cluster-based recommenders

2012-03-12 Thread Ahmed Abdeen Hamed
Thanks for the heads up. I will keep that in mind! -Ahmed On Mon, Mar 12, 2012 at 1:49 PM, Ted Dunning wrote: > Be aware that cluster based recommenders almost never perform as well as > user/item based recommenders. > >

Re: Cluster-based recommenders

2012-03-12 Thread Ahmed Abdeen Hamed
I have a question about the TreeClusteringRecommender: Is there a way that you can estimate the number of clusters rather than specifying how many you want? I thought of using the K-means clustering algorithm to do that for me, but it sounds a bit redundant. Is there a more elegant way? Thanks,

Re: Cluster-based recommenders

2012-03-12 Thread Ahmed Abdeen Hamed
This makes perfect sense! Thanks again, -Ahmed On Mon, Mar 12, 2012 at 6:20 PM, Sean Owen wrote: > You can set a threshold rather than a count -- that's about as much as > that bit of code does in this regard. > >

Re: Injecting content into item-item CF

2012-03-13 Thread Ahmed Abdeen Hamed
Hi Sean, I did some reading before writing so I can ask more specific questions. The MiA book has a couple of sections that cover content-based. The move attributes examples make sense. However, it appears to me that the similarity can not be computed offline. This is because the similarity is dep

Re: Injecting content into item-item CF

2012-03-13 Thread Ahmed Abdeen Hamed
attributes on items to refine > the item-item similarities and uses content attributes on users to help > access those similarities. Often one uses a search engine such as solr to > augment the real-time side of the implementation. > > > On Tue, Mar 13, 2012 at 9:28 AM, Ahmed

Re: Injecting content into item-item CF

2012-03-13 Thread Ahmed Abdeen Hamed
uess at how you are > also adding in something recommender-related accurate? > > Otherwise we may be talking past each other again. > > On Tue, Mar 13, 2012 at 5:35 PM, Ahmed Abdeen Hamed > wrote: > > Thanks Sean and Ted! > > > > Let me explain how I got here

Re: Injecting content into item-item CF

2012-03-13 Thread Ahmed Abdeen Hamed
Sorry if my questions are hard to understand. Let's start all over... Do we have an example that explains the following paragraph the in MiA book? "Or recall that item-based recommenders require some notion of similarity between two given items. This similarity is encapsulated by an ItemSimilar

Edit Distance

2012-03-19 Thread Ahmed Abdeen Hamed
Hello, Does Mahout have support for Edit Distance between two Strings? I looked on the web but can't find anything. Please let me know if it does. Thanks very much, -Ahmed

Re: Edit Distance

2012-03-19 Thread Ahmed Abdeen Hamed
Thanks very much! -Ahmed On Mon, Mar 19, 2012 at 11:46 AM, Sean Owen wrote: > No I don't think that really comes into play in any of the ML algorithms > here. At least I do not recall seeing it. > >

Re: Edit Distance

2012-03-19 Thread Ahmed Abdeen Hamed
relevance to you perhaps, but my friend once > did a comparison of string edit distance metrics for name matching > correction. > > > http://www.mendeley.com/research/comparison-string-distance-metrics-namematching-tasks-3/ > > Dawid > > On Mon, Mar 19, 2012 at 4:44 PM

Re: is hadoop necessary for clustering in mahout?

2012-03-22 Thread Ahmed Abdeen Hamed
Hi, I think I can answer this question... Yes, you can run a clustering algorithm on your local machine without using Hadoop. Just include the mahout jar files in your classpath and start using it as just another java library. I am currently experimenting with TreeClusteringRecommender but you ca

Merging similarities from two different approaches

2012-03-22 Thread Ahmed Abdeen Hamed
Hello, I developed a recommender that computes the distance between two items based on contents. However, I also need to include the association between the user-item. But, when I do that, I end up having a similarity score from the item-item content based and also another similarity score based o

Re: Merging similarities from two different approaches

2012-03-22 Thread Ahmed Abdeen Hamed
u need any of this at > all, since I'm not sure what the user-item value is to begin with. > That's your output, not an input. > > On Thu, Mar 22, 2012 at 9:18 PM, Ahmed Abdeen Hamed > wrote: > > Hello, > > > > I developed a recommender that computes th

Re: Merging similarities from two different approaches

2012-03-22 Thread Ahmed Abdeen Hamed
You are correct. In a previous post, I inquired about the use of TreeClusteringRecommender which is based upon a UserSimilarity metrix. My question was whether I can use it for ItemSimialrity, and your answer was yes, just feed the itemID as a userID and vice versa and that's what I am doing in it

Re: Merging similarities from two different approaches

2012-03-23 Thread Ahmed Abdeen Hamed
s in the > ballpark on theoretically sound is to take their product. > > > On Thu, Mar 22, 2012 at 9:48 PM, Ahmed Abdeen Hamed > wrote: > > You are correct. In a previous post, I inquired about the use of > > TreeClusteringRecommender which is based upon a UserSimilarit

Re: Merging similarities from two different approaches

2012-03-26 Thread Ahmed Abdeen Hamed
n't care what it > means. > > > On Fri, Mar 23, 2012 at 1:52 PM, Sean Owen wrote: > >> On Fri, Mar 23, 2012 at 8:33 PM, Ahmed Abdeen Hamed >> wrote: >> > As for merging the scores, I need an OR rule, which translates to the >> > addition. If I used A

cluster-based recommendation algorithm

2012-03-26 Thread Ahmed Abdeen Hamed
Hello, This might sound trivial but I have to ask before I can spend time on it. Mahout provides a recommender that is cluster-based called TreeClusteringRecommender. Can other recommenders be implemented but with different clustering algorithms (K-Means) as opposed to TreeClustering? Please let

TrainNewsGroups source code

2012-04-04 Thread Ahmed Abdeen Hamed
Hello, The source code for the TrainNewsGroups classification example has some issues. There are uncommented regions that they are still covered in the MiA book. However, those regions can't compile. For instance, line 95 onColon.split(line) doesn't declare onColon object before it was used. Is t

Re: Combining CF and Content-based recommendations

2012-04-05 Thread Ahmed Abdeen Hamed
Hi Anita, I had a similar question to the list not too long ago. I got very good answers from both Sean and Ted. Please check the archives and if you still have questions feel free to email me. I am sure I will learn something new. Good luck! -Ahmed On Thu, Apr 5, 2012 at 4:20 PM, ameh wrote:

Re: TrainNewsGroups source code

2012-04-06 Thread Ahmed Abdeen Hamed
I didn't sent my response to the list so I sending it now. Please let me know if you have a fix for me. Thanks very much, -Ahmed On Wed, Apr 4, 2012 at 8:44 PM, Ahmed Abdeen Hamed wrote: > Sorry my message wasn't specific enough. In ch14 of the MiA source code. > there is a Trai

citing mahout

2012-04-08 Thread Ahmed Abdeen Hamed
Hello, Is there a specific format the Mahout developers would like for citing Mahout? Thanks very much, -Ahmed

Re: citing mahout

2012-04-08 Thread Ahmed Abdeen Hamed
machine-learning and data-mining > library}, >Url = {http://mahout.apache.org}, >Bdsk-Url-1 = {http://mahout.apache.org}} > > /Manuel > > On 08.04.2012, at 21:24, Ahmed Abdeen Hamed wrote: > > > Hello, > > > > Is there a specific format the Mahout developers would like for citing > > Mahout? > > > > Thanks very much, > > > > -Ahmed > > -- > Manuel Blechschmidt > Dortustr. 57 > 14467 Potsdam > Mobil: 0173/6322621 > Twitter: http://twitter.com/Manuel_B > >

Re: Naive Bayes implementation using Mahout

2012-04-09 Thread Ahmed Abdeen Hamed
Hello Vikas, I am actually having the same problem with the 20-newsgroups example. I will reduce the directory structure to see if that works. Let me know if you come up with a better solution. Thanks, -Ahmed On Mon, Apr 9, 2012 at 11:13 AM, Vikas wrote: > Hi All, > > I am new to Mahout hence t

Re: Naive Bayes implementation using Mahout

2012-04-09 Thread Ahmed Abdeen Hamed
I actually tried with a much simpler file structure and still getting FileNotFoundException. Of course the files exist and have the right permissions. Something is strange with this example. -Ahmed On Mon, Apr 9, 2012 at 12:50 PM, Ahmed Abdeen Hamed wrote: > Hello Vikas, > I am actually

Re: Genetic Algorithm

2012-04-23 Thread Ahmed Abdeen Hamed
Hello Paritosh, When I downloaded Mahout's source-code I found out that it actually have support for GA. However, I didn't really have a chance to play with it yet. I don't believe you would see any reference to it in the MiA book though. There is also a module on Evolutionary Processes (EP), whic