If I remember correctly, you 12M users and 18M interactions. If I interpret the plots correctly there is one single item that accounts for 8.5M interactions (nearly half of the overall interactions) and more than two thirds of the users like it?
--sebastian On 01.12.2011 16:12, Sean Owen wrote: > You can 'tickle' the cache asynchronously if you like. > > I am still not clear on why you are doing so many item-item similarity > calculations. Your change ought to let you do 1, or 10, or 100 per > calculation if you like. That, we know, is fast. And a few hundred > similarities should start to give reasonable recommendations. > > What is preventing you from making this tradeoff (with your change)? > Yes, it is essential for reasonable performance. > > On Thu, Dec 1, 2011 at 3:06 PM, Daniel Zohar <[email protected]> wrote: > >> Hi Manuel, >> I haven't got to the point where CacheItemSimilarity kicks in. That is, I >> will have to run a lot of recommendations in order to get a real benefit >> from it. I would first like to optimize the 'cold start' so it's at least >> serves at reasonable time. Usually cache is used to prevent repeated >> calculations, but personally I dont think it's a replacement for optimized >> performance. Don't you agree? >> >> Also, I will try to profile the app now as you suggest and send the results >> asap. >> >> Thanks! >> >> On Thu, Dec 1, 2011 at 4:56 PM, Manuel Blechschmidt < >> [email protected]> wrote: >> >>> Hi Daniel, >>> actually you are running the profile inside tomcat. You should take a >>> snapshot and then drill down to the functions where the actual >>> recommendation takes place. The current screenshots also contains some >>> profiles from Tomcat threads which are sleeping a lot and therefore >> taking >>> a lot of time. >>> >>> Further the screenshots does not contain the amount how often the >>> different functions are called. >>> >>> You have to profile multiple requests alone. The CacheItemSimilarity gets >>> filled therefore it should go faster and faster. >>> >>> On 01.12.2011, at 15:11, Daniel Zohar wrote: >>> >>>> @Manuel thanks for the tips. I have installed VisualVM and followed are >>> the >>>> results >>>> I did two sampling - >>>> - With the optimized SamplingCandidateItemsStrategy ( >>>> http://pastebin.com/6n9C8Pw1): >> http://static.inky.ws/image/934/image.jpg >>>> - Without the optimized SamplingCandidateItemsStrategy: >>>> http://static.inky.ws/image/935/image.jpg >>>> >>> >>> The big hot spot is the function FastIDSet.find(): >>> >>> Optimized: 13,759 s >>> Unoptimized: 246,487 s >>> >>> So you see that your optimization already got you a performance boost of >>> 2000%. >>> >>> Did you play around with the CacheItemSimilarity cache sizes? >>> >>> /Manuel >>> >>> -- >>> Manuel Blechschmidt >>> Dortustr. 57 >>> 14467 Potsdam >>> Mobil: 0173/6322621 >>> Twitter: http://twitter.com/Manuel_B >>> >>> >> >
