I have some custom Clojure code that maps strings to longs for my particular data set, stores the values in a set and writes them to a file. Will try and post some code in the next couple of weeks.
Daniel On Mar 8, 2012 9:21 AM, "Claudia Grieco" <[email protected]> wrote: > Thanks guys for the help > Claudia > > -----Messaggio originale----- > Da: Manuel Blechschmidt [mailto:[email protected]] > Inviato: giovedì 8 marzo 2012 16.15 > A: [email protected] > Oggetto: Re: R: R: Using recommenders with String identifiers > > Hi Claudia, > actually a kind of. With the IDMigrator it depends how you store them. You > can store them in memory, in a database or in a file. > > Further if you would use strings these strings would get copied multiple > times and therefore would use multiple times the amount of there memory. > > So you could supply a recommender implementation which is doing the String > Long mapping transparently for the user and put in on github. Currently > there is a lack of easy to understand examples. I tried to help a little > bit > with my facebook-recommender-demo. > > /Manuel > > On 08.03.2012, at 15:52, Claudia Grieco wrote: > > > I understand, but with IDMigrator I still need the memory to store the > > long-string mappings, isn't it? > > > > -----Messaggio originale----- > > Da: Sebastian Schelter [mailto:[email protected]] > > Inviato: giovedì 8 marzo 2012 15.27 > > A: [email protected] > > Oggetto: Re: R: Using recommenders with String identifiers > > > > Here's some details on the memory usage of Strings in Java: > > > > http://www.javamex.com/tutorials/memory/string_memory_usage.shtml > > > > On 08.03.2012 14:53, Manuel Blechschmidt wrote: > >> Hallo Claudia, > >> the reason why longs are use is pure efficiency. When you have a lot of > > things and a lot of users and you are using Strings as identifiers you > will > > need a lot of memory just for saving them. Further processes like equals > or > > hash codes will take longer. > >> > >> So a long has 4 bytes (64 bits) a UUID string (e.g. > > 936DA01F-9ABD-4D9D-80C7-02AF85C822A8) encoded as utf-16 has 72 bytes that > > means that UUID would consume more then18x the memory that longs are > taking. > >> > >> /Manuel > >> > >> > >> On 08.03.2012, at 14:27, Claudia Grieco wrote: > >> > >>> Do you think it's worth the work to change the internal code of Mahout > in > >>> order to use string identifiers? > >>> Thanks > >>> Claudia > >>> > >>> -----Messaggio originale----- > >>> Da: Manuel Blechschmidt [mailto:[email protected]] > >>> Inviato: lunedì 5 marzo 2012 11.28 > >>> A: [email protected] > >>> Oggetto: Re: Using recommenders with String identifiers > >>> > >>> Hi Claudia, > >>> you have to use an IDMigrator. > >>> > >>> The following projects shows you an example: > >>> https://github.com/ManuelB/facebook-recommender-demo > >>> > >>> > > > > https://github.com/ManuelB/facebook-recommender-demo/blob/master/src/main/ja > >>> va/de/apaxo/bedcon/FacebookRecommender.java > >>> > >>> Good luck > >>> Manuel > >>> > >>> On 05.03.2012, at 09:53, Claudia Grieco wrote: > >>> > >>>> Hi guys, > >>>> > >>>> I'd like to use mahout to implement a recommender but I'm encountering > a > >>>> problem: > >>>> > >>>> Ids of items and users are represented in Mahout as long integers, > while > >>> my > >>>> data comes from an external database that uses strings to identify > items > >>> and > >>>> users. > >>>> > >>>> Any suggestion as to how I can fix this problem? > >>>> > >>>> Thanks a lot > >>>> > >>>> Claudia > >>>> > >>> > >>> -- > >>> Manuel Blechschmidt > >>> Dortustr. 57 > >>> 14467 Potsdam > >>> Mobil: 0173/6322621 > >>> Twitter: http://twitter.com/Manuel_B > >>> > >>> > >> > > > > -- > Manuel Blechschmidt > Dortustr. 57 > 14467 Potsdam > Mobil: 0173/6322621 > Twitter: http://twitter.com/Manuel_B > > >
