@Rohit. Fibonacci heap will not give optimal search performance for user id and corresponding frequency. I assume that heap will have user id as the key. If I want to search for some specific user id's and its frequency, worst case complexity is going to be O(n). Surely Insertion in O(1) will be a plus.
Is there something else in your mind? On Dec 9, 4:31 pm, Rohit Saraf <rohit.kumar.sa...@gmail.com> wrote: > @mayur: surely it would not be just for finding frequency. It's a server > which may have to do insert/delete operation. > > A web server will have lot's of things to do and this is just a tiny part. > I guess WORM is not apt for this... > In general, a web server will have to maintain a queue or something similar. > > Rohit Saraf > Sophomore > Computer Science and Engineering > IIT Bombay > > On Wed, Dec 9, 2009 at 4:49 PM, Mayur <mayurhem...@gmail.com> wrote: > > We have a server hit by millions of users. Sever log files contains > > the user ids of all of them. How do we find the frequency of login of > > each user. What will the most efficient way to store the users, and > > access them to find their frequency(The log files are very huge!!) > > > I thought of using B+ tree indexing with user ids as the key. Leaf > > nodes will have the pointers to bucket of user ids. One item of bucket > > will contain user id and frequency of this user. > > For insertion, search complexity will be ~O(logn) > > > Any potential problem with approach? Are there any better approach to > > tackle this problem? > > > Not problems really, but opportunities. Here're a few things: > > 1. They're append-only (no inserts, no deletes). Why then have a B+ tree on > > the log file itself, which is a structure optimized for both retrieval and > > inserts? > > 2. If all that you want to query is the frequency of login of each user, > > one might ask the following questions: > > i) How often do you want this frequency data? Is the data required to > > be updated in real-time? > > ii) Is it okay to query this data each time your log's rotated? > > If you require the data as and when it's created (meaning as soon as it is > > logged), it may be a good idea to change the application receiving the call > > to explicitly update the frequency, in say, a database. > > > On the other hand, if you want it periodically, you can simply run a > > script/program that takes the "old" size of the log file as input, and > > begins scanning the log from that offset, thus saving you time. > > The frequency data itself can be stored in a hash file / b+ tree or > > whatever else you'd like (so that it can be updated). > > > If you still wish to index your log files, search for Write-Once Read-Many > > (WORM) data structures on the web. > > > Apologies for the non-algorithmic nature of my response, if it isn't apt > > for the group. > > > On Wed, Dec 9, 2009 at 3:54 PM, Rohit Saraf > > <rohit.kumar.sa...@gmail.com>wrote: > > >> if you don't know abt fibonacci heaps then check out > >>http://en.wikipedia.org/wiki/Fibonacci_heap > > >> Rohit Saraf > >> Sophomore > >> Computer Science and Engineering > >> IIT Bombay > > >> -- > >> You received this message because you are subscribed to the Google Groups > >> "Algorithm Geeks" group. > >> To post to this group, send email to algoge...@googlegroups.com. > >> To unsubscribe from this group, send email to > >> algogeeks+unsubscr...@googlegroups.com<algogeeks%2bunsubscr...@googlegroups.com> > >> . > >> For more options, visit this group at > >>http://groups.google.com/group/algogeeks?hl=en. > > > -- > > You received this message because you are subscribed to the Google Groups > > "Algorithm Geeks" group. > > To post to this group, send email to algoge...@googlegroups.com. > > To unsubscribe from this group, send email to > > algogeeks+unsubscr...@googlegroups.com<algogeeks%2bunsubscr...@googlegroups.com> > > . > > For more options, visit this group at > >http://groups.google.com/group/algogeeks?hl=en. -- You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to algoge...@googlegroups.com. To unsubscribe from this group, send email to algogeeks+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/algogeeks?hl=en.