Kevin wrote: > Thanks. > > That's what I am thinking: a friend told me that, sometimes a linear > look up can be faster than the hashtable, especially when the size is > "small". But I am not sure how "small" will be that (since in my case, > I estimate the average children number of a node will be 100-500). So I > am trying to explore all the possible methods. (In the program, the > tree update will be done in millions of times. And I figured out it > took the majority time of the program.)
The hash tables have Theta(1) per lookup performance. That means no matter how many children a node has, the performance will remain roughly the same. So you are trying for "constant factor" speedup. The only way to go about this is to try experiments and see what works. In the big picture, going after constant factor speedups is frequently a dead end for lots of reasons. But I'll talk about it anyway. Unordered lists of children will have Theta(n) time per lookup, which means that as your trees get wider, your program will slow down proportionally. If your IDs are integers or short strings, you'll probably find that lists will match hash tables at some value of n between 5 and 20. If your program will be used for a while, I'd never use them. Even if today's data produce a speed increase, someone tomorrow will try to run something much bigger, and the program will die. This approach might give a speedup by using less memory. If you can keep the child IDs in _sorted_ lists, then you can look them up with binary search, which is Theta(log n) per lookup. This means performance will degrade only very slowly with subtree width (e.g. if the width increases by a factor of 32, lookups will slow down by a factor of 5). When data structures get smaller, cache and/or paging performance can get better, and that can have a big impact. The only way you'll find out is to try it. Finally, you can embed your tree in a binary search tree. In this case you would define a key to be a level number concatenated with a child Id. X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.11.88.14 with SMTP id l14mr27431cwb; Sat, 20 May 2006 07:36:18 -0700 (PDT) X-Google-Token: tB7KvwwAAADdp_OCIpJ1pCkIYH0P6DBK Received: from 70.101.174.178 by 38g2000cwa.googlegroups.com with HTTP; Sat, 20 May 2006 14:36:17 +0000 (UTC) From: "Gene" <[EMAIL PROTECTED]> To: "Algorithm Geeks" <algogeeks@googlegroups.com> Subject: Re: Best way for a tree data structure for fast updating? Date: Sat, 20 May 2006 07:36:17 -0700 Message-ID: <[EMAIL PROTECTED]> In-Reply-To: <[EMAIL PROTECTED]> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> User-Agent: G2/0.2 X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322),gzip(gfe),gzip(gfe) Mime-Version: 1.0 Content-Type: text/plain Kevin wrote: > Thanks. > > That's what I am thinking: a friend told me that, sometimes a linear > look up can be faster than the hashtable, especially when the size is > "small". But I am not sure how "small" will be that (since in my case, > I estimate the average children number of a node will be 100-500). So I > am trying to explore all the possible methods. (In the program, the > tree update will be done in millions of times. And I figured out it > took the majority time of the program.) The hash tables have Theta(1) per lookup performance. That means no matter how many children a node has, the performance will remain roughly the same. So you are trying for "constant factor" speedup. The only way to go about this is to try experiments and see what works. In the big picture, going after constant factor speedups is frequently a dead end for lots of reasons. But I'll talk about it anyway. Unordered lists of children will have Theta(n) time per lookup, which means that as your trees get wider, your program will slow down proportionally. If your IDs are integers or short strings, you'll probably find that lists will match hash tables at some value of n between 5 and 20. If your program will be used for a while, I'd never use them. Even if today's data produce a speed increase, someone tomorrow will try to run something much bigger, and the program will die. This approach might give a speedup by using less memory. If you can keep the child IDs in _sorted_ lists, then you can look them up with binary search, which is Theta(log n) per lookup. This means performance will degrade only very slowly with subtree width (e.g. if the width increases by a factor of 32, lookups will slow down by a factor of 5). When data structures get smaller, cache and/or paging performance can get better, and that can have a big impact. The only way you'll find out is to try it. Finally, you can embed your tree in a binary search tree. In this case you would define a key to be a level number concatenated with a child Id. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to algogeeks@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/algogeeks -~----------~----~----~----~------~----~------~--~---