[email protected] wrote:
- split the data models per authority/repository. This will improve
performance of refilter/search functions and switching between categories
as well this simplifies a little bit filter functions. The problem starts
when users are adding blastwave, pending, dev, sunfreeware and contrib
repos, our data model contains all packages and GUI search functionality
is bad.
I agree that the complete-as-you-type search can't scale as currently
created. I don't agree that this solution is anything more than a patch
that will tide us over for a little while, and I think making a significant
shift in how you organize the data for a short term fix is a questionable
decision. Release currently has 20k packages, dev has 25k, and pending has
11k. For those repos, the problem has been roughly reduced by a factor of
three. (really more like factors of 2, 2, and 5). Anyway, I'm not sure that
that feature is worth reworking the organizational structure of your code.
Those repos will grow, so unless I've missed something, we've just pushed
the day of reckoning off for a few months (or years depending on their
growth factors). It's quite possible that using the filters from
gtk.ListStore isn't going to scale going forward. I'm not familiar with the
code backing that function, nor am I certain that that's where the slow
down is, but I think I remember it being implemented.
The problem is that the size of the dataset is unbounded, to paraphrase
Brock's response. The complete-as-you-type search doesn't really make
sense in this kind of a situation. If you need a complete-as-you-type
feature because life won't be complete without one, consider caching
previous search results and performing an autocomplete for historical
searches. This idea would be similar to how your web browser tries to
complete websites as you type in the URL.
I really like that idea j, but I do think the complete-as-you-type might
be tractable with the right data structure. Since the search is
currently only happening over package names, imagine a tree for package
names where each layer is a letter in the name. So, if a user types in
abc, I take the a branch, then the b, then the c branch. Once I get
there, I determine the number of answers that can be shown on the screen
at one time, and traverse the tree alphabetically to find the first N
answers under abc. Other than holding that tree in memory, I think that
design scales. And my guess (and it's only a guess) is that by using
lists cleverly, we can probably store a fairly large number of packages
efficiently in that structure w/out a huge memory hit.
When the user hits return, we basically produce a generator which will
spit out the next package if/as the user scrolls. The result is popped
onto the screen, and the result is stored in a list (or we auto populate
the list as fast as we can from the generator as we make a spinner spin
or bump a UI counter).
At least that was the idea I had kicking around in my head when I
mentioned it in the first place.
Brock
-j
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss