[email protected] wrote:
 - split the data models per authority/repository. This will improve
performance of refilter/search functions and switching between categories as well this simplifies a little bit filter functions. The problem starts when users are adding blastwave, pending, dev, sunfreeware and contrib repos, our data model contains all packages and GUI search functionality is bad.
I agree that the complete-as-you-type search can't scale as currently created. I don't agree that this solution is anything more than a patch that will tide us over for a little while, and I think making a significant shift in how you organize the data for a short term fix is a questionable decision. Release currently has 20k packages, dev has 25k, and pending has 11k. For those repos, the problem has been roughly reduced by a factor of three. (really more like factors of 2, 2, and 5). Anyway, I'm not sure that that feature is worth reworking the organizational structure of your code. Those repos will grow, so unless I've missed something, we've just pushed the day of reckoning off for a few months (or years depending on their growth factors). It's quite possible that using the filters from gtk.ListStore isn't going to scale going forward. I'm not familiar with the code backing that function, nor am I certain that that's where the slow down is, but I think I remember it being implemented.

The problem is that the size of the dataset is unbounded, to paraphrase
Brock's response.  The complete-as-you-type search doesn't really make
sense in this kind of a situation.  If you need a complete-as-you-type
feature because life won't be complete without one, consider caching
previous search results and performing an autocomplete for historical
searches.  This idea would be similar to how your web browser tries to
complete websites as you type in the URL.

I really like that idea j, but I do think the complete-as-you-type might be tractable with the right data structure. Since the search is currently only happening over package names, imagine a tree for package names where each layer is a letter in the name. So, if a user types in abc, I take the a branch, then the b, then the c branch. Once I get there, I determine the number of answers that can be shown on the screen at one time, and traverse the tree alphabetically to find the first N answers under abc. Other than holding that tree in memory, I think that design scales. And my guess (and it's only a guess) is that by using lists cleverly, we can probably store a fairly large number of packages efficiently in that structure w/out a huge memory hit.

When the user hits return, we basically produce a generator which will spit out the next package if/as the user scrolls. The result is popped onto the screen, and the result is stored in a list (or we auto populate the list as fast as we can from the generator as we make a spinner spin or bump a UI counter).

At least that was the idea I had kicking around in my head when I mentioned it in the first place.

Brock

-j

_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to