Re: [pkg-discuss] Private reply Re: CR for 6516, 5662, 6227, 5866, 6226, partial fix for 6365

Brock Pytlik Wed, 11 Feb 2009 16:21:49 -0800

Moving this back to pkg-discuss.

Michal Pryc wrote:

Brock,
I complately agree with you that currently it's quite hard to findpackage across repositories, but probably you remember that peoplewere telling that showing all packages from all repositories will notscale, that is why we took another approach, which I believe you willfind fine. This involves a little more changes, which I didn't want toinroduce at once, maybe I should have been more precise about plannedwork.

Right, I agree that showing all packages at once (by default) isn't theright approach. I don't think I agree that splitting by repository isthe right answer though. Especially with the pending change to thecategory system and package namespace, it seems to me that mirroringthat would be the right interface, essentially replacing the currentcategory dropbox and pane, and the list view with a tree structure thatwould allow a user to drill down through several levels of categoriesquickly. I'm fine with having a repository drop box to select repos, butthe default for that box should be "all". If I'm looking for a package,I'd expect choosing which repository to be the lowest level in thehierarchy, not the top level.

So currently we have decided to:

 - split the data models per authority/repository. This will improve
performance of refilter/search functions and switching betweencategories as well this simplifies a little bit filter functions. Theproblem starts when users are adding blastwave, pending, dev,sunfreeware and contrib repos, our data model contains all packagesand GUI search functionality is bad.

I agree that the complete-as-you-type search can't scale as currentlycreated. I don't agree that this solution is anything more than a patchthat will tide us over for a little while, and I think making asignificant shift in how you organize the data for a short term fix is aquestionable decision. Release currently has 20k packages, dev has 25k,and pending has 11k. For those repos, the problem has been roughlyreduced by a factor of three. (really more like factors of 2, 2, and 5).Anyway, I'm not sure that that feature is worth reworking theorganizational structure of your code. Those repos will grow, so unlessI've missed something, we've just pushed the day of reckoning off for afew months (or years depending on their growth factors). It's quitepossible that using the filters from gtk.ListStore isn't going to scalegoing forward. I'm not familiar with the code backing that function, noram I certain that that's where the slow down is, but I think I rememberit being implemented.

If you really want search as you type, backing the filter/search with atree based, rather than linear search (just my guess as to how that'simplemented given the scaling concerns described) might solve theproblems in a way that would scale going forward. Each node could evenknow how many children live beneath it and decide whether or notattempting to populate all of the results at once makes sense. Fun gamescould also be played by combining a search which was a generator with alist to store the results already seen, to allow the search results tostream across, rather than waiting for all results to come in beforedisplaying them.

- add cpickle caching to the data models. This will simply dump thedata structures from the gtk.ListStore and read those on startupmaking sure that cache is in sync with current catalog and packagestates, if not then the list will be gathered as it is now. The cachewill be per repository, so switching between those will readcorresponding caches. I've already working version of this part, wheregui starts in ~3 seconds for the dev repository comparing to 24seconds(currently)

This makes perfect sense to me, except I'd pickle the info for all therepos, or pickle by category instead of by repo.

- in the all repository view, rather then showing all packages, wewill perform remote search. This will allow users to find desiredpackages and will be scalable at the same time (please scroll down tosee *all repositories*):http://xdesign.sfbay.sun.com/projects/solaris/subprojects/package_mngt/UI_specs/ui_spec_phase3/html-mockup/12_search_parameters_r4.htm

What if I just want to browse packages from all repos? I do a search for* or '' and then look at the search results window? I agree, you shouldbe able to do a remote search over all packages. I don't think that it'sin anyway intuitive that that means an entirely different searchalgorithm will now be used. For example if I search for 'foo' in searchall repos and I find out that that maps to "SUNWbar". I would thenexpect that if I do a search on the repos individually, I would find atleast one repo where the same result happened for the same search, butunless I'm confused about how your search is planned to work, that won'thappen because searching a single repo employs an entirely separatealgorithm.

While we're talking about search, I'll mention that you might want tostart thinking about what an advanced search tool might look like in theGUI. If you want to make simplifying assumptions about what the user'slooking for when using the built in search bar, that's fine, but havinga more robust tool so that users can execute the same queries they couldfrom the command line would probably be a nice feature.


Brock

best
Michal

[snip]

_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] Private reply Re: CR for 6516, 5662, 6227, 5866, 6226, partial fix for 6365

Reply via email to