Greetings, I've been plugging away on search v1 for a while now. As part of that change, the syntax for search is going to be expanded a fair bit. I thought I'd outline here what my plans are for the new search command, and get feedback before I turn to the implementation. Please remember that this is my attempt to outline everything I might possibly want to put into search. It's possible that some of these features may disappear if the implementation is too difficult, or I think there isn't much demand for some non-trivial feature. If there are features, from an API or user perspective, either in this list or not, please let me know and I'll do my best to accommodate everyones desires.
From a high level, the syntax will remain roughly the same: pkg search [options] [tokens/query] API: Search will move into the client API. It will provide an option about whether only packages should be returned, or whether entire actions should be sent back. I think there may also be an option for a client to grab a list of possible search tokens from the server. This will allow the client to suggest expansions on it's own, instead of querying the server as the user types. Considering the number of search tokens, we may want to pare this down, or determine someway to update it, rather than simply dumping the entire batch each time since a simple list of tokens is 47M uncompressed for the server. Compression drops that to 15M, but that's still a lot of information to send repeatedly across the wire. Perhaps this feature isn't needed, but I suspect we can be clever about things and find a way to send only the new and deleted tokens after an initial transfer. Output: By default, search will display a list of packages instead of the 4 tuple it currently returns. There will be an option to toggle to set the output to be the old format. In addition, displaying the entire text of matching actions will be possible. By default search will be limited to returning the first N results. N is up for debate, but I suggest 100. Options: The simplest change is that remote search will become the default instead of local search. This seems to conform more with both the majority of use cases for most users (they're searching for things to install, not to find what's already on their system) as well as matching what other packaging systems' search seems to do. New options -p, -a, a third (-4?), and -o will be added. -p (the default behavior) will tell the search command to only display the package names, and not the detailed action info. -a will tell search to display the entire action as text to the user. -4 (or whatever the chosen name is) will cause the previous information to be displayed. -o can be repeated, and functions much as the pkg contents -o option works. It allows the user total control over what will be given as a result. Note that unlike the current behavior of contents, all the information about an action will be listed on a single line. New options -z(?), and --start-point will be added. -z will allow the user to determine the number of answers returned. --start-point will allow the user to specify starting at some result at a point after the initial one. This will allow users (especially things like the BUI) to easily paginate the results and will prevent the server from getting blocked on one query for an excessive amount of time. Tokens/query: Here is where the majority of visible change will take place. The query syntax is being vastly expanded. The simplest change is that multiple tokens will be allowed, and implicitly anded together. In addition, structured queries are being supported, more detail is provided below. Boolean queries will be added as well. I also plan/hope to be able to search against an incorporation, though that may have to wait till v2. Because search will now display at both a package and an action level, I believe it makes sense to distinguish between those uses in the query syntax. Thus, I'm suggesting instead of 'and', and 'or' there will be 'pand', 'por', and 'aand', 'aor' (for pacakge-op and action-op). The package ops will come with some restrictions, they cannot be used unless the user is asking for a package list, instead of an action list. The idea of pand (or example) is that it returns the list of packages containing actions that satisfy each of its components. The domains of these operations would be both lists of packages, and lists of actions. When provided two sets of packages, it simply takes the intersection (or union in the case of 'or') of the sets. When actions are provided, it first converts the sets of actions into a set of packages, each of which contains at least one of the actions in the original set. 'aand' and 'aor' (which might become just 'and' and 'or') would have a domain of actions. They allow the user to specify an action which has two features (for example, an action with both "fun" and "games" in it). When the and operation happens implicitly (for example in pkg search foo bar baz) I suggest that whether a pand or aand is used depends on whether the user has request only package information, or some form of action level information. 'pnot' and 'anot' may or may not be added, depending both on demand and the difficulty of implementing those efficiently. Another new feature is that instead of being limited to searching the text of all actions, users will be able to specify the action type, subtype (which I'll explain in a moment b/c that may be a poor name), or both over which they want to search. If a user wanted to list all files that are delivered into /usr/bin, they could do 'file.path:/usr/bin/*'. If a user wanted a list of all files, they could do "file:*". Or, if a user wanted to find all actions (directory, file, or link) which delivered into /usr/bin, they could search on "*.path:/usr/bin/*". As a final example, suppose a user wanted to search the descriptions of packages for "game", they would use "set.description:game". Some details of the syntax need to be worked out. It's quite possible that "." as a separator won't work well there. This will also include things like searching for a particular version string. I would also like to add searching against an incorporation. Especially for publishing, it appears that having this as part of the API would be useful. Also, it may allow us to better control the noise that's currently spit to a users screen by only returning the version(s) of actions/packages which satisfy the incorporations currently installed on their machine by default. This would only be used for remote search, and might well end up as an option, rather than as part of the query space. That's roughly what I have in mind to do. Depending on how difficult the various features turn out to be, and what time pressure appears, some may be dropped, or others added based on feedback here. Thanks for the help, Brock _______________________________________________ pkg-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/pkg-discuss
