Re: Field value tags
On Sat, Feb 13, 2010 at 11:18 PM, Peter S pete...@hotmail.com wrote: Hello Solr-dev, I've now implemented a QParserPlugin/QParser for tagging functionality in my internal Solr environment, and this is working very nicely. The type of functionality offered by tagging isn't currently in Solr, so I was thinking this might be a good plugin to contribute to the project. Before preparing the plugin for ASF-readiness, it would be great to get feedback, comments etc. on what the Solr dev experts think of including this sort of thing. If it's deemed useful for inclusion, I'll go ahead and create a JIRA issue and prepare the code for ASF. Here is a quick precis of what tagging offers: First off, for your typical user-based searching of 'shopping cart' or google-type doc-scored searching, tagging is probably not what you want. Dismax provides a much better fit for this type of searching. Tagging provides a means of entering a tag into a query, which, on the server (in the plugin) translates to some configured subquery that is actually executed by Solr. There are a number of cool use-cases for this - the 2 most salient of which are these: 1. To provide a known 'key' at query time, that translates into subqueries that the user couldn't/wouldn't/shouldn't know at query time. For example, I use this to supply a tag called: 'admins', which, when entered into a query, will actually query for all documents that have some reference to all administrators/root users in the searched index(es). The [securely logged-in] person searching won't know who all the root users are (and the list will change over time), only that he/she wishes to find out information pertaining to their activity. 2. To provide subquery 'shortcuts' for often used, usually lengthy and/or complicated queries. For example, if every morning, as part of your job, you need to search for: ((this AND that) OR (theother AND NOT somethingelse)) AND timestamp:[then TO now] . . . A tag can be made, say, 'mysearchtag' which equates to the above query. This tag can then be used as a query, and/or embedded in other queries. This is quite handy for automated searching and/or saved searches etc. This allows server administrators to control the content that gets returned by these queries, thus reducing client-side maintenance. Additionally, for distributed searches, evaluated tags can, if desired, produce different queries for different shards (e.g. the list of root users are different on different machines). Any comments, concerns, opinions etc. on a contributuion of this type would be greatly appreciated. Thanks Peter. It definitely sounds useful for some use-cases. Can you open a Jira issue and give a patch? -- Regards, Shalin Shekhar Mangar.
RE: Field value tags
Hello Solr-dev, I've now implemented a QParserPlugin/QParser for tagging functionality in my internal Solr environment, and this is working very nicely. The type of functionality offered by tagging isn't currently in Solr, so I was thinking this might be a good plugin to contribute to the project. Before preparing the plugin for ASF-readiness, it would be great to get feedback, comments etc. on what the Solr dev experts think of including this sort of thing. If it's deemed useful for inclusion, I'll go ahead and create a JIRA issue and prepare the code for ASF. Here is a quick precis of what tagging offers: First off, for your typical user-based searching of 'shopping cart' or google-type doc-scored searching, tagging is probably not what you want. Dismax provides a much better fit for this type of searching. Tagging provides a means of entering a tag into a query, which, on the server (in the plugin) translates to some configured subquery that is actually executed by Solr. There are a number of cool use-cases for this - the 2 most salient of which are these: 1. To provide a known 'key' at query time, that translates into subqueries that the user couldn't/wouldn't/shouldn't know at query time. For example, I use this to supply a tag called: 'admins', which, when entered into a query, will actually query for all documents that have some reference to all administrators/root users in the searched index(es). The [securely logged-in] person searching won't know who all the root users are (and the list will change over time), only that he/she wishes to find out information pertaining to their activity. 2. To provide subquery 'shortcuts' for often used, usually lengthy and/or complicated queries. For example, if every morning, as part of your job, you need to search for: ((this AND that) OR (theother AND NOT somethingelse)) AND timestamp:[then TO now] . . . A tag can be made, say, 'mysearchtag' which equates to the above query. This tag can then be used as a query, and/or embedded in other queries. This is quite handy for automated searching and/or saved searches etc. This allows server administrators to control the content that gets returned by these queries, thus reducing client-side maintenance. Additionally, for distributed searches, evaluated tags can, if desired, produce different queries for different shards (e.g. the list of root users are different on different machines). Any comments, concerns, opinions etc. on a contributuion of this type would be greatly appreciated. Many thanks, Peter _ Do you have a story that started on Hotmail? Tell us now http://clk.atdmt.com/UKM/go/195013117/direct/01/
RE: Field value tags
Just to comment on my own post... I don't believe index-time will be involved in this sort of functionality, as there's no way to know at index-time what 'tag groups' a given term will be in at any given time. This leads me to believe that a query request parser plugin approach might be best. Something that can receive queries when the relevant keyword is present (e.g. tag=blah) and translate the parameters into their current tag member equivalvents. Such a plugin would monitor a file (say, solr/conf/tags.conf) for changes, and reload its member table if/when the tag members change. Peter Hi Solr-dev, I've had a good look 'round for this functionality/patches etc., and I couldn't find anything. Before delving deeper into the possibilities, I thought I'd put this rfe to the experts for comments/'been there, done that' etc...: Requirement: To add the ability to tag indexed field values (a.o.t fields), regardless of the field they may show up in. Then have the ability to query on this tag, and all documents where the tag evaluates to a match are returned. An example: Let's say things are configured to tag specific email addresses that end in @salesoffice.co.uk. (e.g. match on *...@salesoffice.co.uk) with a tag called 'UKSalesEmail'. During indexing, whenever such a value is encountered, it is associated with this tag (probably added to a table with async persistence or similar). A user can then perform a search something like q:*:*tag=UKSalesEmail, which proceeds to return all documents that match *...@salesoffice.co.uk. This might be thought of as 'synonyms for values'. The above example is pretty simple, but such a feature becomes useful with the ability to support one-to-many/many-to-many values - e.g. 'UKSalesEmail' might instead evaluate to a list of 50 sales employee's specific email addresses, rather than a simple wildcard (the list would likely change over time as employees are hired/sacked; the query user probably wouldn't know all the names, and certainly wouldn't want to type them all in). Similarly, 'f...@salesoffice.co.uk' might be associated with a number of different tags (e.g. 'UKSalesOffice', 'UKEmployee', 'BigCorporateBehemoth', 'HighRiskEmployee', 'DrinksOnFridayTeam' etc.). Does Solr possess such functionality at the moment, or is there a [planned] patch for this sort of thing? If not, does the community have any thoughts on an enhancement along these lines? Many thanks, Peter _ Send us your Hotmail stories and be featured in our newsletter http://clk.atdmt.com/UKM/go/195013117/direct/01/ _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now http://clk.atdmt.com/UKM/go/195013117/direct/01/