[jira] Commented: (SOLR-1814) select count(distinct fieldname) in SOLR
[ https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844437#action_12844437 ] Marcus Herou commented on SOLR-1814: Instead of having the file attached... http://svn.tailsweep.com/opensource/solr-contrib/trunk/src/main/java/org/apache/solr/handler/component/ Erik: The facet counts is something else, it groups the counts based on the field supplied does it not? Perhaps facet.query (like you pointed out) can be used, I overlooked that. Never got an answer on the mailinglist so I implemented it instead :) What I jave accomplished is this: select count(distinct blog) from BlogEntries where ...somexpression... One doc is in in this case a BlogEntry and each belongs to Blog (many-to-one). If this already can be accomplished in SOLR, my bad. Please tell me how. Ted: Trove have two licenses GPL and ASL. I can use the ASL version if it helps. I only use Trove due to the efficiency, plain hashmaps can be used of course if it is a showstopper. select count(distinct fieldname) in SOLR Key: SOLR-1814 URL: https://issues.apache.org/jira/browse/SOLR-1814 Project: Solr Issue Type: New Feature Components: SearchComponents - other Affects Versions: 1.4, 1.5, 1.6, 2.0 Reporter: Marcus Herou Fix For: 1.4, 1.5, 1.6, 2.0 Attachments: CountComponent.java I have seen questions on the mailinglist about having the functionality for counting distinct on a field. We at Tailsweep as well want to that in for example our blogsearch. Example: You had 1345 hits on 244 blogs The 244 part is not possible in SOLR today (correct me if I am wrong). So I've written a component which does this. Attaching it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Abstractify FacetComponent and SimpleFacets
Hi, thanks I wondered if it already was incorporated or such. Yes it is a little related to StatsComponent ( sum, avg etc) but I think that this solves another problem (correct me if I'm wrong) since it transforms the resulting field in a functionquery instead of counting as per default (today). I think that the StatsComponent does something similar but operates on the resulting facet. I hook in earlier. I used the StatsComponent as template for another component which I call CountComponent ( http://svn.tailsweep.com/opensource/solr-contrib/trunk/src/main/java/org/apache/solr/handler/component/CountComponent.java) which emulates the SQL equiv: select count(distinct field). Added the patch to JIRA (https://issues.apache.org/jira/browse/SOLR-1814) That one works with sharding as well. The problem is that one need to send the damn entire unique hashset of field across the shards... (can get big). See that Ted and Erik have commented now... Perhaps I have created something which already exists... damn Both these Components probably need to be refined for a release/merge into Solr. How do I move onward with these ? On Fri, Mar 12, 2010 at 2:02 AM, Grant Ingersoll gsing...@apache.orgwrote: On Mar 11, 2010, at 6:30 PM, Yonik Seeley wrote: Interesting looking stuff Marcus! Seems sort of related to stat.facet (calc stats on unique facet values) http://wiki.apache.org/solr/StatsComponent And https://issues.apache.org/jira/browse/SOLR-1622 On Thu, Mar 11, 2010 at 5:49 PM, Marcus Herou marcus.he...@tailsweep.com wrote: I have now implemented Facet with FunctionQueries it is really cool! Sorry but even though the author of SimpleFacets (Yonik) says in the javadoc that one should subclass it to leverage more functionality I did not really find that very true in this case. Hoss was actually the first author of SimpleFacets - SOLR-44 (Solr didn't even have built-in faceting when it came into the incubator!) -Yonik http://www.lucidimagination.com -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/
[jira] Issue Comment Edited: (SOLR-1814) select count(distinct fieldname) in SOLR
[ https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844437#action_12844437 ] Marcus Herou edited comment on SOLR-1814 at 3/12/10 9:55 AM: - Instead of having the file attached... http://svn.tailsweep.com/opensource/solr-contrib/trunk/src/main/java/org/apache/solr/handler/component/ Erik: The facet counts is something else, it groups the counts based on the field supplied does it not? Perhaps facet.query (like you pointed out) can be used, I overlooked that. Never got an answer on the mailinglist so I implemented it instead :) What I have accomplished is this: select count(distinct blog) from BlogEntries where ...somexpression... One doc is in in this case a BlogEntry and each belongs to Blog (many-to-one). If this already can be accomplished in SOLR, my bad. Please tell me how. Ted: Trove have two licenses GPL and ASL. I can use the ASL version if it helps. I only use Trove due to the efficiency, plain hashmaps can be used of course if it is a showstopper. was (Author: marcusherou): Instead of having the file attached... http://svn.tailsweep.com/opensource/solr-contrib/trunk/src/main/java/org/apache/solr/handler/component/ Erik: The facet counts is something else, it groups the counts based on the field supplied does it not? Perhaps facet.query (like you pointed out) can be used, I overlooked that. Never got an answer on the mailinglist so I implemented it instead :) What I jave accomplished is this: select count(distinct blog) from BlogEntries where ...somexpression... One doc is in in this case a BlogEntry and each belongs to Blog (many-to-one). If this already can be accomplished in SOLR, my bad. Please tell me how. Ted: Trove have two licenses GPL and ASL. I can use the ASL version if it helps. I only use Trove due to the efficiency, plain hashmaps can be used of course if it is a showstopper. select count(distinct fieldname) in SOLR Key: SOLR-1814 URL: https://issues.apache.org/jira/browse/SOLR-1814 Project: Solr Issue Type: New Feature Components: SearchComponents - other Affects Versions: 1.4, 1.5, 1.6, 2.0 Reporter: Marcus Herou Fix For: 1.4, 1.5, 1.6, 2.0 Attachments: CountComponent.java I have seen questions on the mailinglist about having the functionality for counting distinct on a field. We at Tailsweep as well want to that in for example our blogsearch. Example: You had 1345 hits on 244 blogs The 244 part is not possible in SOLR today (correct me if I am wrong). So I've written a component which does this. Attaching it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1814) select count(distinct fieldname) in SOLR
[ https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844437#action_12844437 ] Marcus Herou edited comment on SOLR-1814 at 3/12/10 10:03 AM: -- Instead of having the file attached... http://svn.tailsweep.com/opensource/solr-contrib/trunk/src/main/java/org/apache/solr/handler/component/ Erik: The facet counts is something else, it groups the counts based on the field supplied does it not? Perhaps facet.query (like you pointed out) can be used, I overlooked that. Never got an answer on the mailinglist so I implemented it instead :) Well the blogs is not a value it is a field of it's own. We call it feedId and is a pointer to a row in the DB. ... field name=feedId type=integer indexed=true stored=true required=true omitNorms=true / ... What I have accomplished is this: select count(distinct feedId) from FeedItem where ...somexpression... One doc is in in this case a FeedItem and each belongs to Feed (many-to-one). If this already can be accomplished in SOLR, my bad. Please tell me how. Ted: Trove have two licenses GPL and ASL. I can use the ASL version if it helps. I only use Trove due to the efficiency, plain hashmaps can be used of course if it is a showstopper. was (Author: marcusherou): Instead of having the file attached... http://svn.tailsweep.com/opensource/solr-contrib/trunk/src/main/java/org/apache/solr/handler/component/ Erik: The facet counts is something else, it groups the counts based on the field supplied does it not? Perhaps facet.query (like you pointed out) can be used, I overlooked that. Never got an answer on the mailinglist so I implemented it instead :) What I have accomplished is this: select count(distinct blog) from BlogEntries where ...somexpression... One doc is in in this case a BlogEntry and each belongs to Blog (many-to-one). If this already can be accomplished in SOLR, my bad. Please tell me how. Ted: Trove have two licenses GPL and ASL. I can use the ASL version if it helps. I only use Trove due to the efficiency, plain hashmaps can be used of course if it is a showstopper. select count(distinct fieldname) in SOLR Key: SOLR-1814 URL: https://issues.apache.org/jira/browse/SOLR-1814 Project: Solr Issue Type: New Feature Components: SearchComponents - other Affects Versions: 1.4, 1.5, 1.6, 2.0 Reporter: Marcus Herou Fix For: 1.4, 1.5, 1.6, 2.0 Attachments: CountComponent.java I have seen questions on the mailinglist about having the functionality for counting distinct on a field. We at Tailsweep as well want to that in for example our blogsearch. Example: You had 1345 hits on 244 blogs The 244 part is not possible in SOLR today (correct me if I am wrong). So I've written a component which does this. Attaching it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1814) select count(distinct fieldname) in SOLR
[ https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1286#action_1286 ] Marcus Herou commented on SOLR-1814: Ted: I am an idiot about ASL. GNU Trove ( I mixed it up with something else ). I can add code which uses Trove if available in the CP or plain Hashmaps if not. Think it exists some good collection utils in commons. Will look it up. Trove however is super. select count(distinct fieldname) in SOLR Key: SOLR-1814 URL: https://issues.apache.org/jira/browse/SOLR-1814 Project: Solr Issue Type: New Feature Components: SearchComponents - other Affects Versions: 1.4, 1.5, 1.6, 2.0 Reporter: Marcus Herou Fix For: 1.4, 1.5, 1.6, 2.0 Attachments: CountComponent.java I have seen questions on the mailinglist about having the functionality for counting distinct on a field. We at Tailsweep as well want to that in for example our blogsearch. Example: You had 1345 hits on 244 blogs The 244 part is not possible in SOLR today (correct me if I am wrong). So I've written a component which does this. Attaching it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
XMLWriter
Hello, I don't want to roll up all the XMLWriter issues, but stumpled upon this: http://lucene.apache.org/solr/api/org/apache/solr/response/SolrQueryResponse.html#returnable_data says that a Map containing any of the items in this list may be contained in a SolrQueryResponse and will be handled by QueryResponseWriters. This is not true for (at least) Keys in Maps. XMLWriter tries to cast any key to a String. ( There is even a comment on this in the source !?). Is there a reason not to use String.valueOf( entry.getKey() ) or such? -- mit freundlichem Gruß, Frank Wesemann Fotofinder GmbH USt-IdNr. DE812854514 Software EntwicklungWeb: http://www.fotofinder.com/ Potsdamer Str. 96 Tel: +49 30 25 79 28 90 10785 BerlinFax: +49 30 25 79 28 999 Sitz: Berlin Amtsgericht Berlin Charlottenburg (HRB 73099) Geschäftsführer: Ali Paczensky
[jira] Commented: (SOLR-1815) SolrJ doesn't preserve the order of facet queries returned from solr
[ https://issues.apache.org/jira/browse/SOLR-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844668#action_12844668 ] Yonik Seeley commented on SOLR-1815: I'll go ahead and make this change soon if there are no objections. As it relates to SolrJ, HashMap vs LinkedHashMap for facet queries will be completely inconsequential. The only potential burden here lies with the server side - is there some reason solr might not want to return them in order in the future? I really can't think of a realistic reason why not. SolrJ doesn't preserve the order of facet queries returned from solr Key: SOLR-1815 URL: https://issues.apache.org/jira/browse/SOLR-1815 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Reporter: Steve Radhouani Original Estimate: 24h Remaining Estimate: 24h Using Solrj, I wanted to sort the response of a range query based on some specific labels. For instance, using the query: {noformat} facet=true facet.query={!key= Less than 100}[* TO 99] facet.query={!key=100 - 200}[100 TO 200] facet.query={!key=200 +}[201 TO *] {noformat} I wanted to display the response in the following order: {noformat} Less than 100 (x) 100 - 200 (y) 201 + (z) {noformat} independently on the values of x, y, z which are the numbers of the retrieved documents for each range. While Solr itself produces correctly the desired order (as specified in my query), SolrJ doesn't preserve it. RE: Yonik, a solution could be just to change {code} _facetQuery = new HashMapString, Integer(); ...to... _facetQuery = new Linked HashMapString, Integer(); {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: XMLWriter
On Fri, Mar 12, 2010 at 3:37 PM, Frank Wesemann f.wesem...@fotofinder.net wrote: Hello, I don't want to roll up all the XMLWriter issues, but stumpled upon this: http://lucene.apache.org/solr/api/org/apache/solr/response/SolrQueryResponse.html#returnable_data says that a Map containing any of the items in this list may be contained in a SolrQueryResponse and will be handled by QueryResponseWriters. This is not true for (at least) Keys in Maps. XMLWriter tries to cast any key to a String. ( There is even a comment on this in the source !?). Is there a reason not to use String.valueOf( entry.getKey() ) or such? Yeah, seems like the right approach. Any other places we missed this? -Yonik http://www.lucidimagination.com
Re: XMLWriter
Yeah, seems like the right approach. Good, I feared I missed sth. obvious. :-) Any other places we missed this? I'll have a look at it. I'll also open an JIRA issue and add patches etc, if you don't mind. -- mit freundlichem Gruß, Frank Wesemann Fotofinder GmbH USt-IdNr. DE812854514 Software EntwicklungWeb: http://www.fotofinder.com/ Potsdamer Str. 96 Tel: +49 30 25 79 28 90 10785 BerlinFax: +49 30 25 79 28 999 Sitz: Berlin Amtsgericht Berlin Charlottenburg (HRB 73099) Geschäftsführer: Ali Paczensky