[jira] [Commented] (LUCENE-2228) AES Encrypted Directory
[ https://issues.apache.org/jira/browse/LUCENE-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861923#comment-13861923 ] Peter Karich commented on LUCENE-2228: -- What is the state here? AES Encrypted Directory --- Key: LUCENE-2228 URL: https://issues.apache.org/jira/browse/LUCENE-2228 Project: Lucene - Core Issue Type: New Feature Components: modules/other Affects Versions: 3.1 Reporter: Jay Mundrawala Attachments: LUCENE-2228.patch, lucene-encryption.tar.gz Provides an encryption solution for Lucene indexes, using the AES encryption algorithm. You must have the JCE Unlimited Strength Jurisdiction Policy Files 6 Release Candidate which you can get from java.sun.com. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
Why not vote for or against 'maven artifacts'? http://www.doodle.com/2qp35b42vstivhvx I'm using lucene+solr a lot times via maven. Elasticsearch uses lucene via gradle. Solandra uses lucene via ivy and so on ;) So maven artifacts are not only very handy for maven folks. But I think no artifacts would be better than broken ones. Why not trying to 'switch' to ivy build system? It's ant but handles dependencies better IMO. Regards, Peter. On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 18, 2011, at 12:13 PM, Robert Muir wrote: I can't help but remind myself, this is the same argument Oracle offered up for the whole reason hudson debacle (http://hudson-labs.org/content/whos-driving-thing) Declaring that I have a secret pocket of users that want XYZ isn't open source consensus. You were very quick to cite your own secret pocket of users when you called those who support it the vocal minority. So, if you want to continue baiting the discussion we can, but as I see it, we have committers willing to support it, so what's the big deal? I don't think they are that secret, you can look at the last maven discussion and see several other committers who spoke up against it. they are just sick of the discussion i gather and have given up fighting it. The problem again, is the magical special artifacts. I dont see consensus here for maven... when you have it, get back to me. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Report of the most searched terms
IMO there is no inbuilt method for this. So either use apps like piwik in your frontend or grep the logs: http://karussell.wordpress.com/2010/10/27/feeding-solr-with-its-own-logs/ Regards, Peter. Hi, I would like to know if there is a way to get data from SOLR and display them as a report. It would be a Report of the most searched terms and should include total searches, total searches with result and total searches without results. How can I accomplish this? Thanks in advance, Jonilson -- http://jetwick.com open twitter search - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970612#action_12970612 ] Peter Karich commented on SOLR-1729: Hi Yonik, so, sorry for another misposting: yes, you were right. it was the wrong solr version. it was too late yesterday :-/ All is fine now with this patch. But the org.apache.solr.request.SolrRequestInfo class is missing or am I completely crazy now? (I checked out solr twice and applied the patch again but it didn't compile) Regards, Peter. Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, SOLR-1729.patch, SOLR-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970671#action_12970671 ] Peter Karich commented on SOLR-1729: Nice, now this patch 1729 applies + compiles + run tests successfully (I'm using rev 1044942 of trunk) One further question: Would facet queries (with dates) work in the distributed setup without the date-patches? To get a quick(er) workaround. because I would need the patch for 1.4.1 (-solandra) Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, SOLR-1729.patch, SOLR-1729.patch, SOLR-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970544#action_12970544 ] Peter Karich commented on SOLR-1729: Yonik, thanks for the update. I refreshed my sources (now trunk) to rev 1044745. But the patch does not cleanly apply* for SearchHandler. Am I doing something stupid here? Regards, Peter. * pathxy/solr_branch_3x$ patch -p0 SOLR-1729.patch patching file solr/src/test/test-files/solr/conf/schema12.xml patching file solr/src/test/org/apache/solr/search/function/TestFunctionQuery.java Hunk #1 succeeded at 301 (offset -17 lines). patching file solr/src/test/org/apache/solr/handler/component/SpellCheckComponentTest.java patching file solr/src/test/org/apache/solr/handler/component/TermVectorComponentTest.java patching file solr/src/java/org/apache/solr/core/QuerySenderListener.java patching file solr/src/java/org/apache/solr/request/SimpleFacets.java Hunk #1 succeeded at 64 (offset -9 lines). Hunk #2 succeeded at 620 (offset -200 lines). Hunk #3 succeeded at 630 (offset -200 lines). Hunk #4 succeeded at 645 (offset -200 lines). Hunk #5 succeeded at 803 (offset -200 lines). patching file solr/src/java/org/apache/solr/handler/component/SearchHandler.java Hunk #1 FAILED at 192. Hunk #2 succeeded at 255 (offset -36 lines). 1 out of 2 hunks FAILED -- saving rejects to file solr/src/java/org/apache/solr/handler/component/SearchHandler.java.rej patching file solr/src/java/org/apache/solr/handler/component/ResponseBuilder.java Hunk #2 succeeded at 67 (offset -1 lines). patching file solr/src/java/org/apache/solr/spelling/SpellCheckCollator.java patching file solr/src/java/org/apache/solr/util/TestHarness.java Hunk #2 succeeded at 320 (offset -9 lines). Hunk #3 succeeded at 335 (offset -9 lines). patching file solr/src/java/org/apache/solr/util/DateMathParser.java patching file solr/src/webapp/src/org/apache/solr/servlet/SolrServlet.java patching file solr/src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java Hunk #1 succeeded at 241 (offset 4 lines). Hunk #2 succeeded at 255 (offset 4 lines). Hunk #3 succeeded at 283 (offset 4 lines). patching file solr/src/webapp/src/org/apache/solr/servlet/DirectSolrConnection.java Hunk #2 succeeded at 170 (offset -16 lines). Hunk #3 succeeded at 185 with fuzz 1 (offset -16 lines). patching file solr/src/webapp/src/org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java Hunk #1 succeeded at 32 with fuzz 1 (offset -9 lines). Hunk #2 succeeded at 138 (offset -11 lines). Hunk #3 succeeded at 156 (offset -77 lines). Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, SOLR-1729.patch, SOLR-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966488#action_12966488 ] Peter Karich commented on SOLR-1729: *regarding: 1.4.1* Hmmh, today download.carrot2.org is down and I had to delete contrib/clustering to do the build after the patch. which does not apply cleanly (strange that it appled yesterday): solr1.4.1$ patch -p0 solr-1.4.0-solr-1729.patch patching file src/java/org/apache/solr/handler/component/FacetComponent.java patching file src/java/org/apache/solr/handler/component/ResponseBuilder.java solr1.4.1$ patch -p0 solr-1.4.0-solr-1709.patch patching file src/java/org/apache/solr/handler/component/FacetComponent.java Reversed (or previously applied) patch detected! Assume -R? [n] y patching file src/java/org/apache/solr/handler/component/ResponseBuilder.java Reversed (or previously applied) patch detected! Assume -R? [n] y Hunk #3 succeeded at 251 (offset -1 lines). Or is this ok?? Because then, all tests would pass ... *regarding branch3x* both patches do not apply cleanly. SOLR-1709 fails also without SOLR-1729 solr_branch_3x/solr$ patch -p0 solr-1.4.0-solr-1709.patch patching file src/java/org/apache/solr/handler/component/FacetComponent.java Hunk #1 succeeded at 240 (offset 2 lines). Hunk #2 succeeded at 267 with fuzz 2 (offset 7 lines). Hunk #3 FAILED at 436. 1 out of 3 hunks FAILED -- saving rejects to file src/java/org/apache/solr/handler/component/FacetComponent.java.rej patching file src/java/org/apache/solr/handler/component/ResponseBuilder.java Reversed (or previously applied) patch detected! Assume -R? [n] y Hunk #2 FAILED at 61. Hunk #3 FAILED at 252. 2 out of 3 hunks FAILED -- saving rejects to file src/java/org/apache/solr/handler/component/ResponseBuilder.java.rej Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966520#action_12966520 ] Peter Karich commented on SOLR-1729: Hi Peter, 1.4.1 would be fine (I asked Jake from solandra, before I thought he uses the trunk) Now in my last comment I made a stupid mistake: the patches didn't cleanly apply for 1.4.1 because I accidentially overwrote solr-1729.patch with solr-1709 when copying from branch3x and got two identical 1709 patches :-/ So: for 1.4.1 the patches apply cleanly. But the question remains why the following tests are failing: Test org.apache.solr.TestTrie FAILED Test org.apache.solr.request.SimpleFacetsTest FAILED Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966562#action_12966562 ] Peter Karich commented on SOLR-1729: Hi Peter, sorry for the confusion :-/ I was speaking of 1.4.1: the two patches apply. 2 tests fail. Regards, Peter. Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966181#action_12966181 ] Peter Karich commented on SOLR-1729: Peter Sturge, in SOLR-1709 you said that you are working with branch3x I checked it out from here: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x but this 1729 patch didn't apply cleanly*. When I tried the 1.4.1 release it is ok, but the tests fail due to** What could be wrong? Regards, Peter. * solr_branch_3x/solr$ patch -p0 solr-1.4.0-solr-1729.patch patching file src/java/org/apache/solr/request/SimpleFacets.java Hunk #1 succeeded at 245 (offset 28 lines). Hunk #2 succeeded at 280 (offset 28 lines). Hunk #3 FAILED at 582. Hunk #4 FAILED at 652. 2 out of 4 hunks FAILED -- saving rejects to file src/java/org/apache/solr/request/SimpleFacets.java.rej patching file src/java/org/apache/solr/request/UnInvertedField.java Hunk #2 succeeded at 40 with fuzz 1 (offset 1 line). Hunk #3 succeeded at 440 (offset 5 lines). Hunk #4 succeeded at 557 (offset 5 lines). patching file src/common/org/apache/solr/common/params/FacetParams.java Hunk #1 FAILED at 175. 1 out of 1 hunk FAILED -- saving rejects to file src/common/org/apache/solr/common/params/FacetParams.java.rej ** [junit] Running org.apache.solr.TestTrie [junit] xml response was: ?xml version=1.0 encoding=UTF-8? [junit] response [junit] lst name=responseHeaderint name=status0/intint name=QTime157/int/lstresult name=response numFound=15 start=0docfloat name=id0.0/floatdate name=tdate2010-12-02T00:00:00Z/datedouble name=tdouble0.0/doublefloat name=tfloat0.0/floatint name=tint0/intlong name=tlong2147483647/long/docdocfloat name=id1.0/floatdate name=tdate2010-12-03T00:00:00Z/datedouble name=tdouble2.33/doublefloat name=tfloat31.11/floatint name=tint1/intlong name=tlong2147483648/long/docdocfloat name=id2.0/floatdate name=tdate2010-12-04T00:00:00Z/datedouble name=tdouble4.66/doublefloat name=tfloat124.44/floatint name=tint2/intlong name=tlong2147483649/long/docdocfloat name=id3.0/floatdate name=tdate2010-12-05T00:00:00Z/datedouble name=tdouble6.99/doublefloat name=tfloat279.99/floatint name=tint3/intlong name=tlong2147483650/long/docdocfloat name=id4.0/floatdate name=tdate2010-12-06T00:00:00Z/datedouble name=tdouble9.32/doublefloat name=tfloat497.76/floatint name=tint4/intlong name=tlong2147483651/long/docdocfloat name=id5.0/floatdate name=tdate2010-12-07T00:00:00Z/datedouble name=tdouble11.65/doublefloat name=tfloat777.75/floatint name=tint5/intlong name=tlong2147483652/long/docdocfloat name=id6.0/floatdate name=tdate2010-12-08T00:00:00Z/datedouble name=tdouble13.98/doublefloat name=tfloat1119.96/floatint name=tint6/intlong name=tlong2147483653/long/docdocfloat name=id7.0/floatdate name=tdate2010-12-09T00:00:00Z/datedouble name=tdouble16.312/doublefloat name=tfloat1524.39/floatint name=tint7/intlong name=tlong2147483654/long/docdocfloat name=id8.0/floatdate name=tdate2010-12-10T00:00:00Z/datedouble name=tdouble18.64/doublefloat name=tfloat1991.04/floatint name=tint8/intlong name=tlong2147483655/long/docdocfloat name=id9.0/floatdate name=tdate2010-12-11T00:00:00Z/datedouble name=tdouble20.97/doublefloat name=tfloat2519.9102/floatint name=tint9/intlong name=tlong2147483656/long/docdocfloat name=id10.0/floatdate name=tdate2010-12-02T00:00:00Z/datedouble name=tdouble0.0/doublefloat name=tfloat0.0/floatint name=tint0/intlong name=tlong2147483647/long/docdocfloat name=id20.0/floatdate name=tdate2010-12-03T00:00:00Z/datedouble name=tdouble2.33/doublefloat name=tfloat31.11/floatint name=tint1/intlong name=tlong2147483648/long/docdocfloat name=id30.0/floatdate name=tdate2010-12-04T00:00:00Z/datedouble name=tdouble4.66/doublefloat name=tfloat124.44/floatint name=tint2/intlong name=tlong2147483649/long/docdocfloat name=id40.0/floatdate name=tdate2010-12-05T00:00:00Z/datedouble name=tdouble6.99/doublefloat name=tfloat279.99/floatint name=tint3/intlong name=tlong2147483650/long/docdocfloat name=id50.0/floatdate name=tdate2010-12-06T00:00:00Z/datedouble name=tdouble9.32/doublefloat name=tfloat497.76/floatint name=tint4/intlong name=tlong2147483651/long/doc/resultlst name=facet_countslst name=facet_queries/lst name=facet_fieldslst name=tintint name=02/intint name=12/intint name=22/intint name=32/intint name=42/intint name=51/intint name=61/intint name=71/intint name=81/intint name=91/int/lstlst name=tlongint name=21474836472/intint name=21474836482/intint name=21474836492/intint name=21474836502/intint name=21474836512/intint name=21474836521/intint name=21474836531/intint name=21474836541/intint name=21474836551/intint name=21474836561/int/lstlst name=tfloatint name=0.02/intint name=31.112/intint name=124.442/intint name=279.992/intint name=497.762/intint name=777.751/intint name=1119.961/intint name=1524.391
[jira] Commented: (SOLR-1709) Distributed Date Faceting
[ https://issues.apache.org/jira/browse/SOLR-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965841#action_12965841 ] Peter Karich commented on SOLR-1709: Hi Peter, sorry for getting so late back. I'm relative sure now that I'll need that patch (also Jake from solandra was asking when this patch will be ready :-)) So, I will need to apply SOLR-1729 and then this patch to the 3x branch or even without SOLR-1729 (not necessary in my case)? Regards, Peter. Distributed Date Faceting - Key: SOLR-1709 URL: https://issues.apache.org/jira/browse/SOLR-1709 Project: Solr Issue Type: Improvement Components: SearchComponents - other Affects Versions: 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetComponent.java, FacetComponent.java, ResponseBuilder.java, solr-1.4.0-solr-1709.patch This patch is for adding support for date facets when using distributed searches. Date faceting across multiple machines exposes some time-based issues that anyone interested in this behaviour should be aware of: Any time and/or time-zone differences are not accounted for in the patch (i.e. merged date facets are at a time-of-day, not necessarily at a universal 'instant-in-time', unless all shards are time-synced to the exact same time). The implementation uses the first encountered shard's facet_dates as the basis for subsequent shards' data to be merged in. This means that if subsequent shards' facet_dates are skewed in relation to the first by 1 'gap', these 'earlier' or 'later' facets will not be merged in. There are several reasons for this: * Performance: It's faster to check facet_date lists against a single map's data, rather than against each other, particularly if there are many shards * If 'earlier' and/or 'later' facet_dates are added in, this will make the time range larger than that which was requested (e.g. a request for one hour's worth of facets could bring back 2, 3 or more hours of data) This could be dealt with if timezone and skew information was added, and the dates were normalized. One possibility for adding such support is to [optionally] add 'timezone' and 'now' parameters to the 'facet_dates' map. This would tell requesters what time and TZ the remote server thinks it is, and so multiple shards' time data can be normalized. The patch affects 2 files in the Solr core: org.apache.solr.handler.component.FacetComponent.java org.apache.solr.handler.component.ResponseBuilder.java The main changes are in FacetComponent - ResponseBuilder is just to hold the completed SimpleOrderedMap until the finishStage. One possible enhancement is to perhaps make this an optional parameter, but really, if facet.date parameters are specified, it is assumed they are desired. Comments suggestions welcome. As a favour to ask, if anyone could take my 2 source files and create a PATCH file from it, it would be greatly appreciated, as I'm having a bit of trouble with svn (don't shoot me, but my environment is a Redmond-based os company). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: WordDelimiterFilter bug
Hi Robert, QueryGenerator^H^H^HParser Thanks for the hint. I should have done a debugQuery=on earlier ... sorry. But how can I get: str name=parsedquerytw:abc tw:a tw:bc instead of: parsedqueryMultiPhraseQuery(tw:(abc a) bc) for the query aBc ? Regards, Peter. On Thu, Nov 18, 2010 at 5:26 PM, Peter Karichpeat...@yahoo.de wrote: Hi, I asked this on the user list and I think I found a bug in ... (The strange thing is that the admin GUI will highlight it correctly) because the admin gui highlights it, there's no bug. the reason it doesnt match is because of QueryGenerator^H^H^HParser automatically generating phrasequeries. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: WordDelimiterFilter bug
Hi Robert, thanks a lot! I will try a newer solr version for other reasons then I will try your suggested option too! (I will repost your solution to the user mailing list if that is ok for you ...) Where can I find more info about phrasequeries? I only found* I mean, how does MultiPhraseQuery selects its documents for (tw:(abc a) bc) ? Regards, Peter. * http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Optimizing-Findability-Lucene-and-Solr On Fri, Nov 19, 2010 at 5:12 AM, Peter Karichpeat...@yahoo.de wrote: Hi Robert, QueryGenerator^H^H^HParser Thanks for the hint. I should have done a debugQuery=on earlier ... sorry. But how can I get:str name=parsedquerytw:abc tw:a tw:bc instead of: parsedqueryMultiPhraseQuery(tw:(abc a) bc) for the query aBc ? If you are using Solr branch_3x or trunk, you can turn this off, by setting autoGeneratePhraseQueries to false in the fieldType. fieldType name=text class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false By enabling this option, phrase queries are only created by the queryparser when you enclose stuff in double quotes. If you are using an older version of solr such as 1.4.x, then you can only hack it, by adding a PositionFilterFactory to the end of your query analyzer. The downside to that approach (unfortunately the only approach, for older versions) is that it completely disables phrasequeries across the board for that field type. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: WordDelimiterFilter bug
Thanks for the explanation! That makes sense :-) Regards, Peter. On Fri, Nov 19, 2010 at 6:18 AM, Peter Karichpeat...@yahoo.de wrote: Hi Robert, thanks a lot! I will try a newer solr version for other reasons then I will try your suggested option too! (I will repost your solution to the user mailing list if that is ok for you ...) yes, please do! Where can I find more info about phrasequeries? I only found* I mean, how does MultiPhraseQuery selects its documents for (tw:(abc a) bc) ? the multiphrasequery is just like a more general phrase query. a phrase query for abc bc looks for abc in the document, followed by bc a multiphrasequery for (abc a) bc looks for (abc OR a) in the document, followed by bc. this is also the same way synonyms work with phrase queries. imagine you have a synonyms file that looks like this: dog = dog, dogs food = food, chow then if a user types dog food, the resulting query is a multiphrasequery of (dog dogs) (food chow) this matches all 4 possibilities: dog food dogs food dog chow dogs chow for more information, you can see the code to this query here: http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/src/java/org/apache/lucene/search/MultiPhraseQuery.java - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene project announcement
A) make a great Java to C# porting tool db4o has already something like this, I think. maybe the guys from db4o would give lucene committers a license? or would it be bad to rely on none free tools like this? or maybe there is already an equivalent tool? Regards, Peter. I'm forced politely disagree with some of these thoughts, let me explain why: I order for this technique to be successful it seems that there is as much work being poured into the porting technique as there is with the port itself. To my point of view it does seem like this is double the work for benefits that are perhaps not as good as they could be (I am not saying in any way that Lucene.NET today is not good, it is quite good and is the results of great efforts from a lot of very dedicated people) since it follows a java-style design which is great of the java world, but perhaps not always optimal for the C# world. The project should be doing one thing, either: A) make a great Java to C# porting tool B) make a great search engine in C# As an example, it would be a hair-pulling experience to take Lucene.NET as it is today and use it on Microsoft Azure, an environment that is specifically designed for .NET applications. As I said before, besides using Lucene.NET itself I haven't contributed much and only in discussions - I haven't committed any code. However I will say this: I personally don't know nor care about the Java language just as I'm sure many of you don't care about Prolog. In order to help out, I feel that I need to be able to read and understand the Lucene version in order to make the same stuff happen in the Lucene.NET version. This means I have to be both a Java and C# developer at the same time? Mathematicians have been using math to explain algorithms for years, it is a universal language that is (to different levels) understood by all. How those functional algorithms are implemented in a imperative language makes no difference, so long as they are implemented and produce the intended result. I think that in the end, there should be at least 3 projects for Lucene: 1. The Lucene algorithms, in a platform-neutral language - let the search engine gurus implement how this should be done without having to worry about imperative programming and the hacks to get there - either a compiler or a manual model would be used to implement these algorithms 2. Lucene - Architecture of the project(s) - perhaps a lot of UML here in a format where it can be fed to quickly produce skeleton files 3.x. Lucene - language-specific versions As Grant points out it is up to the community to make a decision, then let's all get together and see if collectively a decision can be made. And for the record, I personally think that when an open source project has 3+ ports to the same language - there is a problem. What that problem is however, I won't venture in taking any guesses. I make these comments for the good of the project(s) and it is in no way my intention to offend anyone and I salute all work and effort done thus far, we would not be here were it not for everyone involved. Karell Ste-Marie C.I.O. - BrainBank Inc -Original Message- From: Alex Thompson [mailto:pierogi...@hotmail.com] Sent: Thursday, November 18, 2010 3:58 AM To: lucene-net-...@lucene.apache.org Subject: RE: Lucene project announcement I don't think Lucene.Net staying a line-by-line port is craziness. We're not saying that Lucene.Net is the one true implementation and there can be no others. I see Lucene.Net as part of a spectrum of solutions. On one end of the spectrum is IKVM. If you want all the java lucene features immediately and the constraints of IKVM work for your scenario then great, off you go. Then there is Lucene.Net. This is good if IKVM doesn't work for you, you want short lag time behind java lucene (yes this needs improvement but we're working on it), and ability to read java lucene books/examples and apply that relatively seamlessly to your .NET code. Then on the other end of the spectrum is the forks (wrapper/extension/refactor etc.) that try to make things ideal for the .NET world. I think it's clear there is interest and support for both Lucene.Net and the forks. They should both exist and be complimentary, not competitive. The forks provide greater flexibility and greater exposure so more users and contributors can get involved. Lucene.Net provides the benefits listed above and provides an avenue for features to trickle down from java lucene to the forks. So bottom line there is no one-size-fits-all implementation. Lucene.Net (as a line-by-line) provides good value to a significant user base and (assuming we can optimize the porting) takes relatively little effort, so it is a useful part of the spectrum. Alex -Original Message- From: Andrew Busby [mailto:andrew.bu...@aimstrategic.com] Sent: Wednesday, November 17, 2010 5:06 PM To: lucene-net-...@lucene.apache.org Subject:
WordDelimiterFilter bug
Hi, I asked this on the user list and I think I found a bug in WordDelimiterFilterFactory for splitOnCaseChange=1 catenateAll=0 preserveOriginal=1 (+ lowercase filter). Add the following test* and append the definition to the schema.xml** and it won't pass. Should I open a JIRA issue for this or isn't this a bug and I missed something? (The strange thing is that the admin GUI will highlight it correctly) Regards, Peter. BTW: I just read the code of SpellCheckCollator because it didn't compile. It is: } catch (Exception e) { Log.warn(Exception trying to re-query to check if a spell check possibility would return any hits., e); It should NOT use jetty Log - remove jetty dep } catch (Exception e) { LOG.warn(Exception trying to re-query to check if a spell check possibility would return any hits., e); * @Test public void testCaseChangeAndPreserve() { assertU(adoc(id, 1, subword_cc, abcd)); assertU(adoc(id, 2, subword_cc, abCd.com)); assertU(commit()); assertQ(simple - case change and preserve, req(subword_cc:(abcd)) ,//resu...@numfound=1] ); // returns at the moment only doc 2 // should also return doc1 because abCd should preserved + lowercase filter (for the query) assertQ(camel case query - case change and preserve, req(subword_cc:(abCd)) ,//resu...@numfound=2] ); // returns at the moment 0 docs // should return doc2 because abCd.com should preserved + lowercase filter (for the index) assertQ(camel case domain - case change and preserve, req(subword_cc:(abcd.com)) ,//resu...@numfound=1] ); clearIndex(); } ** fieldtype name=subword_cc class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=1 catenateAll=0 preserveOriginal=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=1 catenateAll=0 preserveOriginal=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldtype field name=subword_cc type=subword_cc indexed=true stored=true/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-792) Pivot (ie: Decision Tree) Faceting Component
[ https://issues.apache.org/jira/browse/SOLR-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12929590#action_12929590 ] Peter Karich commented on SOLR-792: --- Hi Toke and all, maybe I am a bit evil or stupid but could someone enlight me why this patch is necessary? Why can't you we the existing mechanisms in Solr (facets!) and a bit logic while indexing: http://markmail.org/message/2aza6nnsiw3l4bbb#query:+page:1+mid:3j3ttojacpjoyfg5+state:results This has no performance problems when using tons of categories. We already using it with lots of categories. It works out of the box with a nearly infinity depth (either you need a DB - unlimited or the URL length is the limit). The only drawback of this approach is that you won't be able to display two or more 'branches' at the same time. Only one current branch with the current possible categories is possible, which is no limitation in our case. Because the UI would be unusable if too many items would be visible at the same time. One could introduce a special update component for this feature which uses a category tree (in RAM) built from the json or xml definition. I could create such a component if someone is interested. Regards, Peter. Pivot (ie: Decision Tree) Faceting Component Key: SOLR-792 URL: https://issues.apache.org/jira/browse/SOLR-792 Project: Solr Issue Type: New Feature Reporter: Erik Hatcher Assignee: Yonik Seeley Priority: Minor Attachments: SOLR-792-as-helper-class.patch, SOLR-792-PivotFaceting.patch, SOLR-792-PivotFaceting.patch, SOLR-792-PivotFaceting.patch, SOLR-792-PivotFaceting.patch, SOLR-792-raw-type.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch A component to do multi-level faceting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1709) Distributed Date Faceting
[ https://issues.apache.org/jira/browse/SOLR-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12929760#action_12929760 ] Peter Karich commented on SOLR-1709: Hi Peter Sturge, what are the limitations of this patch? only that earlier + later isn't supported? What are the issues before commiting this into trunk? Distributed Date Faceting - Key: SOLR-1709 URL: https://issues.apache.org/jira/browse/SOLR-1709 Project: Solr Issue Type: Improvement Components: SearchComponents - other Affects Versions: 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetComponent.java, FacetComponent.java, ResponseBuilder.java, solr-1.4.0-solr-1709.patch This patch is for adding support for date facets when using distributed searches. Date faceting across multiple machines exposes some time-based issues that anyone interested in this behaviour should be aware of: Any time and/or time-zone differences are not accounted for in the patch (i.e. merged date facets are at a time-of-day, not necessarily at a universal 'instant-in-time', unless all shards are time-synced to the exact same time). The implementation uses the first encountered shard's facet_dates as the basis for subsequent shards' data to be merged in. This means that if subsequent shards' facet_dates are skewed in relation to the first by 1 'gap', these 'earlier' or 'later' facets will not be merged in. There are several reasons for this: * Performance: It's faster to check facet_date lists against a single map's data, rather than against each other, particularly if there are many shards * If 'earlier' and/or 'later' facet_dates are added in, this will make the time range larger than that which was requested (e.g. a request for one hour's worth of facets could bring back 2, 3 or more hours of data) This could be dealt with if timezone and skew information was added, and the dates were normalized. One possibility for adding such support is to [optionally] add 'timezone' and 'now' parameters to the 'facet_dates' map. This would tell requesters what time and TZ the remote server thinks it is, and so multiple shards' time data can be normalized. The patch affects 2 files in the Solr core: org.apache.solr.handler.component.FacetComponent.java org.apache.solr.handler.component.ResponseBuilder.java The main changes are in FacetComponent - ResponseBuilder is just to hold the completed SimpleOrderedMap until the finishStage. One possible enhancement is to perhaps make this an optional parameter, but really, if facet.date parameters are specified, it is assumed they are desired. Comments suggestions welcome. As a favour to ask, if anyone could take my 2 source files and create a PATCH file from it, it would be greatly appreciated, as I'm having a bit of trouble with svn (don't shoot me, but my environment is a Redmond-based os company). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2218) Performance of start= and rows= parameters are exponentially slow with large data sets
[ https://issues.apache.org/jira/browse/SOLR-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12928536#action_12928536 ] Peter Karich commented on SOLR-2218: Lance, would you mind explaining this a bit in detail :-) ? The idea is to grab all/alot documents from solr even if the dataset is very large, if I haven't misunderstood what Bill was requesting. This is very useful IMHO. Performance of start= and rows= parameters are exponentially slow with large data sets -- Key: SOLR-2218 URL: https://issues.apache.org/jira/browse/SOLR-2218 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 1.4.1 Reporter: Bill Bell With large data sets, 10M rows. Setting start=large number and rows=large numbers is slow, and gets slower the farther you get from start=0 with a complex query. Random also makes this slower. Would like to somehow make this performance faster for looping through large data sets. It would be nice if we could pass a pointer to the result set to loop, or support very large rows=number. Something like: rows=1000 start=0 spointer=string_my_query_1 Then within interval (like 5 mins) I can reference this loop: Something like: rows=1000 start=1000 spointer=string_my_query_1 What do you think? Since the data is too great the cache is not helping. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
fast bitset
Hi, would this compressed and fast(?) bitset be interesting for solr/lucene or is openbitset already done this way? quoting from github: The goal of word-aligned compression is not to achieve the best compression, but rather to improve query processing time. License is GPL version 3 and ASL2.0. http://code.google.com/p/javaewah https://github.com/lemire/javaewah I just saw it on twitter ... Regards, Peter. -- http://jetwick.com twitter search prototype - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: fast bitset
And they're not random-access capable. which means it isn't applicable? Important point about WAH and friends is their ability to be fast and/or/not/xor'ed without full decompression. And they're not random-access capable. On Fri, Nov 5, 2010 at 18:47, Uwe Schindleru...@thetaphi.de wrote: Looks interesting, I was only annoyed when I saw new VectorInteger(), which is synchronized, in the iterator code - which is the thing that is most important for DocIdSets Looks like stone ages. Else I would simply give it a try by rewriting the class to also implement DocIdSet and return the optimized iterator (not the one in this class). You can then try to replace some OpenBitSets in any filters and perf test? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Peter Karich [mailto:peat...@yahoo.de] Sent: Friday, November 05, 2010 3:38 PM To: dev@lucene.apache.org Subject: fast bitset Hi, would this compressed and fast(?) bitset be interesting for solr/lucene or is openbitset already done this way? quoting from github: The goal of word-aligned compression is not to achieve the best compression, but rather to improve query processing time. License is GPL version 3 and ASL2.0. http://code.google.com/p/javaewah https://github.com/lemire/javaewah I just saw it on twitter ... Regards, Peter. -- http://jetwick.com twitter search prototype - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: question about inline function in java
Hi, do not ever optimize premature in java (and other jit languages like javascript etc)! if you have a bottleneck - optimize that. and only that. but take care how you compare statements. be sure that you run some loops before. see the first comment of Aleksey Shipilev here: http://karussell.wordpress.com/2009/05/21/microbenchmarking-java-compare-algorithms/ jit is very clever in optimizing code: so code as simple (and 'stupid') as possible to be understandle by jit ;-) I.e. concentrate your time and effort on algorithms not on bytecode. Regards, Peter. hi all we found function call in java will cost much time. e.g replacing Math.min with ab?a:b will make it faster. Another example is lessThan in PriorityQueue when use Collector to gather top K documents. Yes, use function and subclass make it easy to maintain and extend. in C/C++, we can use inline fuction to optimize. What about java? I see many codes in lucene also inline many codes mannully. such as implmented hash map in processDocument, SegmentTermDocs.read // manually inlined call to next() for speed. Is there any compiler option for inline in java? Or we may hardcode something for time consuming tasks - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1311) pseudo-field-collapsing
[ https://issues.apache.org/jira/browse/SOLR-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921324#action_12921324 ] Peter Karich commented on SOLR-1311: Hi Marc, could this issue be closed because of a field collapsing which is now in trunk and more mature? Why it cannot be integrated as a plugin? pseudo-field-collapsing --- Key: SOLR-1311 URL: https://issues.apache.org/jira/browse/SOLR-1311 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.4 Reporter: Marc Sturlese Fix For: Next Attachments: SOLR-1311-pseudo-field-collapsing.patch I am trying to develope a new way of doing field collapsing based on the adjacent field collapsing algorithm. I have started developing it beacuse I am experiencing performance problems with the field collapsing patch with big index (8G). The algorith does adjacent-pseudo-field collapsing. It does collapsing on the first X documents. Instead of making the collapsed docs disapear, the algorith will send them to a given position of the relevance results list. The reason I just do collapsing in the first X documents is that if I have for example 60 results and I am showing 10 results per page, I really don't need to do collapsing in the page 3 or even not in the 3000. Doing this I am noticing dramatically better performance. The problem is I couldn't find a way to plug the algorithm as a component and keep good performance. I had to hack few classes in SolrIndexSearcher.java This patch is just experimental and for testing purposes. In case someone finds it interesting would be good do find a way to integrate it in a better way than it is at the moment. Advices are more than welcome. Functionality: In solrconfig.xml we specify the pseudo-collapsing parameters: str name=plus.considerMoreDocstrue/str str name=plus.considerHowMany3000/str str name=plus.considerFieldname/str (at the moment there's no threshold and other parameters that exist in the current collapse-field patch) plus.considerMoreDocs one enables pseudo-collapsing plus.considerHowMany sets the number of resultant documents in wich we want to apply the algorithm plus.considerField is the field to do pseudo-collapsing If the number of results is lower than plus.considerHowMany the algorithm will be applyed to all the results. Let's say there is a query with 60 results and we've set considerHowMany to 3000 (and we already have the docs sorted by relevance). What adjacent-pseudo-collapse does is, if the 2nd doc has to be collapsed it will be sent to the pos 2999 of the relevance results array. If the 3th has to be collpased too will go to the position 2998 and successively like this. The algorithm is not applyed when a sortspec is set or plus.considerMoreDocs is set to false. It neighter is applyed when using MoreLikeThisRequestHanlder. Example with a query of 9 results: Results sorted by relevance without pseudo-collapse-algorithm: doc1 - collapse_field_value 3 doc2 - collapse_field_value 3 doc3 - collapse_field_value 4 doc4 - collapse_field_value 7 doc5 - collapse_field_value 6 doc6 - collapse_field_value 6 doc7 - collapse_field_value 5 doc8 - collapse_field_value 1 doc9 - collapse_field_value 2 Results pseudo-collapsed with plus.considerHowMany = 5 doc1 - collapse_field_value 3 doc3 - collapse_field_value 4 doc4 - collapse_field_value 7 doc5 - collapse_field_value 6 doc2 - collapse_field_value 3* doc6 - collapse_field_value 6 doc7 - collapse_field_value 5 doc8 - collapse_field_value 1 doc9 - collapse_field_value 2 Results pseudo-collapsed with plus.considerHowMany = 9 doc1 - collapse_field_value 3 doc3 - collapse_field_value 4 doc4 - collapse_field_value 7 doc5 - collapse_field_value 6 doc7 - collapse_field_value 5 doc8 - collapse_field_value 1 doc9 - collapse_field_value 2 doc6 - collapse_field_value 6* doc2 - collapse_field_value 3* *pseudo-collapsed documents -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-64) strict hierarchical facets
[ https://issues.apache.org/jira/browse/SOLR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921326#action_12921326 ] Peter Karich commented on SOLR-64: -- @SolrFan and @Mats: you could try an alternative solution: http://lucene.472066.n3.nabble.com/multi-level-faceting-tp1629650p1672083.html strict hierarchical facets -- Key: SOLR-64 URL: https://issues.apache.org/jira/browse/SOLR-64 Project: Solr Issue Type: New Feature Components: search Reporter: Yonik Seeley Fix For: Next Attachments: SOLR-64.patch, SOLR-64.patch, SOLR-64.patch, SOLR-64.patch Strict Facet Hierarchies... each tag has at most one parent (a tree). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-385) facet sorting with relevancy
[ https://issues.apache.org/jira/browse/SOLR-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921473#action_12921473 ] Peter Karich commented on SOLR-385: --- When I am thinking a bit more about this issue. For the 'ungeneralized version' - sorting against the maximum of the score (or any field?)- we can use the group-feature! http://wiki.apache.org/solr/FieldCollapsing The Solution - I think - would be the following request: http://localhost:8983/solr/select/?q=hardgroup=truegroup.field=manu_exactgroup.limit=1debug=truefl=*,score the collapse groups are ordered by the maxScore I think + hope ;-) So it is the same as we want: http://localhost:8983/solr/select/?q=hardfacet=truefacet.field=manu_exactdebug=truefl=*,scorefacet.stats.sort=max(score) desc Now one remaing task could be to extend this feature with max, min and mean functions ... here is the 'group' result: {code} lst str name=groupValueMaxtor Corp./str − result name=doclist numFound=1 start=0 maxScore=0.70904505 − doc float name=score0.70904505/float − arr name=cat strelectronics/str strhard drive/str /arr − arr name=features strSATA 3.0Gb/s, NCQ/str str8.5ms seek/str str16MB cache/str /arr str name=id6H500F0/str bool name=inStocktrue/bool str name=manuMaxtor Corp./str date name=manufacturedate_dt2006-02-13T15:26:37Z/date − str name=name Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300 /str int name=popularity6/int float name=price350.0/float str name=store45.17614,-93.87341/str /doc /result /lst − lst str name=groupValueSamsung Electronics Co. Ltd./str − result name=doclist numFound=1 start=0 maxScore=0.5908709 − doc float name=score0.5908709/float − arr name=cat strelectronics/str strhard drive/str /arr − arr name=features str7200RPM, 8MB cache, IDE Ultra ATA-133/str − str NoiseGuard, SilentSeek technology, Fluid Dynamic Bearing (FDB) motor /str /arr str name=idSP2514N/str bool name=inStocktrue/bool str name=manuSamsung Electronics Co. Ltd./str date name=manufacturedate_dt2006-02-13T15:26:37Z/date − str name=name Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133 /str int name=popularity6/int float name=price92.0/float str name=store45.17614,-93.87341/str /doc /result /lst {code} this would be the faceting result: {code} lst name=facet_fields lst name=manu_exact int name=Maxtor Corp. score=0.709045051/int int name=Samsung Electronics Co. Ltd. score=0.59087091/int ... {code} facet sorting with relevancy Key: SOLR-385 URL: https://issues.apache.org/jira/browse/SOLR-385 Project: Solr Issue Type: New Feature Components: search Reporter: Dmitry Degtyarev Priority: Minor Sometimes facet sort based only on the count of matches is not relevant, I need to sort not only by the count of matches, but also on the scores of matches. In the most simple way it must sort categories by the sum of item scores that matches query and the category. In the best way there should be some coefficient to multiply Scores or some function. Is it possible to implement such a behavior for facet sort? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene powers Twitter's search
Shai, thanks a lot for this information. This is nice and bad at the same time :-) Nice, that they will contribute their changes back to the community and bad, because now the core search technology (lucene) of jetwick and twitter are nearly the same and this is 'fight' like david vs. goliath ... But probably me or someone else come up with a killer feature ;-) Regards, Peter. Oops, forgot to paste the link from which the quote was taken: http://engineering.twitter.com/2010/10/twitters-new-search-architecture.html Shai On Thu, Oct 7, 2010 at 8:46 AM, Shai Erera ser...@gmail.com wrote: I came across this post today: http://techcrunch.com/2010/10/06/new-twitter-search And continued reading these: http://techcrunch.com/2010/10/06/twitter-search-lives/ From the latter: Modified Lucene Lucene is great, but in its current form it has several shortcomings for real-time search. That’s why we rewrote big parts of the core in-memory data structures, especially the posting lists, while still supporting Lucene’s standard APIs. This allows us to use Lucene’s search layer almost unmodified. Some of the highlights of our changes include: - significantly improved garbage collection performance - lock-free data structures and algorithms - posting lists, that are traversable in reverse order - efficient early query termination We believe that the architecture behind these changes involves several interesting topics that pertain to software engineering in general (not only search). We hope to continue to share more on these improvements. And, before you ask, we’re planning on contributing all these changes back to Lucene; some of which have already made it into Lucene’s trunk and its new realtime branch. And as you can read at the bottom of the last post, Michael B. is behind all this :-). FYI Shai -- http://jetwick.com twitter search prototype - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.
[ https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902588#action_12902588 ] Peter Karich commented on SOLR-2059: Robert, thanks for this work! I have a different application for this patch: in a twitter search # and @ shouldn't be removed. Instead I will handle them like ALPHA, I think. Would you mind to update the patch for the latest version of the trunk? I got a problem with WordDelimiterIterator at line 254 if I am using https://svn.apache.org/repos/asf/lucene/dev/trunk/solr and a file is missing problem (line 37) for http://svn.apache.org/repos/asf/solr Allow customizing how WordDelimiterFilter tokenizes text. - Key: SOLR-2059 URL: https://issues.apache.org/jira/browse/SOLR-2059 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Robert Muir Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2059.patch By default, WordDelimiterFilter assigns 'types' to each character (computed from Unicode Properties). Based on these types and the options provided, it splits and concatenates text. In some circumstances, you might need to tweak the behavior of how this works. It seems the filter already had this in mind, since you can pass in a custom byte[] type table. But its not exposed in the factory. I think you should be able to customize the defaults with a configuration file: {noformat} # A customized type mapping for WordDelimiterFilterFactory # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM # # the default for any character without a mapping is always computed from # Unicode character properties # Map the $, %, '.', and ',' characters to DIGIT # This might be useful for financial data. $ = DIGIT % = DIGIT . = DIGIT \u002C = DIGIT {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.
[ https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902600#action_12902600 ] Peter Karich commented on SOLR-2059: Ups, my mistake ... this helped! What do you think of the file format, is it ok for describing these categories? I think it is ok. I even had a more simpler patch before stumbling over yours: handleAsChar=@# which is now more powerful IMHO: @ = ALPHA # = ALPHA Allow customizing how WordDelimiterFilter tokenizes text. - Key: SOLR-2059 URL: https://issues.apache.org/jira/browse/SOLR-2059 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Robert Muir Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2059.patch By default, WordDelimiterFilter assigns 'types' to each character (computed from Unicode Properties). Based on these types and the options provided, it splits and concatenates text. In some circumstances, you might need to tweak the behavior of how this works. It seems the filter already had this in mind, since you can pass in a custom byte[] type table. But its not exposed in the factory. I think you should be able to customize the defaults with a configuration file: {noformat} # A customized type mapping for WordDelimiterFilterFactory # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM # # the default for any character without a mapping is always computed from # Unicode character properties # Map the $, %, '.', and ',' characters to DIGIT # This might be useful for financial data. $ = DIGIT % = DIGIT . = DIGIT \u002C = DIGIT {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.
[ https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902600#action_12902600 ] Peter Karich edited comment on SOLR-2059 at 8/25/10 3:46 PM: - Ups, my mistake ... this helped! What do you think of the file format, is it ok for describing these categories? I think it is ok. I even had a more simpler patch before stumbling over yours: handleAsChar=@# which is now more powerful IMHO: {code} @ = ALPHA # = ALPHA {code} was (Author: peathal): Ups, my mistake ... this helped! What do you think of the file format, is it ok for describing these categories? I think it is ok. I even had a more simpler patch before stumbling over yours: handleAsChar=@# which is now more powerful IMHO: @ = ALPHA # = ALPHA Allow customizing how WordDelimiterFilter tokenizes text. - Key: SOLR-2059 URL: https://issues.apache.org/jira/browse/SOLR-2059 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Robert Muir Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2059.patch By default, WordDelimiterFilter assigns 'types' to each character (computed from Unicode Properties). Based on these types and the options provided, it splits and concatenates text. In some circumstances, you might need to tweak the behavior of how this works. It seems the filter already had this in mind, since you can pass in a custom byte[] type table. But its not exposed in the factory. I think you should be able to customize the defaults with a configuration file: {noformat} # A customized type mapping for WordDelimiterFilterFactory # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM # # the default for any character without a mapping is always computed from # Unicode character properties # Map the $, %, '.', and ',' characters to DIGIT # This might be useful for financial data. $ = DIGIT % = DIGIT . = DIGIT \u002C = DIGIT {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[no subject]
- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2005) NullPointerException for more like this request handler via SolrJ if the document does not exist
NullPointerException for more like this request handler via SolrJ if the document does not exist Key: SOLR-2005 URL: https://issues.apache.org/jira/browse/SOLR-2005 Project: Solr Issue Type: Bug Components: clients - java, MoreLikeThis Affects Versions: 1.4 Environment: jdk1.6 Reporter: Peter Karich If I query solr with the following (via SolrJ): q=myUniqueKey%3AsomeValueWhichDoesNotExistqt=%2Fmltmlt.fl=myMLTFieldmlt.minwl=2mlt.mindf=1mlt.match.include=falsefacet=truefacet.sort=countfacet.mincount=1facet.limit=10facet.field=differentFacetFieldstart=0rows=10 I get: org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118) Caused by: java.lang.NullPointerException at org.apache.solr.client.solrj.response.QueryResponse.extractFacetInfo(QueryResponse.java:180) at org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:103) at org.apache.solr.client.solrj.response.QueryResponse.init(QueryResponse.java:80) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89) The xml response of the url is empty and so the info variable at line NamedListInteger fq = (NamedListInteger) info.get( facet_queries ); (QueryResponse) is null. Maybe all variables at QueryResponse.setResponse should be checked against null? Sth. like val = res.getVal( i ); if(val == null) continue; ? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2005) NullPointerException for more like this request handler via SolrJ if the document does not exist
[ https://issues.apache.org/jira/browse/SOLR-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Karich updated SOLR-2005: --- Priority: Minor (was: Major) NullPointerException for more like this request handler via SolrJ if the document does not exist Key: SOLR-2005 URL: https://issues.apache.org/jira/browse/SOLR-2005 Project: Solr Issue Type: Bug Components: clients - java, MoreLikeThis Affects Versions: 1.4 Environment: jdk1.6 Reporter: Peter Karich Priority: Minor Original Estimate: 0.33h Remaining Estimate: 0.33h If I query solr with the following (via SolrJ): q=myUniqueKey%3AsomeValueWhichDoesNotExistqt=%2Fmltmlt.fl=myMLTFieldmlt.minwl=2mlt.mindf=1mlt.match.include=falsefacet=truefacet.sort=countfacet.mincount=1facet.limit=10facet.field=differentFacetFieldstart=0rows=10 I get: org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118) Caused by: java.lang.NullPointerException at org.apache.solr.client.solrj.response.QueryResponse.extractFacetInfo(QueryResponse.java:180) at org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:103) at org.apache.solr.client.solrj.response.QueryResponse.init(QueryResponse.java:80) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89) The xml response of the url is empty and so the info variable at line NamedListInteger fq = (NamedListInteger) info.get( facet_queries ); (QueryResponse) is null. Maybe all variables at QueryResponse.setResponse should be checked against null? Sth. like val = res.getVal( i ); if(val == null) continue; ? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-787) SolrJ POM refers to stax parser
[ https://issues.apache.org/jira/browse/SOLR-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878385#action_12878385 ] Peter Karich commented on SOLR-787: --- Is this really correctly fixed? Inspecting my deps with NetBeans' maven dep viewer I don't understand why Solr uses woodstox and SolrJ uses the different artifact (but same jar) org.codehaus.woodstox And according to http://jarvana.com/jarvana/inspect-pom/org/apache/solr/solr-core/1.4.0/solr-core-1.4.0.pom http://jarvana.com/jarvana/inspect-pom/org/apache/solr/solr-solrj/1.4.0/solr-solrj-1.4.0.pom NetBeans is correct. The problem with this is, that you will have two identical jars in the classpath and that the solrj dep forces you to still use stax-api SolrJ POM refers to stax parser --- Key: SOLR-787 URL: https://issues.apache.org/jira/browse/SOLR-787 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 1.4 Attachments: SOLR-787.patch Solr core moved to using woodstox instead of stax but SolrJ POM still has a dependency to stax. We should replace the dependency to stax with woodstox jar in SolrJ's POM. This is not a huge problem as we are not distributing stax anymore but is needed for consistency. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-1950) SolrJ POM still refers to stax parser
SolrJ POM still refers to stax parser - Key: SOLR-1950 URL: https://issues.apache.org/jira/browse/SOLR-1950 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Reporter: Peter Karich Priority: Minor See the issue at https://issues.apache.org/jira/browse/SOLR-787 which seems to be incorrectly fixed. (I cannot reopen that issue, so I create this one here) Using the following deps: dependency groupIdorg.apache.solr/groupId artifactIdsolr-solrj/artifactId version1.4.0/version /dependency dependency artifactIdsolr-core/artifactId groupIdorg.apache.solr/groupId version1.4.0/version /dependency will lead to duplicate jars. (Solr uses woodstox and SolrJ uses the different artifact (but same jar) org.codehaus.woodstox ) But maybe the artifacts are only incorrectly deployed? Where can I find the original pom files? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1864) Master/Slave replication causes tomcat to be unresponsive on slave till replication is being done.
[ https://issues.apache.org/jira/browse/SOLR-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874055#action_12874055 ] Peter Karich commented on SOLR-1864: This might be a duplicate of: https://issues.apache.org/jira/browse/SOLR-1775 The reason might be (as Paul Noble noted) that the garbage collector is busy a lot because of autowarm up after index switch was done Master/Slave replication causes tomcat to be unresponsive on slave till replication is being done. -- Key: SOLR-1864 URL: https://issues.apache.org/jira/browse/SOLR-1864 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 1.5 Environment: Centos 5.2, Tomcat5, java version 1.6.0 OpenJDK Runtime Environment (build 1.6.0-b09) OpenJDK 64-Bit Server VM (build 1.6.0-b09, mixed mode) Reporter: Marcin Hi guys, I have found a strange behaviour on tomcat5, centos 5.2. While replication is being done ( million rows) tomcat5 seems to be unresponsive till its finished. Please help cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841752#action_12841752 ] Peter Karich commented on SOLR-236: --- Shouldn't the float array in DocSetScoreCollector be changed to a Map? hmmh, maybe I expressed myself a bit weird: I already changed this all to a Map (a SortedMap) ... I started this change in DocSetScoreCollector and changed all the other occurances of the float array (otherwise I would have to copy the entire map) I think the compare method should NOT be called if no docs are in the scores array ... ? I would expect that every docId has a score. Yes, me too. So I expect there is somewhere a bug. But as I sayd this breaks only one test (collapse with faceting before). It could be even a but in the testcase though. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Karich updated SOLR-236: -- Attachment: NonAdjacentDocumentCollapserTest.java NonAdjacentDocumentCollapser.java DocSetScoreCollector.java It seems to me that the provides changes are necessary to make the OutOfMemory exception gone. Please apply the files with caution, because I made the changes from an old patch (from Nov 2009) Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, DocSetScoreCollector.java, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841756#action_12841756 ] Peter Karich edited comment on SOLR-236 at 3/5/10 8:53 AM: --- It seems to me that the provided changes are necessary to make the OutOfMemory exception gone (see appended 3 files). Please apply the files with caution, because I made the changes from an old patch (from Nov 2009) was (Author: peathal): It seems to me that the provides changes are necessary to make the OutOfMemory exception gone. Please apply the files with caution, because I made the changes from an old patch (from Nov 2009) Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, DocSetScoreCollector.java, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841147#action_12841147 ] Peter Karich commented on SOLR-236: --- regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat} public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} although the compare method should be called if no docs are in the scores array ... ? Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841147#action_12841147 ] Peter Karich edited comment on SOLR-236 at 3/4/10 9:48 AM: --- regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat}public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} I think the compare method should NOT be called if no docs are in the scores array ... ? was (Author: peathal): regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat}public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} although the compare method should be called if no docs are in the scores array ... ? Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841147#action_12841147 ] Peter Karich edited comment on SOLR-236 at 3/4/10 9:46 AM: --- regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat}public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} although the compare method should be called if no docs are in the scores array ... ? was (Author: peathal): regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat} public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} although the compare method should be called if no docs are in the scores array ... ? Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry
[jira] Commented: (SOLR-1167) Support module xml config files using XInclude
[ https://issues.apache.org/jira/browse/SOLR-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841178#action_12841178 ] Peter Karich commented on SOLR-1167: @Shalin Shekhar Mangar: how can I use the proposed attribute feature to be used for master+slave configuration? Do you have a code snippet? Support module xml config files using XInclude -- Key: SOLR-1167 URL: https://issues.apache.org/jira/browse/SOLR-1167 Project: Solr Issue Type: New Feature Affects Versions: 1.4 Reporter: Bryan Talbot Assignee: Grant Ingersoll Priority: Minor Fix For: 1.4 Attachments: SOLR-1167.patch, SOLR-1167.patch, SOLR-1167.patch, SOLR-1167.patch, SOLR-1167.patch Current configuration files (schema and solrconfig) are monolithic which can make maintenance and reuse more difficult that it needs to be. The XML standards include a feature to include content from external files. This is described at http://www.w3.org/TR/xinclude/ This feature is to add support for XInclude features for XML configuration files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835230#action_12835230 ] Peter Karich commented on SOLR-236: --- We are facing OutOfMemory problems too. We are using https://issues.apache.org/jira/secure/attachment/12425775/field-collapse-5.patch Are you using any other features besides plain collapsing? The field collapse cache gets large very quickly, I suggest you turn it off (if you are using it). Also you can try to make your filterCache smaller. How can I turn off the collapse cache or make the filterCache smaller? Are there other workarounds? E.g. via using a special version of the patch ? I read that it could help to specify collapse.maxdocs but this didn't help in our case ... could collapse.type=adjacent help here? (https://issues.apache.org/jira/browse/SOLR-236?focusedCommentId=12495376page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12495376) What do you think? BTW: We really like this patch and would like to use it !! :-) Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258 ] Peter Karich commented on SOLR-236: --- Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://searchdev05:15100/cs-bidcs/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258 ] Peter Karich edited comment on SOLR-236 at 2/18/10 4:06 PM: Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://server/cs-bidcs/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( was (Author: peathal): Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://searchdev05:15100/cs-bidcs/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258 ] Peter Karich edited comment on SOLR-236 at 2/18/10 4:06 PM: Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://server/solr-app/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( was (Author: peathal): Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://server/cs-bidcs/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258 ] Peter Karich edited comment on SOLR-236 at 2/18/10 4:07 PM: Trying the latest patch from 1th Feb 2010. It compiles against solr-2010-02-13 from nightly build dir, but does not work. If I query http://server/solr-app/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( was (Author: peathal): Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://server/solr-app/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max