[jira] [Commented] (LUCENE-2228) AES Encrypted Directory
[ https://issues.apache.org/jira/browse/LUCENE-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861923#comment-13861923 ] Peter Karich commented on LUCENE-2228: -- What is the state here? AES Encrypted Directory --- Key: LUCENE-2228 URL: https://issues.apache.org/jira/browse/LUCENE-2228 Project: Lucene - Core Issue Type: New Feature Components: modules/other Affects Versions: 3.1 Reporter: Jay Mundrawala Attachments: LUCENE-2228.patch, lucene-encryption.tar.gz Provides an encryption solution for Lucene indexes, using the AES encryption algorithm. You must have the JCE Unlimited Strength Jurisdiction Policy Files 6 Release Candidate which you can get from java.sun.com. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970612#action_12970612 ] Peter Karich commented on SOLR-1729: Hi Yonik, so, sorry for another misposting: yes, you were right. it was the wrong solr version. it was too late yesterday :-/ All is fine now with this patch. But the org.apache.solr.request.SolrRequestInfo class is missing or am I completely crazy now? (I checked out solr twice and applied the patch again but it didn't compile) Regards, Peter. Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, SOLR-1729.patch, SOLR-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970671#action_12970671 ] Peter Karich commented on SOLR-1729: Nice, now this patch 1729 applies + compiles + run tests successfully (I'm using rev 1044942 of trunk) One further question: Would facet queries (with dates) work in the distributed setup without the date-patches? To get a quick(er) workaround. because I would need the patch for 1.4.1 (-solandra) Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, SOLR-1729.patch, SOLR-1729.patch, SOLR-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970544#action_12970544 ] Peter Karich commented on SOLR-1729: Yonik, thanks for the update. I refreshed my sources (now trunk) to rev 1044745. But the patch does not cleanly apply* for SearchHandler. Am I doing something stupid here? Regards, Peter. * pathxy/solr_branch_3x$ patch -p0 SOLR-1729.patch patching file solr/src/test/test-files/solr/conf/schema12.xml patching file solr/src/test/org/apache/solr/search/function/TestFunctionQuery.java Hunk #1 succeeded at 301 (offset -17 lines). patching file solr/src/test/org/apache/solr/handler/component/SpellCheckComponentTest.java patching file solr/src/test/org/apache/solr/handler/component/TermVectorComponentTest.java patching file solr/src/java/org/apache/solr/core/QuerySenderListener.java patching file solr/src/java/org/apache/solr/request/SimpleFacets.java Hunk #1 succeeded at 64 (offset -9 lines). Hunk #2 succeeded at 620 (offset -200 lines). Hunk #3 succeeded at 630 (offset -200 lines). Hunk #4 succeeded at 645 (offset -200 lines). Hunk #5 succeeded at 803 (offset -200 lines). patching file solr/src/java/org/apache/solr/handler/component/SearchHandler.java Hunk #1 FAILED at 192. Hunk #2 succeeded at 255 (offset -36 lines). 1 out of 2 hunks FAILED -- saving rejects to file solr/src/java/org/apache/solr/handler/component/SearchHandler.java.rej patching file solr/src/java/org/apache/solr/handler/component/ResponseBuilder.java Hunk #2 succeeded at 67 (offset -1 lines). patching file solr/src/java/org/apache/solr/spelling/SpellCheckCollator.java patching file solr/src/java/org/apache/solr/util/TestHarness.java Hunk #2 succeeded at 320 (offset -9 lines). Hunk #3 succeeded at 335 (offset -9 lines). patching file solr/src/java/org/apache/solr/util/DateMathParser.java patching file solr/src/webapp/src/org/apache/solr/servlet/SolrServlet.java patching file solr/src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java Hunk #1 succeeded at 241 (offset 4 lines). Hunk #2 succeeded at 255 (offset 4 lines). Hunk #3 succeeded at 283 (offset 4 lines). patching file solr/src/webapp/src/org/apache/solr/servlet/DirectSolrConnection.java Hunk #2 succeeded at 170 (offset -16 lines). Hunk #3 succeeded at 185 with fuzz 1 (offset -16 lines). patching file solr/src/webapp/src/org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java Hunk #1 succeeded at 32 with fuzz 1 (offset -9 lines). Hunk #2 succeeded at 138 (offset -11 lines). Hunk #3 succeeded at 156 (offset -77 lines). Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, SOLR-1729.patch, SOLR-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966488#action_12966488 ] Peter Karich commented on SOLR-1729: *regarding: 1.4.1* Hmmh, today download.carrot2.org is down and I had to delete contrib/clustering to do the build after the patch. which does not apply cleanly (strange that it appled yesterday): solr1.4.1$ patch -p0 solr-1.4.0-solr-1729.patch patching file src/java/org/apache/solr/handler/component/FacetComponent.java patching file src/java/org/apache/solr/handler/component/ResponseBuilder.java solr1.4.1$ patch -p0 solr-1.4.0-solr-1709.patch patching file src/java/org/apache/solr/handler/component/FacetComponent.java Reversed (or previously applied) patch detected! Assume -R? [n] y patching file src/java/org/apache/solr/handler/component/ResponseBuilder.java Reversed (or previously applied) patch detected! Assume -R? [n] y Hunk #3 succeeded at 251 (offset -1 lines). Or is this ok?? Because then, all tests would pass ... *regarding branch3x* both patches do not apply cleanly. SOLR-1709 fails also without SOLR-1729 solr_branch_3x/solr$ patch -p0 solr-1.4.0-solr-1709.patch patching file src/java/org/apache/solr/handler/component/FacetComponent.java Hunk #1 succeeded at 240 (offset 2 lines). Hunk #2 succeeded at 267 with fuzz 2 (offset 7 lines). Hunk #3 FAILED at 436. 1 out of 3 hunks FAILED -- saving rejects to file src/java/org/apache/solr/handler/component/FacetComponent.java.rej patching file src/java/org/apache/solr/handler/component/ResponseBuilder.java Reversed (or previously applied) patch detected! Assume -R? [n] y Hunk #2 FAILED at 61. Hunk #3 FAILED at 252. 2 out of 3 hunks FAILED -- saving rejects to file src/java/org/apache/solr/handler/component/ResponseBuilder.java.rej Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966520#action_12966520 ] Peter Karich commented on SOLR-1729: Hi Peter, 1.4.1 would be fine (I asked Jake from solandra, before I thought he uses the trunk) Now in my last comment I made a stupid mistake: the patches didn't cleanly apply for 1.4.1 because I accidentially overwrote solr-1729.patch with solr-1709 when copying from branch3x and got two identical 1709 patches :-/ So: for 1.4.1 the patches apply cleanly. But the question remains why the following tests are failing: Test org.apache.solr.TestTrie FAILED Test org.apache.solr.request.SimpleFacetsTest FAILED Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966562#action_12966562 ] Peter Karich commented on SOLR-1729: Hi Peter, sorry for the confusion :-/ I was speaking of 1.4.1: the two patches apply. 2 tests fail. Regards, Peter. Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java, solr-1.4.0-solr-1729.patch, UnInvertedField.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966181#action_12966181 ] Peter Karich commented on SOLR-1729: Peter Sturge, in SOLR-1709 you said that you are working with branch3x I checked it out from here: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x but this 1729 patch didn't apply cleanly*. When I tried the 1.4.1 release it is ok, but the tests fail due to** What could be wrong? Regards, Peter. * solr_branch_3x/solr$ patch -p0 solr-1.4.0-solr-1729.patch patching file src/java/org/apache/solr/request/SimpleFacets.java Hunk #1 succeeded at 245 (offset 28 lines). Hunk #2 succeeded at 280 (offset 28 lines). Hunk #3 FAILED at 582. Hunk #4 FAILED at 652. 2 out of 4 hunks FAILED -- saving rejects to file src/java/org/apache/solr/request/SimpleFacets.java.rej patching file src/java/org/apache/solr/request/UnInvertedField.java Hunk #2 succeeded at 40 with fuzz 1 (offset 1 line). Hunk #3 succeeded at 440 (offset 5 lines). Hunk #4 succeeded at 557 (offset 5 lines). patching file src/common/org/apache/solr/common/params/FacetParams.java Hunk #1 FAILED at 175. 1 out of 1 hunk FAILED -- saving rejects to file src/common/org/apache/solr/common/params/FacetParams.java.rej ** [junit] Running org.apache.solr.TestTrie [junit] xml response was: ?xml version=1.0 encoding=UTF-8? [junit] response [junit] lst name=responseHeaderint name=status0/intint name=QTime157/int/lstresult name=response numFound=15 start=0docfloat name=id0.0/floatdate name=tdate2010-12-02T00:00:00Z/datedouble name=tdouble0.0/doublefloat name=tfloat0.0/floatint name=tint0/intlong name=tlong2147483647/long/docdocfloat name=id1.0/floatdate name=tdate2010-12-03T00:00:00Z/datedouble name=tdouble2.33/doublefloat name=tfloat31.11/floatint name=tint1/intlong name=tlong2147483648/long/docdocfloat name=id2.0/floatdate name=tdate2010-12-04T00:00:00Z/datedouble name=tdouble4.66/doublefloat name=tfloat124.44/floatint name=tint2/intlong name=tlong2147483649/long/docdocfloat name=id3.0/floatdate name=tdate2010-12-05T00:00:00Z/datedouble name=tdouble6.99/doublefloat name=tfloat279.99/floatint name=tint3/intlong name=tlong2147483650/long/docdocfloat name=id4.0/floatdate name=tdate2010-12-06T00:00:00Z/datedouble name=tdouble9.32/doublefloat name=tfloat497.76/floatint name=tint4/intlong name=tlong2147483651/long/docdocfloat name=id5.0/floatdate name=tdate2010-12-07T00:00:00Z/datedouble name=tdouble11.65/doublefloat name=tfloat777.75/floatint name=tint5/intlong name=tlong2147483652/long/docdocfloat name=id6.0/floatdate name=tdate2010-12-08T00:00:00Z/datedouble name=tdouble13.98/doublefloat name=tfloat1119.96/floatint name=tint6/intlong name=tlong2147483653/long/docdocfloat name=id7.0/floatdate name=tdate2010-12-09T00:00:00Z/datedouble name=tdouble16.312/doublefloat name=tfloat1524.39/floatint name=tint7/intlong name=tlong2147483654/long/docdocfloat name=id8.0/floatdate name=tdate2010-12-10T00:00:00Z/datedouble name=tdouble18.64/doublefloat name=tfloat1991.04/floatint name=tint8/intlong name=tlong2147483655/long/docdocfloat name=id9.0/floatdate name=tdate2010-12-11T00:00:00Z/datedouble name=tdouble20.97/doublefloat name=tfloat2519.9102/floatint name=tint9/intlong name=tlong2147483656/long/docdocfloat name=id10.0/floatdate name=tdate2010-12-02T00:00:00Z/datedouble name=tdouble0.0/doublefloat name=tfloat0.0/floatint name=tint0/intlong name=tlong2147483647/long/docdocfloat name=id20.0/floatdate name=tdate2010-12-03T00:00:00Z/datedouble name=tdouble2.33/doublefloat name=tfloat31.11/floatint name=tint1/intlong name=tlong2147483648/long/docdocfloat name=id30.0/floatdate name=tdate2010-12-04T00:00:00Z/datedouble name=tdouble4.66/doublefloat name=tfloat124.44/floatint name=tint2/intlong name=tlong2147483649/long/docdocfloat name=id40.0/floatdate name=tdate2010-12-05T00:00:00Z/datedouble name=tdouble6.99/doublefloat name=tfloat279.99/floatint name=tint3/intlong name=tlong2147483650/long/docdocfloat name=id50.0/floatdate name=tdate2010-12-06T00:00:00Z/datedouble name=tdouble9.32/doublefloat name=tfloat497.76/floatint name=tint4/intlong name=tlong2147483651/long/doc/resultlst name=facet_countslst name=facet_queries/lst name=facet_fieldslst name=tintint name=02/intint name=12/intint name=22/intint name=32/intint name=42/intint name=51/intint name=61/intint name=71/intint name=81/intint name=91/int/lstlst name=tlongint name=21474836472/intint name=21474836482/intint name=21474836492/intint name=21474836502/intint name=21474836512/intint name=21474836521/intint name=21474836531/intint name=21474836541/intint name=21474836551/intint name=21474836561/int/lstlst name=tfloatint name=0.02/intint name=31.112/intint name=124.442/intint name=279.992/intint name=497.762/intint name=777.751/intint name=1119.961/intint
[jira] Commented: (SOLR-1709) Distributed Date Faceting
[ https://issues.apache.org/jira/browse/SOLR-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965841#action_12965841 ] Peter Karich commented on SOLR-1709: Hi Peter, sorry for getting so late back. I'm relative sure now that I'll need that patch (also Jake from solandra was asking when this patch will be ready :-)) So, I will need to apply SOLR-1729 and then this patch to the 3x branch or even without SOLR-1729 (not necessary in my case)? Regards, Peter. Distributed Date Faceting - Key: SOLR-1709 URL: https://issues.apache.org/jira/browse/SOLR-1709 Project: Solr Issue Type: Improvement Components: SearchComponents - other Affects Versions: 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetComponent.java, FacetComponent.java, ResponseBuilder.java, solr-1.4.0-solr-1709.patch This patch is for adding support for date facets when using distributed searches. Date faceting across multiple machines exposes some time-based issues that anyone interested in this behaviour should be aware of: Any time and/or time-zone differences are not accounted for in the patch (i.e. merged date facets are at a time-of-day, not necessarily at a universal 'instant-in-time', unless all shards are time-synced to the exact same time). The implementation uses the first encountered shard's facet_dates as the basis for subsequent shards' data to be merged in. This means that if subsequent shards' facet_dates are skewed in relation to the first by 1 'gap', these 'earlier' or 'later' facets will not be merged in. There are several reasons for this: * Performance: It's faster to check facet_date lists against a single map's data, rather than against each other, particularly if there are many shards * If 'earlier' and/or 'later' facet_dates are added in, this will make the time range larger than that which was requested (e.g. a request for one hour's worth of facets could bring back 2, 3 or more hours of data) This could be dealt with if timezone and skew information was added, and the dates were normalized. One possibility for adding such support is to [optionally] add 'timezone' and 'now' parameters to the 'facet_dates' map. This would tell requesters what time and TZ the remote server thinks it is, and so multiple shards' time data can be normalized. The patch affects 2 files in the Solr core: org.apache.solr.handler.component.FacetComponent.java org.apache.solr.handler.component.ResponseBuilder.java The main changes are in FacetComponent - ResponseBuilder is just to hold the completed SimpleOrderedMap until the finishStage. One possible enhancement is to perhaps make this an optional parameter, but really, if facet.date parameters are specified, it is assumed they are desired. Comments suggestions welcome. As a favour to ask, if anyone could take my 2 source files and create a PATCH file from it, it would be greatly appreciated, as I'm having a bit of trouble with svn (don't shoot me, but my environment is a Redmond-based os company). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-792) Pivot (ie: Decision Tree) Faceting Component
[ https://issues.apache.org/jira/browse/SOLR-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12929590#action_12929590 ] Peter Karich commented on SOLR-792: --- Hi Toke and all, maybe I am a bit evil or stupid but could someone enlight me why this patch is necessary? Why can't you we the existing mechanisms in Solr (facets!) and a bit logic while indexing: http://markmail.org/message/2aza6nnsiw3l4bbb#query:+page:1+mid:3j3ttojacpjoyfg5+state:results This has no performance problems when using tons of categories. We already using it with lots of categories. It works out of the box with a nearly infinity depth (either you need a DB - unlimited or the URL length is the limit). The only drawback of this approach is that you won't be able to display two or more 'branches' at the same time. Only one current branch with the current possible categories is possible, which is no limitation in our case. Because the UI would be unusable if too many items would be visible at the same time. One could introduce a special update component for this feature which uses a category tree (in RAM) built from the json or xml definition. I could create such a component if someone is interested. Regards, Peter. Pivot (ie: Decision Tree) Faceting Component Key: SOLR-792 URL: https://issues.apache.org/jira/browse/SOLR-792 Project: Solr Issue Type: New Feature Reporter: Erik Hatcher Assignee: Yonik Seeley Priority: Minor Attachments: SOLR-792-as-helper-class.patch, SOLR-792-PivotFaceting.patch, SOLR-792-PivotFaceting.patch, SOLR-792-PivotFaceting.patch, SOLR-792-PivotFaceting.patch, SOLR-792-raw-type.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch A component to do multi-level faceting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1709) Distributed Date Faceting
[ https://issues.apache.org/jira/browse/SOLR-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12929760#action_12929760 ] Peter Karich commented on SOLR-1709: Hi Peter Sturge, what are the limitations of this patch? only that earlier + later isn't supported? What are the issues before commiting this into trunk? Distributed Date Faceting - Key: SOLR-1709 URL: https://issues.apache.org/jira/browse/SOLR-1709 Project: Solr Issue Type: Improvement Components: SearchComponents - other Affects Versions: 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetComponent.java, FacetComponent.java, ResponseBuilder.java, solr-1.4.0-solr-1709.patch This patch is for adding support for date facets when using distributed searches. Date faceting across multiple machines exposes some time-based issues that anyone interested in this behaviour should be aware of: Any time and/or time-zone differences are not accounted for in the patch (i.e. merged date facets are at a time-of-day, not necessarily at a universal 'instant-in-time', unless all shards are time-synced to the exact same time). The implementation uses the first encountered shard's facet_dates as the basis for subsequent shards' data to be merged in. This means that if subsequent shards' facet_dates are skewed in relation to the first by 1 'gap', these 'earlier' or 'later' facets will not be merged in. There are several reasons for this: * Performance: It's faster to check facet_date lists against a single map's data, rather than against each other, particularly if there are many shards * If 'earlier' and/or 'later' facet_dates are added in, this will make the time range larger than that which was requested (e.g. a request for one hour's worth of facets could bring back 2, 3 or more hours of data) This could be dealt with if timezone and skew information was added, and the dates were normalized. One possibility for adding such support is to [optionally] add 'timezone' and 'now' parameters to the 'facet_dates' map. This would tell requesters what time and TZ the remote server thinks it is, and so multiple shards' time data can be normalized. The patch affects 2 files in the Solr core: org.apache.solr.handler.component.FacetComponent.java org.apache.solr.handler.component.ResponseBuilder.java The main changes are in FacetComponent - ResponseBuilder is just to hold the completed SimpleOrderedMap until the finishStage. One possible enhancement is to perhaps make this an optional parameter, but really, if facet.date parameters are specified, it is assumed they are desired. Comments suggestions welcome. As a favour to ask, if anyone could take my 2 source files and create a PATCH file from it, it would be greatly appreciated, as I'm having a bit of trouble with svn (don't shoot me, but my environment is a Redmond-based os company). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2218) Performance of start= and rows= parameters are exponentially slow with large data sets
[ https://issues.apache.org/jira/browse/SOLR-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12928536#action_12928536 ] Peter Karich commented on SOLR-2218: Lance, would you mind explaining this a bit in detail :-) ? The idea is to grab all/alot documents from solr even if the dataset is very large, if I haven't misunderstood what Bill was requesting. This is very useful IMHO. Performance of start= and rows= parameters are exponentially slow with large data sets -- Key: SOLR-2218 URL: https://issues.apache.org/jira/browse/SOLR-2218 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 1.4.1 Reporter: Bill Bell With large data sets, 10M rows. Setting start=large number and rows=large numbers is slow, and gets slower the farther you get from start=0 with a complex query. Random also makes this slower. Would like to somehow make this performance faster for looping through large data sets. It would be nice if we could pass a pointer to the result set to loop, or support very large rows=number. Something like: rows=1000 start=0 spointer=string_my_query_1 Then within interval (like 5 mins) I can reference this loop: Something like: rows=1000 start=1000 spointer=string_my_query_1 What do you think? Since the data is too great the cache is not helping. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1311) pseudo-field-collapsing
[ https://issues.apache.org/jira/browse/SOLR-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921324#action_12921324 ] Peter Karich commented on SOLR-1311: Hi Marc, could this issue be closed because of a field collapsing which is now in trunk and more mature? Why it cannot be integrated as a plugin? pseudo-field-collapsing --- Key: SOLR-1311 URL: https://issues.apache.org/jira/browse/SOLR-1311 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.4 Reporter: Marc Sturlese Fix For: Next Attachments: SOLR-1311-pseudo-field-collapsing.patch I am trying to develope a new way of doing field collapsing based on the adjacent field collapsing algorithm. I have started developing it beacuse I am experiencing performance problems with the field collapsing patch with big index (8G). The algorith does adjacent-pseudo-field collapsing. It does collapsing on the first X documents. Instead of making the collapsed docs disapear, the algorith will send them to a given position of the relevance results list. The reason I just do collapsing in the first X documents is that if I have for example 60 results and I am showing 10 results per page, I really don't need to do collapsing in the page 3 or even not in the 3000. Doing this I am noticing dramatically better performance. The problem is I couldn't find a way to plug the algorithm as a component and keep good performance. I had to hack few classes in SolrIndexSearcher.java This patch is just experimental and for testing purposes. In case someone finds it interesting would be good do find a way to integrate it in a better way than it is at the moment. Advices are more than welcome. Functionality: In solrconfig.xml we specify the pseudo-collapsing parameters: str name=plus.considerMoreDocstrue/str str name=plus.considerHowMany3000/str str name=plus.considerFieldname/str (at the moment there's no threshold and other parameters that exist in the current collapse-field patch) plus.considerMoreDocs one enables pseudo-collapsing plus.considerHowMany sets the number of resultant documents in wich we want to apply the algorithm plus.considerField is the field to do pseudo-collapsing If the number of results is lower than plus.considerHowMany the algorithm will be applyed to all the results. Let's say there is a query with 60 results and we've set considerHowMany to 3000 (and we already have the docs sorted by relevance). What adjacent-pseudo-collapse does is, if the 2nd doc has to be collapsed it will be sent to the pos 2999 of the relevance results array. If the 3th has to be collpased too will go to the position 2998 and successively like this. The algorithm is not applyed when a sortspec is set or plus.considerMoreDocs is set to false. It neighter is applyed when using MoreLikeThisRequestHanlder. Example with a query of 9 results: Results sorted by relevance without pseudo-collapse-algorithm: doc1 - collapse_field_value 3 doc2 - collapse_field_value 3 doc3 - collapse_field_value 4 doc4 - collapse_field_value 7 doc5 - collapse_field_value 6 doc6 - collapse_field_value 6 doc7 - collapse_field_value 5 doc8 - collapse_field_value 1 doc9 - collapse_field_value 2 Results pseudo-collapsed with plus.considerHowMany = 5 doc1 - collapse_field_value 3 doc3 - collapse_field_value 4 doc4 - collapse_field_value 7 doc5 - collapse_field_value 6 doc2 - collapse_field_value 3* doc6 - collapse_field_value 6 doc7 - collapse_field_value 5 doc8 - collapse_field_value 1 doc9 - collapse_field_value 2 Results pseudo-collapsed with plus.considerHowMany = 9 doc1 - collapse_field_value 3 doc3 - collapse_field_value 4 doc4 - collapse_field_value 7 doc5 - collapse_field_value 6 doc7 - collapse_field_value 5 doc8 - collapse_field_value 1 doc9 - collapse_field_value 2 doc6 - collapse_field_value 6* doc2 - collapse_field_value 3* *pseudo-collapsed documents -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-64) strict hierarchical facets
[ https://issues.apache.org/jira/browse/SOLR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921326#action_12921326 ] Peter Karich commented on SOLR-64: -- @SolrFan and @Mats: you could try an alternative solution: http://lucene.472066.n3.nabble.com/multi-level-faceting-tp1629650p1672083.html strict hierarchical facets -- Key: SOLR-64 URL: https://issues.apache.org/jira/browse/SOLR-64 Project: Solr Issue Type: New Feature Components: search Reporter: Yonik Seeley Fix For: Next Attachments: SOLR-64.patch, SOLR-64.patch, SOLR-64.patch, SOLR-64.patch Strict Facet Hierarchies... each tag has at most one parent (a tree). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-385) facet sorting with relevancy
[ https://issues.apache.org/jira/browse/SOLR-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921473#action_12921473 ] Peter Karich commented on SOLR-385: --- When I am thinking a bit more about this issue. For the 'ungeneralized version' - sorting against the maximum of the score (or any field?)- we can use the group-feature! http://wiki.apache.org/solr/FieldCollapsing The Solution - I think - would be the following request: http://localhost:8983/solr/select/?q=hardgroup=truegroup.field=manu_exactgroup.limit=1debug=truefl=*,score the collapse groups are ordered by the maxScore I think + hope ;-) So it is the same as we want: http://localhost:8983/solr/select/?q=hardfacet=truefacet.field=manu_exactdebug=truefl=*,scorefacet.stats.sort=max(score) desc Now one remaing task could be to extend this feature with max, min and mean functions ... here is the 'group' result: {code} lst str name=groupValueMaxtor Corp./str − result name=doclist numFound=1 start=0 maxScore=0.70904505 − doc float name=score0.70904505/float − arr name=cat strelectronics/str strhard drive/str /arr − arr name=features strSATA 3.0Gb/s, NCQ/str str8.5ms seek/str str16MB cache/str /arr str name=id6H500F0/str bool name=inStocktrue/bool str name=manuMaxtor Corp./str date name=manufacturedate_dt2006-02-13T15:26:37Z/date − str name=name Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300 /str int name=popularity6/int float name=price350.0/float str name=store45.17614,-93.87341/str /doc /result /lst − lst str name=groupValueSamsung Electronics Co. Ltd./str − result name=doclist numFound=1 start=0 maxScore=0.5908709 − doc float name=score0.5908709/float − arr name=cat strelectronics/str strhard drive/str /arr − arr name=features str7200RPM, 8MB cache, IDE Ultra ATA-133/str − str NoiseGuard, SilentSeek technology, Fluid Dynamic Bearing (FDB) motor /str /arr str name=idSP2514N/str bool name=inStocktrue/bool str name=manuSamsung Electronics Co. Ltd./str date name=manufacturedate_dt2006-02-13T15:26:37Z/date − str name=name Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133 /str int name=popularity6/int float name=price92.0/float str name=store45.17614,-93.87341/str /doc /result /lst {code} this would be the faceting result: {code} lst name=facet_fields lst name=manu_exact int name=Maxtor Corp. score=0.709045051/int int name=Samsung Electronics Co. Ltd. score=0.59087091/int ... {code} facet sorting with relevancy Key: SOLR-385 URL: https://issues.apache.org/jira/browse/SOLR-385 Project: Solr Issue Type: New Feature Components: search Reporter: Dmitry Degtyarev Priority: Minor Sometimes facet sort based only on the count of matches is not relevant, I need to sort not only by the count of matches, but also on the scores of matches. In the most simple way it must sort categories by the sum of item scores that matches query and the category. In the best way there should be some coefficient to multiply Scores or some function. Is it possible to implement such a behavior for facet sort? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.
[ https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902588#action_12902588 ] Peter Karich commented on SOLR-2059: Robert, thanks for this work! I have a different application for this patch: in a twitter search # and @ shouldn't be removed. Instead I will handle them like ALPHA, I think. Would you mind to update the patch for the latest version of the trunk? I got a problem with WordDelimiterIterator at line 254 if I am using https://svn.apache.org/repos/asf/lucene/dev/trunk/solr and a file is missing problem (line 37) for http://svn.apache.org/repos/asf/solr Allow customizing how WordDelimiterFilter tokenizes text. - Key: SOLR-2059 URL: https://issues.apache.org/jira/browse/SOLR-2059 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Robert Muir Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2059.patch By default, WordDelimiterFilter assigns 'types' to each character (computed from Unicode Properties). Based on these types and the options provided, it splits and concatenates text. In some circumstances, you might need to tweak the behavior of how this works. It seems the filter already had this in mind, since you can pass in a custom byte[] type table. But its not exposed in the factory. I think you should be able to customize the defaults with a configuration file: {noformat} # A customized type mapping for WordDelimiterFilterFactory # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM # # the default for any character without a mapping is always computed from # Unicode character properties # Map the $, %, '.', and ',' characters to DIGIT # This might be useful for financial data. $ = DIGIT % = DIGIT . = DIGIT \u002C = DIGIT {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.
[ https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902600#action_12902600 ] Peter Karich commented on SOLR-2059: Ups, my mistake ... this helped! What do you think of the file format, is it ok for describing these categories? I think it is ok. I even had a more simpler patch before stumbling over yours: handleAsChar=@# which is now more powerful IMHO: @ = ALPHA # = ALPHA Allow customizing how WordDelimiterFilter tokenizes text. - Key: SOLR-2059 URL: https://issues.apache.org/jira/browse/SOLR-2059 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Robert Muir Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2059.patch By default, WordDelimiterFilter assigns 'types' to each character (computed from Unicode Properties). Based on these types and the options provided, it splits and concatenates text. In some circumstances, you might need to tweak the behavior of how this works. It seems the filter already had this in mind, since you can pass in a custom byte[] type table. But its not exposed in the factory. I think you should be able to customize the defaults with a configuration file: {noformat} # A customized type mapping for WordDelimiterFilterFactory # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM # # the default for any character without a mapping is always computed from # Unicode character properties # Map the $, %, '.', and ',' characters to DIGIT # This might be useful for financial data. $ = DIGIT % = DIGIT . = DIGIT \u002C = DIGIT {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.
[ https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902600#action_12902600 ] Peter Karich edited comment on SOLR-2059 at 8/25/10 3:46 PM: - Ups, my mistake ... this helped! What do you think of the file format, is it ok for describing these categories? I think it is ok. I even had a more simpler patch before stumbling over yours: handleAsChar=@# which is now more powerful IMHO: {code} @ = ALPHA # = ALPHA {code} was (Author: peathal): Ups, my mistake ... this helped! What do you think of the file format, is it ok for describing these categories? I think it is ok. I even had a more simpler patch before stumbling over yours: handleAsChar=@# which is now more powerful IMHO: @ = ALPHA # = ALPHA Allow customizing how WordDelimiterFilter tokenizes text. - Key: SOLR-2059 URL: https://issues.apache.org/jira/browse/SOLR-2059 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Robert Muir Priority: Minor Fix For: 3.1, 4.0 Attachments: SOLR-2059.patch By default, WordDelimiterFilter assigns 'types' to each character (computed from Unicode Properties). Based on these types and the options provided, it splits and concatenates text. In some circumstances, you might need to tweak the behavior of how this works. It seems the filter already had this in mind, since you can pass in a custom byte[] type table. But its not exposed in the factory. I think you should be able to customize the defaults with a configuration file: {noformat} # A customized type mapping for WordDelimiterFilterFactory # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM # # the default for any character without a mapping is always computed from # Unicode character properties # Map the $, %, '.', and ',' characters to DIGIT # This might be useful for financial data. $ = DIGIT % = DIGIT . = DIGIT \u002C = DIGIT {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2005) NullPointerException for more like this request handler via SolrJ if the document does not exist
NullPointerException for more like this request handler via SolrJ if the document does not exist Key: SOLR-2005 URL: https://issues.apache.org/jira/browse/SOLR-2005 Project: Solr Issue Type: Bug Components: clients - java, MoreLikeThis Affects Versions: 1.4 Environment: jdk1.6 Reporter: Peter Karich If I query solr with the following (via SolrJ): q=myUniqueKey%3AsomeValueWhichDoesNotExistqt=%2Fmltmlt.fl=myMLTFieldmlt.minwl=2mlt.mindf=1mlt.match.include=falsefacet=truefacet.sort=countfacet.mincount=1facet.limit=10facet.field=differentFacetFieldstart=0rows=10 I get: org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118) Caused by: java.lang.NullPointerException at org.apache.solr.client.solrj.response.QueryResponse.extractFacetInfo(QueryResponse.java:180) at org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:103) at org.apache.solr.client.solrj.response.QueryResponse.init(QueryResponse.java:80) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89) The xml response of the url is empty and so the info variable at line NamedListInteger fq = (NamedListInteger) info.get( facet_queries ); (QueryResponse) is null. Maybe all variables at QueryResponse.setResponse should be checked against null? Sth. like val = res.getVal( i ); if(val == null) continue; ? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2005) NullPointerException for more like this request handler via SolrJ if the document does not exist
[ https://issues.apache.org/jira/browse/SOLR-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Karich updated SOLR-2005: --- Priority: Minor (was: Major) NullPointerException for more like this request handler via SolrJ if the document does not exist Key: SOLR-2005 URL: https://issues.apache.org/jira/browse/SOLR-2005 Project: Solr Issue Type: Bug Components: clients - java, MoreLikeThis Affects Versions: 1.4 Environment: jdk1.6 Reporter: Peter Karich Priority: Minor Original Estimate: 0.33h Remaining Estimate: 0.33h If I query solr with the following (via SolrJ): q=myUniqueKey%3AsomeValueWhichDoesNotExistqt=%2Fmltmlt.fl=myMLTFieldmlt.minwl=2mlt.mindf=1mlt.match.include=falsefacet=truefacet.sort=countfacet.mincount=1facet.limit=10facet.field=differentFacetFieldstart=0rows=10 I get: org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118) Caused by: java.lang.NullPointerException at org.apache.solr.client.solrj.response.QueryResponse.extractFacetInfo(QueryResponse.java:180) at org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:103) at org.apache.solr.client.solrj.response.QueryResponse.init(QueryResponse.java:80) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89) The xml response of the url is empty and so the info variable at line NamedListInteger fq = (NamedListInteger) info.get( facet_queries ); (QueryResponse) is null. Maybe all variables at QueryResponse.setResponse should be checked against null? Sth. like val = res.getVal( i ); if(val == null) continue; ? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-787) SolrJ POM refers to stax parser
[ https://issues.apache.org/jira/browse/SOLR-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878385#action_12878385 ] Peter Karich commented on SOLR-787: --- Is this really correctly fixed? Inspecting my deps with NetBeans' maven dep viewer I don't understand why Solr uses woodstox and SolrJ uses the different artifact (but same jar) org.codehaus.woodstox And according to http://jarvana.com/jarvana/inspect-pom/org/apache/solr/solr-core/1.4.0/solr-core-1.4.0.pom http://jarvana.com/jarvana/inspect-pom/org/apache/solr/solr-solrj/1.4.0/solr-solrj-1.4.0.pom NetBeans is correct. The problem with this is, that you will have two identical jars in the classpath and that the solrj dep forces you to still use stax-api SolrJ POM refers to stax parser --- Key: SOLR-787 URL: https://issues.apache.org/jira/browse/SOLR-787 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 1.4 Attachments: SOLR-787.patch Solr core moved to using woodstox instead of stax but SolrJ POM still has a dependency to stax. We should replace the dependency to stax with woodstox jar in SolrJ's POM. This is not a huge problem as we are not distributing stax anymore but is needed for consistency. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-1950) SolrJ POM still refers to stax parser
SolrJ POM still refers to stax parser - Key: SOLR-1950 URL: https://issues.apache.org/jira/browse/SOLR-1950 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Reporter: Peter Karich Priority: Minor See the issue at https://issues.apache.org/jira/browse/SOLR-787 which seems to be incorrectly fixed. (I cannot reopen that issue, so I create this one here) Using the following deps: dependency groupIdorg.apache.solr/groupId artifactIdsolr-solrj/artifactId version1.4.0/version /dependency dependency artifactIdsolr-core/artifactId groupIdorg.apache.solr/groupId version1.4.0/version /dependency will lead to duplicate jars. (Solr uses woodstox and SolrJ uses the different artifact (but same jar) org.codehaus.woodstox ) But maybe the artifacts are only incorrectly deployed? Where can I find the original pom files? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1864) Master/Slave replication causes tomcat to be unresponsive on slave till replication is being done.
[ https://issues.apache.org/jira/browse/SOLR-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874055#action_12874055 ] Peter Karich commented on SOLR-1864: This might be a duplicate of: https://issues.apache.org/jira/browse/SOLR-1775 The reason might be (as Paul Noble noted) that the garbage collector is busy a lot because of autowarm up after index switch was done Master/Slave replication causes tomcat to be unresponsive on slave till replication is being done. -- Key: SOLR-1864 URL: https://issues.apache.org/jira/browse/SOLR-1864 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 1.5 Environment: Centos 5.2, Tomcat5, java version 1.6.0 OpenJDK Runtime Environment (build 1.6.0-b09) OpenJDK 64-Bit Server VM (build 1.6.0-b09, mixed mode) Reporter: Marcin Hi guys, I have found a strange behaviour on tomcat5, centos 5.2. While replication is being done ( million rows) tomcat5 seems to be unresponsive till its finished. Please help cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841752#action_12841752 ] Peter Karich commented on SOLR-236: --- Shouldn't the float array in DocSetScoreCollector be changed to a Map? hmmh, maybe I expressed myself a bit weird: I already changed this all to a Map (a SortedMap) ... I started this change in DocSetScoreCollector and changed all the other occurances of the float array (otherwise I would have to copy the entire map) I think the compare method should NOT be called if no docs are in the scores array ... ? I would expect that every docId has a score. Yes, me too. So I expect there is somewhere a bug. But as I sayd this breaks only one test (collapse with faceting before). It could be even a but in the testcase though. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Karich updated SOLR-236: -- Attachment: NonAdjacentDocumentCollapserTest.java NonAdjacentDocumentCollapser.java DocSetScoreCollector.java It seems to me that the provides changes are necessary to make the OutOfMemory exception gone. Please apply the files with caution, because I made the changes from an old patch (from Nov 2009) Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, DocSetScoreCollector.java, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841756#action_12841756 ] Peter Karich edited comment on SOLR-236 at 3/5/10 8:53 AM: --- It seems to me that the provided changes are necessary to make the OutOfMemory exception gone (see appended 3 files). Please apply the files with caution, because I made the changes from an old patch (from Nov 2009) was (Author: peathal): It seems to me that the provides changes are necessary to make the OutOfMemory exception gone. Please apply the files with caution, because I made the changes from an old patch (from Nov 2009) Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, DocSetScoreCollector.java, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841147#action_12841147 ] Peter Karich commented on SOLR-236: --- regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat} public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} although the compare method should be called if no docs are in the scores array ... ? Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841147#action_12841147 ] Peter Karich edited comment on SOLR-236 at 3/4/10 9:48 AM: --- regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat}public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} I think the compare method should NOT be called if no docs are in the scores array ... ? was (Author: peathal): regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat}public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} although the compare method should be called if no docs are in the scores array ... ? Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841147#action_12841147 ] Peter Karich edited comment on SOLR-236 at 3/4/10 9:46 AM: --- regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat}public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} although the compare method should be called if no docs are in the scores array ... ? was (Author: peathal): regarding the OutOfMemory problem: we are now testing the suggested change in production. I replaced the float array with a TreeMapInteger, Float. The change was nearly trivial (I cannot provide a patch easily, because we are using an older patch, althoug I could post the 3 changed files.) The point why I used a TreeMap instead a HashMap was that in the method advance in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap method: {noformat} public int advance(int target) throws IOException { // now we need a treemap method: iter = scores.tailMap(target).entrySet().iterator(); if (iter.hasNext()) return target; else return NO_MORE_DOCS; } {noformat} Then - I think - I discovered a bug/inconsistent behaviour: If I run the test FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then the scores arrays will be created ala new float[maxDocs] in the old version. But the array will never be filled with some values so Float value1 = values.get(doc1); will return null in the method NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of TreeMap is 0!); I work around this via {noformat} if (value1 == null) value1 = 0f; if (value2 == null) value2 = 0f; {noformat} although the compare method should be called if no docs are in the scores array ... ? Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the
[jira] Commented: (SOLR-1167) Support module xml config files using XInclude
[ https://issues.apache.org/jira/browse/SOLR-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841178#action_12841178 ] Peter Karich commented on SOLR-1167: @Shalin Shekhar Mangar: how can I use the proposed attribute feature to be used for master+slave configuration? Do you have a code snippet? Support module xml config files using XInclude -- Key: SOLR-1167 URL: https://issues.apache.org/jira/browse/SOLR-1167 Project: Solr Issue Type: New Feature Affects Versions: 1.4 Reporter: Bryan Talbot Assignee: Grant Ingersoll Priority: Minor Fix For: 1.4 Attachments: SOLR-1167.patch, SOLR-1167.patch, SOLR-1167.patch, SOLR-1167.patch, SOLR-1167.patch Current configuration files (schema and solrconfig) are monolithic which can make maintenance and reuse more difficult that it needs to be. The XML standards include a feature to include content from external files. This is described at http://www.w3.org/TR/xinclude/ This feature is to add support for XInclude features for XML configuration files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835230#action_12835230 ] Peter Karich commented on SOLR-236: --- We are facing OutOfMemory problems too. We are using https://issues.apache.org/jira/secure/attachment/12425775/field-collapse-5.patch Are you using any other features besides plain collapsing? The field collapse cache gets large very quickly, I suggest you turn it off (if you are using it). Also you can try to make your filterCache smaller. How can I turn off the collapse cache or make the filterCache smaller? Are there other workarounds? E.g. via using a special version of the patch ? I read that it could help to specify collapse.maxdocs but this didn't help in our case ... could collapse.type=adjacent help here? (https://issues.apache.org/jira/browse/SOLR-236?focusedCommentId=12495376page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12495376) What do you think? BTW: We really like this patch and would like to use it !! :-) Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258 ] Peter Karich commented on SOLR-236: --- Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://searchdev05:15100/cs-bidcs/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258 ] Peter Karich edited comment on SOLR-236 at 2/18/10 4:06 PM: Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://server/cs-bidcs/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( was (Author: peathal): Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://searchdev05:15100/cs-bidcs/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258 ] Peter Karich edited comment on SOLR-236 at 2/18/10 4:06 PM: Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://server/solr-app/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( was (Author: peathal): Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://server/cs-bidcs/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258 ] Peter Karich edited comment on SOLR-236 at 2/18/10 4:07 PM: Trying the latest patch from 1th Feb 2010. It compiles against solr-2010-02-13 from nightly build dir, but does not work. If I query http://server/solr-app/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( was (Author: peathal): Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from nightly build but does not work. If I query http://server/solr-app/select?q=*:*collapse.field=myfield it fails with: {noformat} HTTP Status 500 - null java.lang.NullPointerException at org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58) at org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja va:84) at org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at ... {noformat} I only need the OutOfMemory problem solved ... :-( Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max