[jira] [Commented] (LUCENE-2228) AES Encrypted Directory

2014-01-03 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861923#comment-13861923
 ] 

Peter Karich commented on LUCENE-2228:
--

What is the state here?

 AES Encrypted Directory
 ---

 Key: LUCENE-2228
 URL: https://issues.apache.org/jira/browse/LUCENE-2228
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/other
Affects Versions: 3.1
Reporter: Jay Mundrawala
 Attachments: LUCENE-2228.patch, lucene-encryption.tar.gz


 Provides an encryption solution for Lucene indexes, using the AES encryption 
 algorithm.
 You must have the JCE Unlimited Strength Jurisdiction Policy Files 6 Release 
 Candidate which
 you can get from java.sun.com.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1729) Date Facet now override time parameter

2010-12-12 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970612#action_12970612
 ] 

Peter Karich commented on SOLR-1729:


Hi Yonik,

so, sorry for another misposting: yes, you were right. it was the wrong solr 
version. it was too late yesterday :-/

All is fine now with this patch. But the 
org.apache.solr.request.SolrRequestInfo class is missing or am I completely 
crazy now? (I checked out solr twice and applied the patch again but it didn't 
compile)

Regards,
Peter.

 Date Facet now override time parameter
 --

 Key: SOLR-1729
 URL: https://issues.apache.org/jira/browse/SOLR-1729
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Solr 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetParams.java, SimpleFacets.java, 
 solr-1.4.0-solr-1729.patch, SOLR-1729.patch, SOLR-1729.patch, 
 UnInvertedField.java


 This PATCH introduces a new query parameter that tells a (typically, but not 
 necessarily) remote server what time to use as 'NOW' when calculating date 
 facets for a query (and, for the moment, date facets *only*) - overriding the 
 default behaviour of using the local server's current time.
 This gets 'round a problem whereby an explicit time range is specified in a 
 query (e.g. timestamp:[then0 TO then1]), and date facets are required for the 
 given time range (in fact, any explicit time range). 
 Because DateMathParser performs all its calculations from 'NOW', remote 
 callers have to work out how long ago 'then0' and 'then1' are from 'now', and 
 use the relative-to-now values in the facet.date.xxx parameters. If a remote 
 server has a different opinion of NOW compared to the caller, the results 
 will be skewed (e.g. they are in a different time-zone, not time-synced etc.).
 This becomes particularly salient when performing distributed date faceting 
 (see SOLR-1709), where multiple shards may all be running with different 
 times, and the faceting needs to be aligned.
 The new parameter is called 'facet.date.now', and takes as a parameter a 
 (stringified) long that is the number of milliseconds from the epoch (1 Jan 
 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. 
 This was chosen over a formatted date to delineate it from a 'searchable' 
 time and to avoid superfluous date parsing. This makes the value generally a 
 programatically-set value, but as that is where the use-case is for this type 
 of parameter, this should be ok.
 NOTE: This parameter affects date facet timing only. If there are other areas 
 of a query that rely on 'NOW', these will not interpret this value. This is a 
 broader issue about setting a 'query-global' NOW that all parts of query 
 analysis can share.
 Source files affected:
 FacetParams.java   (holds the new constant FACET_DATE_NOW)
 SimpleFacets.java  getFacetDateCounts() NOW parameter modified
 This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as 
 it's a general change for date faceting, it was deemed deserving of its own 
 patch. I will be updating SOLR-1709 in due course to include the use of this 
 new parameter, after some rfc acceptance.
 A possible enhancement to this is to detect facet.date fields, look for and 
 match these fields in queries (if they exist), and potentially determine 
 automatically the required time skew, if any. There are a whole host of 
 reasons why this could be problematic to implement, so an explicit 
 facet.date.now parameter is the safest route.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1729) Date Facet now override time parameter

2010-12-12 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970671#action_12970671
 ] 

Peter Karich commented on SOLR-1729:


Nice, now this patch 1729 applies + compiles + run tests successfully (I'm 
using rev 1044942 of trunk)

One further question: Would facet queries (with dates) work in the distributed 
setup without the date-patches? To get a quick(er) workaround. because I would 
need the patch for 1.4.1 (-solandra)

 Date Facet now override time parameter
 --

 Key: SOLR-1729
 URL: https://issues.apache.org/jira/browse/SOLR-1729
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Solr 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetParams.java, SimpleFacets.java, 
 solr-1.4.0-solr-1729.patch, SOLR-1729.patch, SOLR-1729.patch, 
 SOLR-1729.patch, UnInvertedField.java


 This PATCH introduces a new query parameter that tells a (typically, but not 
 necessarily) remote server what time to use as 'NOW' when calculating date 
 facets for a query (and, for the moment, date facets *only*) - overriding the 
 default behaviour of using the local server's current time.
 This gets 'round a problem whereby an explicit time range is specified in a 
 query (e.g. timestamp:[then0 TO then1]), and date facets are required for the 
 given time range (in fact, any explicit time range). 
 Because DateMathParser performs all its calculations from 'NOW', remote 
 callers have to work out how long ago 'then0' and 'then1' are from 'now', and 
 use the relative-to-now values in the facet.date.xxx parameters. If a remote 
 server has a different opinion of NOW compared to the caller, the results 
 will be skewed (e.g. they are in a different time-zone, not time-synced etc.).
 This becomes particularly salient when performing distributed date faceting 
 (see SOLR-1709), where multiple shards may all be running with different 
 times, and the faceting needs to be aligned.
 The new parameter is called 'facet.date.now', and takes as a parameter a 
 (stringified) long that is the number of milliseconds from the epoch (1 Jan 
 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. 
 This was chosen over a formatted date to delineate it from a 'searchable' 
 time and to avoid superfluous date parsing. This makes the value generally a 
 programatically-set value, but as that is where the use-case is for this type 
 of parameter, this should be ok.
 NOTE: This parameter affects date facet timing only. If there are other areas 
 of a query that rely on 'NOW', these will not interpret this value. This is a 
 broader issue about setting a 'query-global' NOW that all parts of query 
 analysis can share.
 Source files affected:
 FacetParams.java   (holds the new constant FACET_DATE_NOW)
 SimpleFacets.java  getFacetDateCounts() NOW parameter modified
 This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as 
 it's a general change for date faceting, it was deemed deserving of its own 
 patch. I will be updating SOLR-1709 in due course to include the use of this 
 new parameter, after some rfc acceptance.
 A possible enhancement to this is to detect facet.date fields, look for and 
 match these fields in queries (if they exist), and potentially determine 
 automatically the required time skew, if any. There are a whole host of 
 reasons why this could be problematic to implement, so an explicit 
 facet.date.now parameter is the safest route.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1729) Date Facet now override time parameter

2010-12-11 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970544#action_12970544
 ] 

Peter Karich commented on SOLR-1729:


Yonik,

thanks for the update. I refreshed my sources (now trunk) to rev 1044745. But 
the patch does not cleanly apply* for SearchHandler.
Am I doing something stupid here?

Regards,
Peter.

*
pathxy/solr_branch_3x$ patch -p0  SOLR-1729.patch 
patching file solr/src/test/test-files/solr/conf/schema12.xml
patching file 
solr/src/test/org/apache/solr/search/function/TestFunctionQuery.java
Hunk #1 succeeded at 301 (offset -17 lines).
patching file 
solr/src/test/org/apache/solr/handler/component/SpellCheckComponentTest.java
patching file 
solr/src/test/org/apache/solr/handler/component/TermVectorComponentTest.java
patching file solr/src/java/org/apache/solr/core/QuerySenderListener.java
patching file solr/src/java/org/apache/solr/request/SimpleFacets.java
Hunk #1 succeeded at 64 (offset -9 lines).
Hunk #2 succeeded at 620 (offset -200 lines).
Hunk #3 succeeded at 630 (offset -200 lines).
Hunk #4 succeeded at 645 (offset -200 lines).
Hunk #5 succeeded at 803 (offset -200 lines).
patching file solr/src/java/org/apache/solr/handler/component/SearchHandler.java
Hunk #1 FAILED at 192.
Hunk #2 succeeded at 255 (offset -36 lines).
1 out of 2 hunks FAILED -- saving rejects to file 
solr/src/java/org/apache/solr/handler/component/SearchHandler.java.rej
patching file 
solr/src/java/org/apache/solr/handler/component/ResponseBuilder.java
Hunk #2 succeeded at 67 (offset -1 lines).
patching file solr/src/java/org/apache/solr/spelling/SpellCheckCollator.java
patching file solr/src/java/org/apache/solr/util/TestHarness.java
Hunk #2 succeeded at 320 (offset -9 lines).
Hunk #3 succeeded at 335 (offset -9 lines).
patching file solr/src/java/org/apache/solr/util/DateMathParser.java
patching file solr/src/webapp/src/org/apache/solr/servlet/SolrServlet.java
patching file 
solr/src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java
Hunk #1 succeeded at 241 (offset 4 lines).
Hunk #2 succeeded at 255 (offset 4 lines).
Hunk #3 succeeded at 283 (offset 4 lines).
patching file 
solr/src/webapp/src/org/apache/solr/servlet/DirectSolrConnection.java
Hunk #2 succeeded at 170 (offset -16 lines).
Hunk #3 succeeded at 185 with fuzz 1 (offset -16 lines).
patching file 
solr/src/webapp/src/org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java
Hunk #1 succeeded at 32 with fuzz 1 (offset -9 lines).
Hunk #2 succeeded at 138 (offset -11 lines).
Hunk #3 succeeded at 156 (offset -77 lines).


 Date Facet now override time parameter
 --

 Key: SOLR-1729
 URL: https://issues.apache.org/jira/browse/SOLR-1729
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Solr 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetParams.java, SimpleFacets.java, 
 solr-1.4.0-solr-1729.patch, SOLR-1729.patch, SOLR-1729.patch, 
 UnInvertedField.java


 This PATCH introduces a new query parameter that tells a (typically, but not 
 necessarily) remote server what time to use as 'NOW' when calculating date 
 facets for a query (and, for the moment, date facets *only*) - overriding the 
 default behaviour of using the local server's current time.
 This gets 'round a problem whereby an explicit time range is specified in a 
 query (e.g. timestamp:[then0 TO then1]), and date facets are required for the 
 given time range (in fact, any explicit time range). 
 Because DateMathParser performs all its calculations from 'NOW', remote 
 callers have to work out how long ago 'then0' and 'then1' are from 'now', and 
 use the relative-to-now values in the facet.date.xxx parameters. If a remote 
 server has a different opinion of NOW compared to the caller, the results 
 will be skewed (e.g. they are in a different time-zone, not time-synced etc.).
 This becomes particularly salient when performing distributed date faceting 
 (see SOLR-1709), where multiple shards may all be running with different 
 times, and the faceting needs to be aligned.
 The new parameter is called 'facet.date.now', and takes as a parameter a 
 (stringified) long that is the number of milliseconds from the epoch (1 Jan 
 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. 
 This was chosen over a formatted date to delineate it from a 'searchable' 
 time and to avoid superfluous date parsing. This makes the value generally a 
 programatically-set value, but as that is where the use-case is for this type 
 of parameter, this should be ok.
 NOTE: This parameter affects date facet timing only. If there are other areas 
 of a query that rely on 'NOW', these will not interpret this value. This is a 
 broader issue about setting a 

[jira] Commented: (SOLR-1729) Date Facet now override time parameter

2010-12-03 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966488#action_12966488
 ] 

Peter Karich commented on SOLR-1729:


*regarding: 1.4.1*
Hmmh, today download.carrot2.org is down and I had to delete contrib/clustering 
to do the build after the patch. which does not apply cleanly (strange that it 
appled yesterday):

solr1.4.1$ patch -p0  solr-1.4.0-solr-1729.patch 
patching file src/java/org/apache/solr/handler/component/FacetComponent.java
patching file src/java/org/apache/solr/handler/component/ResponseBuilder.java

solr1.4.1$ patch -p0  solr-1.4.0-solr-1709.patch 
patching file src/java/org/apache/solr/handler/component/FacetComponent.java
Reversed (or previously applied) patch detected!  Assume -R? [n] y
patching file src/java/org/apache/solr/handler/component/ResponseBuilder.java
Reversed (or previously applied) patch detected!  Assume -R? [n] y
Hunk #3 succeeded at 251 (offset -1 lines).

Or is this ok?? Because then, all tests would pass ...



*regarding branch3x*
both patches do not apply cleanly. SOLR-1709 fails also without SOLR-1729

solr_branch_3x/solr$ patch -p0  solr-1.4.0-solr-1709.patch 
patching file src/java/org/apache/solr/handler/component/FacetComponent.java
Hunk #1 succeeded at 240 (offset 2 lines).
Hunk #2 succeeded at 267 with fuzz 2 (offset 7 lines).
Hunk #3 FAILED at 436.
1 out of 3 hunks FAILED -- saving rejects to file 
src/java/org/apache/solr/handler/component/FacetComponent.java.rej
patching file src/java/org/apache/solr/handler/component/ResponseBuilder.java
Reversed (or previously applied) patch detected!  Assume -R? [n] y
Hunk #2 FAILED at 61.
Hunk #3 FAILED at 252.
2 out of 3 hunks FAILED -- saving rejects to file 
src/java/org/apache/solr/handler/component/ResponseBuilder.java.rej



 Date Facet now override time parameter
 --

 Key: SOLR-1729
 URL: https://issues.apache.org/jira/browse/SOLR-1729
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Solr 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetParams.java, SimpleFacets.java, 
 solr-1.4.0-solr-1729.patch, UnInvertedField.java


 This PATCH introduces a new query parameter that tells a (typically, but not 
 necessarily) remote server what time to use as 'NOW' when calculating date 
 facets for a query (and, for the moment, date facets *only*) - overriding the 
 default behaviour of using the local server's current time.
 This gets 'round a problem whereby an explicit time range is specified in a 
 query (e.g. timestamp:[then0 TO then1]), and date facets are required for the 
 given time range (in fact, any explicit time range). 
 Because DateMathParser performs all its calculations from 'NOW', remote 
 callers have to work out how long ago 'then0' and 'then1' are from 'now', and 
 use the relative-to-now values in the facet.date.xxx parameters. If a remote 
 server has a different opinion of NOW compared to the caller, the results 
 will be skewed (e.g. they are in a different time-zone, not time-synced etc.).
 This becomes particularly salient when performing distributed date faceting 
 (see SOLR-1709), where multiple shards may all be running with different 
 times, and the faceting needs to be aligned.
 The new parameter is called 'facet.date.now', and takes as a parameter a 
 (stringified) long that is the number of milliseconds from the epoch (1 Jan 
 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. 
 This was chosen over a formatted date to delineate it from a 'searchable' 
 time and to avoid superfluous date parsing. This makes the value generally a 
 programatically-set value, but as that is where the use-case is for this type 
 of parameter, this should be ok.
 NOTE: This parameter affects date facet timing only. If there are other areas 
 of a query that rely on 'NOW', these will not interpret this value. This is a 
 broader issue about setting a 'query-global' NOW that all parts of query 
 analysis can share.
 Source files affected:
 FacetParams.java   (holds the new constant FACET_DATE_NOW)
 SimpleFacets.java  getFacetDateCounts() NOW parameter modified
 This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as 
 it's a general change for date faceting, it was deemed deserving of its own 
 patch. I will be updating SOLR-1709 in due course to include the use of this 
 new parameter, after some rfc acceptance.
 A possible enhancement to this is to detect facet.date fields, look for and 
 match these fields in queries (if they exist), and potentially determine 
 automatically the required time skew, if any. There are a whole host of 
 reasons why this could be problematic to implement, so an explicit 
 facet.date.now parameter is the 

[jira] Commented: (SOLR-1729) Date Facet now override time parameter

2010-12-03 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966520#action_12966520
 ] 

Peter Karich commented on SOLR-1729:


Hi Peter,

1.4.1 would be fine (I asked Jake from solandra, before I thought he uses the 
trunk)

Now in my last comment I made a stupid mistake: the patches didn't cleanly 
apply for 1.4.1 because I accidentially overwrote solr-1729.patch with 
solr-1709 when copying from branch3x and got two identical 1709 patches :-/

So: for 1.4.1 the patches apply cleanly. But the question remains why the 
following tests are failing:

Test org.apache.solr.TestTrie FAILED

Test org.apache.solr.request.SimpleFacetsTest FAILED


 Date Facet now override time parameter
 --

 Key: SOLR-1729
 URL: https://issues.apache.org/jira/browse/SOLR-1729
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Solr 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetParams.java, SimpleFacets.java, 
 solr-1.4.0-solr-1729.patch, UnInvertedField.java


 This PATCH introduces a new query parameter that tells a (typically, but not 
 necessarily) remote server what time to use as 'NOW' when calculating date 
 facets for a query (and, for the moment, date facets *only*) - overriding the 
 default behaviour of using the local server's current time.
 This gets 'round a problem whereby an explicit time range is specified in a 
 query (e.g. timestamp:[then0 TO then1]), and date facets are required for the 
 given time range (in fact, any explicit time range). 
 Because DateMathParser performs all its calculations from 'NOW', remote 
 callers have to work out how long ago 'then0' and 'then1' are from 'now', and 
 use the relative-to-now values in the facet.date.xxx parameters. If a remote 
 server has a different opinion of NOW compared to the caller, the results 
 will be skewed (e.g. they are in a different time-zone, not time-synced etc.).
 This becomes particularly salient when performing distributed date faceting 
 (see SOLR-1709), where multiple shards may all be running with different 
 times, and the faceting needs to be aligned.
 The new parameter is called 'facet.date.now', and takes as a parameter a 
 (stringified) long that is the number of milliseconds from the epoch (1 Jan 
 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. 
 This was chosen over a formatted date to delineate it from a 'searchable' 
 time and to avoid superfluous date parsing. This makes the value generally a 
 programatically-set value, but as that is where the use-case is for this type 
 of parameter, this should be ok.
 NOTE: This parameter affects date facet timing only. If there are other areas 
 of a query that rely on 'NOW', these will not interpret this value. This is a 
 broader issue about setting a 'query-global' NOW that all parts of query 
 analysis can share.
 Source files affected:
 FacetParams.java   (holds the new constant FACET_DATE_NOW)
 SimpleFacets.java  getFacetDateCounts() NOW parameter modified
 This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as 
 it's a general change for date faceting, it was deemed deserving of its own 
 patch. I will be updating SOLR-1709 in due course to include the use of this 
 new parameter, after some rfc acceptance.
 A possible enhancement to this is to detect facet.date fields, look for and 
 match these fields in queries (if they exist), and potentially determine 
 automatically the required time skew, if any. There are a whole host of 
 reasons why this could be problematic to implement, so an explicit 
 facet.date.now parameter is the safest route.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1729) Date Facet now override time parameter

2010-12-03 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966562#action_12966562
 ] 

Peter Karich commented on SOLR-1729:


Hi Peter,

sorry for the confusion :-/

I was speaking of 1.4.1: the two patches apply. 2 tests fail.

Regards,
Peter.

 Date Facet now override time parameter
 --

 Key: SOLR-1729
 URL: https://issues.apache.org/jira/browse/SOLR-1729
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Solr 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetParams.java, SimpleFacets.java, 
 solr-1.4.0-solr-1729.patch, UnInvertedField.java


 This PATCH introduces a new query parameter that tells a (typically, but not 
 necessarily) remote server what time to use as 'NOW' when calculating date 
 facets for a query (and, for the moment, date facets *only*) - overriding the 
 default behaviour of using the local server's current time.
 This gets 'round a problem whereby an explicit time range is specified in a 
 query (e.g. timestamp:[then0 TO then1]), and date facets are required for the 
 given time range (in fact, any explicit time range). 
 Because DateMathParser performs all its calculations from 'NOW', remote 
 callers have to work out how long ago 'then0' and 'then1' are from 'now', and 
 use the relative-to-now values in the facet.date.xxx parameters. If a remote 
 server has a different opinion of NOW compared to the caller, the results 
 will be skewed (e.g. they are in a different time-zone, not time-synced etc.).
 This becomes particularly salient when performing distributed date faceting 
 (see SOLR-1709), where multiple shards may all be running with different 
 times, and the faceting needs to be aligned.
 The new parameter is called 'facet.date.now', and takes as a parameter a 
 (stringified) long that is the number of milliseconds from the epoch (1 Jan 
 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. 
 This was chosen over a formatted date to delineate it from a 'searchable' 
 time and to avoid superfluous date parsing. This makes the value generally a 
 programatically-set value, but as that is where the use-case is for this type 
 of parameter, this should be ok.
 NOTE: This parameter affects date facet timing only. If there are other areas 
 of a query that rely on 'NOW', these will not interpret this value. This is a 
 broader issue about setting a 'query-global' NOW that all parts of query 
 analysis can share.
 Source files affected:
 FacetParams.java   (holds the new constant FACET_DATE_NOW)
 SimpleFacets.java  getFacetDateCounts() NOW parameter modified
 This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as 
 it's a general change for date faceting, it was deemed deserving of its own 
 patch. I will be updating SOLR-1709 in due course to include the use of this 
 new parameter, after some rfc acceptance.
 A possible enhancement to this is to detect facet.date fields, look for and 
 match these fields in queries (if they exist), and potentially determine 
 automatically the required time skew, if any. There are a whole host of 
 reasons why this could be problematic to implement, so an explicit 
 facet.date.now parameter is the safest route.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1729) Date Facet now override time parameter

2010-12-02 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966181#action_12966181
 ] 

Peter Karich commented on SOLR-1729:


Peter Sturge,

in SOLR-1709 you said that you are working with branch3x I checked it out from 
here:
https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x

but this 1729 patch didn't apply cleanly*. 

When I tried the 1.4.1 release it is ok, but the tests fail due to**

What could be wrong?

Regards,
Peter.



*
solr_branch_3x/solr$ patch -p0  solr-1.4.0-solr-1729.patch 
patching file src/java/org/apache/solr/request/SimpleFacets.java
Hunk #1 succeeded at 245 (offset 28 lines).
Hunk #2 succeeded at 280 (offset 28 lines).
Hunk #3 FAILED at 582.
Hunk #4 FAILED at 652.
2 out of 4 hunks FAILED -- saving rejects to file 
src/java/org/apache/solr/request/SimpleFacets.java.rej
patching file src/java/org/apache/solr/request/UnInvertedField.java
Hunk #2 succeeded at 40 with fuzz 1 (offset 1 line).
Hunk #3 succeeded at 440 (offset 5 lines).
Hunk #4 succeeded at 557 (offset 5 lines).
patching file src/common/org/apache/solr/common/params/FacetParams.java
Hunk #1 FAILED at 175.
1 out of 1 hunk FAILED -- saving rejects to file 
src/common/org/apache/solr/common/params/FacetParams.java.rej




**
[junit] Running org.apache.solr.TestTrie
[junit]  xml response was: ?xml version=1.0 encoding=UTF-8?
[junit] response
[junit] lst name=responseHeaderint name=status0/intint 
name=QTime157/int/lstresult name=response numFound=15 
start=0docfloat name=id0.0/floatdate 
name=tdate2010-12-02T00:00:00Z/datedouble 
name=tdouble0.0/doublefloat name=tfloat0.0/floatint 
name=tint0/intlong name=tlong2147483647/long/docdocfloat 
name=id1.0/floatdate name=tdate2010-12-03T00:00:00Z/datedouble 
name=tdouble2.33/doublefloat name=tfloat31.11/floatint 
name=tint1/intlong name=tlong2147483648/long/docdocfloat 
name=id2.0/floatdate name=tdate2010-12-04T00:00:00Z/datedouble 
name=tdouble4.66/doublefloat name=tfloat124.44/floatint 
name=tint2/intlong name=tlong2147483649/long/docdocfloat 
name=id3.0/floatdate name=tdate2010-12-05T00:00:00Z/datedouble 
name=tdouble6.99/doublefloat name=tfloat279.99/floatint 
name=tint3/intlong name=tlong2147483650/long/docdocfloat 
name=id4.0/floatdate name=tdate2010-12-06T00:00:00Z/datedouble 
name=tdouble9.32/doublefloat name=tfloat497.76/floatint 
name=tint4/intlong name=tlong2147483651/long/docdocfloat 
name=id5.0/floatdate name=tdate2010-12-07T00:00:00Z/datedouble 
name=tdouble11.65/doublefloat name=tfloat777.75/floatint 
name=tint5/intlong name=tlong2147483652/long/docdocfloat 
name=id6.0/floatdate name=tdate2010-12-08T00:00:00Z/datedouble 
name=tdouble13.98/doublefloat name=tfloat1119.96/floatint 
name=tint6/intlong name=tlong2147483653/long/docdocfloat 
name=id7.0/floatdate name=tdate2010-12-09T00:00:00Z/datedouble 
name=tdouble16.312/doublefloat 
name=tfloat1524.39/floatint name=tint7/intlong 
name=tlong2147483654/long/docdocfloat name=id8.0/floatdate 
name=tdate2010-12-10T00:00:00Z/datedouble 
name=tdouble18.64/doublefloat name=tfloat1991.04/floatint 
name=tint8/intlong name=tlong2147483655/long/docdocfloat 
name=id9.0/floatdate name=tdate2010-12-11T00:00:00Z/datedouble 
name=tdouble20.97/doublefloat name=tfloat2519.9102/floatint 
name=tint9/intlong name=tlong2147483656/long/docdocfloat 
name=id10.0/floatdate name=tdate2010-12-02T00:00:00Z/datedouble 
name=tdouble0.0/doublefloat name=tfloat0.0/floatint 
name=tint0/intlong name=tlong2147483647/long/docdocfloat 
name=id20.0/floatdate name=tdate2010-12-03T00:00:00Z/datedouble 
name=tdouble2.33/doublefloat name=tfloat31.11/floatint 
name=tint1/intlong name=tlong2147483648/long/docdocfloat 
name=id30.0/floatdate name=tdate2010-12-04T00:00:00Z/datedouble 
name=tdouble4.66/doublefloat name=tfloat124.44/floatint 
name=tint2/intlong name=tlong2147483649/long/docdocfloat 
name=id40.0/floatdate name=tdate2010-12-05T00:00:00Z/datedouble 
name=tdouble6.99/doublefloat name=tfloat279.99/floatint 
name=tint3/intlong name=tlong2147483650/long/docdocfloat 
name=id50.0/floatdate name=tdate2010-12-06T00:00:00Z/datedouble 
name=tdouble9.32/doublefloat name=tfloat497.76/floatint 
name=tint4/intlong name=tlong2147483651/long/doc/resultlst 
name=facet_countslst name=facet_queries/lst name=facet_fieldslst 
name=tintint name=02/intint name=12/intint name=22/intint 
name=32/intint name=42/intint name=51/intint 
name=61/intint name=71/intint name=81/intint 
name=91/int/lstlst name=tlongint name=21474836472/intint 
name=21474836482/intint name=21474836492/intint 
name=21474836502/intint name=21474836512/intint 
name=21474836521/intint name=21474836531/intint 
name=21474836541/intint name=21474836551/intint 
name=21474836561/int/lstlst name=tfloatint name=0.02/intint 
name=31.112/intint name=124.442/intint name=279.992/intint 
name=497.762/intint name=777.751/intint name=1119.961/intint 

[jira] Commented: (SOLR-1709) Distributed Date Faceting

2010-12-01 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965841#action_12965841
 ] 

Peter Karich commented on SOLR-1709:


Hi Peter,

sorry for getting so late back.

I'm relative sure now that I'll need that patch (also Jake from solandra was 
asking when this patch will be ready :-))

So, I will need to apply SOLR-1729 and then this patch to the 3x branch or even 
without SOLR-1729 (not necessary in my case)?

Regards,
Peter.

 Distributed Date Faceting
 -

 Key: SOLR-1709
 URL: https://issues.apache.org/jira/browse/SOLR-1709
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetComponent.java, FacetComponent.java, 
 ResponseBuilder.java, solr-1.4.0-solr-1709.patch


 This patch is for adding support for date facets when using distributed 
 searches.
 Date faceting across multiple machines exposes some time-based issues that 
 anyone interested in this behaviour should be aware of:
 Any time and/or time-zone differences are not accounted for in the patch 
 (i.e. merged date facets are at a time-of-day, not necessarily at a universal 
 'instant-in-time', unless all shards are time-synced to the exact same time).
 The implementation uses the first encountered shard's facet_dates as the 
 basis for subsequent shards' data to be merged in.
 This means that if subsequent shards' facet_dates are skewed in relation to 
 the first by 1 'gap', these 'earlier' or 'later' facets will not be merged 
 in.
 There are several reasons for this:
   * Performance: It's faster to check facet_date lists against a single map's 
 data, rather than against each other, particularly if there are many shards
   * If 'earlier' and/or 'later' facet_dates are added in, this will make the 
 time range larger than that which was requested
 (e.g. a request for one hour's worth of facets could bring back 2, 3 
 or more hours of data)
 This could be dealt with if timezone and skew information was added, and 
 the dates were normalized.
 One possibility for adding such support is to [optionally] add 'timezone' and 
 'now' parameters to the 'facet_dates' map. This would tell requesters what 
 time and TZ the remote server thinks it is, and so multiple shards' time data 
 can be normalized.
 The patch affects 2 files in the Solr core:
   org.apache.solr.handler.component.FacetComponent.java
   org.apache.solr.handler.component.ResponseBuilder.java
 The main changes are in FacetComponent - ResponseBuilder is just to hold the 
 completed SimpleOrderedMap until the finishStage.
 One possible enhancement is to perhaps make this an optional parameter, but 
 really, if facet.date parameters are specified, it is assumed they are 
 desired.
 Comments  suggestions welcome.
 As a favour to ask, if anyone could take my 2 source files and create a PATCH 
 file from it, it would be greatly appreciated, as I'm having a bit of trouble 
 with svn (don't shoot me, but my environment is a Redmond-based os company).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-792) Pivot (ie: Decision Tree) Faceting Component

2010-11-08 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12929590#action_12929590
 ] 

Peter Karich commented on SOLR-792:
---

Hi Toke and all,

maybe I am a bit evil or stupid but could someone enlight me why this patch is 
necessary?

Why can't you we the existing mechanisms in Solr (facets!) and a bit logic 
while indexing:

http://markmail.org/message/2aza6nnsiw3l4bbb#query:+page:1+mid:3j3ttojacpjoyfg5+state:results

This has no performance problems when using tons of categories. We already 
using it with lots of categories. It works out of the box with a nearly 
infinity depth (either you need a DB - 
unlimited or the URL length is the limit).

The only drawback of this approach is that you won't be able to display two or 
more 'branches' at the same time. Only one current branch with the current 
possible categories is possible, which is no limitation in our case. Because 
the UI would be unusable if too many items would be visible at the same time.

One could introduce a special update component for this feature which uses a 
category tree (in RAM) built from the json or xml definition. I could create 
such a component if someone is interested.

Regards,
Peter.

 Pivot (ie: Decision Tree) Faceting Component
 

 Key: SOLR-792
 URL: https://issues.apache.org/jira/browse/SOLR-792
 Project: Solr
  Issue Type: New Feature
Reporter: Erik Hatcher
Assignee: Yonik Seeley
Priority: Minor
 Attachments: SOLR-792-as-helper-class.patch, 
 SOLR-792-PivotFaceting.patch, SOLR-792-PivotFaceting.patch, 
 SOLR-792-PivotFaceting.patch, SOLR-792-PivotFaceting.patch, 
 SOLR-792-raw-type.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, 
 SOLR-792.patch, SOLR-792.patch, SOLR-792.patch, SOLR-792.patch


 A component to do multi-level faceting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1709) Distributed Date Faceting

2010-11-08 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12929760#action_12929760
 ] 

Peter Karich commented on SOLR-1709:


Hi Peter Sturge,

what are the limitations of this patch? only that earlier + later isn't 
supported?

What are the issues before commiting this into trunk?

 Distributed Date Faceting
 -

 Key: SOLR-1709
 URL: https://issues.apache.org/jira/browse/SOLR-1709
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetComponent.java, FacetComponent.java, 
 ResponseBuilder.java, solr-1.4.0-solr-1709.patch


 This patch is for adding support for date facets when using distributed 
 searches.
 Date faceting across multiple machines exposes some time-based issues that 
 anyone interested in this behaviour should be aware of:
 Any time and/or time-zone differences are not accounted for in the patch 
 (i.e. merged date facets are at a time-of-day, not necessarily at a universal 
 'instant-in-time', unless all shards are time-synced to the exact same time).
 The implementation uses the first encountered shard's facet_dates as the 
 basis for subsequent shards' data to be merged in.
 This means that if subsequent shards' facet_dates are skewed in relation to 
 the first by 1 'gap', these 'earlier' or 'later' facets will not be merged 
 in.
 There are several reasons for this:
   * Performance: It's faster to check facet_date lists against a single map's 
 data, rather than against each other, particularly if there are many shards
   * If 'earlier' and/or 'later' facet_dates are added in, this will make the 
 time range larger than that which was requested
 (e.g. a request for one hour's worth of facets could bring back 2, 3 
 or more hours of data)
 This could be dealt with if timezone and skew information was added, and 
 the dates were normalized.
 One possibility for adding such support is to [optionally] add 'timezone' and 
 'now' parameters to the 'facet_dates' map. This would tell requesters what 
 time and TZ the remote server thinks it is, and so multiple shards' time data 
 can be normalized.
 The patch affects 2 files in the Solr core:
   org.apache.solr.handler.component.FacetComponent.java
   org.apache.solr.handler.component.ResponseBuilder.java
 The main changes are in FacetComponent - ResponseBuilder is just to hold the 
 completed SimpleOrderedMap until the finishStage.
 One possible enhancement is to perhaps make this an optional parameter, but 
 really, if facet.date parameters are specified, it is assumed they are 
 desired.
 Comments  suggestions welcome.
 As a favour to ask, if anyone could take my 2 source files and create a PATCH 
 file from it, it would be greatly appreciated, as I'm having a bit of trouble 
 with svn (don't shoot me, but my environment is a Redmond-based os company).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2218) Performance of start= and rows= parameters are exponentially slow with large data sets

2010-11-05 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12928536#action_12928536
 ] 

Peter Karich commented on SOLR-2218:


Lance, would you mind explaining this a bit in detail :-) ?

The idea is to grab all/alot documents from solr even if the dataset is very 
large, if I haven't misunderstood what Bill was requesting. This is very useful 
IMHO.

 Performance of start= and rows= parameters are exponentially slow with large 
 data sets
 --

 Key: SOLR-2218
 URL: https://issues.apache.org/jira/browse/SOLR-2218
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 1.4.1
Reporter: Bill Bell

 With large data sets,  10M rows.
 Setting start=large number and rows=large numbers is slow, and gets 
 slower the farther you get from start=0 with a complex query. Random also 
 makes this slower.
 Would like to somehow make this performance faster for looping through large 
 data sets. It would be nice if we could pass a pointer to the result set to 
 loop, or support very large rows=number.
 Something like:
 rows=1000
 start=0
 spointer=string_my_query_1
 Then within interval (like 5 mins) I can reference this loop:
 Something like:
 rows=1000
 start=1000
 spointer=string_my_query_1
 What do you think? Since the data is too great the cache is not helping.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1311) pseudo-field-collapsing

2010-10-15 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921324#action_12921324
 ] 

Peter Karich commented on SOLR-1311:


Hi Marc,

could this issue be closed because of a field collapsing which is now in trunk 
and more mature?

Why it cannot be integrated as a plugin?

 pseudo-field-collapsing
 ---

 Key: SOLR-1311
 URL: https://issues.apache.org/jira/browse/SOLR-1311
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4
Reporter: Marc Sturlese
 Fix For: Next

 Attachments: SOLR-1311-pseudo-field-collapsing.patch


 I am trying to develope a new way of doing field collapsing based on the 
 adjacent field collapsing algorithm. I have started developing it beacuse I 
 am experiencing performance problems with the field collapsing patch with big 
 index (8G).
 The algorith does adjacent-pseudo-field collapsing. It does collapsing on the 
 first X documents. Instead of making the collapsed docs disapear, the 
 algorith will send them to a given position of the relevance results list.
 The reason I just do collapsing in the first X documents is that if I have 
 for example 60 results and I am showing 10 results per page, I really 
 don't need to do collapsing in the page 3 or even not in the 3000. Doing 
 this I am noticing dramatically better performance. The problem is I couldn't 
 find a way to plug the algorithm as a component and keep good performance. I 
 had to hack few classes in SolrIndexSearcher.java
 This patch is just experimental and for testing purposes. In case someone 
 finds it interesting would be good do find a way to integrate it in a better 
 way than it is at the moment.
 Advices are more than welcome.
   
 Functionality:
 In solrconfig.xml we specify the pseudo-collapsing parameters:
  str name=plus.considerMoreDocstrue/str
  str name=plus.considerHowMany3000/str
  str name=plus.considerFieldname/str
 (at the moment there's no threshold and other parameters that exist in the 
 current collapse-field patch)
 plus.considerMoreDocs one enables pseudo-collapsing
 plus.considerHowMany sets the number of resultant documents in wich we want 
 to apply the algorithm
 plus.considerField is the field to do pseudo-collapsing
 If the number of results is lower than plus.considerHowMany the algorithm 
 will be applyed to all the results.
 Let's say there is a query with 60 results and we've set considerHowMany 
 to 3000 (and we already have the docs sorted by relevance). 
 What adjacent-pseudo-collapse does is, if the 2nd doc has to be collapsed it 
 will be sent to the pos 2999 of the relevance results array. If the 3th has 
 to be collpased too  will go to the position 2998 and successively like this.
 The algorithm is not applyed when a sortspec is set or plus.considerMoreDocs 
 is set to false. It neighter is applyed when using MoreLikeThisRequestHanlder.
 Example with a query of 9 results:
 Results sorted by relevance without pseudo-collapse-algorithm:
 doc1 - collapse_field_value 3
 doc2 - collapse_field_value 3
 doc3 - collapse_field_value 4
 doc4 - collapse_field_value 7
 doc5 - collapse_field_value 6
 doc6 - collapse_field_value 6
 doc7 - collapse_field_value 5
 doc8 - collapse_field_value 1
 doc9 - collapse_field_value 2
 Results pseudo-collapsed with plus.considerHowMany = 5
 doc1 - collapse_field_value 3
 doc3 - collapse_field_value 4
 doc4 - collapse_field_value 7
 doc5 - collapse_field_value 6
 doc2 - collapse_field_value 3*
 doc6 - collapse_field_value 6
 doc7 - collapse_field_value 5
 doc8 - collapse_field_value 1
 doc9 - collapse_field_value 2
 Results pseudo-collapsed with plus.considerHowMany = 9
 doc1 - collapse_field_value 3
 doc3 - collapse_field_value 4
 doc4 - collapse_field_value 7
 doc5 - collapse_field_value 6
 doc7 - collapse_field_value 5
 doc8 - collapse_field_value 1
 doc9 - collapse_field_value 2
 doc6 - collapse_field_value 6*
 doc2 - collapse_field_value 3*
 *pseudo-collapsed documents

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-64) strict hierarchical facets

2010-10-15 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921326#action_12921326
 ] 

Peter Karich commented on SOLR-64:
--

@SolrFan and @Mats:

you could try an alternative solution:
http://lucene.472066.n3.nabble.com/multi-level-faceting-tp1629650p1672083.html

 strict hierarchical facets
 --

 Key: SOLR-64
 URL: https://issues.apache.org/jira/browse/SOLR-64
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Yonik Seeley
 Fix For: Next

 Attachments: SOLR-64.patch, SOLR-64.patch, SOLR-64.patch, 
 SOLR-64.patch


 Strict Facet Hierarchies... each tag has at most one parent (a tree).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-385) facet sorting with relevancy

2010-10-15 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921473#action_12921473
 ] 

Peter Karich commented on SOLR-385:
---

When I am thinking a bit more about this issue. For the 'ungeneralized version' 
- sorting against the maximum of the score (or any field?)- we can use the 
group-feature!

http://wiki.apache.org/solr/FieldCollapsing

The Solution - I think - would be the following request:

http://localhost:8983/solr/select/?q=hardgroup=truegroup.field=manu_exactgroup.limit=1debug=truefl=*,score

the collapse groups are ordered by the maxScore I think + hope ;-) 

So it is the same as we want:

http://localhost:8983/solr/select/?q=hardfacet=truefacet.field=manu_exactdebug=truefl=*,scorefacet.stats.sort=max(score)
 desc

Now one remaing task could be to extend this feature with max, min and mean 
functions ...


here is the 'group' result:

{code}
lst
str name=groupValueMaxtor Corp./str
−
result name=doclist numFound=1 start=0 maxScore=0.70904505
−
doc
float name=score0.70904505/float
−
arr name=cat
strelectronics/str
strhard drive/str
/arr
−
arr name=features
strSATA 3.0Gb/s, NCQ/str
str8.5ms seek/str
str16MB cache/str
/arr
str name=id6H500F0/str
bool name=inStocktrue/bool
str name=manuMaxtor Corp./str
date name=manufacturedate_dt2006-02-13T15:26:37Z/date
−
str name=name
Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300
/str
int name=popularity6/int
float name=price350.0/float
str name=store45.17614,-93.87341/str
/doc
/result
/lst
−
lst
str name=groupValueSamsung Electronics Co. Ltd./str
−
result name=doclist numFound=1 start=0 maxScore=0.5908709
−
doc
float name=score0.5908709/float
−
arr name=cat
strelectronics/str
strhard drive/str
/arr
−
arr name=features
str7200RPM, 8MB cache, IDE Ultra ATA-133/str
−
str
NoiseGuard, SilentSeek technology, Fluid Dynamic Bearing (FDB) motor
/str
/arr
str name=idSP2514N/str
bool name=inStocktrue/bool
str name=manuSamsung Electronics Co. Ltd./str
date name=manufacturedate_dt2006-02-13T15:26:37Z/date
−
str name=name
Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133
/str
int name=popularity6/int
float name=price92.0/float
str name=store45.17614,-93.87341/str
/doc
/result
/lst
{code}

this would be the faceting result:

{code}
lst name=facet_fields
lst name=manu_exact
int name=Maxtor Corp. score=0.709045051/int
int name=Samsung Electronics Co. Ltd. score=0.59087091/int
...
{code}

 facet sorting with relevancy
 

 Key: SOLR-385
 URL: https://issues.apache.org/jira/browse/SOLR-385
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Dmitry Degtyarev
Priority: Minor

 Sometimes facet sort based only on the count of matches is not relevant, I 
 need to sort not only by the count of matches, but also on the scores of 
 matches.
 In the most simple way it must sort categories by the sum of item scores that 
 matches query and the category. In the best way there should be some 
 coefficient to multiply Scores or some function.
 Is it possible to implement such a behavior for facet sort?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.

2010-08-25 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902588#action_12902588
 ] 

Peter Karich commented on SOLR-2059:


Robert,

thanks for this work! I have a different application for this patch: in a 
twitter search # and @ shouldn't be removed. Instead I will handle them like 
ALPHA, I think.

Would you mind to update the patch for the latest version of the trunk? I got a 
problem with WordDelimiterIterator at line 254 if I am using 
https://svn.apache.org/repos/asf/lucene/dev/trunk/solr and a file is missing 
problem (line 37) for http://svn.apache.org/repos/asf/solr

 Allow customizing how WordDelimiterFilter tokenizes text.
 -

 Key: SOLR-2059
 URL: https://issues.apache.org/jira/browse/SOLR-2059
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Robert Muir
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: SOLR-2059.patch


 By default, WordDelimiterFilter assigns 'types' to each character (computed 
 from Unicode Properties).
 Based on these types and the options provided, it splits and concatenates 
 text.
 In some circumstances, you might need to tweak the behavior of how this works.
 It seems the filter already had this in mind, since you can pass in a custom 
 byte[] type table.
 But its not exposed in the factory.
 I think you should be able to customize the defaults with a configuration 
 file:
 {noformat}
 # A customized type mapping for WordDelimiterFilterFactory
 # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM
 # 
 # the default for any character without a mapping is always computed from 
 # Unicode character properties
 # Map the $, %, '.', and ',' characters to DIGIT 
 # This might be useful for financial data.
 $ = DIGIT
 % = DIGIT
 . = DIGIT
 \u002C = DIGIT
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.

2010-08-25 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902600#action_12902600
 ] 

Peter Karich commented on SOLR-2059:


Ups, my mistake ... this helped!

 What do you think of the file format, is it ok for describing these 
 categories? 

I think it is ok. I even had a more simpler patch before stumbling over yours: 
handleAsChar=@# which is now more powerful IMHO:

@ = ALPHA
# = ALPHA



 Allow customizing how WordDelimiterFilter tokenizes text.
 -

 Key: SOLR-2059
 URL: https://issues.apache.org/jira/browse/SOLR-2059
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Robert Muir
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: SOLR-2059.patch


 By default, WordDelimiterFilter assigns 'types' to each character (computed 
 from Unicode Properties).
 Based on these types and the options provided, it splits and concatenates 
 text.
 In some circumstances, you might need to tweak the behavior of how this works.
 It seems the filter already had this in mind, since you can pass in a custom 
 byte[] type table.
 But its not exposed in the factory.
 I think you should be able to customize the defaults with a configuration 
 file:
 {noformat}
 # A customized type mapping for WordDelimiterFilterFactory
 # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM
 # 
 # the default for any character without a mapping is always computed from 
 # Unicode character properties
 # Map the $, %, '.', and ',' characters to DIGIT 
 # This might be useful for financial data.
 $ = DIGIT
 % = DIGIT
 . = DIGIT
 \u002C = DIGIT
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.

2010-08-25 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902600#action_12902600
 ] 

Peter Karich edited comment on SOLR-2059 at 8/25/10 3:46 PM:
-

Ups, my mistake ... this helped!

 What do you think of the file format, is it ok for describing these 
 categories? 

I think it is ok. I even had a more simpler patch before stumbling over yours: 
handleAsChar=@# which is now more powerful IMHO:
{code} 
@ = ALPHA
# = ALPHA
{code} 


  was (Author: peathal):
Ups, my mistake ... this helped!

 What do you think of the file format, is it ok for describing these 
 categories? 

I think it is ok. I even had a more simpler patch before stumbling over yours: 
handleAsChar=@# which is now more powerful IMHO:

@ = ALPHA
# = ALPHA


  
 Allow customizing how WordDelimiterFilter tokenizes text.
 -

 Key: SOLR-2059
 URL: https://issues.apache.org/jira/browse/SOLR-2059
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Robert Muir
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: SOLR-2059.patch


 By default, WordDelimiterFilter assigns 'types' to each character (computed 
 from Unicode Properties).
 Based on these types and the options provided, it splits and concatenates 
 text.
 In some circumstances, you might need to tweak the behavior of how this works.
 It seems the filter already had this in mind, since you can pass in a custom 
 byte[] type table.
 But its not exposed in the factory.
 I think you should be able to customize the defaults with a configuration 
 file:
 {noformat}
 # A customized type mapping for WordDelimiterFilterFactory
 # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM
 # 
 # the default for any character without a mapping is always computed from 
 # Unicode character properties
 # Map the $, %, '.', and ',' characters to DIGIT 
 # This might be useful for financial data.
 $ = DIGIT
 % = DIGIT
 . = DIGIT
 \u002C = DIGIT
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-2005) NullPointerException for more like this request handler via SolrJ if the document does not exist

2010-07-19 Thread Peter Karich (JIRA)
NullPointerException for more like this request handler via SolrJ if the 
document does not exist


 Key: SOLR-2005
 URL: https://issues.apache.org/jira/browse/SOLR-2005
 Project: Solr
  Issue Type: Bug
  Components: clients - java, MoreLikeThis
Affects Versions: 1.4
 Environment: jdk1.6
Reporter: Peter Karich


If I query solr with the following (via SolrJ):

q=myUniqueKey%3AsomeValueWhichDoesNotExistqt=%2Fmltmlt.fl=myMLTFieldmlt.minwl=2mlt.mindf=1mlt.match.include=falsefacet=truefacet.sort=countfacet.mincount=1facet.limit=10facet.field=differentFacetFieldstart=0rows=10

I get:

org.apache.solr.client.solrj.SolrServerException: Error executing query
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
Caused by: java.lang.NullPointerException
at 
org.apache.solr.client.solrj.response.QueryResponse.extractFacetInfo(QueryResponse.java:180)
at 
org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:103)
at 
org.apache.solr.client.solrj.response.QueryResponse.init(QueryResponse.java:80)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)

The xml response of the url is empty and so the info variable at line

NamedListInteger fq = (NamedListInteger) info.get( facet_queries );

(QueryResponse) is null. Maybe all variables at QueryResponse.setResponse 
should be checked against null? Sth. like

val = res.getVal( i );
if(val == null) continue; 

?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2005) NullPointerException for more like this request handler via SolrJ if the document does not exist

2010-07-19 Thread Peter Karich (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Karich updated SOLR-2005:
---

Priority: Minor  (was: Major)

 NullPointerException for more like this request handler via SolrJ if the 
 document does not exist
 

 Key: SOLR-2005
 URL: https://issues.apache.org/jira/browse/SOLR-2005
 Project: Solr
  Issue Type: Bug
  Components: clients - java, MoreLikeThis
Affects Versions: 1.4
 Environment: jdk1.6
Reporter: Peter Karich
Priority: Minor
   Original Estimate: 0.33h
  Remaining Estimate: 0.33h

 If I query solr with the following (via SolrJ):
 q=myUniqueKey%3AsomeValueWhichDoesNotExistqt=%2Fmltmlt.fl=myMLTFieldmlt.minwl=2mlt.mindf=1mlt.match.include=falsefacet=truefacet.sort=countfacet.mincount=1facet.limit=10facet.field=differentFacetFieldstart=0rows=10
 I get:
 org.apache.solr.client.solrj.SolrServerException: Error executing query
 at 
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
 at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.solr.client.solrj.response.QueryResponse.extractFacetInfo(QueryResponse.java:180)
 at 
 org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:103)
 at 
 org.apache.solr.client.solrj.response.QueryResponse.init(QueryResponse.java:80)
 at 
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
 The xml response of the url is empty and so the info variable at line
 NamedListInteger fq = (NamedListInteger) info.get( facet_queries );
 (QueryResponse) is null. Maybe all variables at QueryResponse.setResponse 
 should be checked against null? Sth. like
 val = res.getVal( i );
 if(val == null) continue; 
 ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-787) SolrJ POM refers to stax parser

2010-06-13 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878385#action_12878385
 ] 

Peter Karich commented on SOLR-787:
---

Is this really correctly fixed? Inspecting my deps with NetBeans' maven dep 
viewer I don't understand why Solr uses woodstox and SolrJ uses the different 
artifact (but same jar) org.codehaus.woodstox

And according to 

http://jarvana.com/jarvana/inspect-pom/org/apache/solr/solr-core/1.4.0/solr-core-1.4.0.pom

http://jarvana.com/jarvana/inspect-pom/org/apache/solr/solr-solrj/1.4.0/solr-solrj-1.4.0.pom

NetBeans is correct.

The problem with this is, that you will have two identical jars in the 
classpath and that the solrj dep forces you to still use stax-api

 SolrJ POM refers to stax parser
 ---

 Key: SOLR-787
 URL: https://issues.apache.org/jira/browse/SOLR-787
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-787.patch


 Solr core moved to using woodstox instead of stax but SolrJ POM still has a 
 dependency to stax. We should replace the dependency to stax with woodstox 
 jar in SolrJ's POM.
 This is not a huge problem as we are not distributing stax anymore but is 
 needed for consistency.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-1950) SolrJ POM still refers to stax parser

2010-06-13 Thread Peter Karich (JIRA)
SolrJ POM still refers to stax parser
-

 Key: SOLR-1950
 URL: https://issues.apache.org/jira/browse/SOLR-1950
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4
Reporter: Peter Karich
Priority: Minor


See the issue at https://issues.apache.org/jira/browse/SOLR-787 which seems to 
be incorrectly fixed. (I cannot reopen that issue, so I create this one here)

Using the following deps:
   dependency
groupIdorg.apache.solr/groupId
artifactIdsolr-solrj/artifactId
version1.4.0/version
/dependency
dependency
artifactIdsolr-core/artifactId
groupIdorg.apache.solr/groupId
version1.4.0/version
/dependency

will lead to duplicate jars. (Solr uses woodstox and SolrJ uses the different 
artifact (but same jar) org.codehaus.woodstox )

But maybe the artifacts are only incorrectly deployed? Where can I find the 
original pom files?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1864) Master/Slave replication causes tomcat to be unresponsive on slave till replication is being done.

2010-06-01 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874055#action_12874055
 ] 

Peter Karich commented on SOLR-1864:


This might be a duplicate of: https://issues.apache.org/jira/browse/SOLR-1775

The reason might be (as Paul Noble noted) that the garbage collector is busy a 
lot because of autowarm up after index switch was done

 Master/Slave replication causes tomcat to be unresponsive on slave till 
 replication is being done.
 --

 Key: SOLR-1864
 URL: https://issues.apache.org/jira/browse/SOLR-1864
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.5
 Environment: Centos 5.2, Tomcat5, java version 1.6.0
 OpenJDK  Runtime Environment (build 1.6.0-b09)
 OpenJDK 64-Bit Server VM (build 1.6.0-b09, mixed mode)
Reporter: Marcin

 Hi guys,
 I have found a strange behaviour on tomcat5, centos 5.2.
 While replication is being done ( million rows) tomcat5 seems to be 
 unresponsive till its finished.
 Please help
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-236) Field collapsing

2010-03-05 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841752#action_12841752
 ] 

Peter Karich commented on SOLR-236:
---

 Shouldn't the float array in DocSetScoreCollector be changed to a Map?

hmmh, maybe I expressed myself a bit weird: I already changed this all to a Map 
(a SortedMap) ... 
I started this change in DocSetScoreCollector and changed all the other 
occurances of the float array (otherwise I would have to copy the entire map)

  I think the compare method should NOT be called if no docs are in the 
  scores array ... ?

 I would expect that every docId has a score.

Yes, me too. So I expect there is somewhere a bug. But as I sayd this breaks 
only one test (collapse with faceting before). It could be even a but in the 
testcase though.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2010-03-05 Thread Peter Karich (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Karich updated SOLR-236:
--

Attachment: NonAdjacentDocumentCollapserTest.java
NonAdjacentDocumentCollapser.java
DocSetScoreCollector.java

It seems to me that the provides changes are necessary to make the OutOfMemory 
exception gone. Please apply the files with caution, because I made the changes 
from an old patch (from Nov 2009)

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, DocSetScoreCollector.java, 
 field-collapse-3.patch, field-collapse-4-with-solrj.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2010-03-05 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841756#action_12841756
 ] 

Peter Karich edited comment on SOLR-236 at 3/5/10 8:53 AM:
---

It seems to me that the provided changes are necessary to make the OutOfMemory 
exception gone (see appended 3 files). Please apply the files with caution, 
because I made the changes from an old patch (from Nov 2009)

  was (Author: peathal):
It seems to me that the provides changes are necessary to make the 
OutOfMemory exception gone. Please apply the files with caution, because I made 
the changes from an old patch (from Nov 2009)
  
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, DocSetScoreCollector.java, 
 field-collapse-3.patch, field-collapse-4-with-solrj.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2010-03-04 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841147#action_12841147
 ] 

Peter Karich commented on SOLR-236:
---

regarding the OutOfMemory problem: we are now testing the suggested change in 
production.

I replaced the float array with a TreeMapInteger, Float. The change was 
nearly trivial (I cannot provide a patch easily, because we are using an older 
patch, althoug I could post the 3 changed files.)

The point why I used a TreeMap instead a HashMap was that in the method advance 
in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap 
method:

{noformat} 
public int advance(int target) throws IOException {
// now we need a treemap method:
iter = scores.tailMap(target).entrySet().iterator();
if (iter.hasNext())
return target;
else
return NO_MORE_DOCS;
}
{noformat} 

Then -  I think - I discovered a bug/inconsistent behaviour: If I run the test 
FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then 
the scores arrays will be created ala new float[maxDocs] in the old version. 
But the array will never be filled with some values so Float value1 = 
values.get(doc1); will return null in the method 
NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of 
TreeMap is 0!); I work around this via 

{noformat} 

if (value1 == null)
value1 = 0f;
if (value2 == null)
value2 = 0f;

{noformat} 

although the compare method should be called if no docs are in the scores array 
... ?

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2010-03-04 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841147#action_12841147
 ] 

Peter Karich edited comment on SOLR-236 at 3/4/10 9:48 AM:
---

regarding the OutOfMemory problem: we are now testing the suggested change in 
production.

I replaced the float array with a TreeMapInteger, Float. The change was 
nearly trivial (I cannot provide a patch easily, because we are using an older 
patch, althoug I could post the 3 changed files.)

The point why I used a TreeMap instead a HashMap was that in the method advance 
in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap 
method:

{noformat}public int advance(int target) throws IOException {
// now we need a treemap method:
iter = scores.tailMap(target).entrySet().iterator();
if (iter.hasNext())
return target;
else
return NO_MORE_DOCS;
}
{noformat} 

Then -  I think - I discovered a bug/inconsistent behaviour: If I run the test 
FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then 
the scores arrays will be created ala new float[maxDocs] in the old version. 
But the array will never be filled with some values so Float value1 = 
values.get(doc1); will return null in the method 
NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of 
TreeMap is 0!); I work around this via 

{noformat} 
if (value1 == null)
value1 = 0f;
if (value2 == null)
value2 = 0f;
{noformat} 

I think the compare method should NOT be called if no docs are in the scores 
array ... ?

  was (Author: peathal):
regarding the OutOfMemory problem: we are now testing the suggested change 
in production.

I replaced the float array with a TreeMapInteger, Float. The change was 
nearly trivial (I cannot provide a patch easily, because we are using an older 
patch, althoug I could post the 3 changed files.)

The point why I used a TreeMap instead a HashMap was that in the method advance 
in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap 
method:

{noformat}public int advance(int target) throws IOException {
// now we need a treemap method:
iter = scores.tailMap(target).entrySet().iterator();
if (iter.hasNext())
return target;
else
return NO_MORE_DOCS;
}
{noformat} 

Then -  I think - I discovered a bug/inconsistent behaviour: If I run the test 
FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then 
the scores arrays will be created ala new float[maxDocs] in the old version. 
But the array will never be filled with some values so Float value1 = 
values.get(doc1); will return null in the method 
NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of 
TreeMap is 0!); I work around this via 

{noformat} 
if (value1 == null)
value1 = 0f;
if (value2 == null)
value2 = 0f;
{noformat} 

although the compare method should be called if no docs are in the scores array 
... ?
  
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the 

[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2010-03-04 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841147#action_12841147
 ] 

Peter Karich edited comment on SOLR-236 at 3/4/10 9:46 AM:
---

regarding the OutOfMemory problem: we are now testing the suggested change in 
production.

I replaced the float array with a TreeMapInteger, Float. The change was 
nearly trivial (I cannot provide a patch easily, because we are using an older 
patch, althoug I could post the 3 changed files.)

The point why I used a TreeMap instead a HashMap was that in the method advance 
in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap 
method:

{noformat}public int advance(int target) throws IOException {
// now we need a treemap method:
iter = scores.tailMap(target).entrySet().iterator();
if (iter.hasNext())
return target;
else
return NO_MORE_DOCS;
}
{noformat} 

Then -  I think - I discovered a bug/inconsistent behaviour: If I run the test 
FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then 
the scores arrays will be created ala new float[maxDocs] in the old version. 
But the array will never be filled with some values so Float value1 = 
values.get(doc1); will return null in the method 
NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of 
TreeMap is 0!); I work around this via 

{noformat} 
if (value1 == null)
value1 = 0f;
if (value2 == null)
value2 = 0f;
{noformat} 

although the compare method should be called if no docs are in the scores array 
... ?

  was (Author: peathal):
regarding the OutOfMemory problem: we are now testing the suggested change 
in production.

I replaced the float array with a TreeMapInteger, Float. The change was 
nearly trivial (I cannot provide a patch easily, because we are using an older 
patch, althoug I could post the 3 changed files.)

The point why I used a TreeMap instead a HashMap was that in the method advance 
in the class NonAdjacentDocumentCollapser.PredefinedScorer I needed the tailMap 
method:

{noformat} 
public int advance(int target) throws IOException {
// now we need a treemap method:
iter = scores.tailMap(target).entrySet().iterator();
if (iter.hasNext())
return target;
else
return NO_MORE_DOCS;
}
{noformat} 

Then -  I think - I discovered a bug/inconsistent behaviour: If I run the test 
FieldCollapsingIntegrationTest.testNonAdjacentCollapse_withFacetingBefore then 
the scores arrays will be created ala new float[maxDocs] in the old version. 
But the array will never be filled with some values so Float value1 = 
values.get(doc1); will return null in the method 
NonAdjacentDocumentCollapser.FloatValueFieldComparator.compare (the size of 
TreeMap is 0!); I work around this via 

{noformat} 

if (value1 == null)
value1 = 0f;
if (value2 == null)
value2 = 0f;

{noformat} 

although the compare method should be called if no docs are in the scores array 
... ?
  
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the 

[jira] Commented: (SOLR-1167) Support module xml config files using XInclude

2010-03-04 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841178#action_12841178
 ] 

Peter Karich commented on SOLR-1167:


@Shalin Shekhar Mangar: how can I use the proposed attribute feature to be used 
for master+slave configuration? Do you have a code snippet?

 Support module xml config files using XInclude
 --

 Key: SOLR-1167
 URL: https://issues.apache.org/jira/browse/SOLR-1167
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.4
Reporter: Bryan Talbot
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1167.patch, SOLR-1167.patch, SOLR-1167.patch, 
 SOLR-1167.patch, SOLR-1167.patch


 Current configuration files (schema and solrconfig) are monolithic which can 
 make maintenance and reuse more difficult that it needs to be.  The XML 
 standards include a feature to include content from external files.  This is 
 described at http://www.w3.org/TR/xinclude/
 This feature is to add support for XInclude features for XML configuration 
 files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2010-02-18 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835230#action_12835230
 ] 

Peter Karich commented on SOLR-236:
---

We are facing OutOfMemory problems too. We are using 
https://issues.apache.org/jira/secure/attachment/12425775/field-collapse-5.patch

 Are you using any other features besides plain collapsing? The field collapse 
 cache gets large very quickly,
 I suggest you turn it off (if you are using it). Also you can try to make 
 your filterCache smaller.

How can I turn off the collapse cache or make the filterCache smaller?
Are there other workarounds? E.g. via using a special version of the patch ?

I read that it could help to specify collapse.maxdocs but this didn't help in 
our case ... could collapse.type=adjacent help here?  
(https://issues.apache.org/jira/browse/SOLR-236?focusedCommentId=12495376page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12495376)

What do you think?

BTW: We really like this patch and would like to use it !! :-)

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2010-02-18 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258
 ] 

Peter Karich commented on SOLR-236:
---

Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from 
nightly build but does not work. If I query 

http://searchdev05:15100/cs-bidcs/select?q=*:*collapse.field=myfield

it fails with: 

{noformat} 

HTTP Status 500 - null java.lang.NullPointerException at 
org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at 
org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at 
org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58)
 at 
org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja
va:84) at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193)
 at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192)
 at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
 at
...
 {noformat} 


I only need the OutOfMemory problem solved ... :-(

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2010-02-18 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258
 ] 

Peter Karich edited comment on SOLR-236 at 2/18/10 4:06 PM:


Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from 
nightly build but does not work. If I query 

http://server/cs-bidcs/select?q=*:*collapse.field=myfield

it fails with: 

{noformat} 

HTTP Status 500 - null java.lang.NullPointerException at 
org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at 
org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at 
org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58)
 at 
org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja
va:84) at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193)
 at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192)
 at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
 at
...
 {noformat} 


I only need the OutOfMemory problem solved ... :-(

  was (Author: peathal):
Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 
from nightly build but does not work. If I query 

http://searchdev05:15100/cs-bidcs/select?q=*:*collapse.field=myfield

it fails with: 

{noformat} 

HTTP Status 500 - null java.lang.NullPointerException at 
org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at 
org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at 
org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58)
 at 
org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja
va:84) at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193)
 at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192)
 at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
 at
...
 {noformat} 


I only need the OutOfMemory problem solved ... :-(
  
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 

[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2010-02-18 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258
 ] 

Peter Karich edited comment on SOLR-236 at 2/18/10 4:06 PM:


Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 from 
nightly build but does not work. If I query 

http://server/solr-app/select?q=*:*collapse.field=myfield

it fails with: 

{noformat} 

HTTP Status 500 - null java.lang.NullPointerException at 
org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at 
org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at 
org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58)
 at 
org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja
va:84) at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193)
 at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192)
 at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
 at
...
 {noformat} 


I only need the OutOfMemory problem solved ... :-(

  was (Author: peathal):
Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 
from nightly build but does not work. If I query 

http://server/cs-bidcs/select?q=*:*collapse.field=myfield

it fails with: 

{noformat} 

HTTP Status 500 - null java.lang.NullPointerException at 
org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at 
org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at 
org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58)
 at 
org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja
va:84) at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193)
 at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192)
 at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
 at
...
 {noformat} 


I only need the OutOfMemory problem solved ... :-(
  
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to 

[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2010-02-18 Thread Peter Karich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835258#action_12835258
 ] 

Peter Karich edited comment on SOLR-236 at 2/18/10 4:07 PM:


Trying the latest patch from 1th Feb 2010. It compiles against solr-2010-02-13 
from nightly build dir, but does not work. If I query 

http://server/solr-app/select?q=*:*collapse.field=myfield

it fails with: 

{noformat} 

HTTP Status 500 - null java.lang.NullPointerException at 
org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at 
org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at 
org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58)
 at 
org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja
va:84) at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193)
 at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192)
 at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
 at
...
 {noformat} 


I only need the OutOfMemory problem solved ... :-(

  was (Author: peathal):
Trying the latest patch from 1th Feb 2010 compiles against solr-2010-02-13 
from nightly build but does not work. If I query 

http://server/solr-app/select?q=*:*collapse.field=myfield

it fails with: 

{noformat} 

HTTP Status 500 - null java.lang.NullPointerException at 
org.apache.solr.schema.FieldType.toExternal(FieldType.java:329) at 
org.apache.solr.schema.FieldType.storedToReadable(FieldType.java:348) at 
org.apache.solr.search.fieldcollapse.collector.AbstractCollapseCollector.getCollapseGroupResult(AbstractCollapseCollector.java:58)
 at 
org.apache.solr.search.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory$DocumentCountCollapseCollector.getResult(DocumentGroupCountCollapseCollectorFactory.ja
va:84) at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:193)
 at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:192)
 at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
 at
...
 {noformat} 


I only need the OutOfMemory problem solved ... :-(
  
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max