[jira] Resolved: (SOLR-1146) DIH ConcurrentModificationException in getStatus

2009-05-05 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1146.
-

Resolution: Fixed
  Assignee: Shalin Shekhar Mangar

Committed revision 771580.

Thanks Walter and Noble!

 DIH ConcurrentModificationException in getStatus
 

 Key: SOLR-1146
 URL: https://issues.apache.org/jira/browse/SOLR-1146
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1146.patch


 the status messages map is not synchronized. 
 see mail thread: http://markmail.org/thread/2m5akintzvxc2utf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2009-05-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705959#action_12705959
 ] 

Domingo Gómez García commented on SOLR-236:
---

The results of collapse_counts are not what i have expected. It losses many 
categories, only showing . I tried incrementing the collapse.max parameter:

max=1 results 

lst name=doc
int name=2008/LICOBLE-00023109/int
int name=2008/LICOBLE-35/int
int name=2009/LICOBLE-000364/int
int name=2009/LICOBLE-000951/int
/lst
−
lst name=count
int name=12740109/int
int name=127415/int
int name=132824/int
int1/int
/lst


max=2 results

lst name=doc
int name=2009/LICOBLE-8108/int
int name=2007/LICOBLE-14/int
/lst
−
lst name=count
int name=12740108/int
int name=127414/int
/lst


max=3 results

lst name=doc
int name=2008/LICOBLE-00020107/int
int name=2008/LICOBLE-000213/int
/lst
−
lst name=count
int name=12740107/int
int name=127413/int
/lst


max=4

lst name=doc
int name=2009/LICOBLE-00060106/int
/lst
−
lst name=count
int name=12740106/int
/lst

How is possible to get less results each time? There are like 70 categories, do 
I have any way to obtain all those counts? Am I mising any collapsing concept?
Thanks.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2009-05-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705959#action_12705959
 ] 

Domingo Gómez García edited comment on SOLR-236 at 5/5/09 1:53 AM:
---

The results of collapse_counts are not what i have expected. It losses many 
categories, only showing a few . I tried incrementing the collapse.max 
parameter:

max=1 results 

lst name=doc
int name=2008/LICOBLE-00023109/int
int name=2008/LICOBLE-35/int
int name=2009/LICOBLE-000364/int
int name=2009/LICOBLE-000951/int
/lst
−
lst name=count
int name=12740109/int
int name=127415/int
int name=132824/int
int1/int
/lst


max=2 results

lst name=doc
int name=2009/LICOBLE-8108/int
int name=2007/LICOBLE-14/int
/lst
−
lst name=count
int name=12740108/int
int name=127414/int
/lst


max=3 results

lst name=doc
int name=2008/LICOBLE-00020107/int
int name=2008/LICOBLE-000213/int
/lst
−
lst name=count
int name=12740107/int
int name=127413/int
/lst


max=4

lst name=doc
int name=2009/LICOBLE-00060106/int
/lst
−
lst name=count
int name=12740106/int
/lst

How is possible to get less results each time? There are like 70 categories, do 
I have any way to obtain all those counts? Am I mising any collapsing concept?
Thanks.

  was (Author: dgomezca):
The results of collapse_counts are not what i have expected. It losses many 
categories, only showing . I tried incrementing the collapse.max parameter:

max=1 results 

lst name=doc
int name=2008/LICOBLE-00023109/int
int name=2008/LICOBLE-35/int
int name=2009/LICOBLE-000364/int
int name=2009/LICOBLE-000951/int
/lst
−
lst name=count
int name=12740109/int
int name=127415/int
int name=132824/int
int1/int
/lst


max=2 results

lst name=doc
int name=2009/LICOBLE-8108/int
int name=2007/LICOBLE-14/int
/lst
−
lst name=count
int name=12740108/int
int name=127414/int
/lst


max=3 results

lst name=doc
int name=2008/LICOBLE-00020107/int
int name=2008/LICOBLE-000213/int
/lst
−
lst name=count
int name=12740107/int
int name=127413/int
/lst


max=4

lst name=doc
int name=2009/LICOBLE-00060106/int
/lst
−
lst name=count
int name=12740106/int
/lst

How is possible to get less results each time? There are like 70 categories, do 
I have any way to obtain all those counts? Am I mising any collapsing concept?
Thanks.
  
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2009-05-05 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Domingo Gómez García updated SOLR-236:
--

Comment: was deleted

(was: The results of collapse_counts are not what i have expected. It losses 
many categories, only showing a few . I tried incrementing the collapse.max 
parameter:

max=1 results 

lst name=doc
int name=2008/LICOBLE-00023109/int
int name=2008/LICOBLE-35/int
int name=2009/LICOBLE-000364/int
int name=2009/LICOBLE-000951/int
/lst
−
lst name=count
int name=12740109/int
int name=127415/int
int name=132824/int
int1/int
/lst


max=2 results

lst name=doc
int name=2009/LICOBLE-8108/int
int name=2007/LICOBLE-14/int
/lst
−
lst name=count
int name=12740108/int
int name=127414/int
/lst


max=3 results

lst name=doc
int name=2008/LICOBLE-00020107/int
int name=2008/LICOBLE-000213/int
/lst
−
lst name=count
int name=12740107/int
int name=127413/int
/lst


max=4

lst name=doc
int name=2009/LICOBLE-00060106/int
/lst
−
lst name=count
int name=12740106/int
/lst

How is possible to get less results each time? There are like 70 categories, do 
I have any way to obtain all those counts? Am I mising any collapsing concept?
Thanks.)

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (SOLR-1076) JdbcDataSource should resolve variables in jdbc url, username and password

2009-05-05 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reopened SOLR-1076:
-


JNDI name is not resolved

 JdbcDataSource should resolve variables in jdbc url, username and password
 --

 Key: SOLR-1076
 URL: https://issues.apache.org/jira/browse/SOLR-1076
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-1076.patch, SOLR-1076.patch, SOLR-1076.patch


 It is not possible to use request parameters as a variable in JdbcDataSource.
 Related discussion on solr-user at
 http://www.lucidimagination.com/search/document/835dd5d14518c260/dih_read_datasource_param_values_from_property_file_or_configure_jndi_datasource

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1076) JdbcDataSource should resolve variables in jdbc url, username and password

2009-05-05 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1076:


Attachment: SOLR-1076.patch

Resolve all variables before extracting jndi name

 JdbcDataSource should resolve variables in jdbc url, username and password
 --

 Key: SOLR-1076
 URL: https://issues.apache.org/jira/browse/SOLR-1076
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-1076.patch, SOLR-1076.patch, SOLR-1076.patch


 It is not possible to use request parameters as a variable in JdbcDataSource.
 Related discussion on solr-user at
 http://www.lucidimagination.com/search/document/835dd5d14518c260/dih_read_datasource_param_values_from_property_file_or_configure_jndi_datasource

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1076) JdbcDataSource should resolve variables in jdbc url, username and password

2009-05-05 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1076.
-

Resolution: Fixed

Committed revision 771656.

 JdbcDataSource should resolve variables in jdbc url, username and password
 --

 Key: SOLR-1076
 URL: https://issues.apache.org/jira/browse/SOLR-1076
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-1076.patch, SOLR-1076.patch, SOLR-1076.patch


 It is not possible to use request parameters as a variable in JdbcDataSource.
 Related discussion on solr-user at
 http://www.lucidimagination.com/search/document/835dd5d14518c260/dih_read_datasource_param_values_from_property_file_or_configure_jndi_datasource

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



TermsComponent sort by frequency

2009-05-05 Thread Matt Weber

Hello,

I am interested in adding sorting support to TermsComponent.   I would  
most likely use the same algorithm as facet sorting, using CountPair's  
in a BoundedTreeSet.  Does anyone have a problem with this?  Is there  
a different algorithm I should be using?


Thanks,

Matt Weber


[jira] Updated: (SOLR-1144) replication hang

2009-05-05 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1144:
---

Fix Version/s: 1.4

 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1145) Patch to set IndexWriter.defaultInfoStream from solr.xml

2009-05-05 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1145:
---

Fix Version/s: 1.4

This would be very useful to get in for 1.4 to help with indexing issues people 
may hit.

 Patch to set IndexWriter.defaultInfoStream from solr.xml
 

 Key: SOLR-1145
 URL: https://issues.apache.org/jira/browse/SOLR-1145
 Project: Solr
  Issue Type: Improvement
Reporter: Chris Harris
 Fix For: 1.4

 Attachments: SOLR-1145.patch


 Lucene IndexWriters use an infoStream to log detailed info about indexing 
 operations for debugging purpose. This patch is an extremely simple way to 
 allow logging this info to a file from within Solr: After applying the patch, 
 set the new defaultInfoStreamFilePath attribute of the solr element in 
 solr.xml to the path of the file where you'd like to save the logging 
 information.
 Note that, in a multi-core setup, all cores will end up logging to the same 
 infoStream log file. This may not be desired. (But it does justify putting 
 the setting in solr.xml rather than solrconfig.xml.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1144) replication hang

2009-05-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12706199#action_12706199
 ] 

Yonik Seeley commented on SOLR-1144:


Hmmm, I had trouble finding SOLR-1096 before.
But it looks like it was used mainly for adding a timeout.  There's still an 
underlying bug somewhere, right?

 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1091) phps (serialized PHP) writer produces invalid output

2009-05-05 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1091:
---

Fix Version/s: 1.4

 phps (serialized PHP) writer produces invalid output
 --

 Key: SOLR-1091
 URL: https://issues.apache.org/jira/browse/SOLR-1091
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.3
 Environment: Sun JRE 1.6.0 on Centos 5
Reporter: frank farmer
Priority: Minor
 Fix For: 1.4


 The serialized PHP output writer can outputs invalid string lengths for 
 certain (unusual) input values.  Specifically, I had a document containing 
 the following 6 byte character sequence: \xED\xAF\x80\xED\xB1\xB8
 I was able to create a document in the index containing this value without 
 issue; however, when fetching the document back out using the serialized PHP 
 writer, it returns a string like the following:
 s:4:􀁸;
 Note that the string length specified is 4, while the string is actually 6 
 bytes long.
 When using PHP's native serialize() function, it correctly sets the length to 
 6:
 # php -r 'var_dump(serialize(\xED\xAF\x80\xED\xB1\xB8));'
 string(13) s:6:􀁸;
 The wt=php writer, which produces output to be parsed with eval(), doesn't 
 have any trouble with this string.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1143) Return partial results when a connection to a shard is refused

2009-05-05 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1143:
---

Fix Version/s: 1.4

Seems like this is something we should consider for 1.4

 Return partial results when a connection to a shard is refused
 --

 Key: SOLR-1143
 URL: https://issues.apache.org/jira/browse/SOLR-1143
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Nicolas Dessaigne
 Fix For: 1.4

 Attachments: SOLR-1143.patch


 If any shard is down in a distributed search, a ConnectException it thrown.
 Here's a little patch that change this behaviour: if we can't connect to a 
 shard (ConnectException), we get partial results from the active shards. As 
 for TimeOut parameter (https://issues.apache.org/jira/browse/SOLR-502), we 
 set the parameter partialResults at true.
 This patch also adresses a problem expressed in the mailing list about a year 
 ago 
 (http://www.nabble.com/partialResults,-distributed-search---SOLR-502-td19002610.html)
 We have a use case that needs this behaviour and we would like to know your 
 thougths about such a behaviour? Should it be the default behaviour for 
 distributed search?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1078) WordDelimiterFilter do wrong word breaking for Thai vowel

2009-05-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12706211#action_12706211
 ] 

Yonik Seeley commented on SOLR-1078:


Are these characters all in the basic multilingual plane?

Here is the relevant code how WordDelimiterFilter characterizes chars:

{code}
  [...]
} else if (Character.isLowerCase(ch)) {
  return LOWER;
} else if (Character.isLetter(ch)) {
  return UPPER;
} else {
  return SUBWORD_DELIM;
}
{code}



 WordDelimiterFilter do wrong word breaking for Thai vowel
 -

 Key: SOLR-1078
 URL: https://issues.apache.org/jira/browse/SOLR-1078
 Project: Solr
  Issue Type: Bug
  Components: Analysis
Affects Versions: 1.4
 Environment: Ubuntu 8.10 64bit
 Java 1.6.0_10
Reporter: SIriwat Aumngamsup

 With any configuration of schema.xml
 {code:xml}filter class=solr.WordDelimiterFilterFactory /{code}
 will do wrong word breaking with Thai characters.
 
 Example: ผู้ ใหญ่ บ้าน
 Wrong result: 0 = ผ, 1 = ใหญ, 2 = บ, 3 = าน
 Expect result: 0 = ผู้, 1 = ใหญ่, 2 = บ้าน
 
 Example2: ผู้ใหญ่บ้าน (no space)
 Wrong result: 0 = ผ, 1 = ใหญ, 2 = บ, 3 = าน (same result)
 Expect result: 0 = ผู้ใหญ่บ้าน
 
 There's a similar problem with Drupal (http://drupal.org/node/335928)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1030) Facet counts are not correct (or total document count is not correct as they do not match) on some searches

2009-05-05 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-1030.


Resolution: Cannot Reproduce

Closing... assuming that this is a duplicate docs on shards issue since we 
haven't seen it elsewhere.

 Facet counts are not correct (or total document count is not correct as they 
 do not match) on some searches
 ---

 Key: SOLR-1030
 URL: https://issues.apache.org/jira/browse/SOLR-1030
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.4
Reporter: Jayson Minard

 -There isn't much detailed evidence for this one yet, but hopefully it rings 
 a bell with someone who made changes in this area recently...-
 -Since updating to the tip from our previous use of the tip from around Jan 
 9, 2009- (seems to be previous to r733656 as well) we are now seeing facet 
 counts no longer match total document count.  This is through distributed 
 search and I have not verified that it only happens on distributed vs. single 
 shard search so it could be on both.
 For example, on a single valued field with one facet value set as a fq 
 filter, combined with a text search on a simple term science, the following 
 is the facet count:
 8,294,284
 And the total document count for the same results is:
 8,294,274
 some debug info (not sure why the filter query is replicated more than once, 
 but that shouldn't be harmful):
 {code}
 uerystring(science)
 QParser   OldLuceneQParser
 filter_queries[sys_content_type:(Journal Article), 
 sys_content_type:(Journal Article), 
 sys_content_type:(Journal Article), sys_content_type:(Journal Article), 
 sys_content_type:(Journal Article), sys_content_type:(Journal Article), 
 sys_content_type:(Journal Article), sys_content_type:(Journal Article), 
 sys_content_type:(Journal Article), sys_content_type:(Journal Article)]
 rawquerystring(science)
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1025) QParsers ignore configured defaultType's

2009-05-05 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1025:
---

Fix Version/s: 1.4

Can you give a concrete example (actual query strings) of what isn't working?

 QParsers ignore configured defaultType's
 

 Key: SOLR-1025
 URL: https://issues.apache.org/jira/browse/SOLR-1025
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.3
 Environment: All platforms
Reporter: Rick Moynihan
 Fix For: 1.4

   Original Estimate: 1h
  Remaining Estimate: 1h

 Whilst trying to implement my own QParser that would work with any XML 
 configured default query type, I noticed that the BoostQParserPlugin had a 
 hard coded assumption that means it ignores the defaultType specified in the 
 XML.
 The problem appears to be the following line:
 baseParser = subQuery(localParams.get(QueryParsing.V), null);
 Passing null into subQuery, appears to cause it to get an 
 OldLuceneQParserPlugin if there is no defaultType specified as a localParam, 
 i.e. it doesn't appear to look further down the chain to inspect whether a 
 defaultType has been set in solrconfig.xml.
 Other QParsers appear to make similar assumptions (though I haven't tested 
 them).  Changing the above code to the following should resolve the issue.  
 I'd suggest that this functionality should also be made available inside the 
 QParser base class, so all QParsers can correctly resolve the defaultType.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



SolrDispatchFilter config parameter solrconfig-filename

2009-05-05 Thread Jian Han Guo
Hi,

I wanted to use this parameter to specify different solr configuration files
for master and slave to simplify deployment procedure. Unfortunately, I
can't dynamically replace the value of this parameter. Basically, what I
want is

  filter
filter-nameSolrRequestFilter/filter-name
filter-classorg.apache.solr.servlet.SolrDispatchFilter/filter-class
init-param
  param-namesolrconfig-filename/param-name
  param-valuesolrconfig-master.xml/param-value
/init-param
/filter

for master instance, and

  filter
filter-nameSolrRequestFilter/filter-name
filter-classorg.apache.solr.servlet.SolrDispatchFilter/filter-class
init-param
  param-namesolrconfig-filename/param-name
  param-valuesolrconfig-slave.xml/param-value
/init-param
/filter

for slave instance.

Ideally, if I can use system property for its value like in solrconfig.xml.
For example,


  filter
filter-nameSolrRequestFilter/filter-name
filter-classorg.apache.solr.servlet.SolrDispatchFilter/filter-class
init-param
  param-namesolrconfig-filename/param-name
  param-value${solr.config.filename: solrconfig.xml}/param-value
/init-param
/filter

but I learned that in general we can't use system property in web.xml.

I realize that I can use replication of config file to achieve this, but I
thought that creates unnecessary dependencies for slaves on master instance.

So here is my proposal:

make SolrDispatchFilter look up another init parameter, say
'solrconfig-filename-property', and its value is a system property name, and
if this property is set, we got the file name, otherwise nothing happens (of
course, if both exist, 'solrconfig-filename' takes precedence). This will
give us maxium flexibility of specifying configuration files for different
instances.

Your thoughts?

Thanks,

Jianhan


[jira] Commented: (SOLR-1091) phps (serialized PHP) writer produces invalid output

2009-05-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12706260#action_12706260
 ] 

Yonik Seeley commented on SOLR-1091:


Is this valid unicode?  The serialized PHP writer already calculates the size 
of the UTF-8 encoded string, so it's difficult to see what's going on.


 phps (serialized PHP) writer produces invalid output
 --

 Key: SOLR-1091
 URL: https://issues.apache.org/jira/browse/SOLR-1091
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.3
 Environment: Sun JRE 1.6.0 on Centos 5
Reporter: frank farmer
Priority: Minor
 Fix For: 1.4


 The serialized PHP output writer can outputs invalid string lengths for 
 certain (unusual) input values.  Specifically, I had a document containing 
 the following 6 byte character sequence: \xED\xAF\x80\xED\xB1\xB8
 I was able to create a document in the index containing this value without 
 issue; however, when fetching the document back out using the serialized PHP 
 writer, it returns a string like the following:
 s:4:􀁸;
 Note that the string length specified is 4, while the string is actually 6 
 bytes long.
 When using PHP's native serialize() function, it correctly sets the length to 
 6:
 # php -r 'var_dump(serialize(\xED\xAF\x80\xED\xB1\xB8));'
 string(13) s:6:􀁸;
 The wt=php writer, which produces output to be parsed with eval(), doesn't 
 have any trouble with this string.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1144) replication hang

2009-05-05 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12706302#action_12706302
 ] 

Noble Paul commented on SOLR-1144:
--

the stacktrace http://markmail.org/message/ecr6m4rf4iy2d652 . 

I suspect the following two threads are blocked

{code}
'NioBlockingSelector.BlockPoller-2' Id=10, RUNNABLE on lock=, total cpu
time=5580.ms user time=2120.ms
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at
org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSe
lector.java:305)
'NioBlockingSelector.BlockPoller-1' Id=9, RUNNABLE on lock=, total cpu
time=333280.ms user time=107520.ms
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollrrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at
org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSe
lector.java:305)
{code}



 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.