[jira] Created: (SOLR-1772) UpdateProcessor to prune "empty" values

2010-02-12 Thread Hoss Man (JIRA)
UpdateProcessor to prune "empty" values
---

 Key: SOLR-1772
 URL: https://issues.apache.org/jira/browse/SOLR-1772
 Project: Solr
  Issue Type: Wish
Reporter: Hoss Man


Users seem to frequently get confused when some FieldTypes (typically the 
numeric ones) complain about invalid field values when the inadvertantly index 
an empty string.

It would be cool to provide an UpdateProcessor that makes it easy to strip out 
any fields being added as empty values ... it could be configured using field 
(and/or field type) names or globs to select/ignore certain fields -- i haven't 
thought it through all that hard

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1771) StringIndexDocValues should provide a better error message when getStringIndex fails

2010-02-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1771.


Resolution: Fixed
  Assignee: Hoss Man

I'm not convinced that the wording of the new error message is all that great, 
but it's vastly better then the previous behavior...

Committed revision 909746.


Note that this affected numerous different class: OrdFieldSource, All the 
"Sortable*Field" classes, DateField, and StrField.  (anyone instantiating an 
instance of StringIndexDocValues)

> StringIndexDocValues should provide a better error message when 
> getStringIndex fails
> 
>
> Key: SOLR-1771
> URL: https://issues.apache.org/jira/browse/SOLR-1771
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 1.5
>
>
> if someone attempts to use an OrdFieldSource on a field that is tokenized, 
> FieldCache.getStringIndex throws a confusing RuntimeException that 
> StringIndexDocValues propogates.  we should wrap that exception in something 
> more helpful...
> http://old.nabble.com/sorting-td27544348.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1771) StringIndexDocValues should provide a better error message when getStringIndex fails

2010-02-12 Thread Hoss Man (JIRA)
StringIndexDocValues should provide a better error message when getStringIndex 
fails


 Key: SOLR-1771
 URL: https://issues.apache.org/jira/browse/SOLR-1771
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Hoss Man
 Fix For: 1.5


if someone attempts to use an OrdFieldSource on a field that is tokenized, 
FieldCache.getStringIndex throws a confusing RuntimeException that 
StringIndexDocValues propogates.  we should wrap that exception in something 
more helpful...

http://old.nabble.com/sorting-td27544348.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1579) CLONE -stats.jsp XML escaping

2010-02-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1579.


Resolution: Fixed
  Assignee: Hoss Man  (was: Erik Hatcher)


I fully expect stats.jsp will be deprecated in the next release of Solr in 
favor of the handler in SOLR-1750 -- BUT -- I still can't beleive such an 
anoying and yet trivial to fix bug was arround for so long ... especially since 
the incorrect fix for the XML attribute escaping is only half the problem: 
escapeCharData as still needed for the XML ELement content escaping.

David: thanks for your prodding on this ... i committed your patch plus some 
additional fixes (r909705)


> CLONE -stats.jsp XML escaping
> -
>
> Key: SOLR-1579
> URL: https://issues.apache.org/jira/browse/SOLR-1579
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 1.4
>Reporter: David Bowen
>Assignee: Hoss Man
> Fix For: 1.5
>
> Attachments: SOLR-1579.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The fix to SOLR-1008 was wrong.  It used chardata escaping for a value that 
> is an attribute value.
> I.e. instead of XML.escapeCharData it should call XML.escapeAttributeValue.
> Otherwise, any query used as a key in the filter cache whose printed 
> representation contains a double-quote character causes invalid XML to be 
> generated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1008) stats.jsp XML escaping

2010-02-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1008:
---

Fix Version/s: (was: 1.4)
   1.5

Note: the fix included in Solr 1.4 was not actually correct, revising version 
info accordingly.  see SOLR-1579 for details

> stats.jsp XML escaping
> --
>
> Key: SOLR-1008
> URL: https://issues.apache.org/jira/browse/SOLR-1008
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
> Fix For: 1.5
>
> Attachments: SOLR-1008.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> stats.jsp gave this error:
> Line Number 1327, Column 48: stat names are not XML escaped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-02-12 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833108#action_12833108
 ] 

Jason Rutherglen commented on SOLR-1301:


There still seems to be a bug where the temporary directory index isn't deleted 
on job completion.

> Solr + Hadoop
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4
>Reporter: Andrzej Bialecki 
> Fix For: 1.5
>
> Attachments: commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, hadoop.patch, 
> log4j-1.2.15.jar, README.txt, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SolrRecordWriter.java
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

2010-02-12 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832973#action_12832973
 ] 

Erik Hatcher commented on SOLR-1365:


I'm not really sure why we have that constraint in SolrResourceLoader, and why 
any class we load can't simply implement SolrCoreAware.  But at the very least, 
we can update this to support a SimilarityFactory for the sake of this issue.  
+1

> Add configurable Sweetspot Similarity factory
> -
>
> Key: SOLR-1365
> URL: https://issues.apache.org/jira/browse/SOLR-1365
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 1.3
>Reporter: Kevin Osborn
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do 
> something useful by extending SweetSpotSimilarity. So, instead, I made a 
> factory class and an configurable SweetSpotSimilarty. There are two classes. 
> SweetSpotSimilarityFactory reads the parameters from schema.xml. It then 
> creates an instance of VariableSweetSpotSimilarity, which is my custom 
> SweetSpotSimilarity class. In addition to the standard functions, it also 
> handles dynamic fields.
> So, in schema.xml, you could have something like this:
> 
> true
>   1.0
>   1.5
>   1.3
>   2.0
>   1
>   1
>   0.5
>   2
>   9
>   0.2
>   2
>   7
>name="lengthNormFactorsSteepness_supplierDescription_*">0.4
>  
> So, now everything is in a config file instead of having to create your own 
> subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1769) Solr 1.4 Replication - Repeater throwing NullPointerException

2010-02-12 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832964#action_12832964
 ] 

Noble Paul commented on SOLR-1769:
--


Let me assume that you are using the 1.4 release and not a daily nightly build

I don't see anything obviously wrong. meanwhile, why don't you just pick up the 
solrconfig.xml from 1.4 and make the necessary changes and test it out?


> Solr 1.4 Replication - Repeater throwing NullPointerException
> -
>
> Key: SOLR-1769
> URL: https://issues.apache.org/jira/browse/SOLR-1769
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Deepak
>Assignee: Noble Paul
>Priority: Critical
> Attachments: solrconfig.xml
>
>
> Hi
> I am trying to test Solr 1.4 Java replication. It works fine with this 
> configuration on slave and data is sync from master with out any issue
> 
> 
>  name="masterUrl">http://IP:PORT/SolrSmartPriceSS/replication
> internal
> 5000
> 1
>  
> 
> We need to setup repeater on this slave. We have this configuration on slave. 
> With this configuration, it's throwing a null pointer exception. Please see 
> error log
> 
>  
> commit   
> schema.xml  
> 
> 
>  name="masterUrl">http://IP:PORT/SolrSmartPriceSS/replication
> internal
> 5000
> 1
>  
> 
> Error log
> INFO: start 
> commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false)
> Feb 9, 2010 10:27:55 PM org.apache.solr.handler.ReplicationHandler doFetch
> SEVERE: SnapPull failed
> org.apache.solr.common.SolrException: Index fetch failed :
> at 
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329)
> at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
> at 
> org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.solr.handler.ReplicationHandler$4.postCommit(ReplicationHandler.java:922)
> at 
> org.apache.solr.update.UpdateHandler.callPostCommitCallbacks(UpdateHandler.java:78)
> at 
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:411)
> at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:467)
> at 
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:319)
> Please let us know how can we resolve this issue
> Regards
> Deepak
>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Hudson build is back to normal : Solr-trunk #1057

2010-02-12 Thread Apache Hudson Server
See