date:20070702

[jira] Assigned: (SOLR-139) Support updateable/modifiable documents

2007-07-02 Thread Ryan McKinley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley reassigned SOLR-139:
--

Assignee: Ryan McKinley

 Support updateable/modifiable documents
 ---

 Key: SOLR-139
 URL: https://issues.apache.org/jira/browse/SOLR-139
 Project: Solr
  Issue Type: Improvement
  Components: update
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Attachments: SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-XmlUpdater.patch


 It would be nice to be able to update some fields on a document without 
 having to insert the entire document.
 Given the way lucene is structured, (for now) one can only modify stored 
 fields.
 While we are at it, we can support incrementing an existing value - I think 
 this only makes sense for numbers.
 for background, see:
 http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Solr nightly build failure

2007-07-02 Thread solr-dev


init-forrest-entities:
[mkdir] Created dir: /tmp/apache-solr-nightly/build

checkJunitPresence:

compile-common:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/common
[javac] Compiling 25 source files to /tmp/apache-solr-nightly/build/common
[javac] Note: 
/tmp/apache-solr-nightly/src/java/org/apache/solr/common/params/DisMaxParams.java
 uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

compile:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/core
[javac] Compiling 194 source files to /tmp/apache-solr-nightly/build/core
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

compile-solrj-core:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/client/solrj
[javac] Compiling 21 source files to 
/tmp/apache-solr-nightly/build/client/solrj
[javac] Note: 
/tmp/apache-solr-nightly/client/java/solrj/src/org/apache/solr/client/solrj/impl/CommonsHttpSolrServer.java
 uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.

compile-solrj:
[javac] Compiling 2 source files to 
/tmp/apache-solr-nightly/build/client/solrj
[javac] Note: 
/tmp/apache-solr-nightly/client/java/solrj/src/org/apache/solr/client/solrj/embedded/JettySolrRunner.java
 uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.

compileTests:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/tests
[javac] Compiling 55 source files to /tmp/apache-solr-nightly/build/tests
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

junit:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/test-results
[junit] Running org.apache.solr.BasicFunctionalityTest
[junit] Tests run: 24, Failures: 0, Errors: 0, Time elapsed: 21.204 sec
[junit] Running org.apache.solr.ConvertedLegacyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.779 sec
[junit] Running org.apache.solr.DisMaxRequestHandlerTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 6.733 sec
[junit] Running org.apache.solr.EchoParamsTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 2.036 sec
[junit] Running org.apache.solr.OutputWriterTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.864 sec
[junit] Running org.apache.solr.SampleTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.056 sec
[junit] Running org.apache.solr.analysis.TestBufferedTokenStream
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.056 sec
[junit] Running org.apache.solr.analysis.TestHyphenatedWordsFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.069 sec
[junit] Running org.apache.solr.analysis.TestKeepWordFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.067 sec
[junit] Running org.apache.solr.analysis.TestPatternReplaceFilter
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.088 sec
[junit] Running org.apache.solr.analysis.TestPatternTokenizerFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.083 sec
[junit] Running org.apache.solr.analysis.TestPhoneticFilter
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.101 sec
[junit] Running org.apache.solr.analysis.TestRemoveDuplicatesTokenFilter
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.073 sec
[junit] Running org.apache.solr.analysis.TestSynonymFilter
[junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.126 sec
[junit] Running org.apache.solr.analysis.TestTrimFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.061 sec
[junit] Running org.apache.solr.analysis.TestWordDelimiterFilter
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 10.284 sec
[junit] Running org.apache.solr.common.SolrDocumentTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.078 sec
[junit] Running org.apache.solr.common.params.SolrParamTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.078 sec
[junit] Running org.apache.solr.common.util.ContentStreamTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.171 sec
[junit] Running org.apache.solr.common.util.IteratorChainTest
[junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.064 sec

Build failed in Hudson: Solr-Nightly #130

2007-07-02 Thread hudson

See http://lucene.zones.apache.org:8080/hudson/job/Solr-Nightly/130/changes

Changes:

[ryan] SOLR-133 -- found a bug in the delete XML parsing.  for id's and queries 
with , it did not behave correctly.  Adds a fix and test.  Another sideeffect 
that should be noted is that this parser now accepts multiple delete commands:

delete
 id1/id
 id3/id
 id4/id
/delete`

[ryan] changes note for multiple deletes with delete command

[ryan] SOLR-269 -- missing changes for somethign commited a while ago

[ryan] SOLR-280 -- changing the SolrDocument/SolrInputDocument implementation 
so each one as as efficient as it can be.  The API changes mostly affect solrj 
users.

--
[...truncated 869 lines...]
A client/ruby/solr-ruby/test/unit/document_test.rb
A client/ruby/solr-ruby/test/unit/standard_response_test.rb
AUclient/ruby/solr-ruby/test/unit/delimited_file_source_test.rb
A client/ruby/solr-ruby/test/unit/xpath_test_file.xml
AUclient/ruby/solr-ruby/test/unit/array_mapper_test.rb
A client/ruby/solr-ruby/test/unit/field_test.rb
AUclient/ruby/solr-ruby/test/unit/solr_mock_base.rb
A client/ruby/solr-ruby/test/unit/add_document_test.rb
AUclient/ruby/solr-ruby/test/unit/request_test.rb
A client/ruby/solr-ruby/test/unit/commit_test.rb
AUclient/ruby/solr-ruby/test/unit/xpath_mapper_test.rb
AUclient/ruby/solr-ruby/test/unit/suite.rb
A client/ruby/solr-ruby/test/unit/ping_test.rb
A client/ruby/solr-ruby/test/unit/dismax_request_test.rb
A client/ruby/solr-ruby/test/unit/response_test.rb
AUclient/ruby/solr-ruby/test/unit/indexer_test.rb
AUclient/ruby/solr-ruby/test/unit/connection_test.rb
A client/ruby/solr-ruby/test/unit/delete_test.rb
AUclient/ruby/solr-ruby/test/unit/tab_delimited.txt
A client/ruby/solr-ruby/test/unit/hpricot_test_file.xml
AUclient/ruby/solr-ruby/test/unit/standard_request_test.rb
A client/ruby/solr-ruby/test/unit/hpricot_mapper_test.rb
AUclient/ruby/solr-ruby/test/unit/data_mapper_test.rb
AUclient/ruby/solr-ruby/test/unit/util_test.rb
A client/ruby/solr-ruby/test/functional
A client/ruby/solr-ruby/test/functional/test_solr_server.rb
A client/ruby/solr-ruby/test/functional/server_test.rb
A client/ruby/solr-ruby/test/conf
AUclient/ruby/solr-ruby/test/conf/schema.xml
A client/ruby/solr-ruby/test/conf/protwords.txt
A client/ruby/solr-ruby/test/conf/stopwords.txt
AUclient/ruby/solr-ruby/test/conf/solrconfig.xml
A client/ruby/solr-ruby/test/conf/scripts.conf
A client/ruby/solr-ruby/test/conf/admin-extra.html
A client/ruby/solr-ruby/test/conf/synonyms.txt
A client/ruby/solr-ruby/LICENSE.txt
A client/ruby/solr-ruby/Rakefile
A client/ruby/solr-ruby/script
AUclient/ruby/solr-ruby/script/setup.rb
AUclient/ruby/solr-ruby/script/solrshell
A client/ruby/solr-ruby/lib
A client/ruby/solr-ruby/lib/solr
AUclient/ruby/solr-ruby/lib/solr/util.rb
A client/ruby/solr-ruby/lib/solr/document.rb
A client/ruby/solr-ruby/lib/solr/exception.rb
AUclient/ruby/solr-ruby/lib/solr/indexer.rb
AUclient/ruby/solr-ruby/lib/solr/response.rb
AUclient/ruby/solr-ruby/lib/solr/connection.rb
A client/ruby/solr-ruby/lib/solr/importer
AUclient/ruby/solr-ruby/lib/solr/importer/delimited_file_source.rb
AUclient/ruby/solr-ruby/lib/solr/importer/solr_source.rb
AUclient/ruby/solr-ruby/lib/solr/importer/array_mapper.rb
AUclient/ruby/solr-ruby/lib/solr/importer/mapper.rb
AUclient/ruby/solr-ruby/lib/solr/importer/xpath_mapper.rb
A client/ruby/solr-ruby/lib/solr/importer/hpricot_mapper.rb
A client/ruby/solr-ruby/lib/solr/xml.rb
AUclient/ruby/solr-ruby/lib/solr/importer.rb
A client/ruby/solr-ruby/lib/solr/field.rb
AUclient/ruby/solr-ruby/lib/solr/solrtasks.rb
A client/ruby/solr-ruby/lib/solr/request
A client/ruby/solr-ruby/lib/solr/request/ping.rb
A client/ruby/solr-ruby/lib/solr/request/select.rb
AUclient/ruby/solr-ruby/lib/solr/request/optimize.rb
AUclient/ruby/solr-ruby/lib/solr/request/standard.rb
A client/ruby/solr-ruby/lib/solr/request/delete.rb
AUclient/ruby/solr-ruby/lib/solr/request/index_info.rb
A client/ruby/solr-ruby/lib/solr/request/update.rb
A client/ruby/solr-ruby/lib/solr/request/dismax.rb
A client/ruby/solr-ruby/lib/solr/request/add_document.rb
A client/ruby/solr-ruby/lib/solr/request/commit.rb
A client/ruby/solr-ruby/lib/solr/request/base.rb
AUclient/ruby/solr-ruby/lib/solr/request.rb
A client/ruby/solr-ruby/lib/solr/response
A client/ruby/solr-ruby/lib/solr/response/ping.rb
AU

[jira] Closed: (SOLR-163) libxml/rexml-related test case failure

2007-07-02 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher closed SOLR-163.
-

Resolution: Fixed

 libxml/rexml-related test case failure
 --

 Key: SOLR-163
 URL: https://issues.apache.org/jira/browse/SOLR-163
 Project: Solr
  Issue Type: Bug
  Components: clients - ruby - flare
Reporter: Erik Hatcher

 http://www.nabble.com/solrb-testing--tf3213880.html#a8949745
1) Failure: 
 test_delete_by_i18n_query_request(DeleteTest) [./test/unit/ 
 delete_test.rb:53]: 
 delete\n  queryëäïöü/query\n/ 
 delete expected to be =~ 
 /delete[\s]*query\303\253\303\244\303\257\303\266\303\274\/ 
 query[\s]*\/delete/m. 
2) Failure: 
 test_i18n_xml(FieldTest) [./test/unit/field_test.rb:39]: 
 field name=\i18nstring\Äêâîôû   
 Öëäïöü/field expected to be =~ 
 /field name=[']i18nstring[']\303\204\303\252\303\242\303\256\303 
 \264\303\273 \303\226\303\253\303\244\303\257\303\266\303\274\/ 
 field/m. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-280) slightly more efficient SolrDocument implementation

2007-07-02 Thread Will Johnson (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Will Johnson updated SOLR-280:
--

Attachment: SOLR-280-SolrDocument2-API-Compatibility.patch

The API changes mostly affect solrj users.

being one of those heavily affected users i created the attached patch to make
us unaffected. (or at least i went from a few hundred compile errors to 0)

the following methods were added back and are mostly 1-5 line wrappers to the
existing methods or underlying datastructures.

setField(String, Object)
getFieldValue(String)
getFieldValues(String)
addField(String, Object)
getFieldNames()

- will

slightly more efficient SolrDocument implementation
---

Key: SOLR-280
URL: https://issues.apache.org/jira/browse/SOLR-280
Project: Solr
Issue Type: Improvement
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Priority: Minor
Attachments: SOLR-280-SolrDocument2-API-Compatibility.patch,
SOLR-280-SolrDocument2.patch, SOLR-280-SolrDocument2.patch

Following discussion in SOLR-272
This implementation stores fields as a MapString,Object rather then a
MapString,CollectionObject. The API changes slightly in that:
getFieldValue( name ) returns a Collection if there are more then one fields
and a Object if there is only one.
getFirstValue( name ) returns a single value for the field. This is intended
to make things easier for client applications.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-280) slightly more efficient SolrDocument implementation

2007-07-02 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509602
 ] 

Ryan McKinley commented on SOLR-280:


in rev552521, I changed the Float variables to float and default everything to 
1.0.  if we have wrapper functions, this seems better then autoboxing/checking 
null values.

 slightly more efficient SolrDocument implementation
 ---

 Key: SOLR-280
 URL: https://issues.apache.org/jira/browse/SOLR-280
 Project: Solr
  Issue Type: Improvement
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Priority: Minor
 Attachments: SOLR-280-SolrDocument2-API-Compatibility.patch, 
 SOLR-280-SolrDocument2.patch, SOLR-280-SolrDocument2.patch


 Following discussion in SOLR-272
 This implementation stores fields as a MapString,Object rather then a 
 MapString,CollectionObject.  The API changes slightly in that:
  getFieldValue( name ) returns a Collection if there are more then one fields 
 and a Object if there is only one.
 getFirstValue( name ) returns a single value for the field.  This is intended 
 to make things easier for client applications.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-139) Support updateable/modifiable documents

2007-07-02 Thread Ryan McKinley (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509615
]

Ryan McKinley commented on SOLR-139:

So you are suggesting pulling this out of the UpdateHandler and managing the
document merging in the UpdateRequestProcessor? (this might makes sense - It
was not an option when the patch started in feb)

How can the UpdateHandler get access to pending documents? should it just use
req.getSearcher()?

example1: a userTag field that represents tags on objects of the form
user#tagstring.
If user==member, then add tagstring to the indexed-only ownerTags field, else
add the tagstring to the socialTags field.

example2: an UpdateRequestProcessor is used to encode the value of a field
with rot13... this should obviously only be done for new field values, and
not values that are just being re-stored, so the UpdateRequestProcessor needs
to be able to distinguish between the two.

1 2 seem pretty straightforwad

example3: some field values are pulled from a database when missing rather
than being stored values.

Do you mean as input or output? The UpdateRequestProcessor could not affect if
a field is stored or not, it could augment a document with more fields *before*
it is indexed. To add fields from a database rather then storing them, we
would need a hook at the end.

Support updateable/modifiable documents
---

Key: SOLR-139
URL: https://issues.apache.org/jira/browse/SOLR-139
Project: Solr
Issue Type: Improvement
Components: update
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Attachments: SOLR-139-IndexDocumentCommand.patch,
SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch,
SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch,
SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch,
SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch,
SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch,
SOLR-139-XmlUpdater.patch

It would be nice to be able to update some fields on a document without
having to insert the entire document.
Given the way lucene is structured, (for now) one can only modify stored
fields.
While we are at it, we can support incrementing an existing value - I think
this only makes sense for numbers.
for background, see:
http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-139) Support updateable/modifiable documents

2007-07-02 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509622
 ] 

Yonik Seeley commented on SOLR-139:
---

 So you are suggesting [...]
 
I don't have a concrete implementation idea, I'm just going over all the things 
I know people will want to do (and many of these I have an immediate use for).

 Do you mean as input or output?

Input, for index-only fields.  Normally field values need to be stored for an 
update to work, but we could also allow the user to get these field values 
from an external source.

 we would need a hook at the end.

Yes, it might make sense to have more than one callback method per 
UpdateRequestProcessor

Of course now that I finally look at the code, UpdateRequestProcessor isn't 
quite what I expected.
I was originally thinking more along the lines of DocumentMutator(s) that 
manipulate a document, not that actually initiate the add/delete/udpate calls.  
But there is a certain greater power to what you are exposing/allowing too (as 
long as you don't need multiple of them).

In UpdateRequestProcessor , instead of 
  protected final NamedListObject response;
Why not just expose SolrQueryRequest, SolrQueryResponse?




 Support updateable/modifiable documents
 ---

 Key: SOLR-139
 URL: https://issues.apache.org/jira/browse/SOLR-139
 Project: Solr
  Issue Type: Improvement
  Components: update
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Attachments: SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-XmlUpdater.patch


 It would be nice to be able to update some fields on a document without 
 having to insert the entire document.
 Given the way lucene is structured, (for now) one can only modify stored 
 fields.
 While we are at it, we can support incrementing an existing value - I think 
 this only makes sense for numbers.
 for background, see:
 http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: removing most @author tags

2007-07-02 Thread Chris Hostetter


: In the spirit of shared ownership, what do people think of getting rid
: of @author tags (for committers or other dev people that consent?).
: Other apache projects have done so, for a host of reasons.

+1

: $ find . -name \*.java | xargs grep '@author'| grep -i hoss | wc
:   2   8 152

wow ... that's 2 more then i expected to see.



-Hoss

[jira] Commented: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

2007-07-02 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509638
 ] 

Hoss Man commented on SOLR-277:
---

(FWIW: the old XmlUpdateRequestHandler has been renamed XppUpdateRequestHandler 
and deprecated, the current XmlUpdateRequestHandler uses stax.)

This type of functionality seems like it might be a handy option for people to 
have if they know they have non standard entities in their input -- but i think 
be default we want to be strict about our XML parsing.  perhaps an optional 
init param could be added to the XmlUpdateRequestHandler where a filename 
containing mappings like this could be specified?

(this is assuming stax has something akin to the 
parser.defineEntityReplacementText method used in the patch)

 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 -

 Key: SOLR-277
 URL: https://issues.apache.org/jira/browse/SOLR-277
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.3
Reporter: Toru Matsuzawa
 Attachments: XmlUpdateRequestHandler.patch


 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
 It is necessary to correspond with XmlUpdateRequestHandler because xpp3 
 cannot use !DOCTYPE.
 I think it is necessary until StaxUpdateRequestHandler becomes /update.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: removing most @author tags

2007-07-02 Thread Ryan McKinley





Thoughts?



+1

It does feel a bit akward.

[jira] Created: (SOLR-284) Parsing Rich Document Types

2007-07-02 Thread Eric Pugh (JIRA)

Parsing Rich Document Types
---

 Key: SOLR-284
 URL: https://issues.apache.org/jira/browse/SOLR-284
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.3
Reporter: Eric Pugh
 Fix For: 1.3


I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
Solr.

I am attaching a patch file with the code changes, and if this looks good, will 
add a page similar to http://wiki.apache.org/solr/UpdateCSV.

 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-284) Parsing Rich Document Types

2007-07-02 Thread Eric Pugh (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh updated SOLR-284:
---

Attachment: test-files.zip

test files to go in test/test-files for unit testing.

 Parsing Rich Document Types
 ---

 Key: SOLR-284
 URL: https://issues.apache.org/jira/browse/SOLR-284
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.3
Reporter: Eric Pugh
 Fix For: 1.3

 Attachments: rich.patch, test-files.zip


 I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
 that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
 Solr.
 I am attaching a patch file with the code changes, and if this looks good, 
 will add a page similar to http://wiki.apache.org/solr/UpdateCSV.
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-284) Parsing Rich Document Types

2007-07-02 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509676
 ] 

Ryan McKinley commented on SOLR-284:


I haven't run this patch, but have a few questions...

What is the *general* approach to extract a lucene document (list of fields) 
from a PDF? Word? Powerpoint?

Is this just access to a few common fields like author, keywords, text, etc?  
Is this something that realistically would need to be custom for each case?  

Perhaps it makes sense to add a contrib section for this sort of stuff.  It 
seems weird to add 10 library dependencies to the core distribution.  How does 
nutch handle this?
 


 Parsing Rich Document Types
 ---

 Key: SOLR-284
 URL: https://issues.apache.org/jira/browse/SOLR-284
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.3
Reporter: Eric Pugh
 Fix For: 1.3

 Attachments: rich.patch, test-files.zip


 I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
 that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
 Solr.
 I am attaching a patch file with the code changes, and if this looks good, 
 will add a page similar to http://wiki.apache.org/solr/UpdateCSV.
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-284) Parsing Rich Document Types

2007-07-02 Thread Eric Pugh (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh updated SOLR-284:
---

Attachment: libs.zip

new jars to go in trunk/lib for pdf and office parsing...

 Parsing Rich Document Types
 ---

 Key: SOLR-284
 URL: https://issues.apache.org/jira/browse/SOLR-284
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.3
Reporter: Eric Pugh
 Fix For: 1.3

 Attachments: libs.zip, rich.patch, test-files.zip


 I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
 that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
 Solr.
 I am attaching a patch file with the code changes, and if this looks good, 
 will add a page similar to http://wiki.apache.org/solr/UpdateCSV.
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: removing most @author tags

2007-07-02 Thread Mike Klaas


On 2-Jul-07, at 11:32 AM, Yonik Seeley wrote:


In the spirit of shared ownership, what do people think of getting rid
of @author tags (for committers or other dev people that consent?).
Other apache projects have done so, for a host of reasons.

- some people don't use author tags, hence credit is uneven
- author tags tend to only credit the original author, and not
everyone that works on the code after (or does code reviews, lends
ideas, etc, etc)
- we have CHANGES.txt to generally credit people (and it prob does  
a better job)


I've seen a better list of reasons elsewhere, but my main motivation
was that it didn't feel right having my name spashed all over code
that many other people are contributing to now.

Thoughts?


+0, though I think it is mostly a decision for those who have already  
tons of @author tags in the repo.


FWIW, our internal repository was in a similar situation: I was  
__author__ of 90% of the files, though certainly not the sole  
contributor to all of those files.  I decided to strip this  
attribution for precisely the reasons you enumerated.


-Mike

[jira] Commented: (SOLR-225) Allow pluggable Highlighting classes -- Formatters and Fragmenters

2007-07-02 Thread Mike Klaas (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509709
 ] 

Mike Klaas commented on SOLR-225:
-

Looking great Ryan (again, only commenting on the Highlighting configurability 
parts)

should:

  protected boolean emptyArray(String[] arr) {
return (arr == null || arr.length == 0 || arr[0] == null || 
arr[0].trim().length() == 0);
  }

perhaps be defined as

protected boolean emptyArray(String[] arr) {
return (arr == null || arr.length == 0 || arr.length == 1arr[0] == null || 
arr[0].trim().length() == 0);
}

 Allow pluggable Highlighting classes -- Formatters and Fragmenters
 --

 Key: SOLR-225
 URL: https://issues.apache.org/jira/browse/SOLR-225
 Project: Solr
  Issue Type: Improvement
Reporter: Brian Whitman
Assignee: Ryan McKinley
 Attachments: SOLR-225+260-HighlightPlugins.patch, 
 SOLR-225+260-HighlightPlugins.patch, SOLR-225-HighlightingConfig.patch, 
 SOLR-225-HighlightingConfig.patch, SOLR-225-HighlightingConfig.patch, 
 SOLR-225-HighlightingConfig.patch, SOLR-225-HighlightingConfig.patch


 Highlighting should support a pluggable architecture similar to what is seen 
 with RequestHandlers, Fields, FieldTypes, etc
 '
 For more background:
 http://www.nabble.com/Custom-fragmenter-tf3681588.html#a10289335

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-225) Allow pluggable Highlighting classes -- Formatters and Fragmenters

2007-07-02 Thread Mike Klaas (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509709
 ] 

Mike Klaas edited comment on SOLR-225 at 7/2/07 4:14 PM:
-

Looking great Ryan (again, only commenting on the Highlighting configurability 
parts)

should:

  protected boolean emptyArray(String[] arr) {
return (arr == null || arr.length == 0 || arr[0] == null || 
arr[0].trim().length() == 0);
  }

perhaps be defined as

protected boolean emptyArray(String[] arr) {
return (arr == null || arr.length == 0 || 
 (arr.length == 1  (arr[0] == null || arr[0].trim().length() 
== 0)));
}

Params:

+  public static final String HIGHLIGHT   = hl;
+  public static final String PREFIX  = hl.;
+  public static final String FIELDS  = PREFIX+fl;
+  public static final String SNIPPETS= PREFIX+snippets;
+  public static final String FRAGSIZE= PREFIX+fragsize;
+  public static final String INCREMENT   = PREFIX+increment;
+  public static final String SLOP= PREFIX+slop;

perhaps this should be PREFIX + 'regex.slop'?

+  public static final String MAX_CHARS   = PREFIX+maxAnalyzedChars;

similarly.  Seems somewhat inelegant to define/hardcode the plugin-specific 
parameters here, though it is nice ot have them all in one place...




 was:
Looking great Ryan (again, only commenting on the Highlighting configurability 
parts)

should:

  protected boolean emptyArray(String[] arr) {
return (arr == null || arr.length == 0 || arr[0] == null || 
arr[0].trim().length() == 0);
  }

perhaps be defined as

protected boolean emptyArray(String[] arr) {
return (arr == null || arr.length == 0 || arr.length == 1arr[0] == null || 
arr[0].trim().length() == 0);
}

 Allow pluggable Highlighting classes -- Formatters and Fragmenters
 --

 Key: SOLR-225
 URL: https://issues.apache.org/jira/browse/SOLR-225
 Project: Solr
  Issue Type: Improvement
Reporter: Brian Whitman
Assignee: Ryan McKinley
 Attachments: SOLR-225+260-HighlightPlugins.patch, 
 SOLR-225+260-HighlightPlugins.patch, SOLR-225-HighlightingConfig.patch, 
 SOLR-225-HighlightingConfig.patch, SOLR-225-HighlightingConfig.patch, 
 SOLR-225-HighlightingConfig.patch, SOLR-225-HighlightingConfig.patch


 Highlighting should support a pluggable architecture similar to what is seen 
 with RequestHandlers, Fields, FieldTypes, etc
 '
 For more background:
 http://www.nabble.com/Custom-fragmenter-tf3681588.html#a10289335

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-285) Server Side XSLT for update processing

2007-07-02 Thread Hoss Man (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hoss Man updated SOLR-285:
--

Attachment: xslt_updater.diff

this is mainly just a proof of concept ... there is a lot of room for
improvement here .. this reuses the same TransformerProvider as the
XSLTResposneWriter but doesn't even try to use hte cache (even if it did, using
it in conjunction with XSLTResposneWriter would constantly invalidate the cache)

the biggest improvement would be to find someway to pipeline the XSLT
transformation into the Stax parsing ... i tried to at least use a DOMResult
for hte transformer and a DOMSource for the XMLStreamReader but i got this
exception...

SEVERE: java.lang.UnsupportedOperationException:
XMLInputFactory.createXMLStreamReader(javax.xml.transform.dom.DOMSource) not
yet implemented
at
com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:70)

...oh well.

patch also includes a simple rss2solr.xml stylesheet that does some very
simplistic/silly transformations to match the example schema.xml

comments from people who understand javax.xml.* better then i do would be
greatly appreciated.

Server Side XSLT for update processing
--

Key: SOLR-285
URL: https://issues.apache.org/jira/browse/SOLR-285
Project: Solr
Issue Type: New Feature
Reporter: Hoss Man
Attachments: xslt_updater.diff

Ideally, we should support a way for people to specify send XML
ContentStreams to Solr and do server side XSLT processing to convert it (much
like the XSLTResposneWriter supports server side XSLT processing of responses.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-225) Allow pluggable Highlighting classes -- Formatters and Fragmenters

2007-07-02 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509721
 ] 

Ryan McKinley commented on SOLR-225:



 perhaps be defined as
 
 protected boolean emptyArray(String[] arr) {
 return (arr == null || arr.length == 0 || 
  (arr.length == 1  (arr[0] == null || 
 arr[0].trim().length() == 0)));
 }

seems good.  This patch tried not to change any highlighting logic, it is just 
moved it from the existing HighlightingUtils.java

I will add this change.


 
 +  public static final String MAX_CHARS   = PREFIX+maxAnalyzedChars;
 
 similarly.  Seems somewhat inelegant to define/hardcode the plugin-specific 
 parameters here, though it is nice ot have them all in one place...
 

I'm torn on what is more/less elegant.  

Should we have a new class in o.a.s.common.params for each plugin?  

Since the number of 'core' plugins will be relatively small, having a single 
HighlightParams class with sections for the core plugin options seems ok.  But 
I can easily be talked out of this...


 Allow pluggable Highlighting classes -- Formatters and Fragmenters
 --

 Key: SOLR-225
 URL: https://issues.apache.org/jira/browse/SOLR-225
 Project: Solr
  Issue Type: Improvement
Reporter: Brian Whitman
Assignee: Ryan McKinley
 Attachments: SOLR-225+260-HighlightPlugins.patch, 
 SOLR-225+260-HighlightPlugins.patch, SOLR-225-HighlightingConfig.patch, 
 SOLR-225-HighlightingConfig.patch, SOLR-225-HighlightingConfig.patch, 
 SOLR-225-HighlightingConfig.patch, SOLR-225-HighlightingConfig.patch


 Highlighting should support a pluggable architecture similar to what is seen 
 with RequestHandlers, Fields, FieldTypes, etc
 '
 For more background:
 http://www.nabble.com/Custom-fragmenter-tf3681588.html#a10289335

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: removing most @author tags

2007-07-02 Thread Ian Holsman


Yonik Seeley wrote:

In the spirit of shared ownership, what do people think of getting rid
of @author tags (for committers or other dev people that consent?).
Other apache projects have done so, for a host of reasons.

- some people don't use author tags, hence credit is uneven
- author tags tend to only credit the original author, and not
everyone that works on the code after (or does code reviews, lends
ideas, etc, etc)
- we have CHANGES.txt to generally credit people (and it prob does a 
better job)

you forgot another big reason
people tend to email people in the @author tags directly, instead of 
using the lists.

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509725
 ] 

Yonik Seeley commented on SOLR-269:
---

Looking at UpdateRequestProcessor further, it seems like these should be 
singletons (instance per entry in solrconfig, no factory needed), and any extra 
state that is needed
should be added to classes we already have (like AddCommand, etc), no?


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-225) Allow pluggable Highlighting classes -- Formatters and Fragmenters

2007-07-02 Thread Mike Klaas (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509728
 ] 

Mike Klaas commented on SOLR-225:
-

 I'm torn on what is more/less elegant. 

 Should we have a new class in o.a.s.common.params for each plugin? 

 Since the number of 'core' plugins will be relatively small, having a single 
 HighlightParams class with sections  for the core plugin options seems ok. 
 But I can easily be talked out of this... 

Seems ok to me too.  Spreading everything into a jumble of classes won't 
exactly help coherence.



 Allow pluggable Highlighting classes -- Formatters and Fragmenters
 --

 Key: SOLR-225
 URL: https://issues.apache.org/jira/browse/SOLR-225
 Project: Solr
  Issue Type: Improvement
Reporter: Brian Whitman
Assignee: Ryan McKinley
 Attachments: SOLR-225+260-HighlightPlugins.patch, 
 SOLR-225+260-HighlightPlugins.patch, SOLR-225-HighlightingConfig.patch, 
 SOLR-225-HighlightingConfig.patch, SOLR-225-HighlightingConfig.patch, 
 SOLR-225-HighlightingConfig.patch, SOLR-225-HighlightingConfig.patch


 Highlighting should support a pluggable architecture similar to what is seen 
 with RequestHandlers, Fields, FieldTypes, etc
 '
 For more background:
 http://www.nabble.com/Custom-fragmenter-tf3681588.html#a10289335

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509732
 ] 

Yonik Seeley commented on SOLR-269:
---

I think the newly added incremental time should not be on by default, as well 
as logging per id for deletes and adds.
Mike added the id aggregation code specifically because logging each add was 
taking so much time.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509733
 ] 

Ryan McKinley commented on SOLR-269:


maybe.  I'm not sure I totally understand your suggestion though.  

I need something that is easily subclassed and can cleanly holds state across 
an entire request cycle.  The alternative is to pass the 
SolrQueryRequest/Response into each action and maybe pull out the 
schema/updateHandler/logged in user/etc for each command (each document in the 
list of 100)

Is the factory a performance concern?  (to my tastes) it seems nicer to work 
with:
 
 processDelete( DeleteUpdateCommand cmd )
 {
   if( user.isAdmin() ) {
 updateHandler.delete( cmd );
   }  
   else {
...
   }
 }
 
 than:
 
 processDelete( DeleteUpdateCommand cmd, SolrQueryRequest req, 
SolrQueryResponse rsp )
 {
   User user = req.getContext().get( user );
   if( user.isAdmin() ) {
SolrCore core = req.getCore();
SolrSchema schema = core.getSchema();
UpdateHandler updateHandler = core.getUpdateHandler();
updateHandler.delete( cmd );
   }  
   else {
 ...
   }
 }
 
I'm fine either way, like the easy 1 per-request interface.  

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Ryan McKinley (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509734
]

Ryan McKinley commented on SOLR-269:

I think the newly added incremental time should not be on by default, as well
as logging per id for deletes and adds.
Mike added the id aggregation code specifically because logging each add was
taking so much time.

sounds good. the testing I did showed that lots of time is spent in the
logging phase.

I will remove it from the default implementation.

UpdateRequestProcessorFactory - process requests before submitting them
---

Key: SOLR-269
URL: https://issues.apache.org/jira/browse/SOLR-269
Project: Solr
Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Fix For: 1.3

Attachments: SOLR-269-UpdateRequestProcessorFactory.patch

A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit.
An UpdateRequestProcessor lets clients plug in logic after a document has
been parsed and before it has been 'updated' with the index. This is a good
place to add custom logic for:
* transforming the document fields
* fine grained authorization (can user X updated document Y?)
* allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler
str
name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
lst name=update.processor.args
... (optionally pass in arguments to the factory init method) ...
/lst
/requestHandler
http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-160) [Patch] Get Test Solr Server working in Windows environment

2007-07-02 Thread Mel Riffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mel Riffe resolved SOLR-160.


Resolution: Fixed

let me know if anyone else has problems with this patch; i'm moving the 
majority (99%) of my Rails development to the Mac but will support as needed.

 [Patch] Get Test Solr Server working in Windows environment
 ---

 Key: SOLR-160
 URL: https://issues.apache.org/jira/browse/SOLR-160
 Project: Solr
  Issue Type: Improvement
  Components: clients - ruby - flare
 Environment: Windows XP Home
Reporter: Mel Riffe
 Attachments: win32_functional_tests.patch, 
 win32_functional_tests.results


 Because Windows does not support forking processes I created a patch that 
 uses the Win32 api to create and destroy a process to control the test solr 
 server.
 I have attached two files: 1) the patch and 2) the results from running 'rake 
 test'  In my environment I have two failures.
 My approach was to still support the including/requiring of the single file 
 test/functional/test_solr_server.rb and have it further require the platform 
 dependent start/stop api.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Yonik Seeley (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509737
]

Yonik Seeley commented on SOLR-269:
---

I need something that is easily subclassed and can cleanly holds state across
an entire request cycle.

Having a factory and separate object so that one can use core instead of
req.getCore(), etc, seems like overkill for the normal case though since
getCore(), getSchema(), getUpdateHandler() all just return instance variables.
I was thinking any state like that could be on the UpdateCommand.

I'd like to have potentially several request processors, but if people start
doing single doc add requests, instantiating and initializing all those request
processors will get expensive.

I do see your usecase though, in the case of multiple docs per add and you have
some expensive state you only want to calculate once.
If it's a relatively rare case, one could put it in the request context.
The tradeoff would be an extra hash lookup per-document of a multi-document add
vs an extra object creation for single-doc adds.

Different Q on usage: is this where my document mutator stuff should go??? If
I want a transformation done on a field, regardless of where the data is coming
from (XML update handler, CSV update handler, future REST update handler, etc),
how should that be done? Is there a single place I can register a plugin to do
this, and is UpdateRequestProcessor where you see it happening?

UpdateRequestProcessorFactory - process requests before submitting them
---

Key: SOLR-269
URL: https://issues.apache.org/jira/browse/SOLR-269
Project: Solr
Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Fix For: 1.3

Attachments: SOLR-269-UpdateRequestProcessorFactory.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-139) Support updateable/modifiable documents

Solr nightly build failure

Build failed in Hudson: Solr-Nightly #130

[jira] Closed: (SOLR-163) libxml/rexml-related test case failure

[jira] Updated: (SOLR-280) slightly more efficient SolrDocument implementation

[jira] Commented: (SOLR-280) slightly more efficient SolrDocument implementation

[jira] Commented: (SOLR-139) Support updateable/modifiable documents

[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Re: removing most @author tags

[jira] Commented: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

Re: removing most @author tags

[jira] Created: (SOLR-284) Parsing Rich Document Types

[jira] Updated: (SOLR-284) Parsing Rich Document Types

[jira] Commented: (SOLR-284) Parsing Rich Document Types

[jira] Updated: (SOLR-284) Parsing Rich Document Types

Re: removing most @author tags

[jira] Commented: (SOLR-225) Allow pluggable Highlighting classes -- Formatters and Fragmenters

[jira] Issue Comment Edited: (SOLR-225) Allow pluggable Highlighting classes -- Formatters and Fragmenters

[jira] Updated: (SOLR-285) Server Side XSLT for update processing

[jira] Commented: (SOLR-225) Allow pluggable Highlighting classes -- Formatters and Fragmenters

Re: removing most @author tags

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

[jira] Commented: (SOLR-225) Allow pluggable Highlighting classes -- Formatters and Fragmenters

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

[jira] Resolved: (SOLR-160) [Patch] Get Test Solr Server working in Windows environment

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

28 matches

Site Navigation

Mail list logo

Footer information