[jira] Updated: (SOLR-319) changes SynonymFilterFactory for N-gram tokenizer
[ https://issues.apache.org/jira/browse/SOLR-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-319: Attachment: SOLR-319-UTF-8.patch The patch includes TestSynonymMap. To test SynonymMap, I removed "private" declaration from parseRules() method. This patch includes CJKTokenizerFactory, too. > changes SynonymFilterFactory for N-gram tokenizer > - > > Key: SOLR-319 > URL: https://issues.apache.org/jira/browse/SOLR-319 > Project: Solr > Issue Type: Improvement >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: SOLR-319-UTF-8.patch > > > WHAT: > Currently, SynonymFilterFactory works very well with N-gram tokenizer > (CJKTokenizer, for example). > But we have to take care of the statement in synonyms.txt. > For example, if I use CJKTokenizer (work as bi-gram for CJK chars) and want > C1C2C3 maps to C4C5C6, > I have to write the rule as follows: > C1C2 C2C3 => C4C5 C5C6 > But I want to write it "C1C2C3=>C4C5C6". This patch allows it. It is also > helpful for sharing synonyms.txt. > HOW: > tokenFactory attribute is added to class="solr.SynonymFilterFactory"/>. > If the attribute is specified, SynonymFilterFactory uses the TokenizerFactory > to create Tokenizer. > Then SynonymFilterFactory uses the Tokenizer to get tokens from the rules in > synonyms.txt file. > sample-1: CJKTokenizer > positionIncrementGap="100"> > > > synonyms="ngram_synonym_test_ja.txt" > ignoreCase="true" expand="true" > tokenFactory="solr.CJKTokenizerFactory"/> > > > > > > > > sample-2: NGramTokenizer > positionIncrementGap="100"> > > maxGramSize="2"/> > > > > maxGramSize="2"/> > synonyms="ngram_synonym_test_ngram.txt" > ignoreCase="true" expand="true" > tokenFactory="solr.NGramTokenizerFactory" > minGramSize="2" maxGramSize="2"/> > > > > backward compatibility: > Yes. If you omit tokenFactory attribute from class="solr.SynonymFilterFactory"/> tag, it works as usual. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-319) changes SynonymFilterFactory for N-gram tokenizer
changes SynonymFilterFactory for N-gram tokenizer - Key: SOLR-319 URL: https://issues.apache.org/jira/browse/SOLR-319 Project: Solr Issue Type: Improvement Reporter: Koji Sekiguchi Priority: Minor WHAT: Currently, SynonymFilterFactory works very well with N-gram tokenizer (CJKTokenizer, for example). But we have to take care of the statement in synonyms.txt. For example, if I use CJKTokenizer (work as bi-gram for CJK chars) and want C1C2C3 maps to C4C5C6, I have to write the rule as follows: C1C2 C2C3 => C4C5 C5C6 But I want to write it "C1C2C3=>C4C5C6". This patch allows it. It is also helpful for sharing synonyms.txt. HOW: tokenFactory attribute is added to . If the attribute is specified, SynonymFilterFactory uses the TokenizerFactory to create Tokenizer. Then SynonymFilterFactory uses the Tokenizer to get tokens from the rules in synonyms.txt file. sample-1: CJKTokenizer sample-2: NGramTokenizer backward compatibility: Yes. If you omit tokenFactory attribute from tag, it works as usual. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-139) Support updateable/modifiable documents
[ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515424 ] Mike Klaas commented on SOLR-139: - Darn, you're right: writer.addDocument() is outside of the synchronized block. We could do as you suggested, downgrading to a read lock from commit. It should only reduce concurrently when the document is in pending state. > Support updateable/modifiable documents > --- > > Key: SOLR-139 > URL: https://issues.apache.org/jira/browse/SOLR-139 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Attachments: getStoredFields.patch, getStoredFields.patch, > getStoredFields.patch, getStoredFields.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, > SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, > SOLR-269+139-ModifiableDocumentUpdateProcessor.patch > > > It would be nice to be able to update some fields on a document without > having to insert the entire document. > Given the way lucene is structured, (for now) one can only modify stored > fields. > While we are at it, we can support incrementing an existing value - I think > this only makes sense for numbers. > for background, see: > http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-139) Support updateable/modifiable documents
[ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515406 ] Yonik Seeley commented on SOLR-139: --- The locking logic for getStoredFields() is indeed flawed. closing the writer inside the sync block of getStoredFields() doesn't project callers of addDoc() from concurrently using that writer. The commit lock aquire will be needed after all... no getting around it I think. > Support updateable/modifiable documents > --- > > Key: SOLR-139 > URL: https://issues.apache.org/jira/browse/SOLR-139 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Attachments: getStoredFields.patch, getStoredFields.patch, > getStoredFields.patch, getStoredFields.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, > SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, > SOLR-269+139-ModifiableDocumentUpdateProcessor.patch > > > It would be nice to be able to update some fields on a document without > having to insert the entire document. > Given the way lucene is structured, (for now) one can only modify stored > fields. > While we are at it, we can support incrementing an existing value - I think > this only makes sense for numbers. > for background, see: > http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-139) Support updateable/modifiable documents
[ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515366 ] Yonik Seeley commented on SOLR-139: --- I disabled logging on all of "org.apache.solr" via a filter, and voila, OOM problems are gone. Perhaps the logger could not keep up with the number of records and they piled up over time time (does any component of the logging framework use another thread that might be getting starved?) Anyway, it doesn't look like Solr has a memory leak. On to the next issue. > Support updateable/modifiable documents > --- > > Key: SOLR-139 > URL: https://issues.apache.org/jira/browse/SOLR-139 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Attachments: getStoredFields.patch, getStoredFields.patch, > getStoredFields.patch, getStoredFields.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, > SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, > SOLR-269+139-ModifiableDocumentUpdateProcessor.patch > > > It would be nice to be able to update some fields on a document without > having to insert the entire document. > Given the way lucene is structured, (for now) one can only modify stored > fields. > While we are at it, we can support incrementing an existing value - I think > this only makes sense for numbers. > for background, see: > http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-139) Support updateable/modifiable documents
[ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515306 ] Yonik Seeley commented on SOLR-139: --- OOM still happens from the command line also after lucene updates to 2.2. Looks like it's time for old-school instrumentation (printfs, etc). > Support updateable/modifiable documents > --- > > Key: SOLR-139 > URL: https://issues.apache.org/jira/browse/SOLR-139 > Project: Solr > Issue Type: Improvement > Components: update >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Attachments: getStoredFields.patch, getStoredFields.patch, > getStoredFields.patch, getStoredFields.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, > SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, > SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, > SOLR-269+139-ModifiableDocumentUpdateProcessor.patch > > > It would be nice to be able to update some fields on a document without > having to insert the entire document. > Given the way lucene is structured, (for now) one can only modify stored > fields. > While we are at it, we can support incrementing an existing value - I think > this only makes sense for numbers. > for background, see: > http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-317) A XSLT stylesheet that pretty-prints out the response from the LukeRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Peuss updated SOLR-317: -- Attachment: prettyluke.xsl SVG graphics tuning. > A XSLT stylesheet that pretty-prints out the response from the > LukeRequestHandler > - > > Key: SOLR-317 > URL: https://issues.apache.org/jira/browse/SOLR-317 > Project: Solr > Issue Type: Improvement > Components: web gui >Reporter: Thomas Peuss >Priority: Minor > Attachments: prettyluke.xsl, prettyluke.xsl, prettyluke.xsl > > > A first version of a XSLT stylesheet for pretty printing the response from > the LukeRequestHandler. It uses inline SVG graphics for the histograms if you > are on Firefox, Safari or Opera. On IE you get a list for the histograms. > When you put it in /admin you can try it with > http://localhost:8080/apache-solr-1.3-dev/admin/luke?stylesheet=../apache-solr-1.3-dev/admin/prettyluke -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Solr-Nightly #153
See http://lucene.zones.apache.org:8080/hudson/job/Solr-Nightly/153/changes -- [...truncated 900 lines...] A client/ruby/solr-ruby/test/unit/field_test.rb AUclient/ruby/solr-ruby/test/unit/solr_mock_base.rb A client/ruby/solr-ruby/test/unit/add_document_test.rb AUclient/ruby/solr-ruby/test/unit/request_test.rb A client/ruby/solr-ruby/test/unit/commit_test.rb AUclient/ruby/solr-ruby/test/unit/xpath_mapper_test.rb AUclient/ruby/solr-ruby/test/unit/suite.rb A client/ruby/solr-ruby/test/unit/ping_test.rb A client/ruby/solr-ruby/test/unit/dismax_request_test.rb A client/ruby/solr-ruby/test/unit/response_test.rb AUclient/ruby/solr-ruby/test/unit/indexer_test.rb AUclient/ruby/solr-ruby/test/unit/connection_test.rb A client/ruby/solr-ruby/test/unit/delete_test.rb AUclient/ruby/solr-ruby/test/unit/tab_delimited.txt A client/ruby/solr-ruby/test/unit/hpricot_test_file.xml AUclient/ruby/solr-ruby/test/unit/standard_request_test.rb A client/ruby/solr-ruby/test/unit/hpricot_mapper_test.rb AUclient/ruby/solr-ruby/test/unit/data_mapper_test.rb AUclient/ruby/solr-ruby/test/unit/util_test.rb A client/ruby/solr-ruby/test/functional A client/ruby/solr-ruby/test/functional/test_solr_server.rb A client/ruby/solr-ruby/test/functional/server_test.rb A client/ruby/solr-ruby/test/conf AUclient/ruby/solr-ruby/test/conf/schema.xml A client/ruby/solr-ruby/test/conf/protwords.txt A client/ruby/solr-ruby/test/conf/stopwords.txt AUclient/ruby/solr-ruby/test/conf/solrconfig.xml A client/ruby/solr-ruby/test/conf/scripts.conf A client/ruby/solr-ruby/test/conf/admin-extra.html A client/ruby/solr-ruby/test/conf/synonyms.txt A client/ruby/solr-ruby/LICENSE.txt A client/ruby/solr-ruby/Rakefile A client/ruby/solr-ruby/script AUclient/ruby/solr-ruby/script/setup.rb AUclient/ruby/solr-ruby/script/solrshell A client/ruby/solr-ruby/lib A client/ruby/solr-ruby/lib/solr AUclient/ruby/solr-ruby/lib/solr/util.rb A client/ruby/solr-ruby/lib/solr/document.rb A client/ruby/solr-ruby/lib/solr/exception.rb AUclient/ruby/solr-ruby/lib/solr/indexer.rb AUclient/ruby/solr-ruby/lib/solr/response.rb AUclient/ruby/solr-ruby/lib/solr/connection.rb A client/ruby/solr-ruby/lib/solr/importer AUclient/ruby/solr-ruby/lib/solr/importer/delimited_file_source.rb AUclient/ruby/solr-ruby/lib/solr/importer/solr_source.rb AUclient/ruby/solr-ruby/lib/solr/importer/array_mapper.rb AUclient/ruby/solr-ruby/lib/solr/importer/mapper.rb AUclient/ruby/solr-ruby/lib/solr/importer/xpath_mapper.rb A client/ruby/solr-ruby/lib/solr/importer/hpricot_mapper.rb A client/ruby/solr-ruby/lib/solr/xml.rb AUclient/ruby/solr-ruby/lib/solr/importer.rb A client/ruby/solr-ruby/lib/solr/field.rb AUclient/ruby/solr-ruby/lib/solr/solrtasks.rb A client/ruby/solr-ruby/lib/solr/request A client/ruby/solr-ruby/lib/solr/request/ping.rb A client/ruby/solr-ruby/lib/solr/request/select.rb AUclient/ruby/solr-ruby/lib/solr/request/optimize.rb AUclient/ruby/solr-ruby/lib/solr/request/standard.rb A client/ruby/solr-ruby/lib/solr/request/delete.rb AUclient/ruby/solr-ruby/lib/solr/request/index_info.rb A client/ruby/solr-ruby/lib/solr/request/update.rb A client/ruby/solr-ruby/lib/solr/request/dismax.rb A client/ruby/solr-ruby/lib/solr/request/add_document.rb A client/ruby/solr-ruby/lib/solr/request/commit.rb A client/ruby/solr-ruby/lib/solr/request/base.rb AUclient/ruby/solr-ruby/lib/solr/request.rb A client/ruby/solr-ruby/lib/solr/response A client/ruby/solr-ruby/lib/solr/response/ping.rb AUclient/ruby/solr-ruby/lib/solr/response/optimize.rb A client/ruby/solr-ruby/lib/solr/response/standard.rb A client/ruby/solr-ruby/lib/solr/response/xml.rb A client/ruby/solr-ruby/lib/solr/response/ruby.rb A client/ruby/solr-ruby/lib/solr/response/delete.rb AUclient/ruby/solr-ruby/lib/solr/response/index_info.rb A client/ruby/solr-ruby/lib/solr/response/dismax.rb A client/ruby/solr-ruby/lib/solr/response/add_document.rb A client/ruby/solr-ruby/lib/solr/response/commit.rb A client/ruby/solr-ruby/lib/solr/response/base.rb AUclient/ruby/solr-ruby/lib/solr.rb A client/ruby/solr-ruby/CHANGES.yml A client/ruby/solr-ruby/README A client/ruby/solr-ruby/examples A client/ruby/solr-ruby/examples/marc AUclient/ruby/solr-ruby/examples/marc/marc_importer.rb A client/ruby/solr-ruby/examples/delicious_library A client/ruby/s
PHP Response Writer for Solr
I've been using the proposed PHP response writer code from SOLR-196 (eval-able php code) and SOLR-275 (serialized php data) for some time now and would like to work towards getting these included in the main Solr distribution. http://www.nabble.com/Created%3A-%28SOLR-196%29-A-PHP-response-writer-for-Solr-tf3458434.html http://www.nabble.com/-jira--Created%3A-%28SOLR-275%29-PHP-Serialized-Response-Writer-tf3980951.html There is quite a bit of code duplication in SOLR-196 which I'd like to eliminate if possible, and due to the way php serializes data (e.g. storing the number of elements in an array) the JSONWriter may have to be refactored (specifically where arrays are written directly using writer.write('{') and writer.write('}') rather than writeArray() method. In order to differentiate between the two, I propose we rename the serialized writer to PHPSerializedResponseWriter to avoid any conflicts with the original "eval" PHPResponseWriter and configure them as such: I'd just like to get some opinion / feedback on the above and also to figure out the best approach to achieving this goal before I start making any changes. thanks, Pieter