RE: problem adding new fields in DIH
Thanks for the explanation and bug report Robert! -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, July 09, 2012 3:18 PM To: solr-user@lucene.apache.org Subject: Re: problem adding new fields in DIH Thanks again for reporting this Brent. I opened a JIRA issue: https://issues.apache.org/jira/browse/SOLR-3610 On Mon, Jul 9, 2012 at 3:36 PM, Brent Mills bmi...@uship.com wrote: We're having an issue when we add or change a field in the db-data-config.xml and schema.xml files in solr. Basically whenever I add something new to index I add it to the database, then the data config, then add the field to the schema to index, reload the core, and do a full import. This has worked fine until we upgraded to an iteration of 4.0 (we are currently on 4.0 alpha). Now sometimes when we go through this process solr throws errors about the field not being found. The only way to fix this is to restart tomcat and everything immediately starts working fine again. The interesting thing is that this is only a problem if the database is returning a value for that field and only in the documents that have a value. The field shows up in the schema browser in solr, it just has no data in it. If I completely remove it from the database but leave it in the schema and dataconfig files there is no issue. Also of note, this is happening on 2 different machines. Here's the trace SEVERE: Exception while solr commit. java.lang.IllegalArgumentException: no such field test at org.apache.solr.core.DefaultCodecFactory$1.getPostingsFormatForField(DefaultCodecFactory.java:49) at org.apache.lucene.codecs.lucene40.Lucene40Codec$1.getPostingsFormatForField(Lucene40Codec.java:52) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:94) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:480) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:554) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2547) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2683) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2663) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:414) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) at org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919) at org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) at org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:107) at org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:304) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:256) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:399) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:380) -- lucidimagination.com
Re: problem adding new fields in DIH
Hi Brent, Ordinarily when you make a change to schema.xml, that should be accompanied by a core wipe and reindex. I think you may have been lucking out thus far. Michael Della Bitta Appinions, Inc. -- Where Influence Isn’t a Game. http://www.appinions.com On Mon, Jul 9, 2012 at 3:36 PM, Brent Mills bmi...@uship.com wrote: We're having an issue when we add or change a field in the db-data-config.xml and schema.xml files in solr. Basically whenever I add something new to index I add it to the database, then the data config, then add the field to the schema to index, reload the core, and do a full import. This has worked fine until we upgraded to an iteration of 4.0 (we are currently on 4.0 alpha). Now sometimes when we go through this process solr throws errors about the field not being found. The only way to fix this is to restart tomcat and everything immediately starts working fine again. The interesting thing is that this is only a problem if the database is returning a value for that field and only in the documents that have a value. The field shows up in the schema browser in solr, it just has no data in it. If I completely remove it from the database but leave it in the schema and dataconfig files there is no issue. Also of note, this is happening on 2 different machines. Here's the trace SEVERE: Exception while solr commit. java.lang.IllegalArgumentException: no such field test at org.apache.solr.core.DefaultCodecFactory$1.getPostingsFormatForField(DefaultCodecFactory.java:49) at org.apache.lucene.codecs.lucene40.Lucene40Codec$1.getPostingsFormatForField(Lucene40Codec.java:52) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:94) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:480) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:554) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2547) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2683) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2663) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:414) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) at org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919) at org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) at org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:107) at org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:304) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:256) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:399) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:380)
Re: problem adding new fields in DIH
Hello, This is because Solr's Codec implementation defers to the schema, to determine how the field should be indexed. When a core is reloaded, the IndexWriter is not closed but the existing writer is kept around: so you are basically trying to index to the old version of schema before the reload. I feel like we should fix this, but I only have two ideas: 1. turn off per-field codec support by default, so that if you want to e.g. set a field to use MemoryPostingsFormat or Pulsing, you must explicitly enable a per-field codec configuration in solrconfig.xml. This would parallel how Similarity works, and is probably ok since this is pretty expert stuff. Then you would have no issues, but if someone wanted per-field codec support they would have to make the tradeoff that reloading a core still leaves them indexing with the old configuration. 2. close and reopen the indexwriter on core reloads. On Mon, Jul 9, 2012 at 3:36 PM, Brent Mills bmi...@uship.com wrote: We're having an issue when we add or change a field in the db-data-config.xml and schema.xml files in solr. Basically whenever I add something new to index I add it to the database, then the data config, then add the field to the schema to index, reload the core, and do a full import. This has worked fine until we upgraded to an iteration of 4.0 (we are currently on 4.0 alpha). Now sometimes when we go through this process solr throws errors about the field not being found. The only way to fix this is to restart tomcat and everything immediately starts working fine again. The interesting thing is that this is only a problem if the database is returning a value for that field and only in the documents that have a value. The field shows up in the schema browser in solr, it just has no data in it. If I completely remove it from the database but leave it in the schema and dataconfig files there is no issue. Also of note, this is happening on 2 different machines. Here's the trace SEVERE: Exception while solr commit. java.lang.IllegalArgumentException: no such field test at org.apache.solr.core.DefaultCodecFactory$1.getPostingsFormatForField(DefaultCodecFactory.java:49) at org.apache.lucene.codecs.lucene40.Lucene40Codec$1.getPostingsFormatForField(Lucene40Codec.java:52) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:94) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:480) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:554) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2547) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2683) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2663) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:414) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) at org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919) at org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) at org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:107) at org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:304) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:256) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:399) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:380) -- lucidimagination.com
Re: problem adding new fields in DIH
Thanks again for reporting this Brent. I opened a JIRA issue: https://issues.apache.org/jira/browse/SOLR-3610 On Mon, Jul 9, 2012 at 3:36 PM, Brent Mills bmi...@uship.com wrote: We're having an issue when we add or change a field in the db-data-config.xml and schema.xml files in solr. Basically whenever I add something new to index I add it to the database, then the data config, then add the field to the schema to index, reload the core, and do a full import. This has worked fine until we upgraded to an iteration of 4.0 (we are currently on 4.0 alpha). Now sometimes when we go through this process solr throws errors about the field not being found. The only way to fix this is to restart tomcat and everything immediately starts working fine again. The interesting thing is that this is only a problem if the database is returning a value for that field and only in the documents that have a value. The field shows up in the schema browser in solr, it just has no data in it. If I completely remove it from the database but leave it in the schema and dataconfig files there is no issue. Also of note, this is happening on 2 different machines. Here's the trace SEVERE: Exception while solr commit. java.lang.IllegalArgumentException: no such field test at org.apache.solr.core.DefaultCodecFactory$1.getPostingsFormatForField(DefaultCodecFactory.java:49) at org.apache.lucene.codecs.lucene40.Lucene40Codec$1.getPostingsFormatForField(Lucene40Codec.java:52) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:94) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:480) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:554) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2547) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2683) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2663) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:414) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) at org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919) at org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) at org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:107) at org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:304) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:256) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:399) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:380) -- lucidimagination.com