RE: problem adding new fields in DIH

2012-07-11 Thread Brent Mills
Thanks for the explanation and bug report Robert!

-Original Message-
From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Monday, July 09, 2012 3:18 PM
To: solr-user@lucene.apache.org
Subject: Re: problem adding new fields in DIH

Thanks again for reporting this Brent. I opened a JIRA issue:
https://issues.apache.org/jira/browse/SOLR-3610

On Mon, Jul 9, 2012 at 3:36 PM, Brent Mills bmi...@uship.com wrote:
 We're having an issue when we add or change a field in the db-data-config.xml 
 and schema.xml files in solr.  Basically whenever I add something new to 
 index I add it to the database, then the data config, then add the field to 
 the schema to index, reload the core, and do a full import.  This has worked 
 fine until we upgraded to an iteration of 4.0 (we are currently on 4.0 
 alpha).  Now sometimes when we go through this process solr throws errors 
 about the field not being found.  The only way to fix this is to restart 
 tomcat and everything immediately starts working fine again.

 The interesting thing is that this is only a problem if the database is 
 returning a value for that field and only in the documents that have a value. 
  The field shows up in the schema browser in solr, it just has no data in it. 
  If I completely remove it from the database but leave it in the schema and 
 dataconfig files there is no issue.  Also of note, this is happening on 2 
 different machines.

 Here's the trace

 SEVERE: Exception while solr commit.
 java.lang.IllegalArgumentException: no such field test
 at 
 org.apache.solr.core.DefaultCodecFactory$1.getPostingsFormatForField(DefaultCodecFactory.java:49)
 at 
 org.apache.lucene.codecs.lucene40.Lucene40Codec$1.getPostingsFormatForField(Lucene40Codec.java:52)
 at 
 org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:94)
 at 
 org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335)
 at 
 org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
 at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117)
 at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
 at 
 org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:480)
 at 
 org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422)
 at 
 org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:554)
 at 
 org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2547)
 at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2683)
 at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2663)
 at 
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:414)
 at 
 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82)
 at 
 org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
 at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919)
 at 
 org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154)
 at 
 org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:107)
 at 
 org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:304)
 at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:256)
 at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
 at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:399)
 at 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:380)




-- 
lucidimagination.com


Re: problem adding new fields in DIH

2012-07-09 Thread Michael Della Bitta
Hi Brent,

Ordinarily when you make a change to schema.xml, that should be
accompanied by a core wipe and reindex. I think you may have been
lucking out thus far.

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Mon, Jul 9, 2012 at 3:36 PM, Brent Mills bmi...@uship.com wrote:
 We're having an issue when we add or change a field in the db-data-config.xml 
 and schema.xml files in solr.  Basically whenever I add something new to 
 index I add it to the database, then the data config, then add the field to 
 the schema to index, reload the core, and do a full import.  This has worked 
 fine until we upgraded to an iteration of 4.0 (we are currently on 4.0 
 alpha).  Now sometimes when we go through this process solr throws errors 
 about the field not being found.  The only way to fix this is to restart 
 tomcat and everything immediately starts working fine again.

 The interesting thing is that this is only a problem if the database is 
 returning a value for that field and only in the documents that have a value. 
  The field shows up in the schema browser in solr, it just has no data in it. 
  If I completely remove it from the database but leave it in the schema and 
 dataconfig files there is no issue.  Also of note, this is happening on 2 
 different machines.

 Here's the trace

 SEVERE: Exception while solr commit.
 java.lang.IllegalArgumentException: no such field test
 at 
 org.apache.solr.core.DefaultCodecFactory$1.getPostingsFormatForField(DefaultCodecFactory.java:49)
 at 
 org.apache.lucene.codecs.lucene40.Lucene40Codec$1.getPostingsFormatForField(Lucene40Codec.java:52)
 at 
 org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:94)
 at 
 org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335)
 at 
 org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
 at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117)
 at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
 at 
 org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:480)
 at 
 org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422)
 at 
 org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:554)
 at 
 org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2547)
 at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2683)
 at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2663)
 at 
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:414)
 at 
 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82)
 at 
 org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
 at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919)
 at 
 org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154)
 at 
 org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:107)
 at 
 org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:304)
 at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:256)
 at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
 at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:399)
 at 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:380)



Re: problem adding new fields in DIH

2012-07-09 Thread Robert Muir
Hello,

This is because Solr's Codec implementation defers to the schema, to
determine how the field should be indexed. When a core is reloaded,
the IndexWriter is not closed but the existing writer is kept around:
so you are basically trying to index to the old version of schema
before the reload.

I feel like we should fix this, but I only have two ideas:
1. turn off per-field codec support by default, so that if you want to
e.g. set a field to use MemoryPostingsFormat or Pulsing, you must
explicitly enable a per-field codec configuration in solrconfig.xml.
This would parallel how Similarity works, and is probably ok since
this is pretty expert stuff. Then you would have no issues, but if
someone wanted per-field codec support they would have to make the
tradeoff that reloading a core still leaves them indexing with the old
configuration.
2. close and reopen the indexwriter on core reloads.

On Mon, Jul 9, 2012 at 3:36 PM, Brent Mills bmi...@uship.com wrote:
 We're having an issue when we add or change a field in the db-data-config.xml 
 and schema.xml files in solr.  Basically whenever I add something new to 
 index I add it to the database, then the data config, then add the field to 
 the schema to index, reload the core, and do a full import.  This has worked 
 fine until we upgraded to an iteration of 4.0 (we are currently on 4.0 
 alpha).  Now sometimes when we go through this process solr throws errors 
 about the field not being found.  The only way to fix this is to restart 
 tomcat and everything immediately starts working fine again.

 The interesting thing is that this is only a problem if the database is 
 returning a value for that field and only in the documents that have a value. 
  The field shows up in the schema browser in solr, it just has no data in it. 
  If I completely remove it from the database but leave it in the schema and 
 dataconfig files there is no issue.  Also of note, this is happening on 2 
 different machines.

 Here's the trace

 SEVERE: Exception while solr commit.
 java.lang.IllegalArgumentException: no such field test
 at 
 org.apache.solr.core.DefaultCodecFactory$1.getPostingsFormatForField(DefaultCodecFactory.java:49)
 at 
 org.apache.lucene.codecs.lucene40.Lucene40Codec$1.getPostingsFormatForField(Lucene40Codec.java:52)
 at 
 org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:94)
 at 
 org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335)
 at 
 org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
 at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117)
 at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
 at 
 org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:480)
 at 
 org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422)
 at 
 org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:554)
 at 
 org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2547)
 at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2683)
 at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2663)
 at 
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:414)
 at 
 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82)
 at 
 org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
 at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919)
 at 
 org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154)
 at 
 org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:107)
 at 
 org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:304)
 at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:256)
 at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
 at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:399)
 at 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:380)




-- 
lucidimagination.com


Re: problem adding new fields in DIH

2012-07-09 Thread Robert Muir
Thanks again for reporting this Brent. I opened a JIRA issue:
https://issues.apache.org/jira/browse/SOLR-3610

On Mon, Jul 9, 2012 at 3:36 PM, Brent Mills bmi...@uship.com wrote:
 We're having an issue when we add or change a field in the db-data-config.xml 
 and schema.xml files in solr.  Basically whenever I add something new to 
 index I add it to the database, then the data config, then add the field to 
 the schema to index, reload the core, and do a full import.  This has worked 
 fine until we upgraded to an iteration of 4.0 (we are currently on 4.0 
 alpha).  Now sometimes when we go through this process solr throws errors 
 about the field not being found.  The only way to fix this is to restart 
 tomcat and everything immediately starts working fine again.

 The interesting thing is that this is only a problem if the database is 
 returning a value for that field and only in the documents that have a value. 
  The field shows up in the schema browser in solr, it just has no data in it. 
  If I completely remove it from the database but leave it in the schema and 
 dataconfig files there is no issue.  Also of note, this is happening on 2 
 different machines.

 Here's the trace

 SEVERE: Exception while solr commit.
 java.lang.IllegalArgumentException: no such field test
 at 
 org.apache.solr.core.DefaultCodecFactory$1.getPostingsFormatForField(DefaultCodecFactory.java:49)
 at 
 org.apache.lucene.codecs.lucene40.Lucene40Codec$1.getPostingsFormatForField(Lucene40Codec.java:52)
 at 
 org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:94)
 at 
 org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335)
 at 
 org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
 at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117)
 at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
 at 
 org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:480)
 at 
 org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422)
 at 
 org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:554)
 at 
 org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2547)
 at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2683)
 at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2663)
 at 
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:414)
 at 
 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82)
 at 
 org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
 at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919)
 at 
 org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154)
 at 
 org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:107)
 at 
 org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:304)
 at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:256)
 at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
 at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:399)
 at 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:380)




-- 
lucidimagination.com