Any thought about this issue: Solr on HDFS generate empty tlog when add documents without commit.
Thanks, Tom On Fri, Jul 18, 2014 at 12:21 PM, Tom Chen <tomchen1...@gmail.com> wrote: > Hi, > > This seems a bug for Solr running on HDFS. > > Reproduce steps: > 1) Setup Solr to run on HDFS like this: > > java -Dsolr.directoryFactory=HdfsDirectoryFactory > -Dsolr.lock.type=hdfs > -Dsolr.hdfs.home=hdfs://host:port/path > > For the purpose of this testing, turn off the default auto commit in > solrconfig.xml, i.e. comment out autoCommit like this: > <!-- > <autoCommit> > <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> > <openSearcher>false</openSearcher> > </autoCommit> > --> > > 2) Add a document without commit: > curl "http://localhost:8983/solr/collection1/update?commit=false" -H > "Content-type:text/xml; charset=utf-8" --data-binary "@solr.xml" > > 3) Solr generate empty tlog file (0 file size, the last one ends with 6): > [hadoop@hdtest042 exampledocs]$ hadoop fs -ls > /path/collection1/core_node1/data/tlog > Found 5 items > -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 > /path/collection1/core_node1/data/tlog/tlog.0000000000000000001 > -rw-r--r-- 1 hadoop hadoop 67 2014-07-18 08:47 > /path/collection1/core_node1/data/tlog/tlog.0000000000000000003 > -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 > /path/collection1/core_node1/data/tlog/tlog.0000000000000000004 > -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 > /path/collection1/core_node1/data/tlog/tlog.0000000000000000005 > -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 > /path/collection1/core_node1/data/tlog/tlog.0000000000000000006 > > 4) Simulate Solr crash by killing the process with -9 option. > > 5) restart the Solr process. Observation is that uncommitted document are > not replayed, files in tlog directory are cleaned up. Hence uncommitted > document(s) is lost. > > Am I missing anything or this is a bug? > > BTW, additional observations: > a) If in step 4) Solr is stopped gracefully (i.e. without -9 option), > non-empty tlog file is geneated and after re-starting Solr, uncommitted > document is replayed as expected. > > b) If Solr doesn't run on HDFS (i.e. on local file system), this issue is > not observed either. > > Thanks, > Tom >