Thanks! So solr 4.7 does not seem to respect the luceneMatchVersion on the binary (index) level. Or perhaps, I misunderstand the meaning of the luceneMatchVersion.
This is what I see when loading index from hdfs via luke and launching the Index Checker tool: [clip] Segments file=segments_2 numSegments=1 version=4.7 format= userData={commitTimeMSec=1397157712399} 1 of 1: name=_0 docCount=82 codec=Lucene46 compound=false numFiles=10 size (MB)=0.027 diagnostics = {timestamp=1397157712512, os=Linux, os.version=3.2.0-61-generic, source=flush, lucene.version=4.7.0 1570806 - simon - 2014-02-22 08:25:23, os.arch=amd64, java.version=1.7.0_51, java.vendor=Oracle Corporation} no deletions test: open reader.........OK test: fields..............OK [11 fields] test: field norms.........OK [0 fields] test: terms, freq, prox...OK [1161 terms; 2949 terms/docs pairs; 2768 tokens] test: stored fields.......OK [902 total field count; avg 11 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...........OK [1 docvalues fields; 0 BINARY; 0 NUMERIC; 1 SORTED; 0 SORTED_SET] No problems were detected with this index. [/clip] I wonder whether there is any possibility of defining the version of the codec in solr config/schema. Dmitry On Thu, Apr 10, 2014 at 11:58 PM, Wolfgang Hoschek <whosc...@cloudera.com>wrote: > There's no such other location in there. BTW, you can disable the mtree > merge via --reducers=-2 (or --reducers=0 in old versions) . > > Wolfgang. > > On Apr 10, 2014, at 3:44 PM, Dmitry Kan <solrexp...@gmail.com> wrote: > > > a correction: actually when I tested the above change I had so little > data, > > that it didn't trigger sub-shard slicing and thus merging of the slices. > > Still, looks as if somewhere in the map-reduce contrib code there is a > > "link" to what lucene version to use. > > > > Wolfgang, do you happen to know where that other Version.* is specified? > > > > > > On Thu, Apr 10, 2014 at 12:59 PM, Dmitry Kan <solrexp...@gmail.com> > wrote: > > > >> Thanks for responding, Wolfgang. > >> > >> Changing to LUCENE_43: > >> > >> IndexWriterConfig writerConfig = new > IndexWriterConfig(Version.LUCENE_43, > >> null); > >> > >> didn't affect on the index format version, because, I believe, if the > >> format of the index to merge has been of higher version (4.1 in this > case), > >> it will merge to the same and not lower version (4.0). But format > version > >> certainly could be read from the solrconfig, you are right. > >> > >> Dmitry > >> > >> > >> On Wed, Apr 9, 2014 at 11:51 PM, Wolfgang Hoschek < > whosc...@cloudera.com>wrote: > >> > >>> There is a current limitation in that the code doesn't actually look > into > >>> solrconfig.xml for the version. We should fix this, indeed. See > >>> > >>> > >>> > https://github.com/apache/lucene-solr/blob/trunk/solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/TreeMergeOutputFormat.java#L100-101 > >>> > >>> Wolfgang. > >>> > >>> On Apr 8, 2014, at 11:49 AM, Dmitry Kan <solrexp...@gmail.com> wrote: > >>> > >>>> Hello, > >>>> > >>>> When we instantiate the MapReduceIndexerTool with the collections' > conf > >>>> directory, we expect, that the Lucene version is respected and the > index > >>>> gets generated in a format compatible with the defined version. > >>>> > >>>> This does not seem to happen, however. > >>>> > >>>> Checking with luke: > >>>> > >>>> the expected Lucene index format: Lucene 4.0 > >>>> the output Lucene index format: Lucene 4.1 > >>>> > >>>> Can anybody shed some light onto the semantics behind specifying the > >>> Lucene > >>>> version in this context? Does this have something to do with what > >>> version > >>>> of solr core is used by the morphline library? > >>>> > >>>> Thanks, > >>>> > >>>> Dmitry > >>>> > >>>> ---------- Forwarded message ---------- > >>>> > >>>> Dear list, > >>>> > >>>> We have been generating solr indices with the solr-hadoop contrib > module > >>>> (SOLR-1301). Our current solr in use is of 4.3.1 version. Is there any > >>> tool > >>>> that could do the backward conversion, i.e. 4.7->4.3.1? Or is the > >>> upgrade > >>>> the only way to go? > >>>> > >>>> -- > >>>> Dmitry > >>>> Blog: http://dmitrykan.blogspot.com > >>>> Twitter: http://twitter.com/dmitrykan > >>>> > >>>> > >>>> > >>>> -- > >>>> Dmitry > >>>> Blog: http://dmitrykan.blogspot.com > >>>> Twitter: http://twitter.com/dmitrykan > >>> > >>> > >> > >> > >> -- > >> Dmitry > >> Blog: http://dmitrykan.blogspot.com > >> Twitter: http://twitter.com/dmitrykan > >> > > > > > > > > -- > > Dmitry > > Blog: http://dmitrykan.blogspot.com > > Twitter: http://twitter.com/dmitrykan > > -- Dmitry Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan