[ 
https://issues.apache.org/jira/browse/LUCENE-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076921#comment-14076921
 ] 

Uwe Schindler commented on LUCENE-5850:
---------------------------------------

Hi,

thanks for all the work done already. I am on vacation at the moment, so sorry 
for not responding in time :-) I was talking to Robert and Simon today via 
Google Hangouts (my internet was slow during the day). This issue contains 
several issues, which should be handled separately.

In fact we have actually 2 problems:
- The documentation of the LUCENE_MAIN_VERSION constant is inconsistent with 
its use. Originally this was meant to be the "dot version only", because we 
decided to never change the index format in bugfix versions. Because of this 
the suggested format was "x.y" (and the special case for alpha/beta introduced 
by Robert). Unfortunately in some releases, the RM just search/replaced all 
version numbers and we suddenly got minor versions there that were written into 
the index. This is not really bad at all, although its stupid for file format 
versions (because it did not change). I listened to Simon and Robert - both 
said that we should write the real version into the index file! We do this 
already, but only in the commit metadata, not per segment. We write the full 
version string there, including the RM's name and so on. So we record this 
metadata in the index. But we cannot get the minor version for each segment. So 
we have to decide if we write the bugfix version number to index file 
(x.y.z.a.b.c... as many as you like). If we do this, we need the validation of 
LUCENE_MAIN_VERSION (smoker and -> see below).
- Because of the documentation and missing knowledge, Shai committed a change 
to DocValues that uses Version.parseLeniently() to check the index version. 
Shai already added some catch blocks to work around issues (he mentioned them 
in the comment....!!!), instead of using the officially written version 
comparator in StringUtils. This was a major bug, because in Version 4.9.1, the 
shit would be smoking :-) I am glad that Robert fixed it! Many thanks, you are 
the best!!!

The documentation of the Version class is perfectly fine, maybe we should add a 
not, that this class and parser is only for parsing versions that are 
user-input from config files. Also the Version class is only for passing a hint 
to the API, which behaviour you want to have. The Version class is not for file 
formats and should never serialized or used to switch inside codecs depending 
on file versions. We can fix two things:
- Add StringUtils to Javadocs ({{@see}} with hint) of Version class
- Maybe add a "hack to Version.parseLeniently to strip everything after the 
second dot (it's just a change of regex). By this users can enter 4.9.1, but it 
will return LUCENE_4_9.

I am happy that Mike added a Smoke tester check, because the test in our code 
base was not good enough. But we should fix the version test correctly. Now I 
refer to Robert's complaint about the system property: The system property in 
the test runner was added to prevent the bug we have. This one passes the 
{{-Dversion=...}} constant down to tests, so we can validate 
LUCENE_MAIN_VERSION. Unfortunately the "version" sysprop also contains bullshit 
after the main version, so we just do a "startsWith" check in the test. And 
this is the bug. If LUCENE_MAIN_VERSION=="4.8" and we pass 
{{-Dversion=4.8.1-foobar}} the test passes, because the sysprop starts with the 
MAIN_VERSION.

I think the perfect fix can be done since I fixed common-build to have the main 
version and the appendix separately. "version" is constrcuted in common-build 
from a prefix and suffix:

{code:xml}
  <property name="dev.version.base" value="5.0"/>
  <property name="dev.version.suffix" value="SNAPSHOT"/>
  <property name="dev.version" 
value="${dev.version.base}-${dev.version.suffix}"/>
  <property name="version" value="${dev.version}"/>
{code}

In fact we should change this and pass "dev.version.base" to the test runner 
and do an "equals" check in the testcase. By that we guarantee that 
common-build's version is identical to the LUCENE_MAIN_VERSION.

> Constants#LUCENE_MAIN_VERSION can have broken values 
> -----------------------------------------------------
>
>                 Key: LUCENE-5850
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5850
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: general/build
>    Affects Versions: 4.3.1, 4.5.1
>            Reporter: Simon Willnauer
>             Fix For: 5.0, 4.10
>
>         Attachments: LUCENE-5850.patch, LUCENE-5850_bomb.patch, 
> LUCENE-5850_smoketester.patch
>
>
> Constants#LUCENE_MAIN_VERSION is set to the Lucene Main version and should 
> not contain minor versions. Well this is at least what I thought and to my 
> knowledge what the comments say too. Yet in for instance 4.3.1 and 4.5.1 we 
> broke this such that the version from SegmentsInfo can not be parsed with 
> Version#parseLeniently. IMO we should really add an assertion that this 
> constant doesn't throw an error and / or make the smoketester catch this. to 
> me this is actually a index BWC break. Note that 4.8.1 doesn't have this 
> problem...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to