msokolov commented on a change in pull request #2395:
URL: https://github.com/apache/lucene-solr/pull/2395#discussion_r578690516



##########
File path: lucene/backward-codecs/README.md
##########
@@ -0,0 +1,54 @@
+# Index backwards compatibility
+
+This README describes the approach to maintaining compatibility with indices
+from previous versions and gives guidelines for making format changes.
+
+## Compatibility strategy
+
+Codecs and file formats are versioned according to the minor version in which
+they were created. For example Lucene87Codec represents the codec used for
+creating Lucene 8.7 indices, and potentially later index versions too. Each
+segment records the codec version that was used to write it.
+
+Lucene supports the ability to read segments created in older versions by
+maintaining old codec classes. These older codecs live in the backwards-codecs
+package along with their file formats. When making a change to a file format,
+we create a fresh copies of the codec and format, and move the existing ones
+into backwards-codecs.
+
+Older codecs are tested in two ways:
+* Through unit tests like TestLucene80NormsFormat, which checks we can write
+then read data using each old format
+* Through TestBackwardsCompatibility, which loads indices created in previous
+versions and checks that we can search them
+
+## Making index format changes
+
+As an example, let's say we're making a change to the norms file format, and
+the current class in core is Lucene80NormsFormat. We'd perform the following
+steps:
+
+1. Create a new format with the target version for the changes, for example
+Lucene90NormsFormat. This includes creating copies of its writer and reader
+classes, as well as any helper classes. Make sure to copy unit tests too, like
+TestLucene80NormsFormat.
+2. Move the old Lucene80NormsFormat, along with its writer, reader, tests, and
+helper classes to the backwards-codecs package. If the format will only be
+used for reading, then delete the write-side logic and move it to a test-only
+class like Lucene80RWNormsFormat to support unit tests. Note that most formats
+only need read logic, but a small set including DocValuesFormat and
+FieldInfosFormat will need to retain write logic since can be used to update

Review comment:
       "since they can be used"

##########
File path: lucene/backward-codecs/README.md
##########
@@ -0,0 +1,54 @@
+# Index backwards compatibility
+
+This README describes the approach to maintaining compatibility with indices
+from previous versions and gives guidelines for making format changes.
+
+## Compatibility strategy
+
+Codecs and file formats are versioned according to the minor version in which
+they were created. For example Lucene87Codec represents the codec used for
+creating Lucene 8.7 indices, and potentially later index versions too. Each
+segment records the codec version that was used to write it.
+
+Lucene supports the ability to read segments created in older versions by
+maintaining old codec classes. These older codecs live in the backwards-codecs
+package along with their file formats. When making a change to a file format,
+we create a fresh copies of the codec and format, and move the existing ones

Review comment:
       "we create fresh copies" (strike "a")




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to