Copilot commented on code in PR #4054:
URL: https://github.com/apache/solr/pull/4054#discussion_r3252846433


##########
dev-tools/scripts/smokeTestRelease.py:
##########
@@ -390,6 +394,61 @@ def testChangelogMd(dir, version):
   if 'v%s' % version not in content and version not in content:
     raise RuntimeError('Version %s not found in CHANGELOG.md' % version)
 
+
+def testChangelogFolder(dir, version):
+  "Checks changelog folder structure."
+  changelog_folder = os.path.join(dir, 'changelog')
+
+  if not os.path.exists(changelog_folder):
+    raise RuntimeError('changelog folder not found at %s' % changelog_folder)
+
+  print('  check changelog folder structure...')
+
+  # Check that 'unreleased' folder exists and is empty
+  unreleased_folder = os.path.join(changelog_folder, 'unreleased')
+  if not os.path.exists(unreleased_folder):
+    raise RuntimeError('changelog/unreleased folder not found')
+
+  unreleased_contents = os.listdir(unreleased_folder)
+  # Filter out hidden files like .gitkeep
+  unreleased_contents = [f for f in unreleased_contents if not 
f.startswith('.')]
+  if len(unreleased_contents) > 0:
+    raise RuntimeError('changelog/unreleased folder is not empty, contains: 
%s' % unreleased_contents)
+
+  # Pattern to match version folders (e.g., v9.5.0, v10.0.0 or v10.1.0-beta1)
+  version_pattern = re.compile(r'^v\d+\.\d+\.\d+(-\w+)?$')

Review Comment:
   The version-folder regex `^v\d+\.\d+\.\d+(-\w+)?$` only allows a single 
`\w+` suffix segment (e.g. `v10.1.0-beta1`). Common release qualifiers used 
elsewhere in this script include forms like `-alpha`, `-beta`, and `RC\d+` (see 
`reVersion` on line 536), and pre-release tags such as `v10.1.0-beta-1` (with 
an internal dash) would not match. If such a folder ever appears, the regex 
would silently treat it as a "non-version folder" and skip the 
`release-date.txt` validation entirely instead of erroring. Consider aligning 
this pattern with the version parsing used elsewhere or making it more 
permissive while still requiring `release-date.txt` for all `v*` folders other 
than the one being released.



##########
dev-tools/scripts/smokeTestRelease.py:
##########
@@ -1112,6 +1208,9 @@ def parse_config():
 def main():
   c = parse_config()
 
+  global SKIP_CHANGELOG
+  SKIP_CHANGELOG = c.skip_changelog

Review Comment:
   Module-level state is mutated via `global SKIP_CHANGELOG` from `main()` to 
communicate the CLI flag to `checkChangesContent`/`testChangelogFolder`. This 
makes the functions harder to test in isolation and couples them to CLI 
parsing. Threading the flag through function parameters (e.g. adding a 
`skip_changelog` argument to `checkChangesContent`, `testChanges`, 
`verifyUnpacked`, and `testChangelogMd`) would be cleaner and consistent with 
how other configuration values flow through this script via `c`/`testArgs`.



##########
dev-tools/scripts/smokeTestRelease.py:
##########
@@ -410,7 +501,7 @@ def checkChangesContent(s, version, name, isHTML):
   if m is not None:
     raise RuntimeError('incorrect issue (_ instead of -) in %s: %s' % (name, 
m.group(1)))
 
-  if s.lower().find('not yet released') != -1:
+  if not SKIP_CHANGELOG and s.lower().find('not yet released') != -1:
     raise RuntimeError('saw "not yet released" in %s' % name)

Review Comment:
   The `--skip-changelog` flag now also suppresses pre-existing checks that 
previously always ran: the `'Release %s'` substring check (line 460) and the 
`'not yet released'` substring check (line 504). These checks are unrelated to 
the new logchange-generated h2-header validation and are useful safety nets 
even when the new logchange tooling has not been run. Consider gating only the 
newly added h2-header validation behind `SKIP_CHANGELOG`, and leaving the 
existing assertions intact so passing `--skip-changelog` does not 
unintentionally weaken existing release verification.



##########
dev-tools/scripts/smokeTestRelease.py:
##########
@@ -390,6 +394,61 @@ def testChangelogMd(dir, version):
   if 'v%s' % version not in content and version not in content:
     raise RuntimeError('Version %s not found in CHANGELOG.md' % version)
 
+
+def testChangelogFolder(dir, version):
+  "Checks changelog folder structure."
+  changelog_folder = os.path.join(dir, 'changelog')
+
+  if not os.path.exists(changelog_folder):
+    raise RuntimeError('changelog folder not found at %s' % changelog_folder)
+
+  print('  check changelog folder structure...')
+
+  # Check that 'unreleased' folder exists and is empty
+  unreleased_folder = os.path.join(changelog_folder, 'unreleased')
+  if not os.path.exists(unreleased_folder):
+    raise RuntimeError('changelog/unreleased folder not found')
+
+  unreleased_contents = os.listdir(unreleased_folder)
+  # Filter out hidden files like .gitkeep
+  unreleased_contents = [f for f in unreleased_contents if not 
f.startswith('.')]
+  if len(unreleased_contents) > 0:
+    raise RuntimeError('changelog/unreleased folder is not empty, contains: 
%s' % unreleased_contents)
+
+  # Pattern to match version folders (e.g., v9.5.0, v10.0.0 or v10.1.0-beta1)
+  version_pattern = re.compile(r'^v\d+\.\d+\.\d+(-\w+)?$')
+
+  # Check all subdirectories in changelog
+  for entry in os.listdir(changelog_folder):
+    entry_path = os.path.join(changelog_folder, entry)
+
+    # Skip if not a directory
+    if not os.path.isdir(entry_path):
+      continue
+
+    # Skip the unreleased folder (already checked)
+    if entry == 'unreleased':
+      continue
+
+    release_date_file = os.path.join(entry_path, 'release-date.txt')
+
+    # Check if this is a version folder (vX.Y.Z format)
+    if version_pattern.match(entry):
+      # Version folders for the current release should NOT have 
release-date.txt
+      # Only check if this is the version being released
+      if entry == 'v%s' % version:
+        if os.path.exists(release_date_file):
+          raise RuntimeError('changelog/%s folder should not contain 
release-date.txt (version not yet released)' % entry)
+      else:
+        # Other version folders (past releases) should have release-date.txt
+        if not os.path.exists(release_date_file):
+          raise RuntimeError('changelog/%s folder is missing release-date.txt' 
% entry)
+    else:
+      # Non-version folders (e.g., documentation or support directories) are 
allowed
+      # and are not required to contain a release-date.txt file.
+      # They are intentionally ignored by this check.
+      pass

Review Comment:
   The PR description states that the smoke tester should verify "folder 
`changelog/v{releaseVer}` exists and does not contain a `release-date.txt`". 
However, this implementation only checks the "does not contain 
release-date.txt" part — it iterates over existing subdirectories in 
`changelog/` and only triggers the check if a folder named `v{version}` happens 
to be present. If the `changelog/v{version}` folder is missing entirely (which 
would mean `logchange release` was never run for this release), the check 
silently passes. Consider explicitly verifying that `changelog/v{version}` 
exists before walking subdirectories.



##########
dev-tools/scripts/smokeTestRelease.py:
##########
@@ -398,9 +457,41 @@ def testChangelogMd(dir, version):
 def checkChangesContent(s, version, name, isHTML):
   currentVersionTuple = versionToTuple(version, name)
 
-  if isHTML and s.find('Release %s' % version) == -1:
+  if isHTML and not SKIP_CHANGELOG and s.find('Release %s' % version) == -1:
     raise RuntimeError('did not see "Release %s" in %s' % (version, name))
 
+  # Validate h2 header structure for Changes.html (requires logchange generate 
to have been run)
+  if isHTML and not SKIP_CHANGELOG:
+    print('    validate h2 header structure...')
+    h2_pattern = 
re.compile(r'<h2[^>]*>\s*<a[^>]*>(Release\s+[^<]+)</a>\s*</h2>', re.IGNORECASE 
| re.DOTALL)
+    headers = h2_pattern.findall(s)
+
+    if len(headers) < 2:
+      raise RuntimeError('Expected at least 2 release h2 headers in %s, found 
%d' % (name, len(headers)))
+
+    # First header should be "Release {version}" without a date
+    first_header = headers[0].strip()
+    expected_first = 'Release %s' % version
+
+    # Check if first header matches version and does NOT have a date pattern 
[yyyy-mm-dd]
+    if not first_header.startswith(expected_first):
+      raise RuntimeError('First h2 header should start with "%s", but got "%s" 
in %s' % (expected_first, first_header, name))
+
+    date_pattern = re.compile(r'\[\d{4}-\d{2}-\d{2}\]')
+    if date_pattern.search(first_header):
+      raise RuntimeError('First h2 header for "%s" should NOT contain a 
release date, but got "%s" in %s' % (version, first_header, name))
+
+    # Second header should be "Release {version} [yyyy-mm-dd]" with a date
+    second_header = headers[1].strip()
+    if not second_header.startswith('Release '):
+      raise RuntimeError('Second h2 header should start with "Release", but 
got "%s" in %s' % (second_header, name))
+
+    if not date_pattern.search(second_header):
+      raise RuntimeError('Second h2 header should contain a release date 
[yyyy-mm-dd], but got "%s" in %s' % (second_header, name))
+
+    print('      - First h2: "%s" (no date)' % first_header)
+    print('      - Second h2: "%s" (with date)' % second_header)

Review Comment:
   The PR description states one of the goals is to "validate that generated 
`Changes.txt` does not list a release-date for current release". However, the 
new h2-header structure validation in `checkChangesContent` is gated on 
`isHTML`, so it only runs for `Changes.html` and never for `Changes.txt`. 
Either extend the check to cover the plain-text `Changes.txt` format as well, 
or update the PR description to reflect that only `Changes.html` is validated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to