Hi Nazir/Andres, On Tue, Jun 2, 2026 at 12:13 PM Jakub Wartak <[email protected]> wrote: > > Hi Andres/Nazir, > [..] > Continuing on previous story...: > Windows was still @ 31mins, and whatever I've tried it is was not helping it > (but I cannot measure inside GHA Runner what was happening, so those were > blind > shots with fstweaks, etc). One important thing, altough I failed altering > CacheIsPowerProtected (avoid flushing the write cache) as it seems impossible > for me to do so on D:\ (as paging file is there and and altering it also > requires reboot), at least we know stuff is way slower than it could be on > those runners: > > "Get-PhysicalDisk | Get-StorageAdvancedProperty" reported: > > FriendlyName SerialNumber IsPowerProtected IsDeviceCacheEnabled > ------------ ------------ ---------------- -------------------- > Msft Virtual Disk False False > Msft Virtual Disk False False > > Perhaps there's way to use some custom image/templ with different settings, > especially for D:\, after all it's just volatile stuff. Thoughts? (not that I > care that much for Win, but waiting half hour for it finish every time is > not going to be nice...) > [..]
OK, so to close the loop: does no no-write-flushing (and ReFS) can help us here? I've made it work, but the possible configuration is just slower (just "Test run" step) by +2mins (26vs28 mins) :( Longer: * This is windows 2022 server, so ReFS (MS next-gen fs) is available. Technically robocopy should do CoW (for our initdb clones out there). * D:\ cannot cannot be reformatted from NTFS as ReFS mainly due to active pagefile and github agent places files there too. * But (!) one can make loop-image on D:\ with ReFS (sic!) * And disable write-cache-flushing with some hacks (usually used with RAID cards with BBU) And I've bumped TEST_JOBS 4->8 (even with 4 VCPUs), because my local runs showed in taskmgr that after quite some time we have ended up using just ~40% CPU (also with 4 VCPUs) while not doing I/O (this is somehow contrary to what Andres was stating earlier). I cannot find way to add observability of CPU usage on GHA runner, so just gonna leave it as that (but before anybody wishes to add more CPU it would actually help if such workload on GHA is really on CPU or I/O there). So it appears that without going into the dragon's den (I mean deeply analyzing our tests, especially subscription and recovery), we won't gain much in such setup. Patch attached if anybody wants to experiment more. -J.
From b12be7baf025287752b365cb59861f8d54fe2c0a Mon Sep 17 00:00:00 2001 From: Jakub Wartak <[email protected]> Date: Tue, 2 Jun 2026 11:43:56 +0200 Subject: [PATCH v1] Try ReFS ci-os-only: windows --- .github/workflows/postgresql-ci.yml | 52 ++++++++++++++++++++++++----- 1 file changed, 44 insertions(+), 8 deletions(-) diff --git a/.github/workflows/postgresql-ci.yml b/.github/workflows/postgresql-ci.yml index e2795ca0ffb..971fb9a705b 100644 --- a/.github/workflows/postgresql-ci.yml +++ b/.github/workflows/postgresql-ci.yml @@ -28,7 +28,7 @@ env: # It's possible that some jobs benefit from an increased test concurrency, # but a default of 4 is a safe bet. Individual jobs can override. - TEST_JOBS: 4 + TEST_JOBS: 8 CCACHE_MAXSIZE: "250M" CCACHE_DIR: ${{ github.workspace }}/ccache_dir @@ -45,6 +45,7 @@ env: # Can be set to a non-empty value to run a limited set of tests # (e.g. --suite regress to only run the main regression tests). +# MTEST_TARGET: --suite regress --suite postgresql:recovery MTEST_TARGET: PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests @@ -134,6 +135,9 @@ jobs: - &nix_sysinfo_step name: sysinfo run: | + mount + lsblk -O + ps auxww id uname -a ulimit -a -H && ulimit -a -S @@ -307,10 +311,11 @@ jobs: with: name: logs-${{ github.job }}-${{ github.run_id }}-${{ github.run_attempt }} path: | - **/*.log - **/*.diffs - **/regress_log_* - **/crashlog-*.txt + # avoids R:/System Volume Information (EINVAL) + R:/build/**/*.log + R:/build/*/*.diffs + R:/build/**/regress_log_* + R:/build/**/crashlog-*.txt if-no-files-found: ignore @@ -683,8 +688,35 @@ jobs: name: Disable Windows Defender shell: powershell run: | + $diskpartScript = @" + create vdisk file="D:\a\refs.vhd" maximum=16000 type=expandable + attach vdisk + create partition primary + format fs=refs quick + assign letter=R + "@ + $diskpartScript | diskpart + Get-Volume -DriveLetter R + + $DiskNumber = (Get-Partition -DriveLetter R).DiskNumber + $PnpId = (Get-CimInstance Win32_DiskDrive | Where-Object { $_.DeviceID -match "PhysicalDrive$DiskNumber" }).PNPDeviceID + $RegPath = "HKLM:\SYSTEM\CurrentControlSet\Enum\$PnpId\Device Parameters\Disk" + if (-not (Test-Path $RegPath)) { + New-Item -Path $RegPath -Force | Out-Null + } + + # 4. Turn off write-cache buffer flushing (CacheAttributes = 1 tells Windows to ignore OS flush requests) + Set-ItemProperty -Path $RegPath -Name "CacheAttributes" -Value 1 -Type DWord + Set-ItemProperty -Path $RegPath -Name "WriteCacheSetting" -Value 1 -Type DWord + + Set-Disk -Number $DiskNumber -IsOffline $true + Set-Disk -Number $DiskNumber -IsOffline $false + Write-Host "Success: Force-flushing disabled for Drive R: (Disk $DiskNumber)" -ForegroundColor Green + Get-PhysicalDisk | Where-Object { $_.DeviceID -eq $DiskNumber } | Get-StorageAdvancedProperty + Set-MpPreference -DisableRealtimeMonitoring $true -SubmitSamplesConsent NeverSend -MAPSReporting Disable # Verify Defender status + Get-PhysicalDisk | Get-StorageAdvancedProperty $status = Get-MpComputerStatus -ErrorAction SilentlyContinue if ($status) { Write-Host "RealTimeProtectionEnabled: $($status.RealTimeProtectionEnabled)" @@ -719,6 +751,9 @@ jobs: run: | icacls "${{ github.workspace }}" /grant "${env:USERNAME}:(OI)(CI)F" /Q | Out-Null Write-Host "Granted Full Control to $env:USERNAME on ${{ github.workspace }}" + mkdir R:\build + icacls "R:\build" /grant "${env:USERNAME}:(OI)(CI)F" /Q | Out-Null + Write-Host "Granted Full Control to $env:USERNAME on R:\build" # postgres' plpython3u loads python3.dll (the stable-ABI forwarder) # which in turn loads whichever python3NN.dll the Windows loader finds @@ -792,18 +827,19 @@ jobs: -Db_pch=true ^ -Dextra_lib_dirs=d:\openssl\1.1\lib -Dextra_include_dirs=d:\openssl\1.1\include ^ -DTAR=${{env.TAR}} ^ - build + R:/build - name: Build run: | call "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64 - ninja -C build ${{env.MBUILD_TARGET}} - ninja -C build -t missingdeps + ninja -C R:/build ${{env.MBUILD_TARGET}} + ninja -C R:/build -t missingdeps - name: Test world env: ADDITIONAL_SETUP: | call "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64 + R: run: *meson_test_world_cmd # FIX: We need to collect crashlogs but they are not collected. cdb.exe -- 2.43.0
