On Tue, Jun 15, 2021 at 04:24:08PM +0200, Julien Pivotto wrote:
> Hello,
> 
> I am a Prometheus maintainer and we have received a bug regarding
> Prometheus - prometheus would no longer work on OpenBSD since we
> introduced MMAP:
> 
> https://github.com/prometheus/prometheus/issues/8877
> https://github.com/prometheus/prometheus/issues/8799
> 
> I would like to know if the facts here are accurate and, on the
> opposite, if there are happy openbsd users of Prometheus 2.19+.
> 
> I see that Prometheus 2.24 is packaged upstream, so I guess there are
> users. Can you please interact with us so we can better understand the
> situation at play.
> 

Unlike other OS OpenBSD does not automatically sync between mmap-ed memory
of a file with any write() to the same file (OpenBSD has no unified
cache). It requries use of msync(2) to make sure that mappings are
properly updated.

While prometheus works, it also does not. I looked into the code of TSDB
and came to the conclusion that many operations (especially compaction)
fail because TSDB writes to file handels but uses mmaps of the same memory
at the same time.

I fixed one case (which is the one mentioned in the issues index/index.go
but then more errors show up when running tsdb go test. Including a SEGV
in db_test.go

I played a bit more with this and skipping the bad test in db_test.go it
seems to mostly pass but errors out at the end:

level=error msg="WAL corruption detected; truncating" err="unexpected
CRC32 checksum 7c1a52ff, want 1020304"
file=/tmp/test_corrupted095078964/000001 pos=44
PASS
goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 17761 in state chan send, with
github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut.func1 on top of
the stack:
goroutine 17761 [chan send]:
github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut.func1(0xc001262fd0,
0xc00000eff0)
        /usr/ports/pobj/prometheus-2.27.1/go/src/all/tsdb/wal.go:571 +0x72
created by github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut
        /usr/ports/pobj/prometheus-2.27.1/go/src/all/tsdb/wal.go:570 +0x7a

 Goroutine 18135 in state chan send, with
github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut.func1 on top of
the stack:
goroutine 18135 [chan send]:
github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut.func1(0xc000099290,
0xc000be24b0)
        /usr/ports/pobj/prometheus-2.27.1/go/src/all/tsdb/wal.go:571 +0x72
created by github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut
        /usr/ports/pobj/prometheus-2.27.1/go/src/all/tsdb/wal.go:570 +0x7a
]
exit status 1
FAIL    github.com/prometheus/prometheus/tsdb   83.561s

The TSDB code is very hard to follow and debug. There is mmaps all over
the place and it is unclear which files are written too and which are not.
Also the MmapFile struct are not stored in some other structs and so it is
not that simple to call msync.
-- 
:wq Claudio

$OpenBSD$

Add msync to sync mmap buffers

diff --git tsdb/fileutil/mmap.go tsdb/fileutil/mmap.go
index 4dbca4f97..516991c60 100644
--- tsdb/fileutil/mmap.go
+++ tsdb/fileutil/mmap.go
@@ -71,3 +71,7 @@ func (f *MmapFile) File() *os.File {
 func (f *MmapFile) Bytes() []byte {
        return f.b
 }
+
+func (f *MmapFile) Sync() error {
+       return sync(f.b)
+}
diff --git tsdb/fileutil/mmap_unix.go tsdb/fileutil/mmap_unix.go
index 043f4d408..c21829989 100644
--- tsdb/fileutil/mmap_unix.go
+++ tsdb/fileutil/mmap_unix.go
@@ -28,3 +28,7 @@ func mmap(f *os.File, length int) ([]byte, error) {
 func munmap(b []byte) (err error) {
        return unix.Munmap(b)
 }
+
+func sync(b []byte) error {
+       return unix.Msync(b, unix.MS_ASYNC)
+}
diff --git tsdb/fileutil/mmap_windows.go tsdb/fileutil/mmap_windows.go
index b94226412..c54b6b125 100644
--- tsdb/fileutil/mmap_windows.go
+++ tsdb/fileutil/mmap_windows.go
@@ -44,3 +44,7 @@ func munmap(b []byte) error {
        }
        return nil
 }
+
+func sync(b []byte) error {
+       return nil
+}
diff --git tsdb/index/index.go tsdb/index/index.go
index a6ade9455..723f2bc73 100644
--- tsdb/index/index.go
+++ tsdb/index/index.go
@@ -552,6 +552,7 @@ func (w *Writer) finishSymbols() error {
        if err := w.writeAt(w.buf1.Get(), hashPos); err != nil {
                return err
        }
+       w.symbolFile.Sync()
 
        // Load in the symbol table efficiently for the rest of the index 
writing.
        w.symbols, err = NewSymbols(realByteSlice(w.symbolFile.Bytes()), 
FormatV2, int(w.toc.Symbols))

Reply via email to