hanahmily commented on code in PR #1184:
URL:
https://github.com/apache/skywalking-banyandb/pull/1184#discussion_r3456885907
##########
banyand/stream/block_writer.go:
##########
@@ -173,6 +174,7 @@ type blockWriter struct {
func (bw *blockWriter) reset() {
bw.writers.reset()
+ bw.tagType.reset()
Review Comment:
`reset()` calls `bw.tagType.reset()` (i.e. `clear()`) directly, without
guarding against a nil map. This is safe today only because
`generateBlockWriter` always initializes `tagType`, but `copyFrom` would panic
if a `blockWriter` ever reached use with a nil map. The merged trace
implementation guards this — could we match it here for consistency and
robustness?
```go
if bw.tagType == nil {
bw.tagType = make(tagType)
} else {
bw.tagType.reset()
}
```
##########
banyand/stream/part.go:
##########
@@ -232,6 +239,7 @@ func (mp *memPart) mustFlush(fileSystem fs.FileSystem, path
string) {
}
mp.partMetadata.mustWriteMetadata(fileSystem, path)
+ mp.tagType.mustWriteTagType(fileSystem, path)
Review Comment:
`tag.type` is written *after* `metadata.json` here, which raises two points:
- The comment just below ("…covering the dirent changes for all data files
written above") is now slightly stale, since `metadata.json` is no longer the
final write.
- `metadata.json` is the part's completion marker, so a crash between the
two writes yields a "complete" part with no `tag.type`. Harmless now (read path
is out of scope), but the future query-path reader will need to tolerate a
missing `tag.type`.
Note also that `FinishSync` in `write_data.go` writes them in the opposite
order (`tag.type` before `metadata.json`), so the ordering is inconsistent
between the flush/merge paths and the sync path. Worth aligning, or at least
documenting the intended invariant.
##########
banyand/stream/part_metadata.go:
##########
@@ -107,6 +112,96 @@ func (pm *partMetadata) mustWriteMetadata(fileSystem
fs.FileSystem, partPath str
}
}
+type tagType map[string]map[string]pbv1.ValueType
+
+func (tt tagType) reset() {
+ clear(tt)
+}
+
+func (tt tagType) copyFrom(source tagType) {
+ for familyName, sourceTags := range source {
+ tags := tt[familyName]
+ if tags == nil {
+ tags = make(map[string]pbv1.ValueType, len(sourceTags))
+ tt[familyName] = tags
+ }
+ for tagName, valueType := range sourceTags {
+ tags[tagName] = valueType
+ }
+ }
+}
+
+func (tt tagType) marshal() []byte {
+ var dst []byte
+ dst = encoding.VarUint64ToBytes(dst, uint64(len(tt)))
+ familyNames := make([]string, 0, len(tt))
+ for familyName := range tt {
+ familyNames = append(familyNames, familyName)
+ }
+ sort.Strings(familyNames)
+ for _, familyName := range familyNames {
+ dst = encoding.EncodeBytes(dst,
convert.StringToBytes(familyName))
+ tags := tt[familyName]
+ dst = encoding.VarUint64ToBytes(dst, uint64(len(tags)))
+ tagNames := make([]string, 0, len(tags))
+ for tagName := range tags {
+ tagNames = append(tagNames, tagName)
+ }
+ sort.Strings(tagNames)
+ for _, tagName := range tagNames {
+ dst = encoding.EncodeBytes(dst,
convert.StringToBytes(tagName))
+ dst = append(dst, byte(tags[tagName]))
+ }
+ }
+ return dst
+}
+
+func (tt tagType) unmarshal(src []byte) error {
Review Comment:
The PR description states this "returns errors for truncated names or
missing type bytes," but the decode is only partially defensive:
`encoding.BytesToVarUint64` and `DecodeBytes` return `0`/empty *without* an
error on several truncation shapes. A corrupt `familyCount`/`tagCount` would
therefore spin the loop (inserting empty entries) or panic in `make(...,
tagCount)` rather than returning a clean error. Since `unmarshal` is test-only
today this is latent, but before the read path lands I'd suggest
bounds-checking the decoded counts against `len(remaining)` so corruption fails
fast.
(This is the same function Copilot flagged, but a distinct concern — its
compile claim is incorrect, as `uint64` is a valid `make` size hint.)
##########
banyand/stream/part.go:
##########
@@ -422,6 +430,18 @@ func CreatePartFileReaderFromPath(partPath string, lfs
fs.FileSystem) ([]queue.F
if e.IsDir() {
continue
}
+ if e.Name() == tagTypeFilename {
Review Comment:
Minor: this `if e.Name() == tagTypeFilename` branch doesn't `continue`
afterward, so control falls through to the `filepath.Ext` checks below. It's
harmless (`.type` matches none of those extensions), but adding `continue`
would match the structure of the sibling branches and avoid the redundant
comparisons.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]