Thanks for the review, replies are inline below.
On 2019.07.11 15:35, Jakub Narebski wrote:
> Josh Steadmon <[email protected]> writes:
>
> > trace_schema_validator can be used to verify that trace2 event output
> > conforms to the expectations set by the API documentation and codified
> > in event_schema.json (or strict_schema.json). This allows us to build a
> > regression test to verify that trace2 output does not change
> > unexpectedly.
> >
> > Signed-off-by: Josh Steadmon <[email protected]>
>
> Very nitpicky comments below.
>
> > ---
> > t/trace_schema_validator/.gitignore | 1 +
> > t/trace_schema_validator/Makefile | 10 +++
> > .../trace_schema_validator.go | 78 +++++++++++++++++++
> > 3 files changed, 89 insertions(+)
> > create mode 100644 t/trace_schema_validator/.gitignore
> > create mode 100644 t/trace_schema_validator/Makefile
> > create mode 100644 t/trace_schema_validator/trace_schema_validator.go
> >
> > diff --git a/t/trace_schema_validator/.gitignore
> > b/t/trace_schema_validator/.gitignore
> > new file mode 100644
> > index 0000000000..c3f1e04e9e
> > --- /dev/null
> > +++ b/t/trace_schema_validator/.gitignore
> > @@ -0,0 +1 @@
> > +trace_schema_validator
> > diff --git a/t/trace_schema_validator/Makefile
> > b/t/trace_schema_validator/Makefile
> > new file mode 100644
> > index 0000000000..ed22675e5d
> > --- /dev/null
> > +++ b/t/trace_schema_validator/Makefile
> > @@ -0,0 +1,10 @@
> > +.PHONY: fetch_deps clean
> > +
> > +trace_schema_validator: fetch_deps trace_schema_validator.go
> > + go build
>
> I don't know the Go build process, but shouldn't the name of target and
> the name of actual source file passed to the command?
>
> Though I don't think we would _need_ for example being able to configure
> Go build process via Makefile variables, like e.g. $(GOBUILD) in
> https://sohlich.github.io/post/go_makefile/
Yeah it seems optional for Go, but fixed to make things more consistent
with the other makefiles.
> > +
> > +fetch_deps:
> > + go get github.com/xeipuuv/gojsonschema
> > +
> > +clean:
> > + rm -f trace_schema_validator
>
> In git Makefile we use
>
> clean:
> $(RM) $(PROGRAMS)
>
> I'm not sure if it is needed for operating system independence, but
> using $(RM) is a standard way to create 'clean' targets...
Fixed in V3.
> > diff --git a/t/trace_schema_validator/trace_schema_validator.go
> > b/t/trace_schema_validator/trace_schema_validator.go
> > new file mode 100644
> > index 0000000000..f779ac5ff5
> > --- /dev/null
> > +++ b/t/trace_schema_validator/trace_schema_validator.go
> > @@ -0,0 +1,78 @@
> > +// trace_schema_validator validates individual lines of an input file
> > against a
> > +// provided JSON-Schema for git trace2 event output.
> > +//
> > +// Traces can be collected by setting the GIT_TRACE2_EVENT environment
> > variable
> > +// to an absolute path and running any Git command; traces will be
> > appended to
> > +// the file.
> > +//
> > +// Traces can then be verified like so:
> > +// trace_schema_validator \
> > +// --trace2_event_file /path/to/trace/output \
> > +// --schema_file /path/to/schema
> > +package main
> > +
> > +import (
> > + "bufio"
> > + "flag"
> > + "log"
> > + "os"
> > + "path/filepath"
> > +
> > + "github.com/xeipuuv/gojsonschema"
> > +)
> > +
> > +// Required flags
> > +var schemaFile = flag.String("schema_file", "", "JSON-Schema filename")
> > +var trace2EventFile = flag.String("trace2_event_file", "", "trace2 event
> > filename")
>
> The standard for long options is to use "kebab case", not "snake case"
> for them, i.e. --schema-file not current --schema_file, etc.
Fixed in V3.
> > +
> > +func main() {
> > + flag.Parse()
> > + if *schemaFile == "" || *trace2EventFile == "" {
> > + log.Fatal("Both --schema_file and --trace2_event_file are
> > required.")
> > + }
>
> I guess that you prefer required options with explicit arguments instead
> of positional arguments (that is requiring the command to be called with
> two arguments, first being schema file, second being event file to
> validate).
Yeah, that's the style I'm used to at work, but I can change this if
positional args are more acceptable for Git.
> > + schemaURI, err := filepath.Abs(*schemaFile)
> > + if err != nil {
> > + log.Fatal("Can't get absolute path for schema file: ", err)
> > + }
> > + schemaURI = "file://" + schemaURI
> > +
> > + schemaLoader := gojsonschema.NewReferenceLoader(schemaURI)
> > + schema, err := gojsonschema.NewSchema(schemaLoader)
> > + if err != nil {
> > + log.Fatal("Problem loading schema: ", err)
> > + }
> > +
> > + tracesFile, err := os.Open(*trace2EventFile)
> > + if err != nil {
> > + log.Fatal("Problem opening trace file: ", err)
> > + }
> > + defer tracesFile.Close()
> > +
> > + scanner := bufio.NewScanner(tracesFile)
> > +
> > + count := 0
> > + for ; scanner.Scan(); count++ {
>
> I see that you assume JSON-Lines format, i.e. one JSON object per line
> in the file.
Yeah, I'll add a comment noting this. This won't work with the provided
list schemas unless we reformat everything to be on a single line.
>
> > + if count%10000 == 0 {
> > + // Travis-CI expects regular output or it will time out.
> > + log.Print("Validated items: ", count)
>
> I wonder if it wouldn't be better to provide --progress flag, which
> Travis-CI job would turn on.
Done, thanks for the suggestion.
> > + }
> > + event := gojsonschema.NewStringLoader(scanner.Text())
> > + result, err := schema.Validate(event)
> > + if err != nil {
> > + log.Fatal(err)
> > + }
> > + if !result.Valid() {
> > + log.Print("Trace event is invalid: ", scanner.Text())
>
> It might be good idea to print the line (i.e. the value of `count`
> variable).
>
> I guess that conforming to the <filename>:<line>: prefix (like e.g. gcc
> uses for errors) would be unnecessary.
Done in V3.
> > + for _, desc := range result.Errors() {
> > + log.Print("- ", desc)
> > + }
> > + os.Exit(1)
> > + }
> > + }
> > +
> > + if err := scanner.Err(); err != nil {
> > + log.Fatal("Scanning error: ", err)
> > + }
> > +
> > + log.Print("Validated events: ", count)
> > +}