Thanks Gabriel, that's very useful! For the various parsers, it might make sense to simply fork (or copy/paste) bits of opam parsers from various versions of opam and try to fix these parsers to:
- keep comments - support some kind of alignments? Thomas > On 22 Nov 2015, at 20:55, Gabriel Scherer <[email protected]> wrote: > >> - Any chance you could use cmdliner instead of Arg? Fairly minor, but >> all the other plugins use it and it's nice to have the same behaviours >> for OPAM plugins where possible. > > I'm fond of Daniel's design work, so I would gladly move to > Cmdliner -- Arg was just what I could easily use "in anger" for > a first try. > >> - How does this behave on pre-1.2 files? I think it's about time that >> we deprecate pre-1.2 so that we can have clean metadata standards >> about the new fields such as dev-repo. > > opam-fmt updates older opam files to its own support version -- and > refuses to work on newer files. Two things: > > I think it would be a better design to have a family of scripts > opam-fmt-1.0, opam-fmt-1.1, opam-fmt-1.2 etc. and a "mother script" > opam-fmt that calls the right reformatter according to the file > version¹. However, that requires changes in the packaging of opam-lib, > so that the package for distinct versions can be installed > simultaneously (they would be separate packages > opam-lib-VER, and ocamlfind packages as well). > > ¹: another option would be for opam-lib to also support pairs of parsing > and printing functions for older format version specifically, but that > is an invasive choice to make in a codebase. Right now there is a tiny > bit of logic to know which fields are 1.0 or 1.1-specific, but this > would be much more ambitious. > > There are various warnings implemented in opam-lib that could be emitted > during the processing of files by opam-fmt -- they may be already > available depending on the OPAMDEBUG variable or something, but an > explicit support in the interface could be nice. When I tested > reformatting opam-repository, I observed that a large part of its opam > files raise such warnings (so indeed there seems to be a metadata > problem in the repository today). > >> - Having a bot regularly go through and reformat the entire repository >> also seems quite legit. The alternative would be to reformat at the >> merge point, but this would require a staging branch. > > I'm not sure what you call "merge point"; my idea was to put the burden > of reformatting onto users submitting PRs against the > repository. (Regular reformatting are a sensible idea, but they run in > the problem of loss of information, whether distributed manual > reformatting keeps humans closer in the loop) > >> - I would really like opam-fmt to be lossless, so preserving comments >> seems like an extension that we should implement. Ideally everyone >> can just call it on their packages without thinking about it. > > I have mixed feelings about trying to be lossless. At the very least, > one should recognize that setting this as a design goal would impose > a significant burden on the developers of the parsing/printing functions > in opam-lib. > > Some human choices (alignment of string fields for example) are rather > difficult and fragile to recognize -- and they could complexify the > codebase. Even for comments, right now you cannot tell to which > configuration item an element is attached. There are several ways around > this, which are interesting to consider but also involve a fair amount > of work: > > - You could use ocamldoc-like placement rules: "always after the > relevant field" (a first comment would be a file-global comment), or > "either before or after, but an empty line between the comment and > a non-relevant field"; this seems painful and not-that-easy to > implement. > > - You could move to a docstring-like (or attribute-like) syntax where > comments are explicitly attached to an AST node; from a language > design point of view this would be my preference, but it may require > a change in concrete syntax. > > - Finally, the choices you can make in this design space depend a lot on > whether reformatting will be performed by humans or by bots. If your > comments-attachment rules are obscure, humans have the opportunity to > reformat, see that they got them wrong, and reiterate. Bots will just > put stuff at the wrong place. > > I think that the people that maintain this corner of opam today are > those that will pay the greater cost if "lossless" becomes a design > goal, so it should be their choice to make. > > > In the meantime, it would be interesting to have a look at how opam > files in the repository actually use comments. With > > find packages -name 'opam' | xargs grep --color=always " #" > > I see 75 occurences of comments, 38 of which are just "TODO fixme". The > 37 others seem rather interesting, below are a few representative examples: > > packages/arakoon/arakoon.1.8.6/opam: >> "lwt" { = "2.4.8" } # 2.4.9 had an incompatible API change > > Having textual exaplanations for choice of bound is a reasonable > use-case for attributes. > > packages/camlp4/camlp4.4.01/opam: >> build: [] # dummy package > > This could be replaced by a dedicated note/comment field. > > packages/frama-c/frama-c.20150201/opam: >> "lablgtk" { >= "2.18.2" } #for ocaml >= 4.02.1 > > I don't understand the semantics of this one. > > packages/git/git.1.6.0/opam: >> depopts: [ >> # --enable-mirage >> "mirage-types-lwt" >> "mirage-http" >> "mirage-flow" >> "channel" >> # --enable-unix >> "cohttp" >> "conduit" >> "base-unix" >> ] > > This usage is very interesting, it seems to call for a hierarchy inside > the "depopts" list (and "dependencies" as well, I suppose), with > annotations on sub-groups of dependencies. > > packages/gsl/gsl.1.18.2/opam: >> depends: [ >> "base-bigarray" >> "camlp4" >> "ocamlfind" {>= "1.3.1"} >> # Included from _opam file >> "conf-gsl" >> ] > > I don't know what this comment means. > > packages/lz4/lz4.1.0.0/opam: >> depexts: [ >> [["debian"] ["liblz4-dev"]] >> # [["ubuntu"] ["liblz4-dev"]] reenable when CI updates its Ubuntu >> [["source"] ["https://.../install.sh"]] >> ] > > Again, this would require annotations. > > packages/ppx_deriving/ppx_deriving.0.3/opam: >> build: [ >> # If there is no native dynlink, we can't use native builds >> "ocaml" "pkg/build.ml" "native=true" >> "native-dynlink=true" >> ] > > packages/frama-c-e-acsl/frama-c-e-acsl.0.5/opam: >> build: [ >> ["ocaml" "run_autoconf_if_needed.ml"] #when used in pinned mode the >> configure *can* not yet be generated >> ["./configure" "--prefix" prefix] >> [make] >> ] > > packages/clangml/clangml.0.5.2/opam: >> depexts: [ >> [["debian"] ["libboost-dev" "llvm-3.4-dev" "clang-3.4" "libclang-3.4-dev" >> "binutils-dev"]] >> [["ubuntu"] ["libboost-dev" "llvm-3.4-dev" "clang-3.4" "libclang-3.4-dev" >> "binutils-dev"]] >> [["gentoo"] ["dev-libs/boost" "sys-devel/llvm-3.4.1-r2" >> "sys-devel/clang-3.4.0-r100" "sys-devel/binutils"]] >> # archlinux has no package providing llvm and clang 3.4.1 >> [["archlinux"] ["boost" "binutils"]] >> ] > > packages/mtime/mtime.0.8.1/opam: >> depends: [ "ocamlfind" >> "js_of_ocaml" # FIXME should become a deptopt >> ] > > On Sun, Nov 22, 2015 at 8:09 PM, Anil Madhavapeddy <[email protected]> wrote: >> Thanks for this Gabriel! Quick notes: >> >> - I would really like opam-fmt to be lossless, so preserving comments >> seems like an extension that we should implement. Ideally everyone >> can just call it on their packages without thinking about it. >> >> - Having a bot regularly go through and reformat the entire repository >> also seems quite legit. The alternative would be to reformat at the >> merge point, but this would require a staging branch. >> >> - Any chance you could use cmdliner instead of Arg? Fairly minor, but >> all the other plugins use it and it's nice to have the same behaviours >> for OPAM plugins where possible. >> >> - How does this behave on pre-1.2 files? I think it's about time that >> we deprecate pre-1.2 so that we can have clean metadata standards >> about the new fields such as dev-repo. >> >> regards, >> Anil >> >>> On 21 Nov 2015, at 21:53, Gabriel Scherer <[email protected]> wrote: >>> >>> Hi opam-devel, >>> >>> As part of the discussion in >>> >>> bulk addition of 'ocamlbuild {build}' dependencies >>> https://github.com/ocaml/opam-repository/pull/5140 >>> >>> it became apparent that performing bulk updates on opam-repository is >>> made harder by the fact that the parse-print roundtrip does not >>> preserve human-formatted opam files. For my proposed patch I tried to >>> separate the reformatting of opam file (to follow the opam-lib printer >>> convention) from the actual changes in two separate commits, but that >>> means more work for script authors, and also creates patches that are >>> harder to review. (If (re)formatting was controlled by the maintainer >>> of the OPAM packages instead of authors of bulk updates, we would be >>> more confident in its correctness.) >>> >>> In order to move that discussion forward (how to maintain opam >>> metadata in a way that is easy for both human and scripts to work >>> with?), I propose the opam-fmt script that can be found at >>> https://github.com/gasche/opam-fmt/ >>> >>> I wrote it in the last few days and there are probably some rough >>> edges. Once I feel that it should work, I may try to package it on the >>> opam-repo (in the meantime, clone then pin). >>> >>> This suggests one possible way forward: publicize opam-fmt, encourage >>> users to reformat their opam files using it, adapt the opam-repository >>> QA to call `opam fmt --check` on opam files proposed in PR to enforce >>> it, and after some time (once we trust it works as expected thanks to >>> the human guinea pigs) reformat all older opam files to get a perfect >>> round-trip on all files of the repository. >>> It is not clear to me that this is the best way forward. (For example >>> it means that, in the current state of the opam file parsing/printing >>> code, comments in opam files would always be erased by reformatting. >>> Should we remove comments from the valid syntax of opam files, or >>> attach them to configuration lines to re-print them correctly later, >>> or maybe refuse to work on files with comments?) Opam developers and >>> repository maintainers may decide that the cost of caring about >>> reformatting outweigh the benefits in terms of scriptability. >>> >>> What do you think? >>> _______________________________________________ >>> opam-devel mailing list >>> [email protected] >>> http://lists.ocaml.org/listinfo/opam-devel >>> >> > _______________________________________________ > opam-devel mailing list > [email protected] > http://lists.ocaml.org/listinfo/opam-devel _______________________________________________ opam-devel mailing list [email protected] http://lists.ocaml.org/listinfo/opam-devel
