I will give this a try.
On Wed, Dec 8, 2021 at 10:39 AM Steve Lawrence <[email protected]> wrote: > > That's fair, I agree there definitely is some redundancy. In general I'm > not a huge fan of mixing sources and resources, but maybe it's not too > big of a deal since in this case since sources for UDF/Layers will be > rare, and when they do exist there's probably only a very small number > of them. > > I haven't tested this much, but based on some examples and playing > around a bit, I think this gets you what you're after: > > organization := "org.example" > > name := "dfdl-fmt" > > version := "0.1.0-SNAPSHOT" > > lazy val root = (project in file(".")) > .settings( > Project.inConfig(Compile)(flattenSettings("src")), > Project.inConfig(Test)(flattenSettings("test")), > ) > > def flattenSettings(name: String) = Seq( > unmanagedSourceDirectories := Seq(baseDirectory.value / name), > unmanagedResourceDirectories := unmanagedSourceDirectories.value, > unmanagedSources / includeFilter := "*.java" | "*.scala", > unmanagedResources / excludeFilter := (unmanagedSources / > includeFilter).value, > ) > > (note that we probably also want many of the existing settings in our > current build.sbt files) > > All the non-test stuff goes in a "src" directory. Sources are anything > that ends with .java or .scala. Resources are anything that isn't a source. > > And the "test" directory has the exact same layout, but for tests. > > The .class files that end up in the jar are namespaced by the package line. > > The resources that end up in the jar are namespaced by the directory > structure and/or file naming convention as they are in the src/ or test/ > directory. So schema authors can namespace schemas however they want, > whether it be directories or file names, or not at all. > > > On 12/8/21 9:56 AM, Mike Beckerle wrote: > > I guess my concern is that all the depth associated with the sbt-based > > standard layout feels completely redundant to me. > > > > I am suggesting of the src/main/scala, we need only main/. Of > > src/main/resources/kind we need only main/. > > > > E.g, Why are all the typed subdirs needed (xsd/, dfdl/, etc.) when > > file extensions can be used to distinguish resource types and > > programming language compilers to be used? > > > > To me the only "real" distinction in the standard project layout is > > main vs. test which is needed to exclude test stuff when packaging. > > > > The rest is > > (a) using directories as "package names" - which can be done with > > well-chosen longer file names > > (b) using directories as redundant file typing - which can be done > > with file name extensions. > > > > To me a UDF is a META-INF/services file and some scala/java code in > > the "main" area. > > Ditto for a layer definition. > > > > I guess concretely I am wondering if there is a way to override basic > > sbt settings like this: > > > > * Instead of src/main/scala, just look for main/*.scala > > * Instead of src/main/java, just look for main/*.java > > * Instead of src/main/resources/* just look for main/* where the file > > name does not end in ".scala" nor ".java" > > > > And similarly for test things, where src/test/whatever just becomes > > test/whatever and distinctions are made using file name extensions. > > > > On Wed, Dec 8, 2021 at 9:21 AM Steve Lawrence <[email protected]> wrote: > >> > >> What about the scala/java/resources directories? Do those still exist or > >> are they simplified somehow? > >> > >> We currently have an xsd/ directory to allow schematron, xslt, etc to be > >> included in the same repo. Do we still have that directory? > >> > >> How do pluggable UDF's and Layers fit into this? Do we suggest those are > >> in separate repos, or can they fit into this? > >> > >> Note that I believe sbt supports organizations in a single directory > >> name, e.g. > >> > >> src/ > >> └── main/ > >> └── resources/ > >> └── org.foo.myschema/ > >> └── xsd/ > >> └── common.xsd > >> > >> So that could be one approach to reduce the deep directory structures. > >> > >> Generally, I'm definitely in favor of simplifying the layout, but this > >> to me feels like it might just add more confusion since it's sort of > >> close to the existing layout, but not quite the same. > >> > >> If we are potentially going to go against the standards, and potentially > >> make IDE support more difficult, I almost wonder if we should be more > >> ambitious and come up with something that is completely different? I'm > >> not sure what that would be, but could be more flat. For example, maybe > >> something like this: > >> > >> dfdl-fmt/ > >> ├── build.sbt > >> ├── dfdl/ > >> │ ├── format.dfdl.xsd > >> │ └── main.dfdl.xsd > >> ├── layer/ > >> │ └── MyLayer.scala > >> ├── sch/ > >> ├── tdml/ > >> │ └── main.tdml > >> ├── udf/ > >> │ └── MyUDF.scala > >> └── xslt/ > >> > >> A plugin could implicitly add organization structure so things are > >> namespace when building a jar. Or maybe we even do something like NiFi > >> has with .nar fles have have a custom package format, e.g. .dar > >> > >> It's probably a lot more work, and things to work out (e.g. how to > >> dependencies work for udf and layers), and almost certainly needs a > >> plugin to work instead of just tweaking sbt properties, but something > >> like that feels more ideal to me. > >> > >> Note that maybe we don't even use sbt for this. Maybe there's a better > >> tool for something like this. > >> > >> Another thing to consider that is related, with NiFi we found it > >> difficult to add jars to the NiFi classpath for a specific processor, > >> which means loading schemas from a jar on the classpath couldn't be > >> done. Having a custom package format could make this easier, since all > >> the .dar processing/lookup would be done by Daffodil rather than > >> standard classpath lookups. > >> > >> > >> On 12/3/21 5:25 PM, Mike Beckerle wrote: > >>> Experience in giving DFDL training via daffodil is that our standard > >>> schema > >>> project layout <https://daffodil.apache.org/dfdl-layout/> is much too deep > >>> (directory wise) for many users to conveniently navigate and use. It gets > >>> in the way of learning. > >>> > >>> Our layout was designed to follow sbt conventions that enable automated > >>> dependency management, packaging, etc. It is easy to use if you are > >>> accustomed to using an IDE like Eclipse or IntelliJ. It is also > >>> extraordinarily valuable (and underappreciated) that 'sbt test' does a > >>> built-in-self-test on a schema, and that 'sbt publishLocal' creates a Jar > >>> of a DFDL schema for managed dependencies use between schemas. > >>> > >>> But new users are mostly coming to DFDL/Daffodil from a command-line > >>> prompt > >>> and a text editor (e.g., VIM). > >>> > >>> I am wondering if we can have our cake and eat it too, without too much > >>> added sbt complexity, and without losing 'sbt test' and 'sbt publishLocal' > >>> working their magic for us. > >>> > >>> E.g., what if a simplified layout was: > >>> > >>> mySchema/schema - takes the place of src/main/*. Also no package-style > >>> directory folder structure. > >>> mySchema/test - takes the place of src/test/*. No package-style directory > >>> folder structure. > >>> > >>> It would be optional if users want to user mySchema/test/data and > >>> mySchema/test/infosets to separate infosets and data, or just put all > >>> those > >>> files in the same place and use file extensions (.dat vs. .dat.xml vs. > >>> .tdml, etc.) to distinguish the kinds of content. > >>> > >>> Such a flattened tree structure requires that the schema file names are > >>> well chosen to be unlikely to conflict with other users chosen names, so a > >>> name like common.dfdl.xsd or main.dfdl.xsd would be no good as there is no > >>> package directory structure to make them unique. > >>> > >>> But names like common-mySchema.dfdl.xsd and main-mySchema.dfdl.xsd would > >>> still be quite convenient to use, particularly if the mySchema name is > >>> well > >>> chosen. (Note how I've put the unique part of the name first, so that > >>> name-completion will work most easily on command line.) > >>> > >>> I think this would still work with sbt if we simply override the default > >>> paths (and perhaps file patterns) used for specifying source and > >>> resources. > >>> > >>> Thoughts? > >>> > >> >
