I guess my concern is that all the depth associated with the sbt-based
standard layout feels completely redundant to me.

I am suggesting of the src/main/scala, we need only main/. Of
src/main/resources/kind we need only main/.

E.g, Why are all the typed subdirs needed (xsd/, dfdl/, etc.) when
file extensions can be used to distinguish resource types and
programming language compilers to be used?

To me the only "real" distinction in the standard project layout is
main vs. test which is needed to exclude test stuff when packaging.

The rest is
(a) using directories as "package names" - which can be done with
well-chosen longer file names
(b) using directories as redundant file typing - which can be done
with file name extensions.

To me a UDF is a META-INF/services file and some scala/java code in
the "main" area.
Ditto for a layer definition.

I guess concretely I am wondering if there is a way to override basic
sbt settings like this:

* Instead of src/main/scala, just look for main/*.scala
* Instead of src/main/java, just look for main/*.java
* Instead of src/main/resources/* just look for main/* where the file
name does not end in ".scala" nor ".java"

And similarly for test things, where src/test/whatever just becomes
test/whatever and distinctions are made using file name extensions.

On Wed, Dec 8, 2021 at 9:21 AM Steve Lawrence <slawre...@apache.org> wrote:
>
> What about the scala/java/resources directories? Do those still exist or
> are they simplified somehow?
>
> We currently have an xsd/ directory to allow schematron, xslt, etc to be
> included in the same repo. Do we still have that directory?
>
> How do pluggable UDF's and Layers fit into this? Do we suggest those are
> in separate repos, or can they fit into this?
>
> Note that I believe sbt supports organizations in a single directory
> name, e.g.
>
>    src/
>    └── main/
>        └── resources/
>            └── org.foo.myschema/
>                └── xsd/
>                    └── common.xsd
>
> So that could be one approach to reduce the deep directory structures.
>
> Generally, I'm definitely in favor of simplifying the layout, but this
> to me feels like it might just add more confusion since it's sort of
> close to the existing layout, but not quite the same.
>
> If we are potentially going to go against the standards, and potentially
> make IDE support more difficult, I almost wonder if we should be more
> ambitious and come up with something that is completely different? I'm
> not sure what that would be, but could be more flat. For example, maybe
> something like this:
>
>    dfdl-fmt/
>    ├── build.sbt
>    ├── dfdl/
>    │   ├── format.dfdl.xsd
>    │   └── main.dfdl.xsd
>    ├── layer/
>    │   └── MyLayer.scala
>    ├── sch/
>    ├── tdml/
>    │   └── main.tdml
>    ├── udf/
>    │   └── MyUDF.scala
>    └── xslt/
>
> A plugin could implicitly add organization structure so things are
> namespace when building a jar. Or maybe we even do something like NiFi
> has with .nar fles have have a custom package format, e.g. .dar
>
> It's probably a lot more work, and things to work out (e.g. how to
> dependencies work for udf and layers), and almost certainly needs  a
> plugin to work instead of just tweaking sbt properties, but something
> like that feels more ideal to me.
>
> Note that maybe we don't even use sbt for this. Maybe there's a better
> tool for something like this.
>
> Another thing to consider that is related, with NiFi we found it
> difficult to add jars to the NiFi classpath for a specific processor,
> which means loading schemas from a jar on the classpath couldn't be
> done. Having a custom package format could make this easier, since all
> the .dar processing/lookup would be done by Daffodil rather than
> standard classpath lookups.
>
>
> On 12/3/21 5:25 PM, Mike Beckerle wrote:
> > Experience in giving DFDL training via daffodil is that our standard schema
> > project layout <https://daffodil.apache.org/dfdl-layout/> is much too deep
> > (directory wise) for many users to conveniently navigate and use. It gets
> > in the way of learning.
> >
> > Our layout was designed to follow sbt conventions that enable automated
> > dependency management, packaging, etc. It is easy to use if you are
> > accustomed to using an IDE like Eclipse or IntelliJ.  It is also
> > extraordinarily valuable (and underappreciated) that 'sbt test' does a
> > built-in-self-test on a schema, and that 'sbt publishLocal' creates a Jar
> > of a DFDL schema for managed dependencies use between schemas.
> >
> > But new users are mostly coming to DFDL/Daffodil from a command-line prompt
> > and a text editor (e.g., VIM).
> >
> > I am wondering if we can have our cake and eat it too, without too much
> > added sbt complexity, and without losing 'sbt test' and 'sbt publishLocal'
> > working their magic for us.
> >
> > E.g., what if a simplified layout was:
> >
> > mySchema/schema - takes the place of src/main/*. Also no package-style
> > directory folder structure.
> > mySchema/test - takes the place of src/test/*. No package-style directory
> > folder structure.
> >
> > It would be optional if users want to user mySchema/test/data and
> > mySchema/test/infosets to separate infosets and data, or just put all those
> > files in the same place and use file extensions (.dat vs. .dat.xml vs.
> > .tdml, etc.) to distinguish the kinds of content.
> >
> > Such a flattened tree structure requires that the schema file names are
> > well chosen to be unlikely to conflict with other users chosen names, so a
> > name like common.dfdl.xsd or main.dfdl.xsd would be no good as there is no
> > package directory structure to make them unique.
> >
> > But names like common-mySchema.dfdl.xsd and main-mySchema.dfdl.xsd would
> > still be quite convenient to use, particularly if the mySchema name is well
> > chosen. (Note how I've put the unique part of the name first, so that
> > name-completion will work most easily on command line.)
> >
> > I think this would still work with sbt if we simply override the default
> > paths (and perhaps file patterns) used for specifying source and resources.
> >
> > Thoughts?
> >
>

Reply via email to