Re: Rust vs. C backend

2024-01-16 Thread John Wass
Hi Mike-

Take a look at

https://github.com/jni-rs/jni-rs

There is a good example in the repo.

I'd compare that against jnr-ffi, which is nice in general for jni work.

https://github.com/jnr/jnr-ffi

-john


On Tue, Jan 16, 2024, 08:42 Mike Beckerle  wrote:

> John, what do you (or anyone?) know about interfacing Rust and JVM
> languages like Java/Scala? A quick search pulled back nothing other than
> Java JNI and/or sending GPBs back and forth to micro-services.
>
> I'm wondering if incrementally rewriting parts of the existing, slow, Scala
> backend in Rust is feasible, or if this is not a smooth enough pathway.
>
> For example, I suspect that Daffodil's lexical analyzer could use a
> rewrite. It is one of the oldest pieces of code in Daffodil, and was
> created back when myself and others were really still Scala newbies, and we
> were just learning the implications for the lexical analyzer design I.e.,
> discovering them the hard way, realizing the design was not sufficiently
> general, adding more tweaks (which could be suboptimal), etc.  We had not
> yet developed the philosophy that the backend code should be "java like",
> and not use scala idioms that are less efficient.
>
> But if we're rewriting parts of the back-end and the rewrites are intended
> to be high performance, then it would seem the right language to use is
> something like Rust.
>
> However, this requires that we're able to call it efficiently and that the
> bridge from scala to that rust code is not so expensive as to eliminate the
> performance gain.
>
>
>
> On Fri, Jan 12, 2024, 5:52 PM Interrante, John A (GE Aerospace, US) <
> john.interra...@ge.com> wrote:
>
> > We developed a fork of the C backend that generated VHDL, but I can't get
> > into the implementation details here.  I think the line between Rust and
> C
> > is two-fold: how many machine architectures Rust has been ported to, and
> > how often you need to call C functions from Rust, which means having to
> use
> > Rust's Foreign Function Interface.  Otherwise, Rust is better than C for
> > new code and Daffodil should have a Rust backend.
> >
> > -Original Message-
> > From: Mike Beckerle 
> > Sent: Friday, January 12, 2024 5:50 AM
> > To: dev@daffodil.apache.org
> > Subject: EXT: Re: Rust vs. C backend
> >
> > what I meant by "What did you mean by the phrase “basis for generating
> > VHDL or System Verilog?”
> >
> > I suppose I was thinking you had a fork of the C backend that created
> > Verilog/VHDL, but perhaps the pathway is via the C code output, which is
> > then translated into Verilog/VHDL?
> >
> > In any case we're hearing lots more complaints about performance of
> > Daffodil's scala back-end (and missing optimizations in the middle
> phases).
> >
> > Generating C code won't work for Cyberian software-based applications as
> a
> > memory-safe language is required in these solutions. I will have to learn
> > more about Rust. We need a memory safe language that lets you control the
> > data representations far better than Java/Scala and JVM languages. I am
> > curious where the line is between Rust and C, i.e., what kinds of things
> > are possible in C that aren't possible in Rust.
> >
> > On Thu, Jan 11, 2024 at 5:01 PM Interrante, John A (GE Aerospace, US) <
> > john.interra...@ge.com> wrote:
> >
> > > Hi Mike,
> > >
> > > My view is that when the goal is to generate parsers and unparsers
> > > from fixed format binary data DFDL schemas, compile them to native
> > > machine code, and execute the machine code on CPUs, Daffodil should
> > > generate Rust.  We would have preferred Rust when we started the C
> > > code generator work.  Rust is memory safe, type safe, etc. – but it
> > > was not available for our phase 1 target CPU.
> > >
> > > Creating a Rust backend makes sense, although we don’t think there is
> > > a Rust to hardware path – at least none that we are aware of.  What
> > > did you mean by the phrase “basis for generating VHDL or System
> Verilog?”
> > >
> > > John
> > >
> > > From: Mike Beckerle 
> > > Sent: Thursday, January 11, 2024 5:13 AM
> > > To: John Interrante 
> > > Cc: dev@daffodil.apache.org
> > > Subject: EXT: Rust vs. C backend
> > >
> > > John,
> > >
> > > What's your view of generating Rust vs. Generating C from DFDL?
> > >
> > > Those of us working in Cyberia, well, the edict has been issued that
> > > only memory-safe languages/runtimes are allowed to reduce risk of
> > > cyber-attacks via things like libc flaws.
> > >
> > > Seems to me that Rust is the lowest level language that would be
> > > acceptable
> > >
> > > I believe ultimately, the goal is to generate a useful software
> > > implementation that does not compromise on performance, and to be a
> > > basis for generating VHDL or System Verilog.
> > >
> > > I imagine you've given this some thought you can share.
> > > Mike Beckerle
> > > Apache Daffodil PMC | daffodil.apache.org
> > > OGF DFDL Workgroup Co-Chair |
> > > www.ogf.o

Re: [DISCUSS] Creating daffodil-vscode 1.0.0 release

2022-02-22 Thread John Wass
All issues are resolved.  If we are good to proceed I would nominate Shane
Dell to be the release manager.

On Thu, Feb 17, 2022, 10:38 John Wass  wrote:

> > #81 yarn disallows SVG's in the README
> > #82 yarn build fails with missing icon
> > #83 Rename vsix convenience binary?
>
> Agree, added those to the 1.0.0 milestone.
>
>
>
> On Thu, Feb 17, 2022 at 10:27 AM Steve Lawrence 
> wrote:
>
>> I believe these two issues prevent building the .vsix file and should
>> probably be fixed--being able to build a .vsix file seems critical for a
>> release:
>>
>> #81 yarn disallows SVG's in the README
>>
>>https://github.com/apache/daffodil-vscode/issues/81
>>
>> #82 yarn build fails with missing icon
>>
>>https://github.com/apache/daffodil-vscode/issues/82
>>
>> I'd also suggest we might to resolve #83. It's not as critical to me as
>> the other two, but it seems like a good idea to get a consensus on the
>> best name for the convenience binary so that future releases are
>> consistent.
>>
>> #83 Rename vsix convenience binary?
>>
>>https://github.com/apache/daffodil-vscode/issues/83
>>
>>
>>
>> On 2/17/22 10:06 AM, John Wass wrote:
>> > I'd like to discuss starting the first official release of the Daffodil
>> VS
>> > Code extension.
>> >
>> > All items in the 1.0.0 GitHub milestone have been closed (aside from one
>> > about performing the release).  Are there any issues remaining on the
>> > tracker that should be considered required, or any other blockers to
>> > getting a 1.0.0 RC out there?
>> >
>> > https://github.com/apache/daffodil-vscode/issues
>> > https://github.com/apache/daffodil-vscode/milestone/1
>> >
>>
>>


Re: [DISCUSS] Creating daffodil-vscode 1.0.0 release

2022-02-17 Thread John Wass
> #81 yarn disallows SVG's in the README
> #82 yarn build fails with missing icon
> #83 Rename vsix convenience binary?

Agree, added those to the 1.0.0 milestone.



On Thu, Feb 17, 2022 at 10:27 AM Steve Lawrence 
wrote:

> I believe these two issues prevent building the .vsix file and should
> probably be fixed--being able to build a .vsix file seems critical for a
> release:
>
> #81 yarn disallows SVG's in the README
>
>https://github.com/apache/daffodil-vscode/issues/81
>
> #82 yarn build fails with missing icon
>
>https://github.com/apache/daffodil-vscode/issues/82
>
> I'd also suggest we might to resolve #83. It's not as critical to me as
> the other two, but it seems like a good idea to get a consensus on the
> best name for the convenience binary so that future releases are
> consistent.
>
> #83 Rename vsix convenience binary?
>
>    https://github.com/apache/daffodil-vscode/issues/83
>
>
>
> On 2/17/22 10:06 AM, John Wass wrote:
> > I'd like to discuss starting the first official release of the Daffodil
> VS
> > Code extension.
> >
> > All items in the 1.0.0 GitHub milestone have been closed (aside from one
> > about performing the release).  Are there any issues remaining on the
> > tracker that should be considered required, or any other blockers to
> > getting a 1.0.0 RC out there?
> >
> > https://github.com/apache/daffodil-vscode/issues
> > https://github.com/apache/daffodil-vscode/milestone/1
> >
>
>


[DISCUSS] Creating daffodil-vscode 1.0.0 release

2022-02-17 Thread John Wass
I'd like to discuss starting the first official release of the Daffodil VS
Code extension.

All items in the 1.0.0 GitHub milestone have been closed (aside from one
about performing the release).  Are there any issues remaining on the
tracker that should be considered required, or any other blockers to
getting a 1.0.0 RC out there?

https://github.com/apache/daffodil-vscode/issues
https://github.com/apache/daffodil-vscode/milestone/1


Re: Hex editor operations

2021-11-11 Thread John Wass
Thinking about copying/pasting of data... if we can use the system
clipboard it could greatly improve interoperability and usability.

An example scenario would be embedding the contents of a file (A) in
another (B) by copying A from the host with ctrl+c, and then pasting in the
editor, where B is open, using ctrl+v or right click.

This isn't a must have initially but when a clipboard functionality is
implemented, integrating with the system clipboard should be considered.


On Fri, Nov 5, 2021 at 10:10 AM Mike Beckerle  wrote:

> +1 for a properties panel where you can click on a byte, and it shows the
> bits off to the side/corner along with other potentially useful stuff: the
> position of the byte in bytes and in bits, it's value as decimal, its value
> as an ascii char, etc. This would take very little screen space.
>
> On Wed, Nov 3, 2021 at 8:01 AM John Wass  wrote:
>
> > Yep, I agree with all the bits about bits.
> >
> > The display of non-byte delimited data is covered by the concept of a
> > composable UI that allows for rendering ranges within the file
> differently
> > while laying out these ranges in the proper order and in a seamless
> > presentation.  The editing of such data would also be straightforward.
> The
> > bit view has a bit editor associated with it, allowing inline
> modification
> > of bits, just like you would expect in the hex editor.
> >
> > Where it gets fuzzy is editing in a representation different from the
> > display, eg. editing bits from a byte view.
> >
> > > a way to expand a single byte into a small presentation
> > > of 8 bits, allowing editing of just those 8 bits individually
> >
> > I can visualize this a few ways, not sure of what is the best for
> > usability.  A properties pane type component when a byte is selected
> might
> > be the right one to start with.
> >
> > > Perhaps that is what you meant by Mask/Set operations?
> >
> > I was only considering a byte view; "Mask" was referring to bit
> > manipulation of each byte in a range, "Set" as a verb, simply to update
> the
> > value of an existing byte.
> >
> > Your thoughts about the bit based views fill in the gaps around bits and
> we
> > can consider similar context operations in that view as those I mentioned
> > for the byte view.
> >
> > > I would add to the "parse until this byte", or "start from this byte"
> to
> > > enable narrowing at both the start and end - parse just these bytes
> >
> > Yeah I like this.  The interface between hex editor and debugger commands
> > is going to be an important extension point.
> >
> >
> >
> >
> > On Mon, Nov 1, 2021 at 10:31 PM Mike Beckerle 
> > wrote:
> >
> > > This is a good list.
> > >
> > > One thing I think is important is that often one is dealing with hex
> > data,
> > > but one needs to consider bit fields that do not respect byte
> boundaries.
> > >
> > > I suggest we need a way to expand a single byte into a small
> presentation
> > > of 8 bits, allowing editing of just those 8 bits individually, is of
> > value
> > > as part of a hex-editor. Perhaps that is what you meant by Mask/Set
> > > operations? I think switching to a full-fledged all 1's and 0's display
> > > mode is only for entirely non-byte-oriented data. Anything
> byte-oriented
> > > people users will want to use hex, and occasionally if they need to
> flip
> > > bits, a way for them to expand a byte or small run of bytes to 1's and
> > 0's,
> > > but then collapse back to hex is likely very helpful.
> > >
> > > But I also suggest creating a minimal hex editor version first, then we
> > get
> > > experience with it.
> > >
> > > Any little paper scribble exercises we find ourselves having to do on
> the
> > > side of using the editor, those are good candidates for things the UI
> > > should directly support.
> > >
> > > E.g. I have written things down on paper or in a text editor like:
> > >
> > > BF.32.A5.AC.(01|10 1101.101|1 0001)
> > >
> > > In a real UI these distinctions could be done quite differently.
> > > But the distinctions I'm making are dots separating bytes, pipes
> > separating
> > > bit fields of length 2, 9, and 5 respectively, and parentheses
> indicating
> > > bytes expanded out to a bits region for display, e.g., so that 01 for 2
> > > bits isn't confused with hex 01 8 bits

Re: Hex editor operations

2021-11-11 Thread John Wass
> I’m not sure if this fits Daffodil’s needs but it could be interesting.

BinEd is Java, we need something in JavaScript.

> Here's an older paper about an innovative hex editor:

There are interesting concepts in there.  Some of them, the lenses for
example, are similar to things we have been discussing here.



On Thu, Nov 4, 2021 at 3:58 PM Dave Fisher  wrote:

> There’s a guy with hex editor trying to see if he can make it into a
> community.
>
> https://lists.apache.org/thread/5cbcmfw08002p5ttgyd43kt4vq4c17o8
>
> https://bined.exbin.org/
>
> I’m not sure if this fits Daffodil’s needs but it could be interesting.
>
> Regards,
> Dave
>
> > On Nov 4, 2021, at 12:51 PM, Larry Barber 
> wrote:
> >
> > Here's an older paper about an innovative hex editor:
> >
> > Abstract
> > The analysis of binary data remains a challenge, especially for large or
> potentially inconsistent files. Traditionally, hex editors only make
> limited use of semantic information available to the user. We present an
> editor that supports user-supplied semantic data definitions. This semantic
> information is used throughout the program to realize semantic data
> visualization and data exploration capabilities not present in similar
> systems. Visualization and human-computer interaction techniques are
> applied. We show that this makes recognizing the structure of unknown or
> inconsistent data much more effective. Our approach demonstrates concepts
> that can be applied to the visual analysis of raw data in general.
> >
> >
> https://www.researchgate.net/publication/220836091_Vide_An_editor_for_the_visual_exploration_of_raw_data
> >
> > -Original Message-
> > From: Mike Beckerle 
> > Sent: Monday, November 1, 2021 10:31 PM
> > To: dev@daffodil.apache.org
> > Subject: Re: Hex editor operations
> >
> > This is a good list.
> >
> > One thing I think is important is that often one is dealing with hex
> data, but one needs to consider bit fields that do not respect byte
> boundaries.
> >
> > I suggest we need a way to expand a single byte into a small
> presentation of 8 bits, allowing editing of just those 8 bits individually,
> is of value as part of a hex-editor. Perhaps that is what you meant by
> Mask/Set operations? I think switching to a full-fledged all 1's and 0's
> display mode is only for entirely non-byte-oriented data. Anything
> byte-oriented people users will want to use hex, and occasionally if they
> need to flip bits, a way for them to expand a byte or small run of bytes to
> 1's and 0's, but then collapse back to hex is likely very helpful.
> >
> > But I also suggest creating a minimal hex editor version first, then we
> get experience with it.
> >
> > Any little paper scribble exercises we find ourselves having to do on
> the side of using the editor, those are good candidates for things the UI
> should directly support.
> >
> > E.g. I have written things down on paper or in a text editor like:
> >
> > BF.32.A5.AC.(01|10 1101.101|1 0001)
> >
> > In a real UI these distinctions could be done quite differently.
> > But the distinctions I'm making are dots separating bytes, pipes
> separating bit fields of length 2, 9, and 5 respectively, and parentheses
> indicating bytes expanded out to a bits region for display, e.g., so that
> 01 for 2 bits isn't confused with hex 01 8 bits, and here I have each
> nibble of 4 bits space-separated.
> >
> > Ultimately, we need a graphical means providing:
> > (a) a way of escaping from hex to bits for just a small region of the
> data where you care about the partial byte fields.
> > (b) a way of setting off bit fields from each other, that doesn't
> entirely lose the separation of hex digits.
> >
> > Then if say, that right-most bit field value is incorrect, you should be
> able to set editing focus on a bit and flip it.
> >
> > I would add to the "parse until this byte", or "start from this byte" to
> enable narrowing at both the start and end - parse just these bytes (in an
> identified region by start and end).  The use case I have in mind for this
> is sort of like unit testing. You narrow the data to just one part, then
> you specify to parse it not with the root element of the DFDL schema, but
> with a sub-element that you want to test.
> >
> > On Mon, Nov 1, 2021 at 6:19 AM John Wass  wrote:
> >
> >> Some thoughts on ways the hex editor would interact with an input
> >> stream and the debugger.  I visualize this using a context sensitive
> >> menu in the hex editor as an entrypoin

Re: Hex editor operations

2021-11-03 Thread John Wass
Yep, I agree with all the bits about bits.

The display of non-byte delimited data is covered by the concept of a
composable UI that allows for rendering ranges within the file differently
while laying out these ranges in the proper order and in a seamless
presentation.  The editing of such data would also be straightforward.  The
bit view has a bit editor associated with it, allowing inline modification
of bits, just like you would expect in the hex editor.

Where it gets fuzzy is editing in a representation different from the
display, eg. editing bits from a byte view.

> a way to expand a single byte into a small presentation
> of 8 bits, allowing editing of just those 8 bits individually

I can visualize this a few ways, not sure of what is the best for
usability.  A properties pane type component when a byte is selected might
be the right one to start with.

> Perhaps that is what you meant by Mask/Set operations?

I was only considering a byte view; "Mask" was referring to bit
manipulation of each byte in a range, "Set" as a verb, simply to update the
value of an existing byte.

Your thoughts about the bit based views fill in the gaps around bits and we
can consider similar context operations in that view as those I mentioned
for the byte view.

> I would add to the "parse until this byte", or "start from this byte" to
> enable narrowing at both the start and end - parse just these bytes

Yeah I like this.  The interface between hex editor and debugger commands
is going to be an important extension point.




On Mon, Nov 1, 2021 at 10:31 PM Mike Beckerle  wrote:

> This is a good list.
>
> One thing I think is important is that often one is dealing with hex data,
> but one needs to consider bit fields that do not respect byte boundaries.
>
> I suggest we need a way to expand a single byte into a small presentation
> of 8 bits, allowing editing of just those 8 bits individually, is of value
> as part of a hex-editor. Perhaps that is what you meant by Mask/Set
> operations? I think switching to a full-fledged all 1's and 0's display
> mode is only for entirely non-byte-oriented data. Anything byte-oriented
> people users will want to use hex, and occasionally if they need to flip
> bits, a way for them to expand a byte or small run of bytes to 1's and 0's,
> but then collapse back to hex is likely very helpful.
>
> But I also suggest creating a minimal hex editor version first, then we get
> experience with it.
>
> Any little paper scribble exercises we find ourselves having to do on the
> side of using the editor, those are good candidates for things the UI
> should directly support.
>
> E.g. I have written things down on paper or in a text editor like:
>
> BF.32.A5.AC.(01|10 1101.101|1 0001)
>
> In a real UI these distinctions could be done quite differently.
> But the distinctions I'm making are dots separating bytes, pipes separating
> bit fields of length 2, 9, and 5 respectively, and parentheses indicating
> bytes expanded out to a bits region for display, e.g., so that 01 for 2
> bits isn't confused with hex 01 8 bits, and here I have each nibble of 4
> bits space-separated.
>
> Ultimately, we need a graphical means providing:
> (a) a way of escaping from hex to bits for just a small region of the data
> where you care about the partial byte fields.
> (b) a way of setting off bit fields from each other, that doesn't entirely
> lose the separation of hex digits.
>
> Then if say, that right-most bit field value is incorrect, you should be
> able to set editing focus on a bit and flip it.
>
> I would add to the "parse until this byte", or "start from this byte" to
> enable narrowing at both the start and end - parse just these bytes (in an
> identified region by start and end).  The use case I have in mind for this
> is sort of like unit testing. You narrow the data to just one part, then
> you specify to parse it not with the root element of the DFDL schema, but
> with a sub-element that you want to test.
>
> On Mon, Nov 1, 2021 at 6:19 AM John Wass  wrote:
>
> > Some thoughts on ways the hex editor would interact with an input stream
> > and the debugger.  I visualize this using a context sensitive menu in the
> > hex editor as an entrypoint, where these are some of the operations that
> > could be offered.
> >
> > 1. Add/Delete/Mask/Set
> >   * Individual bytes
> >   * Blocks of bytes
> > 2. Copy/Paste bytes
> >   * Use system clipboard for interoperability
> > 3. Find/Replace bytes
> >   * In a selection
> >   * Across entire file
> > 4. Set breakpoints on bytes
> >   * Stops execution at related point in schema
> >   * Would require custom 

Re: [DISCUSS] Are we ready to release 3.2.0 Daffodil ?

2021-11-02 Thread John Wass
Nothing here for 3.2.0.

On Mon, Nov 1, 2021 at 3:42 PM Interrante, John A (GE Research, US) <
john.interra...@ge.com> wrote:

> I checked the Daffodil JIRA to see how many blocker, critical, and major
> issues we have so far:
>
> Blocker issues: 0
> Critical issues: 1
> https://issues.apache.org/jira/browse/DAFFODIL-2400 - New SAX API
> causes performance degradations
> Major issues: 132
> https://issues.apache.org/jira/browse/DAFFODIL-2574 - Cast error
> when multiplying two unsignedBytes
> and so on - too many to list individually
>
> The critical issue isn't a blocker and in fact hasn't been updated since
> October 2020 (1 year ago).  I see Daffodil's release plan in
> https://cwiki.apache.org/confluence/display/DAFFODIL/Roadmap+for+Upcoming+Releases
> already suggests reducing the number of JIRA issues between 3.2.0 and 3.3.0
> which sounds like a good plan.  Everything planned for 3.2.0 seems to be
> already checked off.
>
> I looked over the instructions in
> https://cwiki.apache.org/confluence/display/DAFFODIL/Release+Workflow
> carefully and preparing a release candidate looks doable but still a
> time-consuming bit of work.  Since we already have a volunteer, I'd rather
> wait for another release (thanks Mike).
>
> The first step in the Release workflow says to create a [DISCUSS] thread
> and allow time for discussion before creating the release candidate.  I've
> renamed this email's subject to start that formal thread and I'm fine with
> waiting less than 72 hours if we get responses from everyone we know who
> might be working on anything that they want to release in 3.2.0.  I plan to
> make some more changes and improvements to the Daffodil C code generator
> myself but the changes can go into 3.3.0 since I already build Daffodil
> from source anyway.  Is Steve Lawrence or John Wass working on something
> that should make it into 3.2.0?  Anyone else?
>
> John
>
> -Original Message-
> From: Mike Beckerle 
> Sent: Monday, November 1, 2021 1:44 PM
> To: dev@daffodil.apache.org
> Subject: EXT: Are we ready to release 3.2.0 Daffodil ?
>
> I believe 3.2.0 is functionally complete.
>
> Should we prepare a release candidate that we can vote on?
>
> We need someone to volunteer to be release manager.
>
> I am willing to do so, as I've not done this for several releases.
> However, if someone else who hasn't done one would like a chance, I will
> defer to them.
>


Hex editor operations

2021-11-01 Thread John Wass
Some thoughts on ways the hex editor would interact with an input stream
and the debugger.  I visualize this using a context sensitive menu in the
hex editor as an entrypoint, where these are some of the operations that
could be offered.

1. Add/Delete/Mask/Set
  * Individual bytes
  * Blocks of bytes
2. Copy/Paste bytes
  * Use system clipboard for interoperability
3. Find/Replace bytes
  * In a selection
  * Across entire file
4. Set breakpoints on bytes
  * Stops execution at related point in schema
  * Would require custom additions to DAP backend
5. Set run options on bytes
  * Debug-to this byte (and break)
  * Start from this byte (skipping previous)

The idea with breakpoints and run operations is that they would behave as a
normal breakpoint in code would.  I don't know if this is functionality for
the first pass of the editor, but it is definitely something to keep in
mind while designing that first pass.

What other operations could be supported?


Re: for daffodil vscode debugger - misc pile of binary data display/edit ideas

2021-10-22 Thread John Wass
Another concept considered is the ability to display regions of a file in
formatting that is different from another region.  For example in the case
where endianness varies across the file, if we could apply different
renderings to these different regions it could assist schema developers in
reasoning on the data.

> File size limits for edit are acceptable.

We have found that to provide an interactive hex editor via VS Code we
would use a html view.  Wrapping bit/byte representation in html tags
significantly increases the size for non-editing functions as well, so if
we need to limit size it might be in general and not isolated to editing.

The scalable approach here will likely be to provide a viewport type
functionality that allows limiting the amount data that is loaded into VS
Code to what is needed for display.  Viewports combined with editing
presents further challenges.  We are researching some ways to approach that
as well.




On Fri, Oct 15, 2021 at 3:21 PM Mike Beckerle  wrote:

> Some requirements that came up in a non-email discussion:
> * ability to edit and save data - editing at hex and bits level. File size
> limits for edit are acceptable.
> * display of bits
> * Right-to-left byte order
> * word-size-sensitive display (e.g., user can set to use 70 bits wide)
> * support for parse & unparse cycle (comparison of output data from
> unparse, to input data in hex/bits display)
>
> The rest of this email is a bunch of random pointers/ideas about binary
> data display/edit. Hopefully useful, not TL;DR.
>
> This is what some least-significant-bit-first data lines look like: They
> are 70 bits wide, because that's the word size of the format. The byte
> order is Right to Left.
>
> 00 1100 0011 1000     0100  0101   1000
> 
> 1110 1000 1100
> 00     0001 0101 1001  1110 1010 1000 0101 1011
> 0011 1001 1010 0010
> 11   1000   1101 0110      
>    0101
> 00    0001 1000   0111  1000   
>    1101
>
> I have highlighted the first fields of the first word. Just to show how
> non-byte-aligned this sort of data is.
>
> This same kind of data is sometimes padded with extra bits which would show
> up on the left. 2 bits more is pretty common, as then each "word" is an
> even 9 bytes, which would make using a hex representation potentially
> useful.  But I've also seen 5 bits of padding, and 75 bit words are no
> help.  So a user needs to be able to say how wide they want the
> presentation, in bits.
>
> The above 4 lines of bits,... that data format is often preceded by a
> 32-byte long big-endian mostSignificantBitFirst header all of which is
> byte-oriented byte-aligned data, and is most easily understood by looking
> at an ordinary L-to-R hex dump.
>
> Hence, users need to be able to examine a file of this sort of data, and
> break the data at byte 32, so that from byte 33 (base 1 numbering) onwards
> for the next 35 bytes (70 bits x 4 = 280 bits = 35 bytes) use the
> bit-oriented 70-bit-wide display. A typical data file will have many such
> header+message, suggesting one must switch back and forth between
> presentations of the data.
>
> You should also look at this bit-order tutorial:
> https://daffodil.apache.org/tutorials/bitorder.tutorial.tdml.xml which
> discusses R-to-L byte display also.
> This tutorial should convince you there is no need to be reordering the
> bits, only the bytes. I.e, in the above 70 bit words, the first byte is
> "1000 1100", regardless of whether the presentation is L-to-R or R-to-L.
>
> The Daffodil CLI debugger has a "data dump" utility that creates R-to-L
> dump display things like this:
>
> fedcba9876543210   ffee ddcc bbaa 9988 7766 5544 3322 1100  87654321
> cø€␀␀␀wü␚’gU€␀gä  63f8 8000  77fc 1a92 6755 8000 67e4 :
>␀␀␁›¶þ␐HD   00 0001 9bb6 fe10 4844 :0010
>
> That example is in the TestDump.scala file in the
> daffodil-io/src/test/scala/org/apache/daffodil/io/TestDump.scala file.
> The chars on the left are iso-8859-1 code points, except for the
> control-pictures characters used to represent those code points.
>
> (Email isn't lining up these characters correctly due to the
> control-pictures characters (like for NUL, DLE, SUB, etc.) not being fixed
> width in this font. I don't think there is a fixed-width font in the world
> with every unicode code point in it.)
>
> There's also examples there of L-to-R dumps for utf-8, utf-16, and utf-32
> data. E.g., this is utf-8 with some 3-byte Kanji chars:
>
> 87654321  0011 2233 4455 6677 8899 aabb ccdd eeff
> 0~1~2~3~4~5~6~7~8~9~a~b~c~d~e~f~
> : 4461 7465 20e5 b9b4 e69c 88e6 97a5 3d32 D~a~t~e~␣~年月日
> =~2~
> 0010: 3030 33e5 b9b4 3038 e69c 8832 37e6 97a5 0~0~3~年0~8~月2~7~日
> 
>
> Character sets are in general quite problematic, as there are some that
> include shift-chars which chang

Re: which tickets are critical for v1.0.0 of debugger?

2021-10-21 Thread John Wass
I moved #39 into the milestone based on discussion from GitHub.

https://github.com/apache/daffodil-vscode/milestone/1

On Mon, Oct 18, 2021 at 9:55 AM Mike Beckerle  wrote:

> I just wanted to try to get the team to mark the tickets that are critical
> to fix for v1.0.0 release with the 1.0.0 milestone.
>
> I marked 2 tickets. The build instructions ticket, which blocks my
> evaluation/usage of the debugger, is the most critical in my mind.
>
> Eliminating the wiki references to the pre-Apache site are important, but
> would not block creating a release-candidate as they are not part of the
> software itself.
>
> Several are clearly not priority for 1.0.0, e.g., TDML support.
>


Re: vscode debugger - next steps

2021-10-14 Thread John Wass
Milestone to track 1.0.0 -
https://github.com/apache/daffodil-vscode/milestone/1



On Thu, Oct 14, 2021 at 7:48 AM John Wass  wrote:

> > You mention a VSIX + ZIP? What is the zip you reference? Is that the
> source, or are there two convenience binaries that make up an extension?
>
> The backend is packaged as a zip, separate from the extension, to provide
> support for debugging against multiple Daffodil versions.
>
>
> > If we want to upload things to GitHub releases once things pass, that's
> probably okay, but anything else probably doesn't follow ASF guidelines.
>
> That's what was agreed upon previously.  I don't have a strong opinion on
> automated publishing, does sound like it would complicate the ASF process.
>
> Other Apache projects publish to GitHub, perhaps we can learn from them.
>
>
> > Seems odd the build.sbt and create_vsix.sh files are in here. Is that a
> bug?
>
> The ignores need to be updated.
>
> https://github.com/apache/daffodil-vscode/issues/30
>
>
>
>
> On Wed, Oct 13, 2021 at 12:50 PM Steve Lawrence 
> wrote:
>
>> I ran the ./create_vsix.sh script, and it does create a .vsix file, but
>> when I unzip that file, this is the contents:
>>
>> .
>> |-- [Content_Types].xml
>> |-- extension
>> |   |-- LICENSE.txt
>> |   |-- NOTICE
>> |   |-- README.md
>> |   |-- build.sbt
>> |   |-- create_vsix.sh
>> |   |-- dist
>> |   |   `-- ext
>> |   |   `-- extension.js
>> |   |-- images
>> |   |   |-- arrow.svg
>> |   |   `-- daffodil.jpg
>> |   |-- package.json
>> |   `-- snippets
>> |   |-- dfdl.json
>> |   `-- json-license.txt
>> `-- extension.vsixmanifest
>>
>> Seems odd the build.sbt and create_vsix.sh files are in here. Is that a
>> bug?
>>
>> It looks like all the ts files are combined and minimized into the
>> single extension.js file. I'm not sure where the dependencies are
>> though. Maybe they are downloaded dynamically? Or maybe just the parts
>> that are used are are "statically compiled" into this extension.js? So
>> we can't easily know which dependencies actually end up in this .vsix
>> file?
>>
>> Also, the debugger .jar and its dependencies aren't in this vsix file?
>> Are those distributed/downloaded separately? Seems like they would
>> wanted to be distributed in the .vsix file so you just need to
>> distribute/install a single file? Is that possible?
>>
>> It's important to understand this so we can figure out what
>> LICENSE/NOTICE information is needed in this .vsix convenience binary.
>>
>>
>> On 10/13/21 12:22 PM, Adam Rosien wrote:
>> > My understanding is the Typescript code gets "compiled" into Javascript
>> > when built and packaged.
>> >
>> > On Wed, Oct 13, 2021 at 9:07 AM Mike Beckerle 
>> wrote:
>> >
>> >> Does the typescript code get compiled to a binary form (e.g.,
>> analogous to
>> >> a jar) or is it distributed as source (e.g., more like javascript)?
>> >>
>> >> On Wed, Oct 13, 2021 at 11:12 AM John Wass  wrote:
>> >>
>> >>> Status on Mike's original list
>> >>> 1-4 complete
>> >>>
>> >>> We have some tweaks that could be added for a 1.0.0, but perhaps we
>> get
>> >> an
>> >>> 1.0.0-RC1 out ASAP, and then can improve that with further RCs?
>> >>>
>> >>> Blockers for an initial RC right now might be
>> >>> 1. Do we know the ASF process for releasing the VSIX and zip to
>> GitHub?
>> >>> 2. Add the actions CI back in for automated release from a tag.
>> >>> 3. Items 5-7 of Mike's original list ?
>> >>>
>> >>> Sound right?
>> >>>
>> >>>
>> >>>
>> >>> On Thu, Oct 7, 2021 at 10:23 AM Mike Beckerle 
>> >>> wrote:
>> >>>
>> >>>> +1 on summary + linking to older issue
>> >>>>
>> >>>> On Thu, Oct 7, 2021 at 10:15 AM Steve Lawrence > >
>> >>>> wrote:
>> >>>>
>> >>>>> +1 sounds good to me
>> >>>>>
>> >>>>> On 10/7/21 10:14 AM, John Wass wrote:
>> >>>>>> Sounds good.  I will do that and start to move issues over.
>> >>>>>>
>> >>>>>> Some of these issues have multiple posts by multiple authors and it
>>

Re: vscode debugger - next steps

2021-10-14 Thread John Wass
> You mention a VSIX + ZIP? What is the zip you reference? Is that the
source, or are there two convenience binaries that make up an extension?

The backend is packaged as a zip, separate from the extension, to provide
support for debugging against multiple Daffodil versions.


> If we want to upload things to GitHub releases once things pass, that's
probably okay, but anything else probably doesn't follow ASF guidelines.

That's what was agreed upon previously.  I don't have a strong opinion on
automated publishing, does sound like it would complicate the ASF process.

Other Apache projects publish to GitHub, perhaps we can learn from them.


> Seems odd the build.sbt and create_vsix.sh files are in here. Is that a
bug?

The ignores need to be updated.

https://github.com/apache/daffodil-vscode/issues/30




On Wed, Oct 13, 2021 at 12:50 PM Steve Lawrence 
wrote:

> I ran the ./create_vsix.sh script, and it does create a .vsix file, but
> when I unzip that file, this is the contents:
>
> .
> |-- [Content_Types].xml
> |-- extension
> |   |-- LICENSE.txt
> |   |-- NOTICE
> |   |-- README.md
> |   |-- build.sbt
> |   |-- create_vsix.sh
> |   |-- dist
> |   |   `-- ext
> |   |   `-- extension.js
> |   |-- images
> |   |   |-- arrow.svg
> |   |   `-- daffodil.jpg
> |   |-- package.json
> |   `-- snippets
> |   |-- dfdl.json
> |   `-- json-license.txt
> `-- extension.vsixmanifest
>
> Seems odd the build.sbt and create_vsix.sh files are in here. Is that a
> bug?
>
> It looks like all the ts files are combined and minimized into the
> single extension.js file. I'm not sure where the dependencies are
> though. Maybe they are downloaded dynamically? Or maybe just the parts
> that are used are are "statically compiled" into this extension.js? So
> we can't easily know which dependencies actually end up in this .vsix file?
>
> Also, the debugger .jar and its dependencies aren't in this vsix file?
> Are those distributed/downloaded separately? Seems like they would
> wanted to be distributed in the .vsix file so you just need to
> distribute/install a single file? Is that possible?
>
> It's important to understand this so we can figure out what
> LICENSE/NOTICE information is needed in this .vsix convenience binary.
>
>
> On 10/13/21 12:22 PM, Adam Rosien wrote:
> > My understanding is the Typescript code gets "compiled" into Javascript
> > when built and packaged.
> >
> > On Wed, Oct 13, 2021 at 9:07 AM Mike Beckerle 
> wrote:
> >
> >> Does the typescript code get compiled to a binary form (e.g., analogous
> to
> >> a jar) or is it distributed as source (e.g., more like javascript)?
> >>
> >> On Wed, Oct 13, 2021 at 11:12 AM John Wass  wrote:
> >>
> >>> Status on Mike's original list
> >>> 1-4 complete
> >>>
> >>> We have some tweaks that could be added for a 1.0.0, but perhaps we get
> >> an
> >>> 1.0.0-RC1 out ASAP, and then can improve that with further RCs?
> >>>
> >>> Blockers for an initial RC right now might be
> >>> 1. Do we know the ASF process for releasing the VSIX and zip to GitHub?
> >>> 2. Add the actions CI back in for automated release from a tag.
> >>> 3. Items 5-7 of Mike's original list ?
> >>>
> >>> Sound right?
> >>>
> >>>
> >>>
> >>> On Thu, Oct 7, 2021 at 10:23 AM Mike Beckerle 
> >>> wrote:
> >>>
> >>>> +1 on summary + linking to older issue
> >>>>
> >>>> On Thu, Oct 7, 2021 at 10:15 AM Steve Lawrence 
> >>>> wrote:
> >>>>
> >>>>> +1 sounds good to me
> >>>>>
> >>>>> On 10/7/21 10:14 AM, John Wass wrote:
> >>>>>> Sounds good.  I will do that and start to move issues over.
> >>>>>>
> >>>>>> Some of these issues have multiple posts by multiple authors and it
> >>>> will
> >>>>> be
> >>>>>> hard to capture all that context in the new repo.  I'm thinking of
> >>>>>> generating a summary and then adding a link to the archived issue.
> >>> Any
> >>>>>> objections there?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Oct 6, 2021 at 5:01 PM Mike Beckerle  >>>
> >>>>> wrote:
> >>>>>>
> >>>>>>> No problem. Go ahead. That's a sensi

Re: vscode debugger - next steps

2021-10-13 Thread John Wass
Status on Mike's original list
1-4 complete

We have some tweaks that could be added for a 1.0.0, but perhaps we get an
1.0.0-RC1 out ASAP, and then can improve that with further RCs?

Blockers for an initial RC right now might be
1. Do we know the ASF process for releasing the VSIX and zip to GitHub?
2. Add the actions CI back in for automated release from a tag.
3. Items 5-7 of Mike's original list ?

Sound right?



On Thu, Oct 7, 2021 at 10:23 AM Mike Beckerle  wrote:

> +1 on summary + linking to older issue
>
> On Thu, Oct 7, 2021 at 10:15 AM Steve Lawrence 
> wrote:
>
> > +1 sounds good to me
> >
> > On 10/7/21 10:14 AM, John Wass wrote:
> > > Sounds good.  I will do that and start to move issues over.
> > >
> > > Some of these issues have multiple posts by multiple authors and it
> will
> > be
> > > hard to capture all that context in the new repo.  I'm thinking of
> > > generating a summary and then adding a link to the archived issue.  Any
> > > objections there?
> > >
> > >
> > >
> > > On Wed, Oct 6, 2021 at 5:01 PM Mike Beckerle 
> > wrote:
> > >
> > >> No problem. Go ahead. That's a sensible last commit on that repo.
> > >>
> > >> On Wed, Oct 6, 2021 at 5:00 PM John Wass  wrote:
> > >>
> > >>> Will there be any problem with updating the readme in the old repo to
> > >> note
> > >>> that it is relocated?
> > >>>
> > >>> On Wed, Oct 6, 2021 at 4:20 PM John Wass  wrote:
> > >>>
> > >>>> Steps 1-7 sound good to me.  Some thoughts
> > >>>>
> > >>>>> 1) push to https://github.com/apache/daffodil-vscode repository.
> > >>>>
> > >>>> Who is going to push the code?
> > >>>>
> > >>>>> 2) move over github issues to the new repo issues
> > >>>>
> > >>>> It doesn't look like the "transfer issue" function works across
> orgs.
> > >> So
> > >>>> a manual move it shall be.
> > >>>>
> > >>>>> 3) move wiki pages/doc to the github wiki associated with the new
> > >>>> repository
> > >>>>
> > >>>> Same thing, manual copy.  Not as significant as issue moving.
> > >>>>
> > >>>>> 4) archive the old original github repo (for posterity).
> > >>>>
> > >>>> Concur. I'd say this happens first to ensure nothing drifts while we
> > are
> > >>>> moving things around.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Wed, Oct 6, 2021 at 10:41 AM Mike Beckerle  >
> > >>>> wrote:
> > >>>>
> > >>>>> With the IP-clearance now complete, next steps (I think) are:
> > >>>>>
> > >>>>> 1) push to https://github.com/apache/daffodil-vscode repository.
> > >>>>> I believe the existing repo main branch should be pushed here as
> is,
> > >>>>> i.e.,
> > >>>>> no need to squash anything.
> > >>>>> Note the main branch is named "main", not master.
> > >>>>> Tag it at the current point on the main branch. (suggest tag name
> > >>>>> apache-ip-clearance ? or happy-apache-birthday ?)
> > >>>>> 2) move over github issues to the new repo issues
> > >>>>> 3) move wiki pages/doc to the github wiki associated with the new
> > >>>>> repository
> > >>>>> 4) archive the old original github repo (for posterity).
> > >>>>> 5) update main daffodil-site pages to mention/highlight the new
> > vscode
> > >>>>> debugger and link to its issues and wiki.
> > >>>>> 6) whatever else I forgot
> > >>>>>
> > >>>>> and
> > >>>>>
> > >>>>> 7) start planning for release 1.0.0.
> > >>>>>
> > >>>>> I am not sure what additional things are needed in order to meet
> > Apache
> > >>>>> criteria for release, given the vscode marketplace as a means of
> > >>>>> distribution. Perhaps we don't need to solve that yet?
> > >>>>>
> > >>>>> I think we covered almost everything else during the IP-clearance
> > >>>>> process.
> > >>>>>
> > >>>>> If there are things, let's discuss them here on the dev list.
> > >>>>>
> > >>>>
> > >>
> > >
> >
> >
>


Re: vscode debugger - next steps

2021-10-07 Thread John Wass
Sounds good.  I will do that and start to move issues over.

Some of these issues have multiple posts by multiple authors and it will be
hard to capture all that context in the new repo.  I'm thinking of
generating a summary and then adding a link to the archived issue.  Any
objections there?



On Wed, Oct 6, 2021 at 5:01 PM Mike Beckerle  wrote:

> No problem. Go ahead. That's a sensible last commit on that repo.
>
> On Wed, Oct 6, 2021 at 5:00 PM John Wass  wrote:
>
> > Will there be any problem with updating the readme in the old repo to
> note
> > that it is relocated?
> >
> > On Wed, Oct 6, 2021 at 4:20 PM John Wass  wrote:
> >
> >> Steps 1-7 sound good to me.  Some thoughts
> >>
> >> > 1) push to https://github.com/apache/daffodil-vscode repository.
> >>
> >> Who is going to push the code?
> >>
> >> > 2) move over github issues to the new repo issues
> >>
> >> It doesn't look like the "transfer issue" function works across orgs.
> So
> >> a manual move it shall be.
> >>
> >> > 3) move wiki pages/doc to the github wiki associated with the new
> >> repository
> >>
> >> Same thing, manual copy.  Not as significant as issue moving.
> >>
> >> > 4) archive the old original github repo (for posterity).
> >>
> >> Concur. I'd say this happens first to ensure nothing drifts while we are
> >> moving things around.
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Oct 6, 2021 at 10:41 AM Mike Beckerle 
> >> wrote:
> >>
> >>> With the IP-clearance now complete, next steps (I think) are:
> >>>
> >>> 1) push to https://github.com/apache/daffodil-vscode repository.
> >>> I believe the existing repo main branch should be pushed here as is,
> >>> i.e.,
> >>> no need to squash anything.
> >>> Note the main branch is named "main", not master.
> >>> Tag it at the current point on the main branch. (suggest tag name
> >>> apache-ip-clearance ? or happy-apache-birthday ?)
> >>> 2) move over github issues to the new repo issues
> >>> 3) move wiki pages/doc to the github wiki associated with the new
> >>> repository
> >>> 4) archive the old original github repo (for posterity).
> >>> 5) update main daffodil-site pages to mention/highlight the new vscode
> >>> debugger and link to its issues and wiki.
> >>> 6) whatever else I forgot
> >>>
> >>> and
> >>>
> >>> 7) start planning for release 1.0.0.
> >>>
> >>> I am not sure what additional things are needed in order to meet Apache
> >>> criteria for release, given the vscode marketplace as a means of
> >>> distribution. Perhaps we don't need to solve that yet?
> >>>
> >>> I think we covered almost everything else during the IP-clearance
> >>> process.
> >>>
> >>> If there are things, let's discuss them here on the dev list.
> >>>
> >>
>


Re: vscode debugger - next steps

2021-10-06 Thread John Wass
Will there be any problem with updating the readme in the old repo to note
that it is relocated?

On Wed, Oct 6, 2021 at 4:20 PM John Wass  wrote:

> Steps 1-7 sound good to me.  Some thoughts
>
> > 1) push to https://github.com/apache/daffodil-vscode repository.
>
> Who is going to push the code?
>
> > 2) move over github issues to the new repo issues
>
> It doesn't look like the "transfer issue" function works across orgs.  So
> a manual move it shall be.
>
> > 3) move wiki pages/doc to the github wiki associated with the new
> repository
>
> Same thing, manual copy.  Not as significant as issue moving.
>
> > 4) archive the old original github repo (for posterity).
>
> Concur. I'd say this happens first to ensure nothing drifts while we are
> moving things around.
>
>
>
>
>
> On Wed, Oct 6, 2021 at 10:41 AM Mike Beckerle 
> wrote:
>
>> With the IP-clearance now complete, next steps (I think) are:
>>
>> 1) push to https://github.com/apache/daffodil-vscode repository.
>> I believe the existing repo main branch should be pushed here as is, i.e.,
>> no need to squash anything.
>> Note the main branch is named "main", not master.
>> Tag it at the current point on the main branch. (suggest tag name
>> apache-ip-clearance ? or happy-apache-birthday ?)
>> 2) move over github issues to the new repo issues
>> 3) move wiki pages/doc to the github wiki associated with the new
>> repository
>> 4) archive the old original github repo (for posterity).
>> 5) update main daffodil-site pages to mention/highlight the new vscode
>> debugger and link to its issues and wiki.
>> 6) whatever else I forgot
>>
>> and
>>
>> 7) start planning for release 1.0.0.
>>
>> I am not sure what additional things are needed in order to meet Apache
>> criteria for release, given the vscode marketplace as a means of
>> distribution. Perhaps we don't need to solve that yet?
>>
>> I think we covered almost everything else during the IP-clearance process.
>>
>> If there are things, let's discuss them here on the dev list.
>>
>


Re: vscode debugger - next steps

2021-10-06 Thread John Wass
Steps 1-7 sound good to me.  Some thoughts

> 1) push to https://github.com/apache/daffodil-vscode repository.

Who is going to push the code?

> 2) move over github issues to the new repo issues

It doesn't look like the "transfer issue" function works across orgs.  So a
manual move it shall be.

> 3) move wiki pages/doc to the github wiki associated with the new
repository

Same thing, manual copy.  Not as significant as issue moving.

> 4) archive the old original github repo (for posterity).

Concur. I'd say this happens first to ensure nothing drifts while we are
moving things around.





On Wed, Oct 6, 2021 at 10:41 AM Mike Beckerle  wrote:

> With the IP-clearance now complete, next steps (I think) are:
>
> 1) push to https://github.com/apache/daffodil-vscode repository.
> I believe the existing repo main branch should be pushed here as is, i.e.,
> no need to squash anything.
> Note the main branch is named "main", not master.
> Tag it at the current point on the main branch. (suggest tag name
> apache-ip-clearance ? or happy-apache-birthday ?)
> 2) move over github issues to the new repo issues
> 3) move wiki pages/doc to the github wiki associated with the new
> repository
> 4) archive the old original github repo (for posterity).
> 5) update main daffodil-site pages to mention/highlight the new vscode
> debugger and link to its issues and wiki.
> 6) whatever else I forgot
>
> and
>
> 7) start planning for release 1.0.0.
>
> I am not sure what additional things are needed in order to meet Apache
> criteria for release, given the vscode marketplace as a means of
> distribution. Perhaps we don't need to solve that yet?
>
> I think we covered almost everything else during the IP-clearance process.
>
> If there are things, let's discuss them here on the dev list.
>


Re: vscode debugger - next steps

2021-10-06 Thread John Wass
> Are there even convenience binaries? What is actually published to the
market place?

We will send a VS Code extension package (.vsix) to the marketplace.

We will also need to make the daffodil debugger zip file from the build
available from the GitHub release.

> Perhaps we don't need to solve that yet?

If we publish the vsix alongside the zip like we do in the example repo it
will get us up and running for manual installations.


On Wed, Oct 6, 2021 at 11:14 AM Steve Lawrence  wrote:

> Just want to clarify that you are suggesting we force push and overwrite
> the current main branch with the IP clearance main branch? This seems
> fine to me, but we do need to make sure add back the .asf.yml file. This
> can be done as the first PR.
>
> I'm also not sure we need to tag it. I usually prefer tags for just
> releases. If there are ever any concerns about the IP clearance, we can
> always refer to the IP clearance documentation, which lists the git
> commit hash. Since we are force pushing, the main branch will have the
> same commit hash.
>
> We'll also need to update the LICENSE/NOTICE files prior to release.
> There are a lot of npm dependencies that we have verified are compatible
> with ALv2, but we still need to document them in the license file of any
> convenience binaries.
>
> Are there even convenience binaries? What is actually published to the
> market place?
>
> As far as the release process, it will be very simialr to the Daffodil
> repo.  We'll have to do the normal vote, which will be for the source
> and convenience binaries. Once that vote passes, we can publish the
> convenience binaries to the VSCode marketplace. We might need to work
> with Apache Infra if there's an "ASF" thing to publish under.
>
> We might want a Release Workflow page dedicated to the vscode repo.
>
>
> On 10/6/21 10:40 AM, Mike Beckerle wrote:
> > With the IP-clearance now complete, next steps (I think) are:
> >
> > 1) push to https://github.com/apache/daffodil-vscode repository.
> > I believe the existing repo main branch should be pushed here as is,
> i.e.,
> > no need to squash anything.
> > Note the main branch is named "main", not master.
> > Tag it at the current point on the main branch. (suggest tag name
> > apache-ip-clearance ? or happy-apache-birthday ?)
> > 2) move over github issues to the new repo issues
> > 3) move wiki pages/doc to the github wiki associated with the new
> repository
> > 4) archive the old original github repo (for posterity).
> > 5) update main daffodil-site pages to mention/highlight the new vscode
> > debugger and link to its issues and wiki.
> > 6) whatever else I forgot
> >
> > and
> >
> > 7) start planning for release 1.0.0.
> >
> > I am not sure what additional things are needed in order to meet Apache
> > criteria for release, given the vscode marketplace as a means of
> > distribution. Perhaps we don't need to solve that yet?
> >
> > I think we covered almost everything else during the IP-clearance
> process.
> >
> > If there are things, let's discuss them here on the dev list.
> >
>
>


Re: github issues vs. JIRA for VSCode Debugger - Fw: Apache JIRA vs. github "issues"

2021-09-23 Thread John Wass
+1 for GitHub Issues

- gain conveniences by staying within GitHub
- avoid complexity involved with separating things
- better experience for new contributors


On Thu, Sep 23, 2021 at 10:29 AM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

> So, I inquired about whether we need to use JIRA, or can just use github
> issues.
>
> I got a reply basically saying we can do what we prefer. (reply is below.
> Apache Airflow uses github issues)
>
> The regular Apache Daffodil repo has a pretty big investment in using
> JIRA. I'm not suggesting we consider switching that.
>
> For VSCode, we could stick with using JIRA, but that would mix its issues
> into the ~390 other Apache Daffodil issues.
>
> There are pros and cons to this.
>
> I am wondering if for the VSCode repo (once established), we should just
> use github issues instead.
>
> Thoughts?
>
> -mikeb
>
> 
> From: Jarek Potiuk 
> Sent: Thursday, September 23, 2021 10:21 AM
> To: Beckerle, Mike 
> Cc: us...@infra.apache.org 
> Subject: Re: Apache JIRA vs. github "issues"
>
> It's quite OK to only use Github Issues/Discussions - we switched to GH in
> Apache Airflow ~ 2 years ago I think.
>
> And a comment from our perspective of a big project that uses GitHub
> Issues at its inception, switched to JIRA and finally returned back to
> GitHub issues when they matured. Others might have different experience but
> this is ours (and I am pretty sure I am representing view of pretty much
> whole Airflow community).
>
> I witnessed just the last switch - from JIRA to GitHub. We stopped using
> JIRA in Apache Airflow in favour of GitHub Issues and Discussions and we
> NEVER looked back. Not a minute. Not even a second. Absolutely no-one
> missed JIRA. Not by far.
>
> That was such an amazing improvement in the overall workflow and
> contributor's engagement. I don't even imagine how we would be able to run
> the project with JIRA.
>
> The overall experience, integration level, overhead needed to manage JIRA
> issues, dual-logging-in and syncing between the two were absolutely
> unmanageable for us. With GitHub Issues we chose to base our "change
> tracking" based on PR# rather than Issue # optional and it made a whole
> world of difference.
>
> Especially recently with GithubDiscussions added to the mix and ability to
> convert issues into discussions (and back) if they are not real issues.
>
> J.
>
>
> On Thu, Sep 23, 2021 at 4:01 PM Beckerle, Mike <
> mbecke...@owlcyberdefense.com>
> wrote:
> I read a old blog post from infra about increasing github integration.
>
> I am wondering about Apache JIRA, vs. using the issues feature of github,
> for an Apache project repo.
>
> Can we use github's issues feature, or do we have to use Apache's JIRA? Is
> there a policy, or even strong preference on this issue?
>
> Thanks
>
> Mike Beckerle
> Apache Daffodil PMC
>
>
>
>


Re: verify licenses on dependencies for vscode debugger

2021-09-20 Thread John Wass
JS dependencies here, should be all transitives too

https://github.com/jw3/example-daffodil-vscode/wiki/js-dependencies

On Mon, Sep 20, 2021 at 7:42 AM Steve Lawrence  wrote:

> These all look compatible with the Apache license and shouldn't be a
> problem. The EPL 1.0 dependencies will require some extra labeling in
> the distributed binary, but that's not a big deal.
>
> package.json also lists some dependencies, I think these are all
> javascript/npm dependencies? We probably need to verify the full
> transitive graph of these dependencies as well.
>
> On 9/19/21 4:11 PM, Adam Rosien wrote:
> >   From sbt, run core/dependencyLicenseInfo (see
> > https://github.com/sbt/sbt-dependency-graph
> >  for instructions):
> >
> > ---
> > No license specified
> > Concurrent Technologies Corporation, Nteligen
> > LLC:daffodil-debugger_2.12:0.0.15-18-g091ad23-SNAPSHOT
> > commons-io:commons-io:2.8.0
> > com.google.code.gson:gson:2.7
> > com.microsoft.java:com.microsoft.java.debug.core:0.31.1
> > ch.qos.logback:logback-classic:1.2.3
> > org.apache.commons:commons-lang3:3.6
> > xml-resolver:xml-resolver:1.2
> > ch.qos.logback:logback-core:1.2.3
> > org.slf4j:slf4j-api:1.7.30
> >
> > Apache 2.0
> > org.typelevel:simulacrum-scalafix-annotations_2.12:0.5.4
> >
> > Apache License, Version 2.0
> > org.apache.daffodil:daffodil-core_2.12:3.1.0
> > org.apache.daffodil:daffodil-sapi_2.12:3.1.0
> > org.apache.daffodil:daffodil-runtime1-unparser_2.12:3.1.0
> > org.apache.daffodil:daffodil-runtime1_2.12:3.1.0
> > org.apache.daffodil:daffodil-io_2.12:3.1.0
> > org.apache.daffodil:daffodil-udf_2.12:3.1.0
> > org.apache.daffodil:daffodil-lib_2.12:3.1.0
> >
> > Apache-2.0
> > com.typesafe:config:1.4.1
> > org.scala-lang.modules:scala-xml_2.12:1.3.0
> > org.typelevel:log4cats-slf4j_2.12:2.1.0
> > org.typelevel:log4cats-core_2.12:2.1.0
> > org.scala-lang.modules:scala-parser-combinators_2.12:1.1.2
> > org.typelevel:cats-effect_2.12:3.1.1
> > org.typelevel:cats-effect-kernel_2.12:3.1.1
> > com.monovore:decline_2.12:2.1.0
> > org.typelevel:cats-effect-std_2.12:3.1.1
> > com.monovore:decline-effect_2.12:2.1.0
> > com.comcast:ip4s-core_2.12:3.0.3
> > org.typelevel:literally_2.12:1.0.2
> >
> > BSD-3-Clause
> > org.scodec:scodec-bits_2.12:1.1.27
> >
> > CC0
> > org.reactivestreams:reactive-streams:1.0.0
> >
> > MIT
> > org.typelevel:cats-core_2.12:2.6.1
> > co.fs2:fs2-io_2.12:3.0.4
> > com.lihaoyi:os-lib_2.12:0.7.6
> > com.lihaoyi:geny_2.12:0.6.9
> > org.typelevel:cats-kernel_2.12:2.6.1
> > co.fs2:fs2-core_2.12:3.0.4
> >
> > Similar to Apache License but with the acknowledgment clause removed
> > org.jdom:jdom2:2.0.6
> >
> > The Apache License, Version 2.0
> > com.fasterxml.woodstox:woodstox-core:6.2.6
> >
> > The Apache Software License, Version 2.0
> > xml-apis:xml-apis:1.4.01
> > xerces:xercesImpl:2.12.1
> > com.fasterxml.jackson.core:jackson-core:2.12.3
> > io.reactivex.rxjava2:rxjava:2.1.1
> >
> > The BSD License
> > org.codehaus.woodstox:stax2-api:4.2.1
> >
> > Unicode/ICU License
> > com.ibm.icu:icu4j:69.1
> > ---
> >
> > Notes:
> >
> >   From the "No license specified", I looked at either the actual pom.xml
> files or
> > the source repository, and determined the actual licenses are:
> >
> > - APL 2.0
> > - commons-io:commons-io:2.8.0
> > - com.google.code.gson:gson:2.7
> > - org.apache.commons:commons-lang3:3.6
> > - xml-resolver:xml-resolver:1.2
> > - Eclipse Public License - v 1.0
> > - com.microsoft.java:com.microsoft.java.debug.core:0.31.1
> > - ch.qos.logback:logback-classic:1.2.3
> > - ch.qos.logback:logback-core:1.2.3
> > - MIT
> > - org.slf4j:slf4j-api:1.7.30
> >
> > On Fri, Sep 17, 2021 at 4:45 PM Adam Rosien  > > wrote:
> >
> >  I said I'd do it, but completely forgot! I'll get this out this
> weekend.
> >
> >  .. Adam
> >
> >  On Fri, Sep 17, 2021 at 3:24 PM Beckerle, Mike
> >   mbecke...@owlcyberdefense.com>> wrote:
> >
> >  I recall someone verifying the licenses on dependencies. I
> can't find
> >  that message now.
> >
> >  However, this must be a transitive verification, so there's
> quite a few.
> >
> >  The build.sbt has only:
> >
> > "ch.qos.logback" % "logback-classic" % "1.2.3",
> > "com.microsoft.java" % "com.microsoft.java.debug.core" %
> "0.31.1",
> > "co.fs2" %% "fs2-io" % "3.0.4",
> > "com.monovore" %% "decline-effect" % "2.1.0",
> > "org.typelevel" %% "log4cats-slf4j" % "2.1.0",
> >
> >  for the typescript code, I see a bunch in package.json.
> >
> >  Action Required: Can someone please verify the licenses of all
> the
> >  dependencies transitively and send me the list?
> >
> >  This is specifically what the IP Clearance checklist asks:
> >
> > Check and make sure that all items depended
> upon by the
> >   

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-16 Thread John Wass
> I know of one file in the repo which will have to be removed which is the
jpeg.dfdl.xsd file, which is there just as an example workspace.

I assume this issue remains, and needs to be addressed prior to giving this
the done stamp.

We could just remove that sample workspace, the setup is trivial and is
addressed in the docs, but that schema and jpg also exist for unit tests.

Looking through the test resources in Daffodil now, any suggestions on a
good candidate are welcomed.



On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike 
wrote:

> I know of one file in the repo which will have to be removed which is the
> jpeg.dfdl.xsd file, which is there just as an example workspace.
>
> The copyright and provisions of that are not compatible with Apache
> licensing.
>
> We can find a DFDL schema that we created that has Apache license to use
> instead.
>
> For the other files under src, server, and build, can we generate a list
> of files identifying which are:
>
> (a) original MIT-licensed, unmodified
> (b) new - can be ASL
> (c) blended - started from MIT-licensed source, modified with
> daffodil-vscode-specific changes.
>
> It is these blended files that are the problematic ones.
>
>
>
> 
> From: Steve Lawrence 
> Sent: Thursday, September 9, 2021 1:38 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
>
> Correct. For more information about Apache license compatibility:
>
>   https://www.apache.org/legal/resolved.html
>
> MIT is Category A and is fine. EPL is Category B and is also okay, but
> generally only in its binary form. So these top-level dependencies look
> okay, assuming their transitive dependencies are also okay.
>
> We'll also need to verify the licenses of all code in the repo.
> Hopefully little of that is original microsoft MIT and can be granted to
> ASF and relicensed.
>
>
> On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> > The requirement, is that the entire dependency tree (transitively)
> cannot depend on any software that has an Apache-incompatible (aka
> restrictive) license.
> >
> > So we need the transitive closure of all dependencies.
> >
> >
> > 
> > From: Adam Rosien 
> > Sent: Thursday, September 9, 2021 12:44 PM
> > To: dev@daffodil.apache.org 
> > Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
> >
> > (I don't understand the requirements of licencing + transitive
> > dependencies, so I'm giving some surface level license info)
> >
> > "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
> > http://logback.qos.ch/license.html
> > "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL
> 1.0
> > "co.fs2" %% "fs2-io" % "3.0.4" - MIT
> > "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
> > "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0
> >
> > On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:
> >
> >> I can relay the list of dependencies and their licenses.
> >>
> >> On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
> >> wrote:
> >>
> >>> I personally don't care too much about having the existing git history
> >>> once its part of ASF, especially if it makes thing any easier (as you
> >>> mention, squash/rebase can be difficut through merges). So I'd say we
> >>> just do plan B--create a tarball of the current state (without the git
> >>> history), and the content of that tarball is what goes through the IP
> >>> clearance process, and is the content of the inital commit when adding
> >>> to the apache/daffodil-vscode repo.
> >>>
> >>> Note that I think the incubator will still want access to the existing
> >>> repo so they can view the full git history. Understanding where
> >>> everything came from and verifying the provenance is important to
> >>> ensuring we have all the appropriate CLA's. So while the tarball is
> >>> maybe what is officially voted on, they will want access to the repo.
> >>>
> >>> That said, I don't think we are going to get CLA's for any Microsoft
> >>> contribute code. So either all Microsoft contributed code will need to
> >>> be kept MIT, or removed from the codebase. And if feels a bit odd to
> >>> grant something to ASF where the original codebase stays MIT and isn't
> >>> part of that grant.
> >>>
> >>> I think understanding how much code still exists that is Microsoft/MIT
> >>> is going to be important to getting this through the IP clearance
> process.
> >>>
> >>> So I'm curious how much of that original Microsoft code still exists? I
> >>> assume since it was just example code it has mostly been replaced? If
> >>> that's the case, we could potentially say Microsoft has no ownership of
> >>> this code, and so their CLA and MIT license aren't necessary?
> >>>
> >>> We should also have a good understanding of the dependencies. If any of
> >>> them are not compatible with ALv2, then going through this process
> isn't
> >>> even worth 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-13 Thread John Wass
> How hard is it to refactor these 6 files so that all new code is in
separate files from all preserved original code?

Will take a look at this today.

On Fri, Sep 10, 2021 at 1:47 PM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

> How hard is it to refactor these 6 files so that all new code is in
> separate files from all preserved original code?
>
> Assume one-liner changes to original files (like calling MockDebugger
> changed to call DaffodilDebugger) are allowed.
>
> We either have to separate these 6 blended files, or convince legal and
> the incubator-pmc that blended files are ok because they originally had the
> MIT license.
>
> I definitely don't want to bother with that unless the refactoring
> exercise here is hard.
> 
> From: John Wass 
> Sent: Friday, September 10, 2021 1:02 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
>
> Mike - Those were renames from the original versions that had "mock" in
> their names.
>
> commit 383fd4882a8fe51adf21b5ae31fe252056800447
>
> On Fri, Sep 10, 2021 at 12:54 PM Beckerle, Mike <
> mbecke...@owlcyberdefense.com> wrote:
>
> >
> > John Wass said:
> >
> > I had a few more (6) source files as modified..
> >
> > extension.ts
> > debugAdapter.ts
> > daffodilRuntime.ts
> > daffodilDebug.ts
> > adapter.test.ts
> > activateDaffodilDebug.ts
> >
> > The 3 files with daffodil or Daffodil in their names, aren't those new
> > files? Or were those based on provided files, but the file was renamed as
> > well as the content modified?
> >
> > ...mikeb
> >
> >
>


Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-10 Thread John Wass
Mike - Those were renames from the original versions that had "mock" in
their names.

commit 383fd4882a8fe51adf21b5ae31fe252056800447

On Fri, Sep 10, 2021 at 12:54 PM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

>
> John Wass said:
>
> I had a few more (6) source files as modified..
>
> extension.ts
> debugAdapter.ts
> daffodilRuntime.ts
> daffodilDebug.ts
> adapter.test.ts
> activateDaffodilDebug.ts
>
> The 3 files with daffodil or Daffodil in their names, aren't those new
> files? Or were those based on provided files, but the file was renamed as
> well as the content modified?
>
> ...mikeb
>
>


Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread John Wass
I had a few more (6) source files as modified..

extension.ts
debugAdapter.ts
daffodilRuntime.ts
daffodilDebug.ts
adapter.test.ts
activateDaffodilDebug.ts

> It would seem an IDE (probably vscode!) decided to restyle/reindent this
code.

We added opinionated code formatting... apparently trying to make this
process as hard as possible :/

That reformat commit was done on 08/25/2021, title of PR was Prettier.
Looking prior to that commit might give a little better idea of what
changed.


> squash/rebase can be difficut through merges

Here is a quick pass on (1) squashing the MS source in a single commit (2)
placing that commit on top of an init commit in a repo (2) and then
rewriting out commits on top of all of that.

It preserves our authorship.  Can be cleaned up a little bit still but I am
not going to put time into it if we don't want this.  I just wanted to note
how it could look.

https://github.com/jw3/rewrite-daffodil-vscode-1

One issue I could see here is the linking of the example repo PR IDs in the
commit messages will conflict once we start adding PRs in the new repo.
Now would be the time to rewrite these commit messages and strip/modify
those #ID tags.

Thoughts on that rewrite repo?





On Thu, Sep 9, 2021 at 5:42 PM Beckerle, Mike 
wrote:

> So via some git trickery I was able to determine the "blended" files.
>
> I'm ignoring the various configuration files which are generally json
> files.
>
> Of the ".ts" files only 3 are blended:
>
> src/debugAdapter.ts - 72 lines - only maybe 6 lines are different
> src/extension.ts - 179 lines
> src/tests/adapter.test.ts - 137 lines (50 of which are commented-out code)
>
> The delta between these files and the original files of the same name are
> larger than expected due to changes in whitespace, and removal of ";" at
> end of line (which I guess are optional in many places in typescript).
>
> It would seem an IDE (probably vscode!) decided to restyle/reindent this
> code.
>
> So it's a bit hard to figure out what the "real" deltas are.
>
> src/debugAdapter.ts appears to be only trivially different. The name
> MockDebugSession was replaced by DaffodilDebugSession, and "./mockDebug"
> was changed to "./daffodilDebug".
>
> The other two files do appear to be where all the real blended code is.
>
>
>
> 
> From: Beckerle, Mike 
> Sent: Thursday, September 9, 2021 4:21 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
>
> Whether it's a PR or series of PRs, or a software grant, that still
> doesn't resolve the issue of the blended files which are part MIT-licensed
> original code, and part new code deltas by the daffodil-vscode contributors.
>
> We need to understand whether those blended files can be teased apart
> somehow so that it is clear going forward what is an MIT-licensed library
> and what is Apache Licensed.
>
> I just did a grep -R -i microsoft  in a clone of the
> openwhisk-vscode-extension and got zero hits. So no files still carry
> microsoft copyright and in fact their NOTICES.txt file does not indicate
> any dependency on MIT-licensed code at all.  So I think
> openwhisk-vscode-extension is not going to help us figure out how to surf
> this issue.
>
>
> 
> From: Steve Lawrence 
> Sent: Thursday, September 9, 2021 3:54 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
>
> The concern is that this code was developed outside of Apache and so
> didn't follow standard Apache process. From the IP clearance page:
>
> https://incubator.apache.org/ip-clearance/
>
> > Any code that was developed outside of the ASF SVN repository and
> > our public mailing lists must be processed like this, even if the
> > external developer is already an ASF committer.
>
> I suppose that submitting it as a PR does follow some of that process,
> but there is maybe less assurance of ownership. Because it was not
> developed in an ASF repository, that code is presumed to be owned by
> you, multiple developers, or a company, and so that ownership must be
> granted to ASF via the IP clearance process, with appropriate software
> grant, CLA's, etc. (At least, that's my admittedly limited understanding
> of the process).
>
> - Steve
>
>
> On 9/9/21 3:34 PM, John Wass wrote:
> > Couldn't we (the vscode contributors) submit a series of PRs against the
> > new repo to move the code, and just archive the example repo as-is?
> >
> > I noted some thoughts on that a whil

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread John Wass
Yeah I was thinking of the example repo as a prototype, just as if I was
working on a feature in my fork of Daffodil.  The main project doesn't own
the feature until it crosses the PR threshold, and once it does cross over
the state of my fork is of no concern to it.



On Thu, Sep 9, 2021 at 3:54 PM Steve Lawrence  wrote:

> The concern is that this code was developed outside of Apache and so
> didn't follow standard Apache process. From the IP clearance page:
>
> https://incubator.apache.org/ip-clearance/
>
> > Any code that was developed outside of the ASF SVN repository and
> > our public mailing lists must be processed like this, even if the
> > external developer is already an ASF committer.
>
> I suppose that submitting it as a PR does follow some of that process,
> but there is maybe less assurance of ownership. Because it was not
> developed in an ASF repository, that code is presumed to be owned by
> you, multiple developers, or a company, and so that ownership must be
> granted to ASF via the IP clearance process, with appropriate software
> grant, CLA's, etc. (At least, that's my admittedly limited understanding
> of the process).
>
> - Steve
>
>
> On 9/9/21 3:34 PM, John Wass wrote:
> > Couldn't we (the vscode contributors) submit a series of PRs against the
> > new repo to move the code, and just archive the example repo as-is?
> >
> > I noted some thoughts on that a while back
> > https://github.com/jw3/example-daffodil-vscode/issues/77
> >
> >
> >
> > On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike <
> mbecke...@owlcyberdefense.com>
> > wrote:
> >
> >> I know of one file in the repo which will have to be removed which is
> the
> >> jpeg.dfdl.xsd file, which is there just as an example workspace.
> >>
> >> The copyright and provisions of that are not compatible with Apache
> >> licensing.
> >>
> >> We can find a DFDL schema that we created that has Apache license to use
> >> instead.
> >>
> >> For the other files under src, server, and build, can we generate a list
> >> of files identifying which are:
> >>
> >> (a) original MIT-licensed, unmodified
> >> (b) new - can be ASL
> >> (c) blended - started from MIT-licensed source, modified with
> >> daffodil-vscode-specific changes.
> >>
> >> It is these blended files that are the problematic ones.
> >>
> >>
> >>
> >> 
> >> From: Steve Lawrence 
> >> Sent: Thursday, September 9, 2021 1:38 PM
> >> To: dev@daffodil.apache.org 
> >> Subject: Re: daffodil-vscode - how to package and identify the
> >> contribution - some git questions
> >>
> >> Correct. For more information about Apache license compatibility:
> >>
> >>   https://www.apache.org/legal/resolved.html
> >>
> >> MIT is Category A and is fine. EPL is Category B and is also okay, but
> >> generally only in its binary form. So these top-level dependencies look
> >> okay, assuming their transitive dependencies are also okay.
> >>
> >> We'll also need to verify the licenses of all code in the repo.
> >> Hopefully little of that is original microsoft MIT and can be granted to
> >> ASF and relicensed.
> >>
> >>
> >> On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> >>> The requirement, is that the entire dependency tree (transitively)
> >> cannot depend on any software that has an Apache-incompatible (aka
> >> restrictive) license.
> >>>
> >>> So we need the transitive closure of all dependencies.
> >>>
> >>>
> >>> 
> >>> From: Adam Rosien 
> >>> Sent: Thursday, September 9, 2021 12:44 PM
> >>> To: dev@daffodil.apache.org 
> >>> Subject: Re: daffodil-vscode - how to package and identify the
> >> contribution - some git questions
> >>>
> >>> (I don't understand the requirements of licencing + transitive
> >>> dependencies, so I'm giving some surface level license info)
> >>>
> >>> "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
> >>> http://logback.qos.ch/license.html
> >>> "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL
> >> 1.0
> >>> "co.fs2" %% "fs2-io" % "3.0.4" - MIT
> >>> "com.monovore" %% &quo

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread John Wass
Couldn't we (the vscode contributors) submit a series of PRs against the
new repo to move the code, and just archive the example repo as-is?

I noted some thoughts on that a while back
https://github.com/jw3/example-daffodil-vscode/issues/77



On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike 
wrote:

> I know of one file in the repo which will have to be removed which is the
> jpeg.dfdl.xsd file, which is there just as an example workspace.
>
> The copyright and provisions of that are not compatible with Apache
> licensing.
>
> We can find a DFDL schema that we created that has Apache license to use
> instead.
>
> For the other files under src, server, and build, can we generate a list
> of files identifying which are:
>
> (a) original MIT-licensed, unmodified
> (b) new - can be ASL
> (c) blended - started from MIT-licensed source, modified with
> daffodil-vscode-specific changes.
>
> It is these blended files that are the problematic ones.
>
>
>
> 
> From: Steve Lawrence 
> Sent: Thursday, September 9, 2021 1:38 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
>
> Correct. For more information about Apache license compatibility:
>
>   https://www.apache.org/legal/resolved.html
>
> MIT is Category A and is fine. EPL is Category B and is also okay, but
> generally only in its binary form. So these top-level dependencies look
> okay, assuming their transitive dependencies are also okay.
>
> We'll also need to verify the licenses of all code in the repo.
> Hopefully little of that is original microsoft MIT and can be granted to
> ASF and relicensed.
>
>
> On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> > The requirement, is that the entire dependency tree (transitively)
> cannot depend on any software that has an Apache-incompatible (aka
> restrictive) license.
> >
> > So we need the transitive closure of all dependencies.
> >
> >
> > 
> > From: Adam Rosien 
> > Sent: Thursday, September 9, 2021 12:44 PM
> > To: dev@daffodil.apache.org 
> > Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
> >
> > (I don't understand the requirements of licencing + transitive
> > dependencies, so I'm giving some surface level license info)
> >
> > "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
> > http://logback.qos.ch/license.html
> > "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL
> 1.0
> > "co.fs2" %% "fs2-io" % "3.0.4" - MIT
> > "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
> > "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0
> >
> > On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:
> >
> >> I can relay the list of dependencies and their licenses.
> >>
> >> On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
> >> wrote:
> >>
> >>> I personally don't care too much about having the existing git history
> >>> once its part of ASF, especially if it makes thing any easier (as you
> >>> mention, squash/rebase can be difficut through merges). So I'd say we
> >>> just do plan B--create a tarball of the current state (without the git
> >>> history), and the content of that tarball is what goes through the IP
> >>> clearance process, and is the content of the inital commit when adding
> >>> to the apache/daffodil-vscode repo.
> >>>
> >>> Note that I think the incubator will still want access to the existing
> >>> repo so they can view the full git history. Understanding where
> >>> everything came from and verifying the provenance is important to
> >>> ensuring we have all the appropriate CLA's. So while the tarball is
> >>> maybe what is officially voted on, they will want access to the repo.
> >>>
> >>> That said, I don't think we are going to get CLA's for any Microsoft
> >>> contribute code. So either all Microsoft contributed code will need to
> >>> be kept MIT, or removed from the codebase. And if feels a bit odd to
> >>> grant something to ASF where the original codebase stays MIT and isn't
> >>> part of that grant.
> >>>
> >>> I think understanding how much code still exists that is Microsoft/MIT
> >>> is going to be important to getting this through the IP clearance
> process.
> >>>
> >>> So I'm curious how much of that original Microsoft code still exists? I
> >>> assume since it was just example code it has mostly been replaced? If
> >>> that's the case, we could potentially say Microsoft has no ownership of
> >>> this code, and so their CLA and MIT license aren't necessary?
> >>>
> >>> We should also have a good understanding of the dependencies. If any of
> >>> them are not compatible with ALv2, then going through this process
> isn't
> >>> even worth it until they are replaced. Do you have a list of the
> >>> dependencies?
> >>>
> >>>
> >>> On 9/9/21 11:16 AM, Beckerle, Mike wrote:
>  So the daffodil-vscode code-base wants to be granted to become part of
> >>> the
>  Daffodil proj

Re: Use GitHub Releases

2021-06-10 Thread John Wass
This sounds good.  Knowing it is possible allows us to continue on in the
same direction.

We can discuss and document the details of the approach in this thread.

On Wed, Jun 9, 2021 at 6:52 PM Beckerle, Mike 
wrote:

> I think it is fine to have github releases and convenience binaries served
> from there, with a couple constraints based on not undermining the
> important ASF policies that provide for verifiable software supply chain.
>
> If the github releases and artifacts correspond to official Apache
> releases, then:
>
> 1) they have to be identical bit-for-bit to those provided from ASF and
> maven central.
>
> 2) both we and our users have to be able to readily verify that this is
> the case (same file names, same hashes, easy to find links to the official
> ASF locations that store the hashes, have the signer keys to verify
> against, etc.)
>
> If these github-based "releases" are intermediate/snapshot kinds of
> things, then I think the only requirement is that it's clear that's what
> they are, (distinct file names, etc. ) so they're not able to be confused
> with any official release.
>
> I think experimentation to see what works well for the debugger/IDE is
> very sensible.
>
> 
> From: John Wass 
> Sent: Wednesday, June 9, 2021 2:35 PM
> To: dev@daffodil.apache.org 
> Subject: Re: Use GitHub Releases
>
> > GitHub does automatically create "Releases when we create a new tag.
>
> The UI rolls them together, but they are two separate things in the API.
> Daffodil has no releases according to the API.
>
> https://api.github.com/repos/apache/daffodil/tags
> https://api.github.com/repos/apache/daffodil/releases
> https://docs.github.com/en/rest/reference/repos#list-releases
>
>
> > Is there some API that's not available unless we manually create
> releases?
>
> We can't attach assets to a tag, only a release.
>
>
> > Are you looking to have convenience binaries also published to these
> release?
>
> Yes, asset fetching along with version lookup was the point of the post, I
> should have mentioned that ;)
>
> Do all Daffodil artifacts need to be published, no, there is Maven Central
> for the jars, but what about publishing the applications as assets, that
> would be the CLI and in the future a debugger backend.
>
>
> > What kinds of information are you looking to query from the releases?
>
> At first the available releases and their assets, but there is additional
> metadata in a release object that might be interesting at some point.
>
>
> > That has some basic version and release date information. And as I
> mentioned before, it requires that projects keep it up to date.
>
> The GitHub Release API does provide a nice single entrypoint for query and
> fetch of assets (and metadata for future use).  Looking at these Apache
> references, it doesn't appear to be as robust.
>
>
>
>
> On Wed, Jun 9, 2021 at 12:54 PM Steve Lawrence 
> wrote:
>
> > GitHub does automatically create "Releases when we create a new tag.
> >
> >   https://github.com/apache/daffodil/releases
> >
> > Is there some API that's not available unless we manually create
> > releases? Are you looking to have convenience binaries also published to
> > these release?
> >
> > What kinds of information are you looking to query from the releases?
> >
> > I know some projects (including Daffodil) keep an updated "Description
> > Of A Project" (doap) file, which is parsed by Apache to fill out project
> > information that can be queried here:
> >
> >   https://projects.apache.org/project.html
> >
> > This is our doap file:
> >
> >   https://daffodil.apache.org/doap.rdf
> >
> > And this is the project page that is generated from that file:
> >
> >   https://projects.apache.org/project.html?daffodil
> >
> > That has some basic version and release date information. And as I
> > mentioned before, it requires that projects keep it up to date. I'm not
> > sure how many do if you're interested about other projects.
> >
> >
> > On 6/9/21 12:36 PM, John Wass wrote:
> > >> the simplest is to ask
> > >
> > > Well the simplest for __me__ is to ask, this will add some overhead to
> > the
> > > release process for someone.  It looks like some Apache projects do
> > GitHub
> > > releases, most don't.
> > >
> > > Also looking for an Apache API to query releases and their artifacts.
> > >
> > >
> > > On Wed, Jun 9, 2021 at 12:13 PM John Wass  wrote:
> > >
> > >> We have been using the GitHub API to collect (representative) releases
> > of
> > >> Daffodil during some prototype work.  However when looking at the main
> > >> Daffodil repo I see there are no releases published there.
> > >>
> > >> There are probably some other ways to work around this, but the
> simplest
> > >> is to ask if publishing releases to GitHub is something that can be
> done
> > >> going forward?
> > >>
> > >>
> > >
> >
> >
>


Re: Use GitHub Releases

2021-06-09 Thread John Wass
> GitHub does automatically create "Releases when we create a new tag.

The UI rolls them together, but they are two separate things in the API.
Daffodil has no releases according to the API.

https://api.github.com/repos/apache/daffodil/tags
https://api.github.com/repos/apache/daffodil/releases
https://docs.github.com/en/rest/reference/repos#list-releases


> Is there some API that's not available unless we manually create releases?

We can't attach assets to a tag, only a release.


> Are you looking to have convenience binaries also published to these
release?

Yes, asset fetching along with version lookup was the point of the post, I
should have mentioned that ;)

Do all Daffodil artifacts need to be published, no, there is Maven Central
for the jars, but what about publishing the applications as assets, that
would be the CLI and in the future a debugger backend.


> What kinds of information are you looking to query from the releases?

At first the available releases and their assets, but there is additional
metadata in a release object that might be interesting at some point.


> That has some basic version and release date information. And as I
mentioned before, it requires that projects keep it up to date.

The GitHub Release API does provide a nice single entrypoint for query and
fetch of assets (and metadata for future use).  Looking at these Apache
references, it doesn't appear to be as robust.




On Wed, Jun 9, 2021 at 12:54 PM Steve Lawrence  wrote:

> GitHub does automatically create "Releases when we create a new tag.
>
>   https://github.com/apache/daffodil/releases
>
> Is there some API that's not available unless we manually create
> releases? Are you looking to have convenience binaries also published to
> these release?
>
> What kinds of information are you looking to query from the releases?
>
> I know some projects (including Daffodil) keep an updated "Description
> Of A Project" (doap) file, which is parsed by Apache to fill out project
> information that can be queried here:
>
>   https://projects.apache.org/project.html
>
> This is our doap file:
>
>   https://daffodil.apache.org/doap.rdf
>
> And this is the project page that is generated from that file:
>
>   https://projects.apache.org/project.html?daffodil
>
> That has some basic version and release date information. And as I
> mentioned before, it requires that projects keep it up to date. I'm not
> sure how many do if you're interested about other projects.
>
>
> On 6/9/21 12:36 PM, John Wass wrote:
> >> the simplest is to ask
> >
> > Well the simplest for __me__ is to ask, this will add some overhead to
> the
> > release process for someone.  It looks like some Apache projects do
> GitHub
> > releases, most don't.
> >
> > Also looking for an Apache API to query releases and their artifacts.
> >
> >
> > On Wed, Jun 9, 2021 at 12:13 PM John Wass  wrote:
> >
> >> We have been using the GitHub API to collect (representative) releases
> of
> >> Daffodil during some prototype work.  However when looking at the main
> >> Daffodil repo I see there are no releases published there.
> >>
> >> There are probably some other ways to work around this, but the simplest
> >> is to ask if publishing releases to GitHub is something that can be done
> >> going forward?
> >>
> >>
> >
>
>


Re: Use GitHub Releases

2021-06-09 Thread John Wass
> the simplest is to ask

Well the simplest for __me__ is to ask, this will add some overhead to the
release process for someone.  It looks like some Apache projects do GitHub
releases, most don't.

Also looking for an Apache API to query releases and their artifacts.


On Wed, Jun 9, 2021 at 12:13 PM John Wass  wrote:

> We have been using the GitHub API to collect (representative) releases of
> Daffodil during some prototype work.  However when looking at the main
> Daffodil repo I see there are no releases published there.
>
> There are probably some other ways to work around this, but the simplest
> is to ask if publishing releases to GitHub is something that can be done
> going forward?
>
>


Use GitHub Releases

2021-06-09 Thread John Wass
We have been using the GitHub API to collect (representative) releases of
Daffodil during some prototype work.  However when looking at the main
Daffodil repo I see there are no releases published there.

There are probably some other ways to work around this, but the simplest is
to ask if publishing releases to GitHub is something that can be done going
forward?


Re: The future of the daffodil DFDL schema debugger?

2021-05-26 Thread John Wass
e a user would want to select an infoset
> item and jump to the associated schema element, or query information
> about that infoset item (e.g.. what bit position did it start at, what
> was the length). We don't have this right now, but would be really nice
> to have. This suggests that we need metadata associated with each of the
> variables. Does DAP have a concept of that and do IDE's have a way to
> show it?
>
> On 4/21/21 7:52 PM, Adam Rosien wrote:
> > I've been reading up on DAP and wanted to share...
> >
> >> There are many areas though that are unique to Daffodil that have no
> > representation in the spec.  These things (like InputStream, Infoset,
> PoU,
> > different variable types, backtracking, etc) will need an extension to
> > DAP.  This really boils down to defining these things to fit under the
> DAP
> > BaseProtocol and enabling handling of those objects on both the front and
> > back ends.
> >
> > To me, much of the current state exposed by the (Daffodil) Debugger
> > translates directly to a DAP Variable[1]. DAP Variables can be
> > nested/hierarchical, so they could (potentially) model larger data like
> the
> > infoset. I can imagine shoving all the current state into Variables as a
> > proof-of-concept.
> >
> > It also seems like the processing stack maintained by the Daffodil
> PState,
> > where each item references the relevant schema element, could translate
> to
> > the DAP StackFrame type [2]. That is, the path from the schema root to
> the
> > currently processing schema element becomes the "call stack". (Apologies
> if
> > I don't have all the Daffodil terms lined up correctly.)
> >
> > For displaying the input data and processing progress, I looked at a few
> > existing VS Code extensions that provided non-builtin views, some of
> which
> > interact with their DAP debugger code [3] [4] [5] [6].
> >
> > Finally, I took a cursory look at scala-debug-adapter [7], which, for
> > reference, wraps Microsoft's java-debug implementation of DAP. I was
> > curious about the set of request/response and event types. Additionally,
> > the Typescript API to VS Code offers custom DAP requests and responses,
> but
> > I couldn't find the equivalent notion in the java-debug project.
> >
> > .. Adam
> >
> > [1]
> >
> https://microsoft.github.io/debug-adapter-protocol/specification#Types_Variable
> > [2]
> >
> https://microsoft.github.io/debug-adapter-protocol/specification#Types_StackFrame
> > [3] https://github.com/scalameta/metals-vscode (provides a debugger and
> > non-debugger custom UI)
> > [4] https://github.com/microsoft/vscode-cpptools (debugger + memory
> view)
> > [5]
> https://marketplace.visualstudio.com/items?itemName=marus25.cortex-debug
> > (debugger + memory view,
> >
> https://github.com/Marus/cortex-debug/blob/master/src/frontend/memory_content_provider.ts
> > )
> > [6]
> >
> https://marketplace.visualstudio.com/items?itemName=slevesque.vscode-hexdump
> > (extension for hexdumps that could be controlled by other extensions)
> > [7] https://github.com/scalacenter/scala-debug-adapter
> > [8] https://github.com/microsoft/java-debug
> >
> > On Tue, Apr 20, 2021 at 7:08 AM John Wass  wrote:
> >
> >>> Going to look deeper into how DAP might fit with Daffodil
> >>
> >> Have been looking over DAP and getting a good feeling about it. The
> >> specification [1] seems general enough that it could be applied to
> Daffodil
> >> and cover a swath of common operations (like start, stop, break,
> continue,
> >> code locations, variables, etc).
> >>
> >> There are many areas though that are unique to Daffodil that have no
> >> representation in the spec.  These things (like InputStream, Infoset,
> PoU,
> >> different variable types, backtracking, etc) will need an extension to
> >> DAP.  This really boils down to defining these things to fit under the
> DAP
> >> BaseProtocol and enabling handling of those objects on both the front
> and
> >> back ends.
> >>
> >> On the backend we need a Daffodil DAP protocol server.  Existing JVM
> >> implementations (like Java [2], Scala [3]) are tied closely to JDI and
> >> would bring a lot of extra baggage to work around that.  Developing a
> >> Daffodil specific implementation is no small task, but feasible.  There
> are
> >> a several existing implementations on the JVM that are close and can be
> >> looked at for reference.
> >>

Re: [VOTE] Release Apache Daffodil 3.1.0-rc2

2021-05-18 Thread John Wass
+1

I checked

[OK] RPM install in centos 8 and fedora 32
[OK] spot check schematron validation and svrl output
[OK] misc CLI operations
[OK] hash of each download matches
[OK] rat check passes
[OK] use of staged artifacts in existing applications



On Fri, May 14, 2021 at 3:26 PM Steve Lawrence  wrote:

> Hi all,
>
> I'd like to call a vote to release Apache Daffodil 3.1.0-rc2.
>
> All distribution packages, including signatures, digests, etc. can be
> found at:
>
> https://dist.apache.org/repos/dist/dev/daffodil/3.1.0-rc2/
>
> Staging artifacts can be found at:
>
> https://repository.apache.org/content/repositories/orgapachedaffodil-1023/
>
> This release has been signed with PGP key 36F3494B033AE661,
> corresponding to slawre...@apache.org, which is included in the KEYS
> file here:
>
> https://downloads.apache.org/daffodil/KEYS
>
> The release candidate has been tagged in git with v3.1.0-rc2.
>
> For reference, here is a list of all closed JIRAs tagged with 3.1.0:
>
> https://s.apache.org/daffodil-issues-3.1.0
>
> For a summary of the changes in this release, see:
>
> https://daffodil.apache.org/releases/3.1.0/
>
> Please review and vote. The vote will be open for at least 72 hours
> (Monday, 17 May 2021, 4pm EST).
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
>


Re: The future of the daffodil DFDL schema debugger?

2021-04-22 Thread John Wass
> dig a bit to see if the DAP-only hooks can be reused without JDI coming
along for the ride

Cool, that would be good to dig at.  Big win if we can reuse it.


Re: The future of the daffodil DFDL schema debugger?

2021-04-21 Thread John Wass
Thanks Adam, the DAP variable angle is interesting.  So are you thinking
all aspects are covered without defining any new DAP interfaces?

What about the backend, do you think a Daffodil debug server implementation
is needed?

When looking at the Java Debug server, for both Scala and Java, it looked
very much tied to JDI and debugging a virtual machine.  Did you see
anything at all that could be reused there?

It seemed to me that whether we extend DAP or not custom backend server
components need to be implemented to provide Daffodil debug sessions rather
than the JDI JVM sessions.




On Wed, Apr 21, 2021 at 7:52 PM Adam Rosien  wrote:

> I've been reading up on DAP and wanted to share...
>
> > There are many areas though that are unique to Daffodil that have no
> representation in the spec.  These things (like InputStream, Infoset, PoU,
> different variable types, backtracking, etc) will need an extension to
> DAP.  This really boils down to defining these things to fit under the DAP
> BaseProtocol and enabling handling of those objects on both the front and
> back ends.
>
> To me, much of the current state exposed by the (Daffodil) Debugger
> translates directly to a DAP Variable[1]. DAP Variables can be
> nested/hierarchical, so they could (potentially) model larger data like the
> infoset. I can imagine shoving all the current state into Variables as a
> proof-of-concept.
>
> It also seems like the processing stack maintained by the Daffodil PState,
> where each item references the relevant schema element, could translate to
> the DAP StackFrame type [2]. That is, the path from the schema root to the
> currently processing schema element becomes the "call stack". (Apologies if
> I don't have all the Daffodil terms lined up correctly.)
>
> For displaying the input data and processing progress, I looked at a few
> existing VS Code extensions that provided non-builtin views, some of which
> interact with their DAP debugger code [3] [4] [5] [6].
>
> Finally, I took a cursory look at scala-debug-adapter [7], which, for
> reference, wraps Microsoft's java-debug implementation of DAP. I was
> curious about the set of request/response and event types. Additionally,
> the Typescript API to VS Code offers custom DAP requests and responses, but
> I couldn't find the equivalent notion in the java-debug project.
>
> .. Adam
>
> [1]
>
> https://microsoft.github.io/debug-adapter-protocol/specification#Types_Variable
> [2]
>
> https://microsoft.github.io/debug-adapter-protocol/specification#Types_StackFrame
> [3] https://github.com/scalameta/metals-vscode (provides a debugger and
> non-debugger custom UI)
> [4] https://github.com/microsoft/vscode-cpptools (debugger + memory view)
> [5]
> https://marketplace.visualstudio.com/items?itemName=marus25.cortex-debug
> (debugger + memory view,
>
> https://github.com/Marus/cortex-debug/blob/master/src/frontend/memory_content_provider.ts
> )
> [6]
>
> https://marketplace.visualstudio.com/items?itemName=slevesque.vscode-hexdump
> (extension for hexdumps that could be controlled by other extensions)
> [7] https://github.com/scalacenter/scala-debug-adapter
> [8] https://github.com/microsoft/java-debug
>
> On Tue, Apr 20, 2021 at 7:08 AM John Wass  wrote:
>
> > > Going to look deeper into how DAP might fit with Daffodil
> >
> > Have been looking over DAP and getting a good feeling about it. The
> > specification [1] seems general enough that it could be applied to
> Daffodil
> > and cover a swath of common operations (like start, stop, break,
> continue,
> > code locations, variables, etc).
> >
> > There are many areas though that are unique to Daffodil that have no
> > representation in the spec.  These things (like InputStream, Infoset,
> PoU,
> > different variable types, backtracking, etc) will need an extension to
> > DAP.  This really boils down to defining these things to fit under the
> DAP
> > BaseProtocol and enabling handling of those objects on both the front and
> > back ends.
> >
> > On the backend we need a Daffodil DAP protocol server.  Existing JVM
> > implementations (like Java [2], Scala [3]) are tied closely to JDI and
> > would bring a lot of extra baggage to work around that.  Developing a
> > Daffodil specific implementation is no small task, but feasible.  There
> are
> > a several existing implementations on the JVM that are close and can be
> > looked at for reference.
> >
> > The backend implementation would look similar to what was described in an
> > earlier post.  We could use ZIO/Akka/etc to implement the backend
> Protocol
> > Server to enable the IO between the Daffodil process and t

Re: Forgot to squash commits

2021-04-21 Thread John Wass
Might want to note that new commit in the PR.

On Wed, Apr 21, 2021 at 4:53 PM Interrante, John A (GE Research, US) <
john.interra...@ge.com> wrote:

> Yep.  My pull request had no conflicts and I just merged it (after all the
> checks passed) without any problem.
>
> -Original Message-
> From: Beckerle, Mike 
> Sent: Wednesday, April 21, 2021 4:34 PM
> To: dev@daffodil.apache.org
> Subject: EXT: Re: Forgot to squash commits
>
> I decided to force-push them, but just in case I do have the branch with
> the other 3 commits saved and we could recreate the other 3-commit scenario
> if necessary.
>
> So the master is now what it is supposed to be. The bug fix (which was
> just adding test cases) having been squashed from 3 commits into 1 (our
> usual workflow practice.)
>
> Outstanding pull requests still have to rebase on top, and conflict
> detection should still do the right thing. I checked a couple PRs and they
> still show no-conflicts with the base.
>
> 
> From: John Wass 
> Sent: Wednesday, April 21, 2021 4:26 PM
> To: dev@daffodil.apache.org 
> Subject: Re: Forgot to squash commits
>
> I'd let them be.
>
> On Wed, Apr 21, 2021 at 4:13 PM Beckerle, Mike <
> mbecke...@owlcyberdefense.com> wrote:
>
> > I ended up committing 3 tiny commits to master, forgot to squash them.
> >
> > Should I fix this by force push?
> >
> > Mike Beckerle | Principal Engineer
> >
> > mbecke...@owlcyberdefense.com  P
> > +1-781-330-0412
> >
> > Connect with us!
> >
> > <https://www.linkedin.com/company/owlcyberdefense/>
> > <https://twitter.com/owlcyberdefense>
> >
> > <https://owlcyberdefense.com/resources/events/>
> >
> >
> >
> > The information contained in this transmission is for the personal and
> > confidential use of the individual or entity to which it is addressed.
> > If the reader is not the intended recipient, you are hereby notified
> > that any review, dissemination, or copying of this communication is
> > strictly prohibited. If you have received this transmission in error,
> > please notify the sender immediately
> >
>


Re: Forgot to squash commits

2021-04-21 Thread John Wass
I'd let them be.

On Wed, Apr 21, 2021 at 4:13 PM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

> I ended up committing 3 tiny commits to master, forgot to squash them.
>
> Should I fix this by force push?
>
> Mike Beckerle | Principal Engineer
>
> mbecke...@owlcyberdefense.com 
> P +1-781-330-0412
>
> Connect with us!
>
> 
> 
>
> 
>
>
>
> The information contained in this transmission is for the personal and
> confidential use of the individual or entity to which it is addressed. If
> the reader is not the intended recipient, you are hereby notified that any
> review, dissemination, or copying of this communication is strictly
> prohibited. If you have received this transmission in error, please notify
> the sender immediately
>


Re: editconfig

2021-04-21 Thread John Wass
> As a Scala project, however, how about using Scalafmt?

I'm in favor of scalafmt also.

> But I assume scalafmt won't cover other files like XML/schema/tdml/text
files.

Take a look at https://github.com/diffplug/spotless

Spotless says it could support all of those, and a quick search says the
SBT plugin is backed by scalafmt.

(I haven't used Spotless, just saw it today and thought of this thread)



On Mon, Apr 19, 2021 at 3:17 PM Interrante, John A (GE Research, US) <
john.interra...@ge.com> wrote:

> I concur with Steve; we're going to need both a scalafmt configuration
> file and an .editorconfig file to cover all source code files unless the
> day comes when scalafmt understands .editorconfig and we're happy with
> scalafmt's default formatting options.
>
> Daffodil's existing code style is supposed to be very close to
> scalariform's default formatting options.  Does anyone know how different
> scalafmt's default formatting options are from scalariform's?  If they're
> not that different, eventually we might end up with just .editorconfig.
>
> -Original Message-
> From: Adam Rosien 
> Sent: Monday, April 19, 2021 12:16 PM
> To: dev@daffodil.apache.org
> Subject: EXT: Re: editconfig
>
> Ah, thanks for the extra context. I'll check out the JIRA issue.
>
> FYI there's an editorconfig integration issue open for scalafmt:
> https://github.com/scalameta/scalafmt/issues/1458.
>
> .. Adam
>
> On Mon, Apr 19, 2021 at 8:51 AM Steve Lawrence 
> wrote:
>
> > As long as scalafmt covers everything editconfig supports and the
> > popular IDE's support it, we'd probably get better results for our
> > scala files. But I assume scalafmt won't cover other files like
> > XML/schema/tdml/text files. We might need a combination of the two to
> > cover all files?
> >
> > See https://issues.apache.org/jira/browse/DAFFODIL-2133 for related
> issue.
> >
> > - Steve
> >
> > On 4/19/21 11:37 AM, Adam Rosien wrote:
> > > As a Scala project, however, how about using Scalafmt? [1] It's
> > > become standard in all the projects I've been involved with; it's
> > > supported by
> > the
> > > language creators and matches the previously mentioned features.
> > >
> > > .. Adam
> > >
> > > [1] https://scalameta.org/scalafmt/
> > >
> > > On Mon, Apr 19, 2021 at 8:20 AM Interrante, John A (GE Research, US)
> > > < john.interra...@ge.com> wrote:
> > >
> > >> I agree, an .editorconfig file at the root of daffodil coupled with
> > >> IDE plugins (some IDEs such as IDEA already support .editorconfig
> > >> without
> > any
> > >> plugin needed) could autoconfigure the following IDE settings
> > automatically
> > >> (if we felt we needed to specify all of these settings):
> > >>
> > >> root = true
> > >> # All files (risky - could change bin/dat files inadvertently) [*]
> > >> end_of_line = lf charset = utf-8 trim_trailing_whitespace = true
> > >> insert_final_newline = true indent_style = space indent_size = 2 #
> > >> Can narrow scope to only source code files [*.{java,scala,xml}]
> > >> indent_style = space indent_size = 2
> > >>
> > >> EditorConfig plugins format only newly typed lines with these
> > >> settings; they do not reformat existing files, meaning only files
> > >> actually
> > changed by
> > >> one's commit will be affected by these settings.  There are
> > >> separate command-line tools that can check, infer, or fix
> > >> EditorConfig rules
> > across
> > >> one or more directories/files in a repository manually.  I think
> > >> using
> > one
> > >> of these tools such as eclint would be essential for writing a
> > >> proper .editorconfig that narrows its scope as much as possible
> > >> (e.g., we don't want to change existing bin or dat or tdml files
> > >> inadvertently when
> > editing
> > >> a single character within them in Emacs or IDEA because many of
> > >> them use other charsets and are not source code).
> > >>
> > >> There's a nice long list of projects already using EditorConfig
> > >> with
> > links
> > >> to their .editorconfig files.  We also can look for similar
> > >> projects to Daffodil to see how they scope their .editorconfig
> > >> rules for their own files, but again, using "eclint infer" and
> > >> "eclint check" seems the
> > safest
> > >> way to me.
> > >>
> > >> John
> > >>
> > >> -Original Message-
> > >> From: Beckerle, Mike 
> > >> Sent: Monday, April 19, 2021 9:56 AM
> > >> To: dev@daffodil.apache.org
> > >> Subject: EXT: editconfig
> > >>
> > >> https://editorconfig.org/
> > >>
> > >> This is interesting and we should consider adding these files to
> > >> the
> > root
> > >> of daffodil both as a declaration of the code-style, and a way that
> > >> auto-configures many IDEs and tools (like github).
> > >>
> > >> This does not appear to be sophisticated enough to really cover
> > code-style
> > >> issues at all, but at least basic whitespace stuff like spaces not
> > >> tabs, 2-space indenting, etc. would be covered.
> > >>
> > >>
> > >>
> > >
> >
> >
>


Re: all this github spam ?

2021-04-21 Thread John Wass
The trick is being able to modify the CI workflow in the PR to inject new
behavior.  If there was a limit of some type on that it would decrease the
usefulness of this.

On Wed, Apr 21, 2021 at 9:33 AM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

> This has to do with crypto mining?  Gaaak.
>
> So their PR contains crypto mining code, and they are doing this to get
> the CI to run it as part of the way CI checks any PR?
>
> Sounds like submitting a PR has to require a Capcha or 2-FA.
>
>
> 
> From: Steve Lawrence 
> Sent: Wednesday, April 21, 2021 9:22 AM
> To: dev@daffodil.apache.org 
> Subject: Re: all this github spam ?
>
> Unfortunately, I'm not sure there's anything we can do about it.
>
> GitHub doesn't give any controls over who can/can't open a PR. We can't
> even temporarily close PR's completely.
>
> We could maybe make it so GitHub actions on PRs must be manually
> triggered so the spammers cryptocurrency mining stuff would never run.
> But that's a bit of a pain, and it relies on the spammers to realize
> their stuff isn't being run anymore and take us off their list. My guess
> is we're stuck on their list forever now.
>
> These crypto mining attacks are a known issue for GitHub, hopefully
> they're working on a solution. Tough, GitHub is eventually detecting
> these are spam and closing the accounts and deleting the PRS, but not
> until after the PR is created.
>
> As to the archive issue, we could maybe ask infra to remove archives
> that are clearly spam (all of them so far say "Demo titles Add
> files...", so unique and consistent). But it doesn't solve the
> underlying issue.
>
>
> On 4/21/21 8:59 AM, Beckerle, Mike wrote:
> > We seem to be fending off maybe 10 a day github spam attacks where people
> > open/close pull requests.
> >
> > Is there something systematic we can do to avoid this?
> >
> > This pollutes our mailing lists. I know we can manually purge the PRs
> from
> > github, but these things will live forever in the mail archives, adding
> a bunch
> > of random emails/account names to them, and generally making them less
> useful.
> >
> > Mike Beckerle | Principal Engineer
> >
> > mbecke...@owlcyberdefense.com 
> >
> > P +1-781-330-0412
> >
> > Connect with us!
> >
> > <
> https://twitter.com/owlcyberdefense>
> >
> > 
> >
> > **
> >
> > The information contained in this transmission is for the personal and
> > confidential use of the individual or entity to which it is addressed.
> If the
> > reader is not the intended recipient, you are hereby notified that any
> review,
> > dissemination, or copying of this communication is strictly prohibited.
> If you
> > have received this transmission in error, please notify the sender
> immediately
> >
>
>


Re: The future of the daffodil DFDL schema debugger?

2021-04-20 Thread John Wass
> Next step is to refine these thoughts with a prototype.

Another next step is to collect feedback on this research and proposed
approach.  Any discussion is appreciated.



On Tue, Apr 20, 2021 at 10:00 AM John Wass  wrote:

> > Going to look deeper into how DAP might fit with Daffodil
>
> Have been looking over DAP and getting a good feeling about it. The
> specification [1] seems general enough that it could be applied to Daffodil
> and cover a swath of common operations (like start, stop, break, continue,
> code locations, variables, etc).
>
> There are many areas though that are unique to Daffodil that have no
> representation in the spec.  These things (like InputStream, Infoset, PoU,
> different variable types, backtracking, etc) will need an extension to
> DAP.  This really boils down to defining these things to fit under the DAP
> BaseProtocol and enabling handling of those objects on both the front and
> back ends.
>
> On the backend we need a Daffodil DAP protocol server.  Existing JVM
> implementations (like Java [2], Scala [3]) are tied closely to JDI and
> would bring a lot of extra baggage to work around that.  Developing a
> Daffodil specific implementation is no small task, but feasible.  There are
> a several existing implementations on the JVM that are close and can be
> looked at for reference.
>
> The backend implementation would look similar to what was described in an
> earlier post.  We could use ZIO/Akka/etc to implement the backend Protocol
> Server to enable the IO between the Daffodil process and the DAP clients.
> This implementation would now be guided by the DAP specification.
>
> With the protocol and backend extended to fit Daffodil that leaves the
> frontend.  In theory an existing IDE plugin should get pretty close to
> being able to perform the common debug operations mentioned above.  To
> support the Daffodil extensions there will need to be handling of the
> extended protocol into whatever views are desired/applicable.
>
> > Also looking into the Java Debug Interface (JDI) for comparison.
>
> JDI appears to be the wrong level of abstraction for what we are talking
> about in debugging Daffodil for schema development.  While DAP does do JVM
> debugging (through a JDI DAP impl) it also generalizes to many other
> debugging scenarios.  JDI on the other hand is very tied to the JVM.
>
> Extending the JDI appears to be more complex than dealing with DAP, and
> even though the JDI API is mostly defined with interfaces, there are choke
> points that limit to JVM concepts.  For example jdi.Value has a finite set
> of JVM types that it works with, its not clear where Daffodil types would
> plugin if even possible.
>
> The final note is that unique Daffodil features wouldn’t get to IDE
> support any faster JDI.  In some cases, like VS Code, you would still need
> an extended DAP to support these features.
>
> > and depending on how it shakes out will update the example to show
> integration
>
> It would appear wise to investigate DAP further.  Next step is to refine
> these thoughts with a prototype. I started an implementation in the example
> debugger project [4] to try to run the current example on a _minimal_ DAP
> implementation.
>
>
> [1] https://microsoft.github.io/debug-adapter-protocol/specification
> [2] https://github.com/Microsoft/java-debug
> [3] https://github.com/scalacenter/scala-debug-adapter
> [4] https://github.com/jw3/example-daffodil-debug
>
>
> On Mon, Apr 12, 2021 at 9:58 AM John Wass  wrote:
>
>> > the code is here https://github.com/jw3/example-daffodil-debug
>>
>> There is now a complete console based example for Zio that demonstrates
>> controlling the debug flow while distributing the current state to three
>> "displays".
>> 1. infoset at current step
>> 2. diff of infoset against previous step
>> 3. bit position and value of data.
>>
>> These displays are very rudimentary but demonstrate the ability to
>> asynchronously populate multiple views while synchronously controlling the
>> debug loop.
>>
>> > - The new protocol being informed by existing debugger and DAPis key
>>
>> Going to look deeper into how DAP might fit with Daffodil, and depending
>> on how it shakes out will update the example to show integration.
>>
>> Some interesting links to start with
>> - https://github.com/scalacenter/scala-debug-adapter
>> -
>> https://scalameta.org/metals/docs/integrations/debug-adapter-protocol.html
>> - https://github.com/microsoft/java-debug
>>
>> Also looking into the Java Debug Interface (JDI) for comparison.
>>
>>
>> On Thu, Apr 8, 2021 at 12:36 PM John Wass

Re: The future of the daffodil DFDL schema debugger?

2021-04-20 Thread John Wass
> Going to look deeper into how DAP might fit with Daffodil

Have been looking over DAP and getting a good feeling about it. The
specification [1] seems general enough that it could be applied to Daffodil
and cover a swath of common operations (like start, stop, break, continue,
code locations, variables, etc).

There are many areas though that are unique to Daffodil that have no
representation in the spec.  These things (like InputStream, Infoset, PoU,
different variable types, backtracking, etc) will need an extension to
DAP.  This really boils down to defining these things to fit under the DAP
BaseProtocol and enabling handling of those objects on both the front and
back ends.

On the backend we need a Daffodil DAP protocol server.  Existing JVM
implementations (like Java [2], Scala [3]) are tied closely to JDI and
would bring a lot of extra baggage to work around that.  Developing a
Daffodil specific implementation is no small task, but feasible.  There are
a several existing implementations on the JVM that are close and can be
looked at for reference.

The backend implementation would look similar to what was described in an
earlier post.  We could use ZIO/Akka/etc to implement the backend Protocol
Server to enable the IO between the Daffodil process and the DAP clients.
This implementation would now be guided by the DAP specification.

With the protocol and backend extended to fit Daffodil that leaves the
frontend.  In theory an existing IDE plugin should get pretty close to
being able to perform the common debug operations mentioned above.  To
support the Daffodil extensions there will need to be handling of the
extended protocol into whatever views are desired/applicable.

> Also looking into the Java Debug Interface (JDI) for comparison.

JDI appears to be the wrong level of abstraction for what we are talking
about in debugging Daffodil for schema development.  While DAP does do JVM
debugging (through a JDI DAP impl) it also generalizes to many other
debugging scenarios.  JDI on the other hand is very tied to the JVM.

Extending the JDI appears to be more complex than dealing with DAP, and
even though the JDI API is mostly defined with interfaces, there are choke
points that limit to JVM concepts.  For example jdi.Value has a finite set
of JVM types that it works with, its not clear where Daffodil types would
plugin if even possible.

The final note is that unique Daffodil features wouldn’t get to IDE support
any faster JDI.  In some cases, like VS Code, you would still need an
extended DAP to support these features.

> and depending on how it shakes out will update the example to show
integration

It would appear wise to investigate DAP further.  Next step is to refine
these thoughts with a prototype. I started an implementation in the example
debugger project [4] to try to run the current example on a _minimal_ DAP
implementation.


[1] https://microsoft.github.io/debug-adapter-protocol/specification
[2] https://github.com/Microsoft/java-debug
[3] https://github.com/scalacenter/scala-debug-adapter
[4] https://github.com/jw3/example-daffodil-debug


On Mon, Apr 12, 2021 at 9:58 AM John Wass  wrote:

> > the code is here https://github.com/jw3/example-daffodil-debug
>
> There is now a complete console based example for Zio that demonstrates
> controlling the debug flow while distributing the current state to three
> "displays".
> 1. infoset at current step
> 2. diff of infoset against previous step
> 3. bit position and value of data.
>
> These displays are very rudimentary but demonstrate the ability to
> asynchronously populate multiple views while synchronously controlling the
> debug loop.
>
> > - The new protocol being informed by existing debugger and DAPis key
>
> Going to look deeper into how DAP might fit with Daffodil, and depending
> on how it shakes out will update the example to show integration.
>
> Some interesting links to start with
> - https://github.com/scalacenter/scala-debug-adapter
> -
> https://scalameta.org/metals/docs/integrations/debug-adapter-protocol.html
> - https://github.com/microsoft/java-debug
>
> Also looking into the Java Debug Interface (JDI) for comparison.
>
>
> On Thu, Apr 8, 2021 at 12:36 PM John Wass  wrote:
>
>> Revisiting this post after doing some debugger related work and thinking
>> about debug protocol/adapters to connect external tooling to the debug
>> process.
>>
>> This comment is good
>>
>> > This allo makes me wonder if an approach worth taking for the future of
>> Daffodil schema debugging is developing a sort of "Daffodil Debug
>> Protocol". I imagine it would be loosely based on DAP (which is
>> essentially JSON message based) but could be targeted to the things that a
>> DFDL schema debugger would really need. An added benefit with some  sort of
>&

Re: [Discuss] creating Release 3.1.0 and 96 JIRA tickets marked "Major" or higher

2021-04-12 Thread John Wass
> I believe we need to do a release very soon regardless of these 96 issues

I would like DAFFODIL- 2482 to get into it;
https://github.com/apache/daffodil/pull/520

Will increase priority on wrapping this up.



On Mon, Apr 12, 2021 at 12:43 PM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

> I'd like to discuss our need to create a new release of Daffodil, which
> would be 3.1.0.
>
> We have added enough new functionality certainly to justify a release.
> There are important features already complete, there is the new Runtime2
> backend, etc.
>
> The challenge is that we have 96 JIRA tickets specifically for bugs that
> are marked "major" or above in priority.  6 are marked critical, so 90 are
> "major". (I am excluding all "improvement" and "new-feature" tickets in
> this count. Just bugs.) Obviously we're not going to fix 96 issues super
> quickly.
>
> Some people advocate a set of criteria for releases which stipulate there
> can be no critical/blocker issues, and no major issues, only minor issues.
> However, the status of critical/major/minor on our JIRA tickets is
> subjective, most bugs are found and reported by us.
>
> Exactly two bugs have "votes" more than zero, which reflects that we've
> not been using the votes field to prioritize anything, but perhaps we
> should use votes moving forward, rather than bumping priorities up and down
> based on our subjective assessment of importance.
>
> I believe we need to do a release very soon regardless of these 96 issues.
> In scrolling through them, evaluating them as "are they more important than
> doing our first TLP release", none of them rise to that level of importance
> to me.
>
> Most of these issues are part of release 3.0.0 and before that as well, so
> 3.1.0 would still be an improvement.
>
> One way to deal with the critical issues is to specifically discuss them
> in a release note.
>
> Please let's discuss openly. What do you believe must​ be in 3.1.0, that
> we would hold up a release over?
>
> -mike beckerle
>
>
>


Re: The future of the daffodil DFDL schema debugger?

2021-04-12 Thread John Wass
> the code is here https://github.com/jw3/example-daffodil-debug

There is now a complete console based example for Zio that demonstrates
controlling the debug flow while distributing the current state to three
"displays".
1. infoset at current step
2. diff of infoset against previous step
3. bit position and value of data.

These displays are very rudimentary but demonstrate the ability to
asynchronously populate multiple views while synchronously controlling the
debug loop.

> - The new protocol being informed by existing debugger and DAPis key

Going to look deeper into how DAP might fit with Daffodil, and depending on
how it shakes out will update the example to show integration.

Some interesting links to start with
- https://github.com/scalacenter/scala-debug-adapter
- https://scalameta.org/metals/docs/integrations/debug-adapter-protocol.html
- https://github.com/microsoft/java-debug

Also looking into the Java Debug Interface (JDI) for comparison.


On Thu, Apr 8, 2021 at 12:36 PM John Wass  wrote:

> Revisiting this post after doing some debugger related work and thinking
> about debug protocol/adapters to connect external tooling to the debug
> process.
>
> This comment is good
>
> > This allo makes me wonder if an approach worth taking for the future of
> Daffodil schema debugging is developing a sort of "Daffodil Debug Protocol".
> I imagine it would be loosely based on DAP (which is  essentially JSON
> message based) but could be targeted to the things that a DFDL schema
> debugger would really need. An added benefit with some  sort of protocol
> is the debugger interface can be uncoupled from Daffodil itself, so we
> could implement a TUI/GUI/whatever in any  language/GUI framework and just
> have it communicate the protocol over some form of IPC. Another benefit
> is that any future backends could implement this protocol and so a single
> debugger could hook into different backends without much issue.
> Unfortunately, defining such a protocol might be a large task, but we do
> have our existing debug infrastructure and things like DAP to guide its
> development/design.
>
> Some thoughts on this
> - Defining the protocol will be a large task, but a minimal version should
> get up and round tripping quickly with a minimal subset of the protocol.
> - The new protocol being informed by existing debugger and DAPis key
> - Uncoupling from Daffodil is key
> - Adapt the Daffodil protocol to produce DAP after the fact so as not to
> constrain Daffodil debugging capability
> - We dont need to tie the protocol or adapters to a single framework,
> implementations of the IO layer should be simple enough to support multiple
> things (eg Akka, Zio, "basic" ...)
> - The current debugger lives in runtime1, but can we make an abstract API
> that any runtime would implement?
>
> Maybe a solution is structured like this
> - daffodil-debug-api:
>   - protocol model
>   - interfaces: debugger / IO adapter / etc
>   - lives in daffodil repo (new subproject?)
> - daffodil-debug-io-NAME
>   - provides implementation of a specific IO adapter
>   - multiple projects possible (daffodil-debugger-akka,
> daffodil-debugger-zio, etc)
>   - supported ones live in their own subprojects, but other can be plugged
> in from external sources
>   - ability to support multiple implementations reduces risk of lock-in
> - debugger applications
>   - maintained in external repositories
>   - depending on the IO implementation these could execute be in separate
> process or on separate machine
>   - like Steve said, could be any language / framework
>
> Three types of reference implementations / sample applications could also
> guide the development of the API
>   1. a replacement for the existing TUI debugger, expected to end up with
> at minimum the same functionality as the current one.
>   2. a standalone GUI (JavaFX, Scala.js, ..) debugger
>   3. an IDE integration
>
> Thoughts?
>
> Also I'm working on some reference implementations of these concepts using
> Akka and Zio.  Not quite ready to talk through it yet, but the code is here
> https://github.com/jw3/example-daffodil-debug
>
>
>
> On Wed, Jan 6, 2021 at 1:42 PM Steve Lawrence 
> wrote:
>
>> Yep, something like that seems very reasonable for dealing with large
>> infosets. But it still feels like we still run into usability issues.
>> For example, what if a user wants to see more? We need some
>> configuration options to increase what we've ellided. It's not big, but
>> every new thing that needs configuration adds complexity and decreases
>> usability.
>>
>> And I think the only reason we are trying to spend effort elliding
>> things is because we're lim

Re: The future of the daffodil DFDL schema debugger?

2021-04-08 Thread John Wass
> lives in daffodil repo (new subproject?)

Not asking a question here, meant to snip out those parens.

The daffodil-debug-api and any daffodil-debug-io-NAME projects do represent
new subprojects.

Just wanted to clarify, never see those things till send is hit.



On Thu, Apr 8, 2021 at 12:36 PM John Wass  wrote:

> Revisiting this post after doing some debugger related work and thinking
> about debug protocol/adapters to connect external tooling to the debug
> process.
>
> This comment is good
>
> > This allo makes me wonder if an approach worth taking for the future of
> Daffodil schema debugging is developing a sort of "Daffodil Debug Protocol".
> I imagine it would be loosely based on DAP (which is  essentially JSON
> message based) but could be targeted to the things that a DFDL schema
> debugger would really need. An added benefit with some  sort of protocol
> is the debugger interface can be uncoupled from Daffodil itself, so we
> could implement a TUI/GUI/whatever in any  language/GUI framework and just
> have it communicate the protocol over some form of IPC. Another benefit
> is that any future backends could implement this protocol and so a single
> debugger could hook into different backends without much issue.
> Unfortunately, defining such a protocol might be a large task, but we do
> have our existing debug infrastructure and things like DAP to guide its
> development/design.
>
> Some thoughts on this
> - Defining the protocol will be a large task, but a minimal version should
> get up and round tripping quickly with a minimal subset of the protocol.
> - The new protocol being informed by existing debugger and DAPis key
> - Uncoupling from Daffodil is key
> - Adapt the Daffodil protocol to produce DAP after the fact so as not to
> constrain Daffodil debugging capability
> - We dont need to tie the protocol or adapters to a single framework,
> implementations of the IO layer should be simple enough to support multiple
> things (eg Akka, Zio, "basic" ...)
> - The current debugger lives in runtime1, but can we make an abstract API
> that any runtime would implement?
>
> Maybe a solution is structured like this
> - daffodil-debug-api:
>   - protocol model
>   - interfaces: debugger / IO adapter / etc
>   - lives in daffodil repo (new subproject?)
> - daffodil-debug-io-NAME
>   - provides implementation of a specific IO adapter
>   - multiple projects possible (daffodil-debugger-akka,
> daffodil-debugger-zio, etc)
>   - supported ones live in their own subprojects, but other can be plugged
> in from external sources
>   - ability to support multiple implementations reduces risk of lock-in
> - debugger applications
>   - maintained in external repositories
>   - depending on the IO implementation these could execute be in separate
> process or on separate machine
>   - like Steve said, could be any language / framework
>
> Three types of reference implementations / sample applications could also
> guide the development of the API
>   1. a replacement for the existing TUI debugger, expected to end up with
> at minimum the same functionality as the current one.
>   2. a standalone GUI (JavaFX, Scala.js, ..) debugger
>   3. an IDE integration
>
> Thoughts?
>
> Also I'm working on some reference implementations of these concepts using
> Akka and Zio.  Not quite ready to talk through it yet, but the code is here
> https://github.com/jw3/example-daffodil-debug
>
>
>
> On Wed, Jan 6, 2021 at 1:42 PM Steve Lawrence 
> wrote:
>
>> Yep, something like that seems very reasonable for dealing with large
>> infosets. But it still feels like we still run into usability issues.
>> For example, what if a user wants to see more? We need some
>> configuration options to increase what we've ellided. It's not big, but
>> every new thing that needs configuration adds complexity and decreases
>> usability.
>>
>> And I think the only reason we are trying to spend effort elliding
>> things is because we're limited to this gdb-like interface where you can
>> only print out a little information at a time.
>>
>> I think what would really is to dump this gdb interface and instead use
>> multiple windows/views. As a really close example to what I imagine, I
>> recently came across this hex editor:
>>
>> https://www.synalysis.net/
>>
>> The screenshots are a bit small so it's not super clear, but this tool
>> has one view for the data in hex, and one view for a tree of parsed
>> results (which is very similar to our infoset). The "infoset" view has
>> information like offset/length/value, and can be related back to the
>> data 

Re: The future of the daffodil DFDL schema debugger?

2021-04-08 Thread John Wass
Revisiting this post after doing some debugger related work and thinking
about debug protocol/adapters to connect external tooling to the debug
process.

This comment is good

> This allo makes me wonder if an approach worth taking for the future of
Daffodil schema debugging is developing a sort of "Daffodil Debug Protocol".
I imagine it would be loosely based on DAP (which is  essentially JSON
message based) but could be targeted to the things that a DFDL schema
debugger would really need. An added benefit with some  sort of protocol is
the debugger interface can be uncoupled from Daffodil itself, so we could
implement a TUI/GUI/whatever in any  language/GUI framework and just have
it communicate the protocol over some form of IPC. Another benefit is that
any future backends could implement this protocol and so a single debugger
could hook into different backends without much issue. Unfortunately,
defining such a protocol might be a large task, but we do have our existing
debug infrastructure and things like DAP to guide its development/design.

Some thoughts on this
- Defining the protocol will be a large task, but a minimal version should
get up and round tripping quickly with a minimal subset of the protocol.
- The new protocol being informed by existing debugger and DAPis key
- Uncoupling from Daffodil is key
- Adapt the Daffodil protocol to produce DAP after the fact so as not to
constrain Daffodil debugging capability
- We dont need to tie the protocol or adapters to a single framework,
implementations of the IO layer should be simple enough to support multiple
things (eg Akka, Zio, "basic" ...)
- The current debugger lives in runtime1, but can we make an abstract API
that any runtime would implement?

Maybe a solution is structured like this
- daffodil-debug-api:
  - protocol model
  - interfaces: debugger / IO adapter / etc
  - lives in daffodil repo (new subproject?)
- daffodil-debug-io-NAME
  - provides implementation of a specific IO adapter
  - multiple projects possible (daffodil-debugger-akka,
daffodil-debugger-zio, etc)
  - supported ones live in their own subprojects, but other can be plugged
in from external sources
  - ability to support multiple implementations reduces risk of lock-in
- debugger applications
  - maintained in external repositories
  - depending on the IO implementation these could execute be in separate
process or on separate machine
  - like Steve said, could be any language / framework

Three types of reference implementations / sample applications could also
guide the development of the API
  1. a replacement for the existing TUI debugger, expected to end up with
at minimum the same functionality as the current one.
  2. a standalone GUI (JavaFX, Scala.js, ..) debugger
  3. an IDE integration

Thoughts?

Also I'm working on some reference implementations of these concepts using
Akka and Zio.  Not quite ready to talk through it yet, but the code is here
https://github.com/jw3/example-daffodil-debug



On Wed, Jan 6, 2021 at 1:42 PM Steve Lawrence  wrote:

> Yep, something like that seems very reasonable for dealing with large
> infosets. But it still feels like we still run into usability issues.
> For example, what if a user wants to see more? We need some
> configuration options to increase what we've ellided. It's not big, but
> every new thing that needs configuration adds complexity and decreases
> usability.
>
> And I think the only reason we are trying to spend effort elliding
> things is because we're limited to this gdb-like interface where you can
> only print out a little information at a time.
>
> I think what would really is to dump this gdb interface and instead use
> multiple windows/views. As a really close example to what I imagine, I
> recently came across this hex editor:
>
> https://www.synalysis.net/
>
> The screenshots are a bit small so it's not super clear, but this tool
> has one view for the data in hex, and one view for a tree of parsed
> results (which is very similar to our infoset). The "infoset" view has
> information like offset/length/value, and can be related back to the
> data view to find the actual bits.
>
> I imagine the "next generation daffodil debugger" to look much like
> this. As data is parsed, the infoset view fills up. This view could act
> like a standard GUI tree so you could collapse sections or scroll around
> to show just the parts you care about, and have search capabilities to
> quickly jump around. The advantage here is you no longer really need
> automated eliding or heuristics for what the user *might* care about.
> You just show the whole thing and let user scroll around. As daffodil
> parses and backtracks, this tree grows or shrinks.
>
> I also imagine you could have a cursor moving around the hex view, so as
> daffodil moves around (e.g. scanning for delimiters, extracting
> integers), one could update this data view to show what daffodil is
> doing and where it is.
>
> I also image there could be other views

Re: Daffodil JIRA ticket DAFFODIL-2491

2021-04-06 Thread John Wass
Dave,  The Jira issue has been updated.  Thanks for the link to the
workflow.

john

On Tue, Apr 6, 2021 at 8:15 AM Thompson, Dave 
wrote:

> Good morning John.
>
> I am working to verify/close the "Resolved" Daffodil v3.1.0 JIRA tickets
> for the upcoming release.
>
> I see you made a resent commit to the daffodil repo that addressed JIRA
> ticket DAFFODIL-2491. I reviewed the v3.1.0 resolved tickets and saw that
> DAFFODIL-2491 was not listed. I viewed the ticket and found it had not been
> updated with a comment on the changes, to "Resolved" or the fix version
> changed to v3.1.0.
>
> If the issue is fully resolved could you please update the ticket per the
> "Code Contributor Workflow" step 16 at the following Apache Daffodil page:
>
>
> https://cwiki.apache.org/confluence/display/DAFFODIL/Code+Contributor+Workflow
>
> I use the ticket and commit comments to determine what/how  I need to
> review/verify the resolution.
>
> Thanks,
>
> Dave
>
> -Original Message-
> From: j...@apache.org
> Sent: Wednesday, March 31, 2021 1:21 PM
> To: comm...@daffodil.apache.org
> Subject: [daffodil] branch master updated: Allow custom debuggers through
> SAPI and JAPI
>
> This is an automated email from the ASF dual-hosted git repository.
>
> jw3 pushed a commit to branch master
> in repository https://gitbox.apache.org/repos/asf/daffodil.git
>
>
> The following commit(s) were added to refs/heads/master by this push:
>  new 7faeb04  Allow custom debuggers through SAPI and JAPI
> 7faeb04 is described below
>
> commit 7faeb04aa17337487848f5f61141a74d7d82484b
> Author: John Wass 
> AuthorDate: Wed Mar 31 10:45:51 2021 -0400
>
> Allow custom debuggers through SAPI and JAPI
>
> DAFFODIL-2491
> ---
>  .../scala/org/apache/daffodil/japi/Daffodil.scala  | 20 --
>  .../daffodil/example/TestCustomDebuggerAPI.java| 79
> ++
>  .../org/apache/daffodil/example/TestJavaAPI.java   | 14 ++--
>  .../scala/org/apache/daffodil/sapi/Daffodil.scala  | 20 --
>  .../daffodil/example/TestCustomDebuggerAPI.scala   | 62 +
>  .../org/apache/daffodil/example/TestScalaAPI.scala | 16 ++---
>  6 files changed, 188 insertions(+), 23 deletions(-)
>
> diff --git
> a/daffodil-japi/src/main/scala/org/apache/daffodil/japi/Daffodil.scala
> b/daffodil-japi/src/main/scala/org/apache/daffodil/japi/Daffodil.scala
> index 83918dc..ba48c30 100644
> --- a/daffodil-japi/src/main/scala/org/apache/daffodil/japi/Daffodil.scala
> +++ b/daffodil-japi/src/main/scala/org/apache/daffodil/japi/Daffodil.sca
> +++ la
> @@ -58,6 +58,7 @@ import java.net.URI
>
>  import org.apache.daffodil.api.URISchemaSource
>  import org.apache.daffodil.api.Validator
> +import org.apache.daffodil.debugger.Debugger
>  import org.apache.daffodil.util.Maybe
>  import org.apache.daffodil.util.Maybe._  import
> org.apache.daffodil.util.MaybeULong
> @@ -532,7 +533,8 @@ class DataProcessor private[japi] (private var dp:
> SDataProcessor)
>/**
> * Enable/disable debugging.
> *
> -   * Before enabling, [[DataProcessor#setDebugger(DebuggerRunner)]] must
> be called with a non-null debugger.
> +   * Before enabling, [[DataProcessor#withDebugger]] or
> [[DataProcessor#withDebuggerRunner(DebuggerRunner)]] must be
> +   * called with a non-null debugger.
> *
> * @param flag true to enable debugging, false to disabled
> */
> @@ -544,7 +546,8 @@ class DataProcessor private[japi] (private var dp:
> SDataProcessor)
>/**
> * Obtain a new [[DataProcessor]] instance with debugging enabled or
> disabled.
> *
> -   * Before enabling, [[DataProcessor#withDebugger(DebuggerRunner)]] must
> be called to obtain a [[DataProcessor]] with a non-null debugger.
> +   * Before enabling, [[DataProcessor#withDebugger(Debugger)]] or
> [[DataProcessor#withDebuggerRunner(DebuggerRunner)]]
> +   * must be called to obtain a [[DataProcessor]] with a non-null
> debugger.
> *
> * @param flag true to enable debugging, false to disabled
> */
> @@ -557,7 +560,7 @@ class DataProcessor private[japi] (private var dp:
> SDataProcessor)
> *
> * @param dr debugger runner
> */
> -  @deprecated("Use withDebugger.", "2.6.0")
> +  @deprecated("Use withDebuggerRunner.", "2.6.0")
>def setDebugger(dr: DebuggerRunner): Unit = {
>  val debugger = newDebugger(dr)
>  dp = dp.withDebugger(debugger)
> @@ -568,11 +571,20 @@ class DataProcessor private[japi] (private var dp:
> SDataProcessor)
> *
> * @param dr debugger runner
> */
> -  def withDebugger(dr: DebuggerRunner): DataProcesso

Re: Output SVRL from Schematron Validator

2021-04-05 Thread John Wass
Thanks.

> Do we need API-level access to this? E.g. in SAPI/JAPI? I would imagine
so.

Yeah good call, Ill add it.



On Mon, Apr 5, 2021 at 1:31 PM Beckerle, Mike 
wrote:

> I looked at the PR for this feature. I think it's fine to have the CLI
> provide an option with a file to write it to, and API-wise, if we decide we
> have to expose this, then a parseResult.validationResult.raw member, or
> like that, is fine with me.
>
> Do we need API-level access to this? E.g. in SAPI/JAPI? I would imagine so.
> ____
> From: John Wass 
> Sent: Monday, March 29, 2021 1:55 PM
> To: dev@daffodil.apache.org 
> Subject: Re: Output SVRL from Schematron Validator
>
> The thought with the OutputStream was it would be dumped directly to a file
> or log or stdout, definitely more of a logging effect than for more
> processing, since the structured results from a validator are already
> returned as ValidationResult.  That idea looks and sounds worse today that
> it did initially.
>
> > What about if each ParseResult has a member
>
> Ah, what if the ParseResult hangs on to the ValidationResult and makes it
> accessible that way?
>
>   def validationResult(): Option[ValidationResult]
>
> To support this ValidationResult would become a trait which lets validator
> implementations attach custom data and interfaces to the result, which
> clients can get to through the ParseResult accessor.
>
> Something like this;
> https://github.com/jw3/daffodil/tree/validator_result_refactor
>
> Thoughts?
>
>
> On Fri, Mar 26, 2021 at 10:30 AM Steve Lawrence 
> wrote:
>
> > What about if each ParseResult has a member that's something like
> >
> >   val validationData: Option[AnyRef]
> >
> > Each validator can optionally return some validation data which is then
> > store in this member. The user could then access this validation data
> > through the ParseResult and cast it to what it should be, as documented
> > by the validator.
> >
> > This allows each validator a way provide whatever additional data they
> > want in whatever format makes the most sense for them.
> >
> > There's the downside that a user needs to know how to cast this AnyRef
> > based on which validator was used. But a similar issue exists if this is
> > just an InputStream--you still need to know how to interpret that
> > InputStream data. But with this approach, it lets a Validator return
> > complex structures that can provide richer information than an
> > InputStream could.
> >
> > On 3/26/21 10:16 AM, John Wass wrote:
> > > Reference implementation here
> > > https://github.com/jw3/daffodil/tree/validator_outputstream
> > >
> > > Currently has changes sketched in from the parse result on down.  Need
> to
> > > wire things in through DP and CLI yet.
> > >
> > > Haven't thought of an alternative that works yet.
> > >
> > >
> > > On Tue, Mar 23, 2021 at 12:59 PM John Wass  wrote:
> > >
> > >> Looking at DAFFODIL-2482 that came up due to a gap that's blocking
> > >> integration of the schematron validation functionality into some
> > workflows
> > >> that require the full SVRL output, not just the pass/fail status.
> > >>
> > >> So what needs to happen here is the SVRL that we currently just parse
> > for
> > >> errors and discard needs to be output in a predictable way. I've
> tried a
> > >> couple things intent on minimizing the footprint of the impl but
> coming
> > up
> > >> empty mainly due to violating the reusable validator principle.
> > >>
> > >> So another unminimized approach would be to provide an additional
> stream
> > >> to all validators for raw output to be written, the implementation of
> > that
> > >> stream is determined by configuration from the DataProcessor.  The new
> > >> output stream is passed at validation-time, which requires changing
> the
> > >> signature of the validate call to accept this output stream in
> addition
> > to
> > >> the existing input stream (or we could add another interface, but I'm
> > not
> > >> convinced of the usefulness of that currently).
> > >>
> > >> Looking for some thoughts on this approach.
> > >>
> > >>
> > >> [1] https://issues.apache.org/jira/browse/DAFFODIL-2482
> > >>
> > >>
> > >
> >
> >
>


Re: Output SVRL from Schematron Validator

2021-03-29 Thread John Wass
The thought with the OutputStream was it would be dumped directly to a file
or log or stdout, definitely more of a logging effect than for more
processing, since the structured results from a validator are already
returned as ValidationResult.  That idea looks and sounds worse today that
it did initially.

> What about if each ParseResult has a member

Ah, what if the ParseResult hangs on to the ValidationResult and makes it
accessible that way?

  def validationResult(): Option[ValidationResult]

To support this ValidationResult would become a trait which lets validator
implementations attach custom data and interfaces to the result, which
clients can get to through the ParseResult accessor.

Something like this;
https://github.com/jw3/daffodil/tree/validator_result_refactor

Thoughts?


On Fri, Mar 26, 2021 at 10:30 AM Steve Lawrence 
wrote:

> What about if each ParseResult has a member that's something like
>
>   val validationData: Option[AnyRef]
>
> Each validator can optionally return some validation data which is then
> store in this member. The user could then access this validation data
> through the ParseResult and cast it to what it should be, as documented
> by the validator.
>
> This allows each validator a way provide whatever additional data they
> want in whatever format makes the most sense for them.
>
> There's the downside that a user needs to know how to cast this AnyRef
> based on which validator was used. But a similar issue exists if this is
> just an InputStream--you still need to know how to interpret that
> InputStream data. But with this approach, it lets a Validator return
> complex structures that can provide richer information than an
> InputStream could.
>
> On 3/26/21 10:16 AM, John Wass wrote:
> > Reference implementation here
> > https://github.com/jw3/daffodil/tree/validator_outputstream
> >
> > Currently has changes sketched in from the parse result on down.  Need to
> > wire things in through DP and CLI yet.
> >
> > Haven't thought of an alternative that works yet.
> >
> >
> > On Tue, Mar 23, 2021 at 12:59 PM John Wass  wrote:
> >
> >> Looking at DAFFODIL-2482 that came up due to a gap that's blocking
> >> integration of the schematron validation functionality into some
> workflows
> >> that require the full SVRL output, not just the pass/fail status.
> >>
> >> So what needs to happen here is the SVRL that we currently just parse
> for
> >> errors and discard needs to be output in a predictable way. I've tried a
> >> couple things intent on minimizing the footprint of the impl but coming
> up
> >> empty mainly due to violating the reusable validator principle.
> >>
> >> So another unminimized approach would be to provide an additional stream
> >> to all validators for raw output to be written, the implementation of
> that
> >> stream is determined by configuration from the DataProcessor.  The new
> >> output stream is passed at validation-time, which requires changing the
> >> signature of the validate call to accept this output stream in addition
> to
> >> the existing input stream (or we could add another interface, but I'm
> not
> >> convinced of the usefulness of that currently).
> >>
> >> Looking for some thoughts on this approach.
> >>
> >>
> >> [1] https://issues.apache.org/jira/browse/DAFFODIL-2482
> >>
> >>
> >
>
>


Re: Output SVRL from Schematron Validator

2021-03-26 Thread John Wass
Reference implementation here
https://github.com/jw3/daffodil/tree/validator_outputstream

Currently has changes sketched in from the parse result on down.  Need to
wire things in through DP and CLI yet.

Haven't thought of an alternative that works yet.


On Tue, Mar 23, 2021 at 12:59 PM John Wass  wrote:

> Looking at DAFFODIL-2482 that came up due to a gap that's blocking
> integration of the schematron validation functionality into some workflows
> that require the full SVRL output, not just the pass/fail status.
>
> So what needs to happen here is the SVRL that we currently just parse for
> errors and discard needs to be output in a predictable way. I've tried a
> couple things intent on minimizing the footprint of the impl but coming up
> empty mainly due to violating the reusable validator principle.
>
> So another unminimized approach would be to provide an additional stream
> to all validators for raw output to be written, the implementation of that
> stream is determined by configuration from the DataProcessor.  The new
> output stream is passed at validation-time, which requires changing the
> signature of the validate call to accept this output stream in addition to
> the existing input stream (or we could add another interface, but I'm not
> convinced of the usefulness of that currently).
>
> Looking for some thoughts on this approach.
>
>
> [1] https://issues.apache.org/jira/browse/DAFFODIL-2482
>
>


Output SVRL from Schematron Validator

2021-03-23 Thread John Wass
Looking at DAFFODIL-2482 that came up due to a gap that's blocking
integration of the schematron validation functionality into some workflows
that require the full SVRL output, not just the pass/fail status.

So what needs to happen here is the SVRL that we currently just parse for
errors and discard needs to be output in a predictable way. I've tried a
couple things intent on minimizing the footprint of the impl but coming up
empty mainly due to violating the reusable validator principle.

So another unminimized approach would be to provide an additional stream to
all validators for raw output to be written, the implementation of that
stream is determined by configuration from the DataProcessor.  The new
output stream is passed at validation-time, which requires changing the
signature of the validate call to accept this output stream in addition to
the existing input stream (or we could add another interface, but I'm not
convinced of the usefulness of that currently).

Looking for some thoughts on this approach.


[1] https://issues.apache.org/jira/browse/DAFFODIL-2482


Re: Possible road map for GUI debugger

2021-02-16 Thread John Wass
> I am conceiving of it as a standalone utility.

I'm interested in the reasoning there.  No strong opinions, I can see some
advantages either way, but an IDE integration does seem to have a head
start in several areas related to debugging.



On Mon, Feb 15, 2021 at 1:27 PM Sloane, Brandon 
wrote:

> We are at the very early stages. I am conceiving of it as a standalone
> utility.
>
>
>
> Brandon Sloane | Engineer
>
> bslo...@owlcyberdefense.com
>
> Connect with us!
>
> <https://www.linkedin.com/company/owlcyberdefense/>
> <https://twitter.com/owlcyberdefense>
>
> <https://owlcyberdefense.com/resources/events/>
>
>
>
> The information contained in this transmission is for the personal and
> confidential use of the individual or entity to which it is addressed. If
> the reader is not the intended recipient, you are hereby notified that any
> review, dissemination, or copying of this communication is strictly
> prohibited. If you have received this transmission in error, please notify
> the sender immediately
>
> --
> *From:* John Wass 
> *Sent:* Monday, February 15, 2021 7:39 AM
> *To:* dev@daffodil.apache.org 
> *Subject:* Re: Possible road map for GUI debugger
>
> Is the gui envisioned to be an ide plugin or a standalone application (ie
> swing, fx, ...)?
>
>
>
> On Sun, Jan 24, 2021 at 2:08 PM Sloane, Brandon <
> bslo...@owlcyberdefense.com>
> wrote:
>
> > Following up on the recent discussion about the Daffodil debugger, I
> would
> > like to propose a roadmap for creating graphical debugger.
> > Organizationally, the GUI debugger is a separate application (maintained
> > in the normal Daffodil repository) that may include any of the current
> > Daffodil libraries as a dependency.
> >
> > Phases 1 and 2 focus on developing an enhanced trace functionality; where
> > Daffodil outputs a machine readable trace of the parse/unparse process.
> The
> > GUI then allows a user to view this trace offline.
> >
> > Phase 3 adds support for interactive debugging, by allowing the
> > Daffodil-cli and Daffodil-gui-debugger to act in a client/server
> > relationship
> >
> > Phase 4 focuses on backend debugger enhancements.
> >
> > In more detail:
> >
> > Phase 1:
> >
> >- Add a --machine-trace flag to Daffodil CLI, that is analogous to the
> >current --trace option. Output is a stream of XML parse-state nodes,
> each
> >one describing the current state of the parse, including:
> >   - Current infoset
> >   - Current bit position
> >   - Current parser
> >   - A UID of the current state
> >   - A UID of the parent state (in event of backtracking, multiple
> >   states will have the same parent).
> >- Machine-trace output stream also include input-data nodes, that
> >include:
> >   - A region of (byte aligned) input data (hex encoded?)
> >   - The byte position of the start of the region
> >   - The relative location of the input data nodes is undefined, but
> >   given the complete trace, it should be possible to reconstruct the
> original
> >   dataset by stitching together the data in the nodes. The goal here
> is that
> >   we can just output data as Daffodil reads it in.
> >- --machine-trace take an optional argument of a file path. If
> >provided, its output will be directed to a file at said location
> (otherwise
> >stderr)
> >- Create a GUI program that can ingest the machine trace and display
> >it in a 4-pane view:
> >   - Pane 1 - trace navigator
> >  - Represents the entire parse in a tree format. Each parse state
> >  corresponds to a node on the tree
> >  - The user can select a node on the tree to load it into the
> >  other panes
> >   - Pane 2 - Infoset viewer
> >  - Shows the complete infoset corresponding to the node selected
> >  in pane 1.
> >  - Indicates the "current" position of the parser within the
> >  infoset
> >  - Highlights changes in the infoset relative to the parent node.
> >   - Pane 3 - binary viewer
> >  - Shows a complete hex-dump of the input data, indicating
> >  current location within the parse
> >   - Pane 3 -- parser view
> >  - Shows the current parser. Probably will be a dumping ground
> >  for additional information that does not warrant the addition
> of a new pane.
> >
> >
> > Phase 2:

Re: Possible road map for GUI debugger

2021-02-15 Thread John Wass
Is the gui envisioned to be an ide plugin or a standalone application (ie
swing, fx, ...)?



On Sun, Jan 24, 2021 at 2:08 PM Sloane, Brandon 
wrote:

> Following up on the recent discussion about the Daffodil debugger, I would
> like to propose a roadmap for creating graphical debugger.
> Organizationally, the GUI debugger is a separate application (maintained
> in the normal Daffodil repository) that may include any of the current
> Daffodil libraries as a dependency.
>
> Phases 1 and 2 focus on developing an enhanced trace functionality; where
> Daffodil outputs a machine readable trace of the parse/unparse process. The
> GUI then allows a user to view this trace offline.
>
> Phase 3 adds support for interactive debugging, by allowing the
> Daffodil-cli and Daffodil-gui-debugger to act in a client/server
> relationship
>
> Phase 4 focuses on backend debugger enhancements.
>
> In more detail:
>
> Phase 1:
>
>- Add a --machine-trace flag to Daffodil CLI, that is analogous to the
>current --trace option. Output is a stream of XML parse-state nodes, each
>one describing the current state of the parse, including:
>   - Current infoset
>   - Current bit position
>   - Current parser
>   - A UID of the current state
>   - A UID of the parent state (in event of backtracking, multiple
>   states will have the same parent).
>- Machine-trace output stream also include input-data nodes, that
>include:
>   - A region of (byte aligned) input data (hex encoded?)
>   - The byte position of the start of the region
>   - The relative location of the input data nodes is undefined, but
>   given the complete trace, it should be possible to reconstruct the 
> original
>   dataset by stitching together the data in the nodes. The goal here is 
> that
>   we can just output data as Daffodil reads it in.
>- --machine-trace take an optional argument of a file path. If
>provided, its output will be directed to a file at said location (otherwise
>stderr)
>- Create a GUI program that can ingest the machine trace and display
>it in a 4-pane view:
>   - Pane 1 - trace navigator
>  - Represents the entire parse in a tree format. Each parse state
>  corresponds to a node on the tree
>  - The user can select a node on the tree to load it into the
>  other panes
>   - Pane 2 - Infoset viewer
>  - Shows the complete infoset corresponding to the node selected
>  in pane 1.
>  - Indicates the "current" position of the parser within the
>  infoset
>  - Highlights changes in the infoset relative to the parent node.
>   - Pane 3 - binary viewer
>  - Shows a complete hex-dump of the input data, indicating
>  current location within the parse
>   - Pane 3 -- parser view
>  - Shows the current parser. Probably will be a dumping ground
>  for additional information that does not warrant the addition of a 
> new pane.
>
>
> Phase 2:
>
>- Support showing suspensions on unparse
>- Display additional data about the parse/unparse state:
>   - Delimeter stack
>   - Variables
>- Support additional formats for displaying input data
>   - Text (of selectable encoding)
>   - Binary (not hex; actual 1s and 0s)
>- Support generating traces through the API as well.
>
> Phase 2.5:
>
> Gather user feedback and iterate on the trace viewer functionality
>
> Phase 3:
>
>- Define protocol for the debugger to talk with the CLI.
>- Debugger can load schema, and input data.
>- Debugger can step through the parse/unparse
>- Viewer built in phases 1 and 2 continually updates as the parse
>proceeds
>- User can send CLI debugger commands and get the textual response;
>but only for the current parse/unparse state.
>- User can set breakpoints (possibly using textual commands). Debugger
>supports continue until breakpoint
>
> Phase 3.5:
>
>
>- Not debugger related, but as long as the Daffodil CLI can operate as
>a server, support server mode operations for non-debug use cases.
>
>
> Phase 4:
>
>- Support time travel debugging, with an interface in both the CLI and
>GUI debuggers
>- Support associating a trace with schema source, and allowing a "go
>to source" functionality where the user can navigate directly to the
>location in source corresponding to the current parser
>- Support dynamically editing/reloading schema and input data
>
> Thoughts?
> Brandon Sloane | Engineer
>
> bslo...@owlcyberdefense.com
>
> Connect with us!
>
> 
> 
>
> 
>
>
>
> The information contained in this transmission is for the personal and
> confidential use of the individual or entity to which it is addressed. If
> the reader is not the intended recipient, you are

Re: [VOTE] Contributors - Graduate Apache Daffodil (Incubating) to a top-level project

2021-01-29 Thread John Wass
+1

On Fri, Jan 29, 2021 at 10:35 AM Thompson, Dave <
dthomp...@owlcyberdefense.com> wrote:

> +1
>
> -Original Message-
> From: Mike Beckerle
> Sent: Thursday, January 28, 2021 6:09 PM
> To: dev@daffodil.apache.org
> Subject: [VOTE] Contributors - Graduate Apache Daffodil (Incubating) to a
> top-level project
>
> One of the first steps in graduating from the Apache incubator to top
> level is making sure we have consensus among our contributors that this
> makes sense now. This is an initial first step.
>
> Please reply with your vote (+1, 0, or -1 with reasons)
>
> This vote will be open for at least 72 hours (non-weekend), so until at
> least 5pm US.ET (UTC-5) on Tuesday, Feb 2.
>
> You can review our wiki page about how we meet the ASF project maturity
> model which is something projects do to self-assess before moving forward.
> Your comments on this wiki page are also welcome.
>
>
> https://cwiki.apache.org/confluence/display/DAFFODIL/Apache+Daffodil+Maturity+Model+Assesment
>
> About voting: see The Apache Voting Process:
> https://www.apache.org/foundation/voting.html if you'd like to review the
> process/guidance.
>
> My vote +1.
>


Re: Embedded Schematron progress

2021-01-18 Thread John Wass
Implemented that and added a test for it in
section02.schema_definition_errors.TestSDE.  The cli tests are in there as
well.


On Mon, Jan 18, 2021 at 9:57 AM John Wass  wrote:

> > I think this warning (or something like it) is actually somewhat useful
> and we shouldn't get rid of it completely.
>
> Concur.  Warn just seems a bit high of a "log level".
>
> > Seems like this should be good practice in general, but maybe we don't
> want to require it?
>
> So far it has only been a problem with embedded schematron.  Was trying to
> think of a way to only eliminate the warn in that case...
>
> > Change the logic to only output this warning if there is no source
> attribute AND the appinfo contains any elements with the dfdl namespace.
>
> I like this.  Will incorporate that to get rid of my local modifications
> and get tests passing in CI and we can see any other/better suggestions
> come about.
>
>
>
>
> On Mon, Jan 18, 2021 at 9:37 AM Steve Lawrence 
> wrote:
>
>> I think this warning (or something like it) is actually somewhat useful
>> and we shouldn't get rid of it completely. If someone tries to add some
>> DFDL annotations but forgets the source in the appinfo, then removing
>> this would just cause Daffodil to silently ignore the block, which could
>> be confusing. Some potential solutions:
>>
>> 1) We recommend that a source attribute be used for all appinfo elements
>> in DFDL schemas, even if not necessary. Seems like this should be good
>> practice in general, but maybe we don't want to require it?
>>
>> 2) Change the logic to only output this warning if there is no source
>> attribute AND the appinfo contains any elements with the dfdl namespace.
>> This way, if someone does accidentally leave off the source attribute
>> for DFDL annotations, they'll get a helpful warning. We might even want
>> to go a step further and give a warning if an appinfo element contains
>> elements in the dfdl namespace but that does not have the dfdl source
>> attribute. This catches both the case where the source is left off, but
>> also catches the case where someone just typos the source. This way we
>> still allow appinfo's without sources, but we only silently ignore them
>> if they do not contain any dfdl elements.
>>
>>
>> On 1/18/21 7:08 AM, John Wass wrote:
>> > Writing some CLI tests showed a diagnostic warning has gone undetected
>> in
>> > the embedded schematron parse.  Using xs:appinfo to add schematron hits
>> > [AnnotatedSchemaComponent#L403](
>> >
>> https://github.com/apache/incubator-daffodil/blob/master/daffodil-core/src/main/scala/org/apache/daffodil/dsom/AnnotatedSchemaComponent.scala#L403
>> > ).
>> >
>> > `this.SDW(WarnID.AppinfoNoSource, """xs:appinfo without source
>> attribute.
>> > Is source="http://www.ogf.org/dfdl/"; missing?""")`
>> >
>> > Thoughts on siliencing or lessening significance of this (eg. log
>> instead
>> > of diagnostic)?
>> >
>> >
>> > On Wed, Jan 6, 2021 at 9:19 AM John Wass  wrote:
>> >
>> >> The schema and tests for BMP/GIF/JPEG were moved into branches on those
>> >> DFDLSchemas repos.  After this PR is merged and a the next release is
>> >> pubished those tests could be added to each of those repos.  I suppose
>> the
>> >> embedded schematron schema could merged any time without the tests.
>> Those
>> >> repos would be a good context to continue and resolve the "best
>> practices
>> >> in the schematron" discussions.
>> >>
>> >> On Tue, Dec 22, 2020 at 9:53 AM John Wass  wrote:
>> >>
>> >>>> The second one is similar to examples in the GIF schema
>> >>> <
>> https://github.com/DFDLSchemas/GIF/blob/master/src/main/resources/com/mitre/gif/sch/GIF.sch#L74-L76
>> >.
>> >>> That schema can be added in the PR unit tests, to go along with the
>> BMP and
>> >>> JPEG.
>> >>>
>> >>> Added the gif schema to the tests, looking good.  Specifically looked
>> at
>> >>> rule `count(/GIF/Global_Color_Table/RGB) eq math:pow(2,
>> >>> ../number(Size_of_Global_Color_Table) + 1)`.
>> >>>
>> >>> Working on embedding the bmp schema now as the final integration test.
>> >>>
>> >>>
>> >>> On Mon, Dec 21, 2020 at 7:49 AM John Wass  wrote:
>> >>>
>> >>>>

Re: Embedded Schematron progress

2021-01-18 Thread John Wass
> I think this warning (or something like it) is actually somewhat useful
and we shouldn't get rid of it completely.

Concur.  Warn just seems a bit high of a "log level".

> Seems like this should be good practice in general, but maybe we don't
want to require it?

So far it has only been a problem with embedded schematron.  Was trying to
think of a way to only eliminate the warn in that case...

> Change the logic to only output this warning if there is no source
attribute AND the appinfo contains any elements with the dfdl namespace.

I like this.  Will incorporate that to get rid of my local modifications
and get tests passing in CI and we can see any other/better suggestions
come about.




On Mon, Jan 18, 2021 at 9:37 AM Steve Lawrence  wrote:

> I think this warning (or something like it) is actually somewhat useful
> and we shouldn't get rid of it completely. If someone tries to add some
> DFDL annotations but forgets the source in the appinfo, then removing
> this would just cause Daffodil to silently ignore the block, which could
> be confusing. Some potential solutions:
>
> 1) We recommend that a source attribute be used for all appinfo elements
> in DFDL schemas, even if not necessary. Seems like this should be good
> practice in general, but maybe we don't want to require it?
>
> 2) Change the logic to only output this warning if there is no source
> attribute AND the appinfo contains any elements with the dfdl namespace.
> This way, if someone does accidentally leave off the source attribute
> for DFDL annotations, they'll get a helpful warning. We might even want
> to go a step further and give a warning if an appinfo element contains
> elements in the dfdl namespace but that does not have the dfdl source
> attribute. This catches both the case where the source is left off, but
> also catches the case where someone just typos the source. This way we
> still allow appinfo's without sources, but we only silently ignore them
> if they do not contain any dfdl elements.
>
>
> On 1/18/21 7:08 AM, John Wass wrote:
> > Writing some CLI tests showed a diagnostic warning has gone undetected in
> > the embedded schematron parse.  Using xs:appinfo to add schematron hits
> > [AnnotatedSchemaComponent#L403](
> >
> https://github.com/apache/incubator-daffodil/blob/master/daffodil-core/src/main/scala/org/apache/daffodil/dsom/AnnotatedSchemaComponent.scala#L403
> > ).
> >
> > `this.SDW(WarnID.AppinfoNoSource, """xs:appinfo without source attribute.
> > Is source="http://www.ogf.org/dfdl/"; missing?""")`
> >
> > Thoughts on siliencing or lessening significance of this (eg. log instead
> > of diagnostic)?
> >
> >
> > On Wed, Jan 6, 2021 at 9:19 AM John Wass  wrote:
> >
> >> The schema and tests for BMP/GIF/JPEG were moved into branches on those
> >> DFDLSchemas repos.  After this PR is merged and a the next release is
> >> pubished those tests could be added to each of those repos.  I suppose
> the
> >> embedded schematron schema could merged any time without the tests.
> Those
> >> repos would be a good context to continue and resolve the "best
> practices
> >> in the schematron" discussions.
> >>
> >> On Tue, Dec 22, 2020 at 9:53 AM John Wass  wrote:
> >>
> >>>> The second one is similar to examples in the GIF schema
> >>> <
> https://github.com/DFDLSchemas/GIF/blob/master/src/main/resources/com/mitre/gif/sch/GIF.sch#L74-L76
> >.
> >>> That schema can be added in the PR unit tests, to go along with the
> BMP and
> >>> JPEG.
> >>>
> >>> Added the gif schema to the tests, looking good.  Specifically looked
> at
> >>> rule `count(/GIF/Global_Color_Table/RGB) eq math:pow(2,
> >>> ../number(Size_of_Global_Color_Table) + 1)`.
> >>>
> >>> Working on embedding the bmp schema now as the final integration test.
> >>>
> >>>
> >>> On Mon, Dec 21, 2020 at 7:49 AM John Wass  wrote:
> >>>
> >>>>> Does the process create SVRL files when it completes?
> >>>>
> >>>> No, the svrl is consumed and converted into Daffodil diagnostics.
> >>>>
> >>>>
> >>>>>  Is there a commandline option to direct the SVRL file to a specific
> >>>> path and name?
> >>>>
> >>>> It doesn't, but is a good idea and certainly could.  Passing a flag
> >>>> through the validator config could trigger writing the file.
> >>>>
> >>>> Pro

Re: Embedded Schematron progress

2021-01-18 Thread John Wass
Writing some CLI tests showed a diagnostic warning has gone undetected in
the embedded schematron parse.  Using xs:appinfo to add schematron hits
[AnnotatedSchemaComponent#L403](
https://github.com/apache/incubator-daffodil/blob/master/daffodil-core/src/main/scala/org/apache/daffodil/dsom/AnnotatedSchemaComponent.scala#L403
).

`this.SDW(WarnID.AppinfoNoSource, """xs:appinfo without source attribute.
Is source="http://www.ogf.org/dfdl/"; missing?""")`

Thoughts on siliencing or lessening significance of this (eg. log instead
of diagnostic)?


On Wed, Jan 6, 2021 at 9:19 AM John Wass  wrote:

> The schema and tests for BMP/GIF/JPEG were moved into branches on those
> DFDLSchemas repos.  After this PR is merged and a the next release is
> pubished those tests could be added to each of those repos.  I suppose the
> embedded schematron schema could merged any time without the tests.  Those
> repos would be a good context to continue and resolve the "best practices
> in the schematron" discussions.
>
> On Tue, Dec 22, 2020 at 9:53 AM John Wass  wrote:
>
>> > The second one is similar to examples in the GIF schema
>> <https://github.com/DFDLSchemas/GIF/blob/master/src/main/resources/com/mitre/gif/sch/GIF.sch#L74-L76>.
>> That schema can be added in the PR unit tests, to go along with the BMP and
>> JPEG.
>>
>> Added the gif schema to the tests, looking good.  Specifically looked at
>> rule `count(/GIF/Global_Color_Table/RGB) eq math:pow(2,
>> ../number(Size_of_Global_Color_Table) + 1)`.
>>
>> Working on embedding the bmp schema now as the final integration test.
>>
>>
>> On Mon, Dec 21, 2020 at 7:49 AM John Wass  wrote:
>>
>>> > Does the process create SVRL files when it completes?
>>>
>>> No, the svrl is consumed and converted into Daffodil diagnostics.
>>>
>>>
>>> >  Is there a commandline option to direct the SVRL file to a specific
>>> path and name?
>>>
>>> It doesn't, but is a good idea and certainly could.  Passing a flag
>>> through the validator config could trigger writing the file.
>>>
>>> Probably be in a follow up PR.
>>>
>>>
>>> > I'm curious of those type of tests will work with this process.
>>>
>>> They should.  The first can be checked in a unit test that matches a
>>> byte.  The second one is similar to examples in the GIF schema
>>> <https://github.com/DFDLSchemas/GIF/blob/master/src/main/resources/com/mitre/gif/sch/GIF.sch#L74-L76>.
>>> That schema can be added in the PR unit tests, to go along with the BMP and
>>> JPEG.
>>>
>>>
>>>
>>>
>>> On Fri, Dec 18, 2020 at 2:43 PM Rege Nteligen 
>>> wrote:
>>>
>>>> I took a look at the sample xsd's with the imbedded schematron
>>>> asserts.  It looks good.  Does the process create SVRL files when it
>>>> completes?  Is there a commandline option to direct the SVRL file to a
>>>> specific path and name?
>>>>
>>>> I was recently working with a modified daffodil GIF schema and
>>>> schematron to report various findings with GIF files.  Several test
>>>> involved testting that keyword were not in HEX blob fields.  I'm curious of
>>>> those type of tests will work with this process.  This is a sample assert:
>>>>  
>>>> GIF: FAIL: LSD_Blob: AFTER-HDR-REF-SQL: Possible
>>>> malicious SQL reference between segemnts
>>>> 
>>>>
>>>> I've also done test to see if the count of bytes in one field matched
>>>> the size of the field value from another field:
>>>> 
>>>> GIF: RED: LSG_GCL: GCL-RGB-CNT: There must be
>>>> Size_of_Global_Color_Table RGB values.
>>>> 
>>>>
>>>>
>>>>
>>>> On 2020/12/18 17:21:02, John Wass  wrote:
>>>> > The Embedded Schematron PR is moving along, hoping to get it out of
>>>> WIP
>>>> > soon.  https://github.com/apache/incubator-daffodil/pull/463
>>>> >
>>>> > The JPEG and BMP schema repos are being used for testing now, and the
>>>> PNG
>>>> > looks like it would provide some great coverage.. maybe too great :/
>>>> Any
>>>> > other noteworthy sources of sch+data that might be beneficial to test
>>>> with?
>>>> >
>>>> > Observations on embedding
>>>> > - Behavior has been predictable, and errors have been clear
>>>> > - There are multiple placement options for schematron rules in a
>>>> schema
>>>> > - The Validator API has held up well, but might be one issue to come
>>>> out of
>>>> > this effort
>>>> >
>>>> > Examples at
>>>> >
>>>> https://github.com/jw3/incubator-daffodil/tree/validator_spi/embedded_schematron/daffodil-schematron/src/test/resources/xsd
>>>> >
>>>>
>>>


Re: rfc: sbt-tdml

2021-01-12 Thread John Wass
ce idea, it will be good to see how far it can reduce the size of the
> Scala test boilerplate code!  Oh, it occurs to me that sometimes it's
> useful to debug a TDML test case directly from the IDE by running a Scala
> test method under the debugger.  How often do developers need to debug
> existing or new TDML test cases, which would require them to write
> throw-away Scala test methods by hand whenever necessary?  Maybe we can
> code up some sort of universal Scala entry point we can run from the IDE
> which can run any desired TDML test case by editing (at most) one or two
> lines of code.
>
> Yeah, I think this is probably the the most important functionality to
> have, and probably isn't viable if IDE support isn't there. How does
> this integrate with IDEs? Maybe it "just works" for IDEs that support
> sbt? What about if an IDE supports BSP since our current version of SBT
> supports BSP now? But what about IDE's like eclipse. I'm not sure if any
> devs use Eclipse anymore since IntelliJ seems much better, but maybe
> something to consider? Can we claim we only support IDE's that support
> SBT/BSP?
>
> Also, SBT has the testOnly command to run a single testsuite/test. Is
> that supported with this? For example, currently we can run
>
>   sbt testOnly org.apache.daffodilTestSuite -- --tests=test_regex
>
> So it takes a test suite and an optional regex of tests in that suite to
> run. Having this similar functionality would be pretty critical.
> Presumably, it would be very similar to the above, but maybe just a path
> to the TDML file instead, e.g.
>
>   sbt testOnly org/apache/daffodil/.../TestSuite.tdml -- --tests=test_regex
>
> Using the path to the tdml file as the test suite name avoids issues
> with duplicate test suite names in TDML files. Also makes for easily
> copy/pasting a path when you want to test a tdml file.
>
>
> Some other thoughts:
>
> Does anything output when a test fails? Or is that still to be implemented?
>
>
> I'm also curious how this works regarding Daffodil versions. Presumably
> this depends on a certain Daffodil version? What happens if a repo has a
> different Daffodil dependency? Does this override that? Or does this
> just expect that Daffodil is already a dependency and uses that version?
> And hopefully our TDMLRUnner API is stable enough that things just work?
> We definitely have cases where we need to run tests with different
> Daffodil versions, it would be nice if we didn't have to also change the
> plugin version to match.
>
>
> We also need to consider how this gets published. Do you plan to
> maintain and publish this separately as an outside plugin that Daffodil
> just depends on an uses? Or do you plant to have it merged into a
> Daffodil repo? If so, I wonder if it lives in the Daffodil repo so
> Daffodil always has the latest, and then this is published when there
> are Daffodil releases? Or maybe it's a separate Apache repo that follows
> the same Apache release process/requirements, but can be on a different
> schedule than Daffodil?
>
>
> If all the above issue can potentially be resolved, I think there's a
> good chance this could replace all of our boiler plate, which I'm in big
> favor of.
>
>
> > -Original Message-
> > From: John Wass 
> > Sent: Friday, January 8, 2021 4:27 PM
> > To: dev@daffodil.apache.org
> > Subject: EXT: rfc: sbt-tdml
> >
> > https://github.com/jw3/sbt-tdml
> > I needed to get smarter on SBT plugins for a task and tinkered with
> something in Daffodil to do it.
> >
> > The goal here was to remove the TDML test boilerplate. I threw this at
> the daffodil-test project with surpsisingly sane results on the first try.
> See the bottom of the readme.
> >
> > The learning process is pretty much complete, so not sure where this
> goes, but wanted see if there are any comments.
> >
>
>


Re: The future of the daffodil DFDL schema debugger?

2021-01-08 Thread John Wass
What other features could find a nice home in an IDE integration?  Having
single convenient entrypoint (the IDE) for such things would be nice, imo.

Things like...

- Rich set of actions for TDML
  - Run a single test from a TDML file
  - Debug/Run TDML
- Run/Debug a data file with a schema from the project
  - ie Right click on a JPG and have context menu for Run with Daffodil ->
pick from list of dfdl.xsd
...



On Fri, Jan 8, 2021 at 2:47 PM Beckerle, Mike 
wrote:

> Use cases or quasi-requirements. This is my summary so far.
>
> 1) capture a human-readable trace of parse/unparse information to a single
> text file (might be same as 2 if machine-readable is sufficiently human
> readable)
>
> 2) capture a machine-readable trace of parse/unparse information to a
> single text file (might be same as 1 if human readable form is also machine
> readable)
>
> 3) interactive debug from a command line - each display of information is
> requested by a specific command (1 and 2 above might be using this with a
> specific canned set of commands auto-issued to display various information,
> and capturing all to an output stream)
>
> 4) interactive debug with multi-panel display where displays are
> updated/animated automatically as debug context changes. (This is intended
> to mean more than just opening all the schema files in different editor
> windows - more than just gdb-style debug under Emacs.)
>
> 5) interactive debug time-machine - ability to backup to prior
> parser/unparser states, move forward again, or just backup and re-check
> something, but then jump forward to proceed from where one left off.
>
> 6) Non Use Case: IDE for DFDL with rich semantic model (akin to the DSOM
> object model) of the schema.
> This is here just to point out that it's really out of scope. There are
> many questions about the schema (e.g., "can I add this property to this
> element?") that are not? required for the debugger. A full and powerful IDE
> is great, but that's really entirely different than our goals for debugging
> that we're trying to discuss here.
>
>
> 
> From: Sloane, Brandon 
> Sent: Thursday, January 7, 2021 1:25 PM
> To: dev@daffodil.apache.org 
> Subject: Re: The future of the daffodil DFDL schema debugger?
>
> We could also create a new flag for --trace that would format the trace
> output in a more machine readable manner. This should let us accomplish
> Larry's goals, and most of mine, with relativly little effort within
> Daffodil (but still all the effort on the GUI side), and would allow for
> off-site analysis in cases where it is not practical to attach a debugger
> while Daffodil is running.
> 
> From: Sloane, Brandon 
> Sent: Thursday, January 7, 2021 1:21 PM
> To: dev@daffodil.apache.org 
> Subject: Re: The future of the daffodil DFDL schema debugger?
>
> I've been thinking about a tool along similar lines (although more
> integrated with Daffodil than post-processing the trace output).
>
> One thing to keep in mind is that, although the trace output is presented
> as a linear log (since we do not have much choice), the actual process is
> more of a tree, due to backtracking.
>
> Ideally, we would have a multi-pane window showing:
>
>
>   *   The hex/binary data
>   *   The infoset
>   *   A time-axis parse tree; with a "major" node at every point of
> uncertainty and parse error, and "minor" nodes at every parse step
>   *   A view of the DFDL schema
>   *   An interactive terminal debugger (e.g. what we currently have)
>   *   Breakpoints/variables/delimeter-stack/etc
>
> Within these panes, you ough to be able to select a given region/element,
> and highlight all the corresponding elements in the other panes.
>
> I think that exporting the nessasary information from Daffodil to
> implement all of this would be relativly straightforward. The only
> potentially problametic parts I see are:
>
>   *   The interactive debugger would require some form of time-travel to
> implement (I think most of the work for this is done to support backracking)
>   *   The memory requirements when used on large infosets
>
> 
> From: Larry Barber 
> Sent: Thursday, January 7, 2021 1:08 PM
> To: dev@daffodil.apache.org 
> Subject: RE: The future of the daffodil DFDL schema debugger?
>
> When I was doing strange and unusual things with DFDL and generating a lot
> of errors, I envisioned how helpful it would be to have a tool that would
> post-process the --trace output and use it to display a dual pane window
> (like the editor referenced below) with the schema on one side and hex
> version on the other, with a slider that would allow be to flow through the
> parsing action and see pointers as to where the parser was in both the
> schema and input files. In other words just convert the information from
> the -trace into a more useful graphical display.
> Perhaps breakpoint like markers could be added to both files to quickl

rfc: sbt-tdml

2021-01-08 Thread John Wass
https://github.com/jw3/sbt-tdml
I needed to get smarter on SBT plugins for a task and tinkered with
something in Daffodil to do it.

The goal here was to remove the TDML test boilerplate. I threw this at the
daffodil-test project with surpsisingly sane results on the first try.  See
the bottom of the readme.

The learning process is pretty much complete, so not sure where this goes,
but wanted see if there are any comments.


Re: Embedded Schematron progress

2021-01-06 Thread John Wass
The schema and tests for BMP/GIF/JPEG were moved into branches on those
DFDLSchemas repos.  After this PR is merged and a the next release is
pubished those tests could be added to each of those repos.  I suppose the
embedded schematron schema could merged any time without the tests.  Those
repos would be a good context to continue and resolve the "best practices
in the schematron" discussions.

On Tue, Dec 22, 2020 at 9:53 AM John Wass  wrote:

> > The second one is similar to examples in the GIF schema
> <https://github.com/DFDLSchemas/GIF/blob/master/src/main/resources/com/mitre/gif/sch/GIF.sch#L74-L76>.
> That schema can be added in the PR unit tests, to go along with the BMP and
> JPEG.
>
> Added the gif schema to the tests, looking good.  Specifically looked at
> rule `count(/GIF/Global_Color_Table/RGB) eq math:pow(2,
> ../number(Size_of_Global_Color_Table) + 1)`.
>
> Working on embedding the bmp schema now as the final integration test.
>
>
> On Mon, Dec 21, 2020 at 7:49 AM John Wass  wrote:
>
>> > Does the process create SVRL files when it completes?
>>
>> No, the svrl is consumed and converted into Daffodil diagnostics.
>>
>>
>> >  Is there a commandline option to direct the SVRL file to a specific
>> path and name?
>>
>> It doesn't, but is a good idea and certainly could.  Passing a flag
>> through the validator config could trigger writing the file.
>>
>> Probably be in a follow up PR.
>>
>>
>> > I'm curious of those type of tests will work with this process.
>>
>> They should.  The first can be checked in a unit test that matches a
>> byte.  The second one is similar to examples in the GIF schema
>> <https://github.com/DFDLSchemas/GIF/blob/master/src/main/resources/com/mitre/gif/sch/GIF.sch#L74-L76>.
>> That schema can be added in the PR unit tests, to go along with the BMP and
>> JPEG.
>>
>>
>>
>>
>> On Fri, Dec 18, 2020 at 2:43 PM Rege Nteligen 
>> wrote:
>>
>>> I took a look at the sample xsd's with the imbedded schematron asserts.
>>> It looks good.  Does the process create SVRL files when it completes?  Is
>>> there a commandline option to direct the SVRL file to a specific path and
>>> name?
>>>
>>> I was recently working with a modified daffodil GIF schema and
>>> schematron to report various findings with GIF files.  Several test
>>> involved testting that keyword were not in HEX blob fields.  I'm curious of
>>> those type of tests will work with this process.  This is a sample assert:
>>>  
>>> GIF: FAIL: LSD_Blob: AFTER-HDR-REF-SQL: Possible
>>> malicious SQL reference between segemnts
>>> 
>>>
>>> I've also done test to see if the count of bytes in one field matched
>>> the size of the field value from another field:
>>> 
>>> GIF: RED: LSG_GCL: GCL-RGB-CNT: There must be
>>> Size_of_Global_Color_Table RGB values.
>>> 
>>>
>>>
>>>
>>> On 2020/12/18 17:21:02, John Wass  wrote:
>>> > The Embedded Schematron PR is moving along, hoping to get it out of WIP
>>> > soon.  https://github.com/apache/incubator-daffodil/pull/463
>>> >
>>> > The JPEG and BMP schema repos are being used for testing now, and the
>>> PNG
>>> > looks like it would provide some great coverage.. maybe too great :/
>>> Any
>>> > other noteworthy sources of sch+data that might be beneficial to test
>>> with?
>>> >
>>> > Observations on embedding
>>> > - Behavior has been predictable, and errors have been clear
>>> > - There are multiple placement options for schematron rules in a schema
>>> > - The Validator API has held up well, but might be one issue to come
>>> out of
>>> > this effort
>>> >
>>> > Examples at
>>> >
>>> https://github.com/jw3/incubator-daffodil/tree/validator_spi/embedded_schematron/daffodil-schematron/src/test/resources/xsd
>>> >
>>>
>>


Re: Embedded Schematron progress

2020-12-22 Thread John Wass
> The second one is similar to examples in the GIF schema
<https://github.com/DFDLSchemas/GIF/blob/master/src/main/resources/com/mitre/gif/sch/GIF.sch#L74-L76>.
That schema can be added in the PR unit tests, to go along with the BMP and
JPEG.

Added the gif schema to the tests, looking good.  Specifically looked at
rule `count(/GIF/Global_Color_Table/RGB) eq math:pow(2,
../number(Size_of_Global_Color_Table) + 1)`.

Working on embedding the bmp schema now as the final integration test.


On Mon, Dec 21, 2020 at 7:49 AM John Wass  wrote:

> > Does the process create SVRL files when it completes?
>
> No, the svrl is consumed and converted into Daffodil diagnostics.
>
>
> >  Is there a commandline option to direct the SVRL file to a specific
> path and name?
>
> It doesn't, but is a good idea and certainly could.  Passing a flag
> through the validator config could trigger writing the file.
>
> Probably be in a follow up PR.
>
>
> > I'm curious of those type of tests will work with this process.
>
> They should.  The first can be checked in a unit test that matches a
> byte.  The second one is similar to examples in the GIF schema
> <https://github.com/DFDLSchemas/GIF/blob/master/src/main/resources/com/mitre/gif/sch/GIF.sch#L74-L76>.
> That schema can be added in the PR unit tests, to go along with the BMP and
> JPEG.
>
>
>
>
> On Fri, Dec 18, 2020 at 2:43 PM Rege Nteligen 
> wrote:
>
>> I took a look at the sample xsd's with the imbedded schematron asserts.
>> It looks good.  Does the process create SVRL files when it completes?  Is
>> there a commandline option to direct the SVRL file to a specific path and
>> name?
>>
>> I was recently working with a modified daffodil GIF schema and schematron
>> to report various findings with GIF files.  Several test involved testting
>> that keyword were not in HEX blob fields.  I'm curious of those type of
>> tests will work with this process.  This is a sample assert:
>>  
>> GIF: FAIL: LSD_Blob: AFTER-HDR-REF-SQL: Possible
>> malicious SQL reference between segemnts
>> 
>>
>> I've also done test to see if the count of bytes in one field matched the
>> size of the field value from another field:
>> 
>> GIF: RED: LSG_GCL: GCL-RGB-CNT: There must be
>> Size_of_Global_Color_Table RGB values.
>> 
>>
>>
>>
>> On 2020/12/18 17:21:02, John Wass  wrote:
>> > The Embedded Schematron PR is moving along, hoping to get it out of WIP
>> > soon.  https://github.com/apache/incubator-daffodil/pull/463
>> >
>> > The JPEG and BMP schema repos are being used for testing now, and the
>> PNG
>> > looks like it would provide some great coverage.. maybe too great :/
>> Any
>> > other noteworthy sources of sch+data that might be beneficial to test
>> with?
>> >
>> > Observations on embedding
>> > - Behavior has been predictable, and errors have been clear
>> > - There are multiple placement options for schematron rules in a schema
>> > - The Validator API has held up well, but might be one issue to come
>> out of
>> > this effort
>> >
>> > Examples at
>> >
>> https://github.com/jw3/incubator-daffodil/tree/validator_spi/embedded_schematron/daffodil-schematron/src/test/resources/xsd
>> >
>>
>


Re: Daffodil schema file extension

2020-12-21 Thread John Wass
> The advantage of a  element would be that it wouldn't matter
that the tooling is smart.

So if the tooling was smart, there would be no need to consider this and
would just stick with xs:schema?

(smart: recognizes a .dfdl extension as DFDL _and_ knows the DFDL language,
actions, etc)


On Fri, Dec 18, 2020 at 12:45 PM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

> Well, admittedly some of my observations may be dated to using Eclipse. I
> agree IntelliJ does seem to just generally a better job on all forms of
> XML/XSD/DFDL.
>
> The advantage of a  element would be that it wouldn't matter
> that the tooling is smart. Any XML-aware tool would treat these as XML
> documents, and use the schemas we provide to give full support for all
> aspects.
>
> If I was still using Eclipse I would probably have built daf:DFDL by now.
> With IntelliJ I've been able to avoid the need I guess
>
> 
> From: John Wass 
> Sent: Friday, December 18, 2020 12:37 PM
> To: dev@daffodil.apache.org 
> Subject: Re: Daffodil schema file extension
>
> > If you have edited tdml in an XML aware editor, you know that the support
> for embedded dfdl schemas is better than it is in xsd editors for a dfdl
> schema file.
>
> Better in what way?  They looked pretty similar to me, in intellij.
>
>
> On Fri, Dec 18, 2020 at 12:35 PM John Wass  wrote:
>
> > Thanks for the feedback.  It was in line with what I figured.  There is a
> > chance the ambiguity I am seeing between .dfdl.xsd and .xsd is self
> > inflicted, but just wanted to put this out there.
> >
> >
> >
> > On Fri, Dec 18, 2020 at 10:14 AM Beckerle, Mike <
> > mbecke...@owlcyberdefense.com> wrote:
> >
> >> If you have edited tdml in an XML aware editor, you know that the
> support
> >> for embedded dfdl schemas is better than it is in xsd editors for a dfdl
> >> schema file.
> >>
> >> For that reason I thought maybe we should use the ".dfdl" extension for
> a
> >> daffodil feature which replaces the xs:schema element of a dfdl schema
> with
> >> a daf:dfdl element. It would otherwise take all the same attributes as
> >> xs:schema, but by being our own outermost element the editor support
> would
> >> treat it more like tdml than xsd.
> >>
> >> This breaks a daf:dfdl schema from being exactly an XML schema, but the
> >> transform to get back is trivial. This might be worth it for the
> superior
> >> IDE support we would get with almost no effort.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Get Outlook for Android<https://aka.ms/ghei36>
> >> 
> >> From: Steve Lawrence 
> >> Sent: Friday, December 18, 2020 9:31:58 AM
> >> To: dev@daffodil.apache.org 
> >> Subject: Re: Daffodil schema file extension
> >>
> >> I think the main reason for the .xsd extension is so IDEs/editors
> >> recognize the file as a normal XML Schema file, and so you get all the
> >> benefits that come with that (e.g. autocompletion, syntax highlighting,
> >> error checking), since most tools aren't going to know about a DFDL
> >> schema, at least by default.
> >>
> >> The .dfdl.xsd extension is so that it makes it possible to configure
> >> IDEs/editors to know specifically about DFDL schemas (e.g. DFDL specific
> >> annotations/properties). But that requires a little IDE configuration,
> >> and not all IDEs/editors support this kind of thing. So even if you
> >> can't do that, you still at least get the XML Schema capabilities with
> >> the .xsd extension.
> >>
> >> As for changing it, that should be fine from a Daffodil perspective. It
> >> doesn't care at all about the extension--it is purely a convention to
> >> make authoring schemas easier.
> >>
> >> Though, one thing to keep in mind is that although these are "DFDL
> >> Schemas", they are still valid XML Schemas and can be used anywhere an
> >> XML Schema can be used. For example, it's not uncommon to parse a file
> >> with a DFDL schema and then use that exact same schema to validate the
> >> resulting infoset. It's possible some XML validation tools/systems
> >> expect XML validation schemas to end in .xsd, though I'm not aware of
> >> any though.
> >>
> >> I general, I think the benefit to .dfdl.xsd is that things that only
> >> care about XML schemas can view t

Re: Embedded Schematron progress

2020-12-21 Thread John Wass
> Does the process create SVRL files when it completes?

No, the svrl is consumed and converted into Daffodil diagnostics.


>  Is there a commandline option to direct the SVRL file to a specific path
and name?

It doesn't, but is a good idea and certainly could.  Passing a flag through
the validator config could trigger writing the file.

Probably be in a follow up PR.


> I'm curious of those type of tests will work with this process.

They should.  The first can be checked in a unit test that matches a byte.
The second one is similar to examples in the GIF schema
<https://github.com/DFDLSchemas/GIF/blob/master/src/main/resources/com/mitre/gif/sch/GIF.sch#L74-L76>.
That schema can be added in the PR unit tests, to go along with the BMP and
JPEG.




On Fri, Dec 18, 2020 at 2:43 PM Rege Nteligen  wrote:

> I took a look at the sample xsd's with the imbedded schematron asserts.
> It looks good.  Does the process create SVRL files when it completes?  Is
> there a commandline option to direct the SVRL file to a specific path and
> name?
>
> I was recently working with a modified daffodil GIF schema and schematron
> to report various findings with GIF files.  Several test involved testting
> that keyword were not in HEX blob fields.  I'm curious of those type of
> tests will work with this process.  This is a sample assert:
>  
> GIF: FAIL: LSD_Blob: AFTER-HDR-REF-SQL: Possible malicious
> SQL reference between segemnts
> 
>
> I've also done test to see if the count of bytes in one field matched the
> size of the field value from another field:
> 
> GIF: RED: LSG_GCL: GCL-RGB-CNT: There must be
> Size_of_Global_Color_Table RGB values.
> 
>
>
>
> On 2020/12/18 17:21:02, John Wass  wrote:
> > The Embedded Schematron PR is moving along, hoping to get it out of WIP
> > soon.  https://github.com/apache/incubator-daffodil/pull/463
> >
> > The JPEG and BMP schema repos are being used for testing now, and the PNG
> > looks like it would provide some great coverage.. maybe too great :/  Any
> > other noteworthy sources of sch+data that might be beneficial to test
> with?
> >
> > Observations on embedding
> > - Behavior has been predictable, and errors have been clear
> > - There are multiple placement options for schematron rules in a schema
> > - The Validator API has held up well, but might be one issue to come out
> of
> > this effort
> >
> > Examples at
> >
> https://github.com/jw3/incubator-daffodil/tree/validator_spi/embedded_schematron/daffodil-schematron/src/test/resources/xsd
> >
>


Re: Daffodil schema file extension

2020-12-18 Thread John Wass
> If you have edited tdml in an XML aware editor, you know that the support
for embedded dfdl schemas is better than it is in xsd editors for a dfdl
schema file.

Better in what way?  They looked pretty similar to me, in intellij.


On Fri, Dec 18, 2020 at 12:35 PM John Wass  wrote:

> Thanks for the feedback.  It was in line with what I figured.  There is a
> chance the ambiguity I am seeing between .dfdl.xsd and .xsd is self
> inflicted, but just wanted to put this out there.
>
>
>
> On Fri, Dec 18, 2020 at 10:14 AM Beckerle, Mike <
> mbecke...@owlcyberdefense.com> wrote:
>
>> If you have edited tdml in an XML aware editor, you know that the support
>> for embedded dfdl schemas is better than it is in xsd editors for a dfdl
>> schema file.
>>
>> For that reason I thought maybe we should use the ".dfdl" extension for a
>> daffodil feature which replaces the xs:schema element of a dfdl schema with
>> a daf:dfdl element. It would otherwise take all the same attributes as
>> xs:schema, but by being our own outermost element the editor support would
>> treat it more like tdml than xsd.
>>
>> This breaks a daf:dfdl schema from being exactly an XML schema, but the
>> transform to get back is trivial. This might be worth it for the superior
>> IDE support we would get with almost no effort.
>>
>>
>>
>>
>>
>>
>>
>>
>> Get Outlook for Android<https://aka.ms/ghei36>
>> 
>> From: Steve Lawrence 
>> Sent: Friday, December 18, 2020 9:31:58 AM
>> To: dev@daffodil.apache.org 
>> Subject: Re: Daffodil schema file extension
>>
>> I think the main reason for the .xsd extension is so IDEs/editors
>> recognize the file as a normal XML Schema file, and so you get all the
>> benefits that come with that (e.g. autocompletion, syntax highlighting,
>> error checking), since most tools aren't going to know about a DFDL
>> schema, at least by default.
>>
>> The .dfdl.xsd extension is so that it makes it possible to configure
>> IDEs/editors to know specifically about DFDL schemas (e.g. DFDL specific
>> annotations/properties). But that requires a little IDE configuration,
>> and not all IDEs/editors support this kind of thing. So even if you
>> can't do that, you still at least get the XML Schema capabilities with
>> the .xsd extension.
>>
>> As for changing it, that should be fine from a Daffodil perspective. It
>> doesn't care at all about the extension--it is purely a convention to
>> make authoring schemas easier.
>>
>> Though, one thing to keep in mind is that although these are "DFDL
>> Schemas", they are still valid XML Schemas and can be used anywhere an
>> XML Schema can be used. For example, it's not uncommon to parse a file
>> with a DFDL schema and then use that exact same schema to validate the
>> resulting infoset. It's possible some XML validation tools/systems
>> expect XML validation schemas to end in .xsd, though I'm not aware of
>> any though.
>>
>> I general, I think the benefit to .dfdl.xsd is that things that only
>> care about XML schemas can view these files as normal XML Schemas due to
>> the .xsd extension. But things that also care about DFDL schemas can
>> have a special case to treat files with .dfdl.xsd extensions differently.
>>
>> Also, I think I have seen .xml and plain .xsd (without .dfdl) extensions
>> used for DFDL schemas, likely for the IDE support. But .dfdl.xsd gets
>> you the possibility of that extra customization.
>>
>>
>> On 12/18/20 8:21 AM, John Wass wrote:
>> > Doing a little work with software that cares about file extensions,
>> > resulting in a couple questions about the history and future of the dfdl
>> > file extension.
>> >
>> > 1, Why was the extension of .dfdl.xsd used?
>> > 2. What issues would arise by dropping the xsd part?
>> > 3. Are there any other extensions being used, or were there others in
>> the
>> > past?
>> >
>> > Interested in Daffodil and DFDL answers, if they diverge somehow.
>> >
>> > Thanks!
>> >
>>
>>


Re: Daffodil schema file extension

2020-12-18 Thread John Wass
Thanks for the feedback.  It was in line with what I figured.  There is a
chance the ambiguity I am seeing between .dfdl.xsd and .xsd is self
inflicted, but just wanted to put this out there.



On Fri, Dec 18, 2020 at 10:14 AM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

> If you have edited tdml in an XML aware editor, you know that the support
> for embedded dfdl schemas is better than it is in xsd editors for a dfdl
> schema file.
>
> For that reason I thought maybe we should use the ".dfdl" extension for a
> daffodil feature which replaces the xs:schema element of a dfdl schema with
> a daf:dfdl element. It would otherwise take all the same attributes as
> xs:schema, but by being our own outermost element the editor support would
> treat it more like tdml than xsd.
>
> This breaks a daf:dfdl schema from being exactly an XML schema, but the
> transform to get back is trivial. This might be worth it for the superior
> IDE support we would get with almost no effort.
>
>
>
>
>
>
>
>
> Get Outlook for Android<https://aka.ms/ghei36>
> 
> From: Steve Lawrence 
> Sent: Friday, December 18, 2020 9:31:58 AM
> To: dev@daffodil.apache.org 
> Subject: Re: Daffodil schema file extension
>
> I think the main reason for the .xsd extension is so IDEs/editors
> recognize the file as a normal XML Schema file, and so you get all the
> benefits that come with that (e.g. autocompletion, syntax highlighting,
> error checking), since most tools aren't going to know about a DFDL
> schema, at least by default.
>
> The .dfdl.xsd extension is so that it makes it possible to configure
> IDEs/editors to know specifically about DFDL schemas (e.g. DFDL specific
> annotations/properties). But that requires a little IDE configuration,
> and not all IDEs/editors support this kind of thing. So even if you
> can't do that, you still at least get the XML Schema capabilities with
> the .xsd extension.
>
> As for changing it, that should be fine from a Daffodil perspective. It
> doesn't care at all about the extension--it is purely a convention to
> make authoring schemas easier.
>
> Though, one thing to keep in mind is that although these are "DFDL
> Schemas", they are still valid XML Schemas and can be used anywhere an
> XML Schema can be used. For example, it's not uncommon to parse a file
> with a DFDL schema and then use that exact same schema to validate the
> resulting infoset. It's possible some XML validation tools/systems
> expect XML validation schemas to end in .xsd, though I'm not aware of
> any though.
>
> I general, I think the benefit to .dfdl.xsd is that things that only
> care about XML schemas can view these files as normal XML Schemas due to
> the .xsd extension. But things that also care about DFDL schemas can
> have a special case to treat files with .dfdl.xsd extensions differently.
>
> Also, I think I have seen .xml and plain .xsd (without .dfdl) extensions
> used for DFDL schemas, likely for the IDE support. But .dfdl.xsd gets
> you the possibility of that extra customization.
>
>
> On 12/18/20 8:21 AM, John Wass wrote:
> > Doing a little work with software that cares about file extensions,
> > resulting in a couple questions about the history and future of the dfdl
> > file extension.
> >
> > 1, Why was the extension of .dfdl.xsd used?
> > 2. What issues would arise by dropping the xsd part?
> > 3. Are there any other extensions being used, or were there others in the
> > past?
> >
> > Interested in Daffodil and DFDL answers, if they diverge somehow.
> >
> > Thanks!
> >
>
>


Embedded Schematron progress

2020-12-18 Thread John Wass
The Embedded Schematron PR is moving along, hoping to get it out of WIP
soon.  https://github.com/apache/incubator-daffodil/pull/463

The JPEG and BMP schema repos are being used for testing now, and the PNG
looks like it would provide some great coverage.. maybe too great :/  Any
other noteworthy sources of sch+data that might be beneficial to test with?

Observations on embedding
- Behavior has been predictable, and errors have been clear
- There are multiple placement options for schematron rules in a schema
- The Validator API has held up well, but might be one issue to come out of
this effort

Examples at
https://github.com/jw3/incubator-daffodil/tree/validator_spi/embedded_schematron/daffodil-schematron/src/test/resources/xsd


Daffodil schema file extension

2020-12-18 Thread John Wass
Doing a little work with software that cares about file extensions,
resulting in a couple questions about the history and future of the dfdl
file extension.

1, Why was the extension of .dfdl.xsd used?
2. What issues would arise by dropping the xsd part?
3. Are there any other extensions being used, or were there others in the
past?

Interested in Daffodil and DFDL answers, if they diverge somehow.

Thanks!


Re: Fw: [SonarCloud] incubator-daffodil: 1 new issue (new debt: 5min)

2020-10-29 Thread John Wass
I had dead code about 95% complete and some work on another round of unused
(maybe locals) back in June, but the branch will be stale now.

Scalafix helps greatly with these tasks but it is still a bit of work.

Will take a look and see where that was once this validators stuff is
wrapped up.


On Mon, Oct 26, 2020 at 2:59 PM Steve Lawrence  wrote:

> We removed a bunch of warnings a while ago when adding Scala 2.12 support:
>
> https://issues.apache.org/jira/browse/DAFFODIL-2145
>
> I suspect it's just a decent amount of work to reenable these warnings
> and fix whatever is broke. Something that definitely needs to be done,
> it's just been relatively low priority.
>
> Agreed that we definitely don't want to wait until things make it into
> master and the sonar scan is run to find these problems. Getting these
> fixed so they never make it to master is ideal.
>
> - Steve
>
>
> On 10/26/20 2:49 PM, Beckerle, Mike wrote:
> > So I got this from sonarcloud. It identifies an unused variable for
> sure. Doesn't help me understand who or why it got into testing.
> >
> > Question: shouldn't our scala compiler settings be giving us a warning
> about this, and aren't warnings fatal per our current practices?
> >
> > I wouldn't expect we'd need solarcloud to find this for us. I'd expect
> every developer to see these on every compile.
> >
> > 
> > From: SonarCloud 
> > Sent: Monday, October 26, 2020 2:45 PM
> > To: Beckerle, Mike 
> > Subject: [SonarCloud] incubator-daffodil: 1 new issue (new debt: 5min)
> >
> > Project: incubator-daffodil
> > Version: not provided
> >
> > 1 new issue (new debt: 5min)
> >
> > Type
> > Bug: 0Vulnerability: 0Code Smell: 1
> >
> > Rules
> > Unused local variables should be removed (scala): 1
> >
> > Most impacted files
> > ConvertTextNumberUnparser.scala: 1
> >
> > More details at:
> https://sonarcloud.io/project/issues?id=apache_incubator-daffodil&createdAt=2020-10-26T18%3A44%3A35%2B
> >
>
>


Re: Validator SPI proposal

2020-10-13 Thread John Wass
PR is in at https://github.com/apache/incubator-daffodil/pull/431



On Wed, Oct 7, 2020 at 9:43 AM john wass  wrote:

> Based on the feedback it sounds like the approach is sane enough to put
> together a PR.
>
> Thanks all for the reviews and feedback.
>
>
> On Thu, Oct 1, 2020 at 6:11 PM Beckerle, Mike <
> mbecke...@owlcyberdefense.com> wrote:
>
>> FYI: John Wass - I am also going to surf your code a bit so may have more
>> comments. I've fetched from your ctc-oss repositories.
>>
>> One thing the UDF code did work through is how to define a SPI-based
>> feature for Daffodil and also include test-specific instances of it and
>> test them, all in the daffodil source tree.
>>
>> 
>> From: Beckerle, Mike 
>> Sent: Thursday, October 1, 2020 5:52 PM
>> To: dev@daffodil.apache.org 
>> Subject: Re: Validator SPI proposal
>>
>> A few thoughts on top of John Interrante's review.
>>
>> The validator code/clases being found via SPI seems good. Sharing
>> code/library with the existing usage for UDFs would be nice if it works out.
>>
>> The validator code reads in various "specs".
>>
>> For XML Schema validation with xerces, this is the XML Schema (which is
>> also the DFDL schema).
>>
>> For Schematron validation, I know some people have asked for the ability
>> to express the schematron rules on the DFDL schema as added schema
>> annotation elements, positioning them on elements and having the "." path
>> expression refer to the element corresonding to the element declaration
>> upon which the rule is placed. They end up looking somewhat like DFDL's
>> assertions, but the schematron rules use full XPath, and so can do somewhat
>> more, and they are operating on the XML Infoset, not the DFDL Infoset.
>>
>> But regardless of whether the schematron rules are extracted from the
>> DFDL schema or from another file, the schematron validator, just like
>> xerces, effectively has to compile that "spec" information into an internal
>> data structure that enables fast validation.
>>
>> So a requirement is that this happens once only, at startup time
>> regardless of how many times parse/unparse are called.
>>
>> Ideally, one would be able to serialize the result of this compilation
>> i.e., save and serialize the validator so that it needn't be recompiled at
>> all if reloaded. If this compiled validator is serializable, then just
>> making that value a member of the SchemaSetRuntimeData class should do it,
>> as that object and all its members get serialized now.
>>
>> So if possible the validator API should accommodate this
>> compile/save/reload cycle.
>>
>> btw: daffodil has validation options for parse, but not for unparse
>> currently. It should have the option to validate the incoming infoset
>> before unparsing as well.
>>
>> Re: Your "unknowns"
>>
>>  - How to approach breaking changes in the Validator API
>>
>> This is a general issue with Daffodil APIs. I think we have previously
>> adopted a posture of that we would support API change by retaining existing
>> but deprecated APIs for a release or two before phasing them out. We try to
>> sort these out in design discussions of APIs or in Pull-Request reviews
>> that have API changes in them.
>>
>> - How to evolve serialized API objects to prevent breakage in existing
>> serialized objects (specifically from daffodil.api.ValidationMode)
>>
>> We have heretofore punted this in Daffodil generally. Saved
>> parser/unparsers are version specific. If we want to fix this we should use
>> a general approach for all Daffodil's serialized objects.  You are
>> proposing to change ValidationMode so that really, it's not an enum any
>> more, it can use identifiers that are pulled from classpath/SPI objects
>> found.
>>
>> In that case the code that uses ValidationMode  will have to change to
>> use something more general. Probably ValidationMode itself has to go away
>> as a concept replaced by a ValidatorSpec class which can be constructed
>> from a string.
>>
>> Then maybe ValidationMode.on isn't an enum at all any more, but a method
>> that returns a singleton ValidationSpec for the xerces built in validator?
>>
>> - Is there a better overall approach to this  :P
>>
>> Gotta start somewhere.
>>
>> -mikeb
>>
>> 
>> From: Wass, John L 
>> Sent: Wednesday, September 30

Re: Validator SPI proposal

2020-10-07 Thread john wass
Based on the feedback it sounds like the approach is sane enough to put
together a PR.

Thanks all for the reviews and feedback.


On Thu, Oct 1, 2020 at 6:11 PM Beckerle, Mike 
wrote:

> FYI: John Wass - I am also going to surf your code a bit so may have more
> comments. I've fetched from your ctc-oss repositories.
>
> One thing the UDF code did work through is how to define a SPI-based
> feature for Daffodil and also include test-specific instances of it and
> test them, all in the daffodil source tree.
>
> 
> From: Beckerle, Mike 
> Sent: Thursday, October 1, 2020 5:52 PM
> To: dev@daffodil.apache.org 
> Subject: Re: Validator SPI proposal
>
> A few thoughts on top of John Interrante's review.
>
> The validator code/clases being found via SPI seems good. Sharing
> code/library with the existing usage for UDFs would be nice if it works out.
>
> The validator code reads in various "specs".
>
> For XML Schema validation with xerces, this is the XML Schema (which is
> also the DFDL schema).
>
> For Schematron validation, I know some people have asked for the ability
> to express the schematron rules on the DFDL schema as added schema
> annotation elements, positioning them on elements and having the "." path
> expression refer to the element corresonding to the element declaration
> upon which the rule is placed. They end up looking somewhat like DFDL's
> assertions, but the schematron rules use full XPath, and so can do somewhat
> more, and they are operating on the XML Infoset, not the DFDL Infoset.
>
> But regardless of whether the schematron rules are extracted from the DFDL
> schema or from another file, the schematron validator, just like xerces,
> effectively has to compile that "spec" information into an internal data
> structure that enables fast validation.
>
> So a requirement is that this happens once only, at startup time
> regardless of how many times parse/unparse are called.
>
> Ideally, one would be able to serialize the result of this compilation
> i.e., save and serialize the validator so that it needn't be recompiled at
> all if reloaded. If this compiled validator is serializable, then just
> making that value a member of the SchemaSetRuntimeData class should do it,
> as that object and all its members get serialized now.
>
> So if possible the validator API should accommodate this
> compile/save/reload cycle.
>
> btw: daffodil has validation options for parse, but not for unparse
> currently. It should have the option to validate the incoming infoset
> before unparsing as well.
>
> Re: Your "unknowns"
>
>  - How to approach breaking changes in the Validator API
>
> This is a general issue with Daffodil APIs. I think we have previously
> adopted a posture of that we would support API change by retaining existing
> but deprecated APIs for a release or two before phasing them out. We try to
> sort these out in design discussions of APIs or in Pull-Request reviews
> that have API changes in them.
>
> - How to evolve serialized API objects to prevent breakage in existing
> serialized objects (specifically from daffodil.api.ValidationMode)
>
> We have heretofore punted this in Daffodil generally. Saved
> parser/unparsers are version specific. If we want to fix this we should use
> a general approach for all Daffodil's serialized objects.  You are
> proposing to change ValidationMode so that really, it's not an enum any
> more, it can use identifiers that are pulled from classpath/SPI objects
> found.
>
> In that case the code that uses ValidationMode  will have to change to use
> something more general. Probably ValidationMode itself has to go away as a
> concept replaced by a ValidatorSpec class which can be constructed from a
> string.
>
> Then maybe ValidationMode.on isn't an enum at all any more, but a method
> that returns a singleton ValidationSpec for the xerces built in validator?
>
> - Is there a better overall approach to this  :P
>
> Gotta start somewhere.
>
> -mikeb
>
> 
> From: Wass, John L 
> Sent: Wednesday, September 30, 2020 3:19 PM
> To: dev@daffodil.apache.org 
> Subject: Re: Validator SPI proposal
>
> Thanks for the review John.
>
> > Then how do you combine your forked Daffodil and sample schemetron
> implementation/application together so that your simplest usage example
> actually works?
>
> Sure.  The missing instructions are below and they were also added to the
> readme in the sample app repo.
>
> ---
>
> 1. From the root of daffodil; stage the cli package
> `sbt daffodil-cli/universal:stage`
>
> 2. Fro

Re: Subproject renaming - check for consensus

2020-10-07 Thread john wass
OK.  Agreed on shortening it.  The prefix all together seemed unnecessary
to me.  My only comment on that would be perhaps to invert prefixing that
it is only be present on the special cases.

Just thoughts, no strong opinions here either.  Also relatively new to the
code, so could be missing some other considerations.

On Wed, Oct 7, 2020 at 9:22 AM Steve Lawrence  wrote:

> I think the concern was the "daffodil-" prefix is a bit long and a bit
> unnecessary, but it is potentially nice since it helps distinguish code
> related subprojects from non-code (e.g. tests, infrastructure). So
> "daf-" is a decent middle ground.
>
> I think the "daf-" was suggested for me, and I don't feel strong about
> it, so I'm not against dropping the daf- prefix entirely if others don't
> like it.
>
> On 10/7/20 9:16 AM, john wass wrote:
> > What was the reasoning on the daf- prefix again?
> >
> > I looked back through but didnt see what the value add of the prefix was.
> >
> >
> > On Wed, Oct 7, 2020 at 9:09 AM Interrante, John A (GE Research, US) <
> > inter...@research.ge.com> wrote:
> >
> >> I've created an issue (
> https://issues.apache.org/jira/browse/DAFFODIL-2406)
> >> to rename Daffodil subprojects for better clarity.  I'd like to ask
> devs if
> >> the issue has everyone's consensus and ask for a volunteer to perform
> the
> >> renaming since I'm going to be refactoring classes and changing code in
> my
> >> own pull request (daffodil-2202-runtime2) for at least the next few
> days.
> >>
> >> Does this proposal look acceptable to everyone?
> >> Phase 1
> >>
> >>   *   containers
> >>   *   daf-backend-c-generator (ignore - not in main yet, but will be
> later)
> >>   *   daf-backend-scala-parser
> >>   *   daf-backend-scala -unparser
> >>   *   daf-cli
> >>   *   daf-io
> >>   *   daf-lib
> >>   *   daf-macro-lib
> >>   *   daf-propgen
> >>   *   daf-sapi
> >>   *   daf-schema-compiler (was daffodil-core)
> >>   *   daf-tdml-lib
> >>   *   daf-tdml-processor
> >>   *   project
> >>   *   test-suite-daf
> >>   *   test-suite-ibm
> >>   *   test-suite-std
> >>   *   tutorials
> >> Phase 2
> >> Merge daf-backend-scala-parser and daf-backend-scala-unparser together
> >> into daf-backend-scala.
> >>
> >
>
>


Re: Subproject renaming - check for consensus

2020-10-07 Thread john wass
What was the reasoning on the daf- prefix again?

I looked back through but didnt see what the value add of the prefix was.


On Wed, Oct 7, 2020 at 9:09 AM Interrante, John A (GE Research, US) <
inter...@research.ge.com> wrote:

> I've created an issue (https://issues.apache.org/jira/browse/DAFFODIL-2406)
> to rename Daffodil subprojects for better clarity.  I'd like to ask devs if
> the issue has everyone's consensus and ask for a volunteer to perform the
> renaming since I'm going to be refactoring classes and changing code in my
> own pull request (daffodil-2202-runtime2) for at least the next few days.
>
> Does this proposal look acceptable to everyone?
> Phase 1
>
>   *   containers
>   *   daf-backend-c-generator (ignore - not in main yet, but will be later)
>   *   daf-backend-scala-parser
>   *   daf-backend-scala -unparser
>   *   daf-cli
>   *   daf-io
>   *   daf-lib
>   *   daf-macro-lib
>   *   daf-propgen
>   *   daf-sapi
>   *   daf-schema-compiler (was daffodil-core)
>   *   daf-tdml-lib
>   *   daf-tdml-processor
>   *   project
>   *   test-suite-daf
>   *   test-suite-ibm
>   *   test-suite-std
>   *   tutorials
> Phase 2
> Merge daf-backend-scala-parser and daf-backend-scala-unparser together
> into daf-backend-scala.
>