I was thinking of approaching DAP integration via scala-debug-adapter, but as you say, it is intended to provide JDI-via-DAP, so I'm dig a bit to see if the DAP-only hooks can be reused without JDI coming along for the ride.
On Wed, Apr 21, 2021 at 5:58 PM John Wass <[email protected]> wrote: > Thanks Adam, the DAP variable angle is interesting. So are you thinking > all aspects are covered without defining any new DAP interfaces? > > What about the backend, do you think a Daffodil debug server implementation > is needed? > > When looking at the Java Debug server, for both Scala and Java, it looked > very much tied to JDI and debugging a virtual machine. Did you see > anything at all that could be reused there? > > It seemed to me that whether we extend DAP or not custom backend server > components need to be implemented to provide Daffodil debug sessions rather > than the JDI JVM sessions. > > > > > On Wed, Apr 21, 2021 at 7:52 PM Adam Rosien <[email protected]> wrote: > > > I've been reading up on DAP and wanted to share... > > > > > There are many areas though that are unique to Daffodil that have no > > representation in the spec. These things (like InputStream, Infoset, > PoU, > > different variable types, backtracking, etc) will need an extension to > > DAP. This really boils down to defining these things to fit under the > DAP > > BaseProtocol and enabling handling of those objects on both the front and > > back ends. > > > > To me, much of the current state exposed by the (Daffodil) Debugger > > translates directly to a DAP Variable[1]. DAP Variables can be > > nested/hierarchical, so they could (potentially) model larger data like > the > > infoset. I can imagine shoving all the current state into Variables as a > > proof-of-concept. > > > > It also seems like the processing stack maintained by the Daffodil > PState, > > where each item references the relevant schema element, could translate > to > > the DAP StackFrame type [2]. That is, the path from the schema root to > the > > currently processing schema element becomes the "call stack". (Apologies > if > > I don't have all the Daffodil terms lined up correctly.) > > > > For displaying the input data and processing progress, I looked at a few > > existing VS Code extensions that provided non-builtin views, some of > which > > interact with their DAP debugger code [3] [4] [5] [6]. > > > > Finally, I took a cursory look at scala-debug-adapter [7], which, for > > reference, wraps Microsoft's java-debug implementation of DAP. I was > > curious about the set of request/response and event types. Additionally, > > the Typescript API to VS Code offers custom DAP requests and responses, > but > > I couldn't find the equivalent notion in the java-debug project. > > > > .. Adam > > > > [1] > > > > > https://microsoft.github.io/debug-adapter-protocol/specification#Types_Variable > > [2] > > > > > https://microsoft.github.io/debug-adapter-protocol/specification#Types_StackFrame > > [3] https://github.com/scalameta/metals-vscode (provides a debugger and > > non-debugger custom UI) > > [4] https://github.com/microsoft/vscode-cpptools (debugger + memory > view) > > [5] > > https://marketplace.visualstudio.com/items?itemName=marus25.cortex-debug > > (debugger + memory view, > > > > > https://github.com/Marus/cortex-debug/blob/master/src/frontend/memory_content_provider.ts > > ) > > [6] > > > > > https://marketplace.visualstudio.com/items?itemName=slevesque.vscode-hexdump > > (extension for hexdumps that could be controlled by other extensions) > > [7] https://github.com/scalacenter/scala-debug-adapter > > [8] https://github.com/microsoft/java-debug > > > > On Tue, Apr 20, 2021 at 7:08 AM John Wass <[email protected]> wrote: > > > > > > Going to look deeper into how DAP might fit with Daffodil > > > > > > Have been looking over DAP and getting a good feeling about it. The > > > specification [1] seems general enough that it could be applied to > > Daffodil > > > and cover a swath of common operations (like start, stop, break, > > continue, > > > code locations, variables, etc). > > > > > > There are many areas though that are unique to Daffodil that have no > > > representation in the spec. These things (like InputStream, Infoset, > > PoU, > > > different variable types, backtracking, etc) will need an extension to > > > DAP. This really boils down to defining these things to fit under the > > DAP > > > BaseProtocol and enabling handling of those objects on both the front > and > > > back ends. > > > > > > On the backend we need a Daffodil DAP protocol server. Existing JVM > > > implementations (like Java [2], Scala [3]) are tied closely to JDI and > > > would bring a lot of extra baggage to work around that. Developing a > > > Daffodil specific implementation is no small task, but feasible. There > > are > > > a several existing implementations on the JVM that are close and can be > > > looked at for reference. > > > > > > The backend implementation would look similar to what was described in > an > > > earlier post. We could use ZIO/Akka/etc to implement the backend > > Protocol > > > Server to enable the IO between the Daffodil process and the DAP > clients. > > > This implementation would now be guided by the DAP specification. > > > > > > With the protocol and backend extended to fit Daffodil that leaves the > > > frontend. In theory an existing IDE plugin should get pretty close to > > > being able to perform the common debug operations mentioned above. To > > > support the Daffodil extensions there will need to be handling of the > > > extended protocol into whatever views are desired/applicable. > > > > > > > Also looking into the Java Debug Interface (JDI) for comparison. > > > > > > JDI appears to be the wrong level of abstraction for what we are > talking > > > about in debugging Daffodil for schema development. While DAP does do > > JVM > > > debugging (through a JDI DAP impl) it also generalizes to many other > > > debugging scenarios. JDI on the other hand is very tied to the JVM. > > > > > > Extending the JDI appears to be more complex than dealing with DAP, and > > > even though the JDI API is mostly defined with interfaces, there are > > choke > > > points that limit to JVM concepts. For example jdi.Value has a finite > > set > > > of JVM types that it works with, its not clear where Daffodil types > would > > > plugin if even possible. > > > > > > The final note is that unique Daffodil features wouldn’t get to IDE > > support > > > any faster JDI. In some cases, like VS Code, you would still need an > > > extended DAP to support these features. > > > > > > > and depending on how it shakes out will update the example to show > > > integration > > > > > > It would appear wise to investigate DAP further. Next step is to > refine > > > these thoughts with a prototype. I started an implementation in the > > example > > > debugger project [4] to try to run the current example on a _minimal_ > DAP > > > implementation. > > > > > > > > > [1] https://microsoft.github.io/debug-adapter-protocol/specification > > > [2] https://github.com/Microsoft/java-debug > > > [3] https://github.com/scalacenter/scala-debug-adapter > > > [4] https://github.com/jw3/example-daffodil-debug > > > > > > > > > On Mon, Apr 12, 2021 at 9:58 AM John Wass <[email protected]> wrote: > > > > > > > > the code is here https://github.com/jw3/example-daffodil-debug > > > > > > > > There is now a complete console based example for Zio that > demonstrates > > > > controlling the debug flow while distributing the current state to > > three > > > > "displays". > > > > 1. infoset at current step > > > > 2. diff of infoset against previous step > > > > 3. bit position and value of data. > > > > > > > > These displays are very rudimentary but demonstrate the ability to > > > > asynchronously populate multiple views while synchronously > controlling > > > the > > > > debug loop. > > > > > > > > > - The new protocol being informed by existing debugger and DAPis > key > > > > > > > > Going to look deeper into how DAP might fit with Daffodil, and > > depending > > > > on how it shakes out will update the example to show integration. > > > > > > > > Some interesting links to start with > > > > - https://github.com/scalacenter/scala-debug-adapter > > > > - > > > > > > > > > > https://scalameta.org/metals/docs/integrations/debug-adapter-protocol.html > > > > - https://github.com/microsoft/java-debug > > > > > > > > Also looking into the Java Debug Interface (JDI) for comparison. > > > > > > > > > > > > On Thu, Apr 8, 2021 at 12:36 PM John Wass <[email protected]> wrote: > > > > > > > >> Revisiting this post after doing some debugger related work and > > thinking > > > >> about debug protocol/adapters to connect external tooling to the > debug > > > >> process. > > > >> > > > >> This comment is good > > > >> > > > >> > This allo makes me wonder if an approach worth taking for the > future > > > of > > > >> Daffodil schema debugging is developing a sort of "Daffodil Debug > > > >> Protocol". I imagine it would be loosely based on DAP (which is > > > >> essentially JSON message based) but could be targeted to the things > > > that a > > > >> DFDL schema debugger would really need. An added benefit with some > > > sort of > > > >> protocol is the debugger interface can be uncoupled from Daffodil > > > >> itself, so we could implement a TUI/GUI/whatever in any > language/GUI > > > >> framework and just have it communicate the protocol over some form > of > > > >> IPC. Another benefit is that any future backends could implement > this > > > >> protocol and so a single debugger could hook into different backends > > > >> without much issue. Unfortunately, defining such a protocol might > be a > > > >> large task, but we do have our existing debug infrastructure and > > things > > > >> like DAP to guide its development/design. > > > >> > > > >> Some thoughts on this > > > >> - Defining the protocol will be a large task, but a minimal version > > > >> should get up and round tripping quickly with a minimal subset of > the > > > >> protocol. > > > >> - The new protocol being informed by existing debugger and DAPis key > > > >> - Uncoupling from Daffodil is key > > > >> - Adapt the Daffodil protocol to produce DAP after the fact so as > not > > to > > > >> constrain Daffodil debugging capability > > > >> - We dont need to tie the protocol or adapters to a single > framework, > > > >> implementations of the IO layer should be simple enough to support > > > multiple > > > >> things (eg Akka, Zio, "basic" ...) > > > >> - The current debugger lives in runtime1, but can we make an > abstract > > > API > > > >> that any runtime would implement? > > > >> > > > >> Maybe a solution is structured like this > > > >> - daffodil-debug-api: > > > >> - protocol model > > > >> - interfaces: debugger / IO adapter / etc > > > >> - lives in daffodil repo (new subproject?) > > > >> - daffodil-debug-io-NAME > > > >> - provides implementation of a specific IO adapter > > > >> - multiple projects possible (daffodil-debugger-akka, > > > >> daffodil-debugger-zio, etc) > > > >> - supported ones live in their own subprojects, but other can be > > > >> plugged in from external sources > > > >> - ability to support multiple implementations reduces risk of > > lock-in > > > >> - debugger applications > > > >> - maintained in external repositories > > > >> - depending on the IO implementation these could execute be in > > > separate > > > >> process or on separate machine > > > >> - like Steve said, could be any language / framework > > > >> > > > >> Three types of reference implementations / sample applications could > > > also > > > >> guide the development of the API > > > >> 1. a replacement for the existing TUI debugger, expected to end up > > > with > > > >> at minimum the same functionality as the current one. > > > >> 2. a standalone GUI (JavaFX, Scala.js, ..) debugger > > > >> 3. an IDE integration > > > >> > > > >> Thoughts? > > > >> > > > >> Also I'm working on some reference implementations of these concepts > > > >> using Akka and Zio. Not quite ready to talk through it yet, but the > > > code > > > >> is here https://github.com/jw3/example-daffodil-debug > > > >> > > > >> > > > >> > > > >> On Wed, Jan 6, 2021 at 1:42 PM Steve Lawrence <[email protected] > > > > > >> wrote: > > > >> > > > >>> Yep, something like that seems very reasonable for dealing with > large > > > >>> infosets. But it still feels like we still run into usability > issues. > > > >>> For example, what if a user wants to see more? We need some > > > >>> configuration options to increase what we've ellided. It's not big, > > but > > > >>> every new thing that needs configuration adds complexity and > > decreases > > > >>> usability. > > > >>> > > > >>> And I think the only reason we are trying to spend effort elliding > > > >>> things is because we're limited to this gdb-like interface where > you > > > can > > > >>> only print out a little information at a time. > > > >>> > > > >>> I think what would really is to dump this gdb interface and instead > > use > > > >>> multiple windows/views. As a really close example to what I > imagine, > > I > > > >>> recently came across this hex editor: > > > >>> > > > >>> https://www.synalysis.net/ > > > >>> > > > >>> The screenshots are a bit small so it's not super clear, but this > > tool > > > >>> has one view for the data in hex, and one view for a tree of parsed > > > >>> results (which is very similar to our infoset). The "infoset" view > > has > > > >>> information like offset/length/value, and can be related back to > the > > > >>> data view to find the actual bits. > > > >>> > > > >>> I imagine the "next generation daffodil debugger" to look much like > > > >>> this. As data is parsed, the infoset view fills up. This view could > > act > > > >>> like a standard GUI tree so you could collapse sections or scroll > > > around > > > >>> to show just the parts you care about, and have search capabilities > > to > > > >>> quickly jump around. The advantage here is you no longer really > need > > > >>> automated eliding or heuristics for what the user *might* care > about. > > > >>> You just show the whole thing and let user scroll around. As > daffodil > > > >>> parses and backtracks, this tree grows or shrinks. > > > >>> > > > >>> I also imagine you could have a cursor moving around the hex view, > so > > > as > > > >>> daffodil moves around (e.g. scanning for delimiters, extracting > > > >>> integers), one could update this data view to show what daffodil is > > > >>> doing and where it is. > > > >>> > > > >>> I also image there could be other views as well. For example, a > > schema > > > >>> view to show where in the schema daffodil is, and to add/remove > > > >>> breakpoints. And an information view for things like variables, > > > in-scope > > > >>> delimiters, PoU's, etc. > > > >>> > > > >>> The only reason I mention a debug protcol is that would allow this > > GUI > > > >>> to be more easily written in something other that Java/Scala to > take > > > >>> advantage of other GUI toolkits. It's been a long while since I've > > done > > > >>> anything with Java guis, but they seems pretty poor that last I > > looked > > > >>> at them. Would even allow for a TUI, which Java has little/no > support > > > >>> for. Also enables things like remote deubgging if an socket IPC was > > > >>> used. Though I'm not sure all of that is necessary. Just thinking > > what > > > >>> would be ideal, and it can always be pared back. > > > >>> > > > >>> > > > >>> On 1/6/21 12:44 PM, Beckerle, Mike wrote: > > > >>> > I don't think of it as a daffodil debug protocol, but just a > > > >>> separation of concerns between display of information and the > > > behaviors of > > > >>> parse/unparse that need to be points where users can pause, and > data > > > >>> structures available to display. > > > >>> > > > > >>> > E.g., it is 100% a display issue that the infoset (shown as XML) > is > > > >>> clumsy, too big, etc. The infoset is available in the processor > > > state, and > > > >>> one can examine the current node, enclosing node, prior sibling(s), > > > >>> following sibling(s), etc. One can elide contents that are too big > > for > > > >>> hexBinary, etc. > > > >>> > > > > >>> > I think this problem, how to display the infoset with sensible > > limits > > > >>> on sizing, is fairly easy to come up with some design for, that > will > > at > > > >>> least be (1) always fairly small (2) much more useful in more > cases. > > It > > > >>> won't be perfect but can be much better than what we do now. > > > >>> > > > > >>> > One sensible display "mode" should be that displaying the context > > > >>> surrounding the current element (when parsing or unparsing) > displays > > at > > > >>> most N-lines. (N/2 before, N/2 after) with a maximum length of L > > > characters > > > >>> (settable within reason ?) > > > >>> > > > > >>> > Sibling and enclosing nodes would be displayed eliding their > > contents > > > >>> to at most 1 line. > > > >>> > > > > >>> > Here's an example of what I mean. Displaying up to M=10 lines > > total: > > > >>> > > > > >>> > ... > > > >>> > <enclosingParent1> > > > >>> > ... > > > >>> > <priorSibling2>89ab782 ...</...> > > > >>> > <priorSibling1>some text is here and some more text</...> > > > >>> > <currentNode>value might be some big thing which needs to be > > > elided > > > >>> ...</...> > > > >>> > <followingSibling1> ... </...> > > > >>> > ??? > > > >>> > </enclosingParent1> > > > >>> > ??? > > > >>> > > > > >>> > The </...> is just an idea to reduce XML matching end-tag > clutter. > > > >>> > > > > >>> > The ... on a line alone or where element content would appear > > > >>> generally means 1 or more other siblings. The way the display above > > > starts > > > >>> with ... means that this is a relative inner nest, not starting > from > > > the > > > >>> absolute root. > > > >>> > > > > >>> > The ... within simple content means that content is elided to fit > > on > > > >>> one line. Always follows some text characters to differentiate from > > the > > > >>> child-element context. > > > >>> > > > > >>> > The ??? means zero or more other siblings. > > > >>> > > > > >>> > I used bold italic above to point out that the current node would > > be > > > >>> highlighted somehow. Probably a way to do this that doesn't require > > > display > > > >>> modes would be useful. E.g., a text marker like ">>>" as in: > > > >>> > > > > >>> >>>> <currentNode>value .... </...> > > > >>> > > > > >>> > might be better, particularly for a trace output being dumped to > a > > > >>> text file. > > > >>> > > > > >>> > I made the above example an unparser kind of example by showing a > > > >>> following sibling that exists that is after the current node. > > > >>> > > > > >>> > I think the key concept is that any sibling node is displayed in > a > > > way > > > >>> that fits on one line. > > > >>> > E.g., even if the element name was really long, I'd suggest: > > > >>> > > > > >>> > <hereIsAnElementWithASuperLongName...>abcd ... </...> > > > >>> > > > > >>> > Where the element name itself gets elided because it is too long. > > > >>> > > > > >>> > A thought. Note that the above presentation is shown as > quasi-XML, > > > but > > > >>> there's nothing XML-specific about it. A JSON-friendly equivalent > > > could be > > > >>> done as well: > > > >>> > > > > >>> > enclosingParent1 = { > > > >>> > ... > > > >>> > priorSibling2 = "89ab782..." > > > >>> > priorSibling1 = "some text is here and some more text" > > > >>> > currentNode = "value might be some big thing which needs to be > > > >>> elided ..." > > > >>> > followingSibling1 = { ... } > > > >>> > ??? > > > >>> > } > > > >>> > > > > >>> > That's enough for 1 email thread on this debug topic. > > > >>> > > > > >>> > > > > >>> > ________________________________ > > > >>> > From: Steve Lawrence <[email protected]> > > > >>> > Sent: Tuesday, January 5, 2021 2:26 PM > > > >>> > To: [email protected] <[email protected]> > > > >>> > Subject: The future of the daffodil DFDL schema debugger? > > > >>> > > > > >>> > > > > >>> > Now that we're in a new year, I'd like to start a discussion > about > > > the > > > >>> > Daffodil DFDL Schema debugger and how it might be improved to be > > more > > > >>> > useful. > > > >>> > > > > >>> > Note that this is not the capabilities to debug Daffodil itself > in > > > >>> > something like Eclipse/IntelliJ, but the ability for Daffodil to > > > >>> provide > > > >>> > enough extra information during a parse/unparse so that a schema > > > >>> > developer can get an idea of what Daffodil is doing. This makes > it > > > >>> > easier for users (rather than developers) to determine why a > schema > > > >>> > isn't giving the expect parse/unparse result (either because of > bad > > > >>> data > > > >>> > or a faulty schema. > > > >>> > > > > >>> > The current state of the debugger is enabled by providing the > > --debug > > > >>> or > > > >>> > --trace flags in the CLI. More information about that here: > > > >>> > > > > >>> > https://daffodil.apache.org/debugger/ > > > >>> > > > > >>> > This enables a TUI and commands somewhat similar to GDB, > providing > > > >>> thins > > > >>> > like breakpoints, steps, displaying the current infoset, display > a > > > dump > > > >>> > of the data, etc. > > > >>> > > > > >>> > Although I find this tool pretty useful, it definitely has some > > > glaring > > > >>> > issues. > > > >>> > > > > >>> > The most glaring to me is that it really isn't useful at all for > > > >>> > debugging unparse. The data dumps only include then main > > > outputstream, > > > >>> > so determine things like suspensions and buffered output is > > > impossible. > > > >>> > > > > >>> > Another issue is the infoset output. When outputting the infoset, > > the > > > >>> > debugger currently just walks the entire thing and converts it to > > XML > > > >>> > and displays the XML. For large infosets, this is excess and can > > make > > > >>> it > > > >>> > impossible to use, even with some configurations the limit how > much > > > of > > > >>> > that infoset is actually printed to the screen. Also things like > > > large > > > >>> > hex binary blobs create excessive and unusable output. > > > >>> > > > > >>> > Another thing I feel is missing is a schema view. Right now it's > > very > > > >>> > difficult to know where in the schema Daffodil actually is. > > > >>> > > > > >>> > I think these issues just need some thought improvement. One > could > > > >>> > imagine a better way to stringify our unparse buffers for debug. > > One > > > >>> > could image a way to receive infoset state changes so the > debugger > > > can > > > >>> > track things like backtracks and remove infosets. One could > image a > > > way > > > >>> > display the schema > > > >>> > > > > >>> > We just need a better way to stringify the current state of the > > > unparse > > > >>> > data including buffers, and we need a way to for the debugger to > > > >>> receive > > > >>> > state change information about infoset so it can update displays > > > rather > > > >>> > than just constantly printing the entire infoset. > > > >>> > > > > >>> > However, I think another other big issue is just usability in > > > general. > > > >>> I > > > >>> > think the CLI usage is reasonable, but it's not always user > > friendly, > > > >>> > and is difficult to view multiple things at the same time. I > think > > > >>> > because of this very few people even use this tool. So this this > > like > > > >>> > perhaps something worth focus. > > > >>> > > > > >>> > My first thought to improving this usability issue would be to > > > >>> implement > > > >>> > the Debug Adapter Protocol (DAP) > > > >>> > (https://microsoft.github.io/debug-adapter-protocol/) for > > Daffodil, > > > >>> > which many IDE's implement. With this implemented, Daffodil could > > be > > > >>> > plugged in to any IDE that supports it and essentially get > > debugging > > > >>> for > > > >>> > free, without the need to worry about the GUI elements. > > > >>> > > > > >>> > I do have concerns that this just wouldn't have enough > > functionality > > > >>> > that we'd really need. For example, DAP really only has ability > > show > > > >>> > code (Daffodil's equivalent is the DFDL schema). There isn't a > way > > to > > > >>> > show a live view of the infoset or data. Most DAP IDE's do have a > > > >>> > console output, so we could potentially make it so the console > > output > > > >>> is > > > >>> > a live view of infoset/data. But I'm not even sure most DAP > > friendly > > > >>> > IDE's could support this kindof console output. Does anyone have > > > >>> > familiarity with DAP IDE's or and what kinds of console > > capabilities > > > >>> are > > > >>> > available? > > > >>> > > > > >>> > I also looked into TUI libraries with the idea that we could just > > > >>> extend > > > >>> > our current debugger user interface to be a bit friendlier. > > > >>> > Unfortunately, there aren't too many Java/Scala TUI libraries and > > > those > > > >>> > that do exist don't have Apache friendly licenses. We also want > to > > be > > > >>> > careful about increase dependencies just for a debugger than many > > > >>> people > > > >>> > might not use, so large graphics libraries are probably out of > the > > > >>> question. > > > >>> > > > > >>> > This allo makes me wonder if an approach worth taking for the > > future > > > of > > > >>> > Daffodil schema debugging is developing a sort of "Daffodil Debug > > > >>> > Protocol". I imagine it would be loosely based on DAP (which is > > > >>> > essentially JSON message based) but could be targeted to the > things > > > >>> that > > > >>> > a DFDL schema debugger would really need. An added benefit with > > some > > > >>> > sort of protocol is the debugger interface can be uncoupled from > > > >>> > Daffodil itself, so we could implement a TUI/GUI/whatever in any > > > >>> > language/GUI framework and just have it communicate the protocol > > over > > > >>> > some form of IPC. Another benefit is that any future backends > could > > > >>> > implement this protocol and so a single debugger could hook into > > > >>> > different backends without much issue. Unfortunately, defining > > such a > > > >>> > protocol might be a large task, but we do have our existing debug > > > >>> > infrastructure and things like DAP to guide its > development/design. > > > >>> > > > > >>> > Thoughts? Does such a Daffodil Debug Protocol seem worth it? > > Perhaps > > > we > > > >>> > really just need the few improvements mentioned to the existing > > > >>> > debugger. Is that enough to make it usable? Or is an entirely > > > different > > > >>> > approach needed to debugging schemas? > > > >>> > > > > >>> > > > >>> > > > > > >
