Thanks for doing this investigation.
I'll take a look at what else is in the PoU Parse State that might be worth
playing similar copy-on-write tricks on.
But I believe profiling is the next step on this.
On Tue, Jan 9, 2024 at 5:34 PM Steve Lawrence wrote:
> And here's where we do some logic
And here's where we do some logic and a more detailed comment about it:
https://github.com/apache/daffodil/blob/main/daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala#L346-L362
So I think we do already do copy-on-write for variables when parsing.
On
There's actually a comment in the PState captureFrom() method used to
capture state during PoUs:
// Note that this is intentionally a shallow copy. This normally would
// not work because the variable map is mutable so other state changes
// could mutate this snapshot. This is avoided by
Actually, I haven't measured it, but there are 4 built in variables, so
even if a schema introduces no new variables of its own there is overhead
to deal with copying the state of 4 variables just in case you need to
backtrack them, and this overhead occurs for every point of uncertainty.
Also
It's definitely worth considering.
DAFFODIL-2852 showed that variable copies can definitely have lot of
overhead. Though the commit to resolve that issue reduced it pretty
substantially, and I believe that change made variable copies disappear
from profiling (but I'm not positive).
In my
Seems like the benefit would only be significant if you were dealing with lots
of variables.
-Original Message-
From: Mike Beckerle
Sent: Tuesday, January 9, 2024 1:39 PM
To: dev@daffodil.apache.org
Subject: Thoughts on on demand copying of parser state
Right now we copy the state of
Right now we copy the state of the parser as every point of uncertainty is
reached.
I am speculating that we could copy on demand. So, for example, if no
variable modifying operation occurs then there would be no overhead to
copy the variable state.
This comes at the cost of each variable doing