Re: LayerLengthEv and NoCacheEvaluatable - Layer API issue

Steve Lawrence Thu, 14 Sep 2023 10:06:25 -0700

All fair points. After becoming a bit more familiar with how layerswork, I have some additional thoughts.

Currently the way layers work is we create a singleLayerTransformerFactory during schema compilation, and each timeparse/unparse comes across this layer it calls factory.newInstance() toallocate a new LayerTransformer. And when we call newInstance, we passin LayerRuntimeInfo, which contains things like layerLengthEv (thatstarted this whole discussion) which can be used to configure the newinstance of a LayerTransformer.

But because each parse/unparse gets a new instance, there are reallyonly so many opportunities for a LayerTransformer to be non-thread safe,for example:

1. If the LayerTransformerFactory stores state and passes it into theLayerTransformers it creates, effectively sharing state among thedifferent LayerTransformer instances.

2. A LayerTransformer uses something like a singleton or some globallyshared resource, so all the instances share state.

We could potentially avoid #1 if each LayerTransformerFactory werewrapped in a ThreadLocal, but there's not much we can do about #2. EvenThreadLocals don't solve that problem.

And with changes to the API, we could maybe even avoid #1 withoutThreadLocals. For example, instead of using a LayerTransformerFactory tocreate new LayerTransformer instances, maybe we just directly create newinstances of the LayerTransformer ourselves, just like we do withLayerTransformerFactory now. The LayerTransformFactory doesn't provideanything right now except for newInstance, so maybe it can be removedaltogether?

Also one thing I've learned is that LayerTransformers do currently needto have mutable state, since they currently may read input variables andstore the value after being initialized. They also may need to storestate while reading (e.g. ByteSwapTransformer needs to read and storemultiple bytes before it can return a byte).

Another thought about ThreadLocals if instances were to be shared withina thread then they need a reset() function called before each use. Onecould argue that a reset() function is more fragile than just allocatinga new instance--it's very easy to add new mutable state and forget toalso add a corresponding change to a reset function. New instances avoidthis issue.



On 2023-09-13 09:48 AM, Mike Beckerle wrote:

(Reviving this discussion thread - it went quiet last few weeks,.... and I
forgot to hit 'send' on this reply)

I would suggest that layer transformers, UDFs, and charset definitions are
not very much like NiFi processors.

They're generally embedded in DFDL "component" schemas that are used by
other DFDL "assembly" schemas.

A goal from a big adopter of Daffodil, cybersecurity devices, is that one
can add a DFDL schema to a configuration without triggering all the
recertifications that make these devices slow to market and too expensive.

That kind of fluidity isn't required of NiFi processors AFAIK.

In order for a code plugin to pass muster easily and not trigger a whole
re-certification means that layer transformers have to be scrutinized and
approved easily. They have to be obviously correct and secure. Most should
be 1 page of code.  In an ideal world they would not contain any iteration
constructs. Just methods consisting of only straight-line code, executed by
a framework. Think: next() and hasNext() methods of an Iterator, but since
those usually involve state, something even simpler like a getNext()
returning an Option[T] to combine the next and hasNext behaviors into one
method.

We have a ways to go to get the layer transforms API to enable this high
level of abstraction, but I do think it is possible for many layer
transformers.

I also think this means all issues about thread-safety, etc. cannot be left
up to the author of the layer, because they make the code anything but
obviously correct, and far too embedded in subtle algorithmic issues.
Rather I believe all such issues must be handled by the framework the layer
transformers execute in.

The overhead you mention is important for fine-grained layer transformers -
e.g., checkDigits in some formats are computed for individual fields, so a
layer would have to underlie the digits of a single number. But that's an
outlier, and a UDF may be a better approach for that since the digits of
one field are available as a value to a UDF. Layers are typically for
situations involving more than one field.






On Wed, Aug 16, 2023 at 1:04 PM Steve Lawrence <slawre...@apache.org> wrote:

I'm thinking the layer framework needs to allocate the layer
transformer instance in a thread-local manner so that if it
constructs mutable state that will be thread local also


One downside to this is layer transformers that are thread-safe
unnecessarily incur some amount of overhead, both in ThreadLocal
overhead and memory overhead for each new transformer instance created.
I'm not sure how much overhead that really is and if it matters, but
maybe something to consider.

I also wonder how other systems handle this kind of thing. For example,
I'm somewhat familiar with Apache NiFi and I don't think it uses
ThreadLocals or allocates multiple instances of Processors. And I don't
think it does anything to actually ensure that Processors are thread-safe.

I think NiFi just documents that XYZ functions must be thread safe, and
it's up to Processor implementations to not do anything like store
state. And if they do need to store state or something, then they need
to use ThreadLocals or some thread-safe data structure for that state.

And actually this is what the Daffodil NiFi processor does. It has a
mutable cache compiled parsers so it doesn't need to recompile them all
the time, and this cache is stored in a thread-safe cache data structure.

On 2023-08-16 12:39 PM, Mike Beckerle wrote:

In the CRC32 layer, the "optimization" was really just laziness. The use
case we have only needs constant known explicit lengths, so we wanted to
hard wire that, and not bother to support nor test any other cases.

But, had that worked, I would bet we would have made the mistake and
pre-allocated and saved the buffer of that size, resulting in
non-thread-safe code.

But turns out there's a base class that handles all this for us, and we

get

the ability for the explicit length to be computed from an expression

also,

for free.

I'm thinking the layer framework needs to allocate the layer transformer
instance in a thread-local manner so that if it constructs mutable state
that will be thread local also, because the natural thing to do is to

look

at the layerLength information and allocate a buffer in a member of the
class. It's just the first and most obvious thing to do, and it's dead
wrong for thread safety.




On Wed, Aug 16, 2023 at 9:48 AM Steve Lawrence <slawre...@apache.org>

wrote:

I don't remember why NoCacheEvluatable was used, but you're I suspect
you're right that it was done so implementations are less likely to mess
things up. Note that NoCacheEvaluatable is also used for
ImplicitLengthEv. I'm not sure why it's used there either. Maybe
LayerLengthEv just copied from that?

Note that using NoCacheEvaluatble still doesn't really ensure
thread-safe implementations. An implementation could just as easily
allocate a buffer based on the first layer length it gets and then
stores that buffer for reuse later (with some logic to grow it if a
future layer length is bigger). That wouldn't be thread safe regardless
of no-cache or not.

So I don't think we should be relying on NoCacheEvalatable to ensure
thread safe implementations. API changes seems reasonable if we want to
enforce that, or maybe we just to improve the documentation about what
parts of the layer API are thread-safe and when things can and cannot be
shared.

That said, I'm curious what optimizations the CRC32 layer is trying to
do with a constant layer length?

On 2023-08-15 05:49 PM, Mike Beckerle wrote:

Lola Kilo and I were debugging a layer for CRC32 and found that the
LayerLengthEv is a NoCacheEvaluatable, so even if the schema does
dfdlx:layerLength="1240" a constant, the the layerLengthEv.optConstant

is

None, which is because it is a NoCacheEvaluatable, which insists
optConstant is false to prevent constant optimization. This results in
calling on the LayerLengthEv at runtime every time.

I am not sure I understand the entire rationale for LayerLengthEv to

be a

NoCacheEvaluatable.

I can think of one reason why.... so that people can't allocate mutable
things based on that length, and store them in the layer transformer

class

instance. Because that is a shared runtime structure, so that wouldn't

be

thread safe. If they are forced to evaluate the layerLengthEv every

time,

then their code will naturally allocate any data structure based on the
length at runtime rather than at layer compile time.

In other words, it's a hack to trick people into being more likely to

write

thread-safe code.

Based on this (and other experience with it) I think the layer API is

in

serious need of revisiting.

We really should NOT just cast existing practice into JAPI/SAPI, but

should

rethink so that, for example, a thread-local is created from the layer
transformer instance.

Comments?

Re: LayerLengthEv and NoCacheEvaluatable - Layer API issue

Reply via email to