This feels like the right approach. Thoughts on instead of having to manually clone() and having some sort of locking mechanism when parse is called, we just make a DataProcessor completely immutable, and setting a variable will implicitly create a copy with modified state, something like:
val dp = dataProcessor .withTunable("tunableFoo", "1") .withVariable("variableFoo", 2") .withVariable("variableBar", 2") It's not as java-y, but feels safer. I can imagine a case where someone sets a bunch of state, calls parse and locks things, and then way down the line they get a "data processor is locked" error because they tried to change some state. On 3/25/20 11:50 AM, Beckerle, Mike wrote: > Here's my concrete proposal to fix DataProcesor's API. > > We add a clone() method. It copies the object sharing the big/expensive stuff > like the compiled schema but gives you a separate copy of all the state that > is settable via setValidationMode, setExternalDFDLVariable, and setTunable. > > So you get a DataProcessor either from a ProcessorFactory, or from > Compiler.reload(). > > You set state. > You can call parse/unparse on various threads > > If you want to change state settings for another parse call, you must call > clone() to get another instance of the DataProcessor. This will share the > expensive/big stuff like the compiled schema, but give you a separate copy of > all the state that is settable via setValidationMode, > setExternalDFDLVariable, and setTunable. > > Then you set state > You can call parse/unparse on various threads. > > I hate to get into locking and such, but once you call parse/unparse, that > should lock the object state, and any further setter calls should fail/throw. > You should have to clone it to "change" the settings. > > This makes the use of the setters a java-ish way of achieving the same > reentrancy one would get if there were no setters and simply more arguments > to the parse/unparse calls. > > ________________________________ > From: Beckerle, Mike <mbecke...@tresys.com> > Sent: Wednesday, March 25, 2020 11:29 AM > To: dev@daffodil.apache.org <dev@daffodil.apache.org> > Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged > > To respond to my own thread here, I think given that we allow multiple > simultaneous calls to parse/unparse from different threads, we must say that > the DataProcessor object is immutable once parse or unparse are called. > > I suppose we could say that it is mutable, but behavior is undetermined if > any parse or unparse calls are active on any thread. > > But this is just asking for trouble IMHO. > > I think we started out with a stateful, non-thread capable API. The idea is > that one thread would be invoking a data processor at a time. A data procesor > was the state-block of an execution. > > The need to share compiled processor reloads, because the compile schemas > were expensive to create, tempted us to allow multiple parse/unparse calls on > different threads. > > Fact is, I think we should have said no to this, provided a > DataProcessor.clone() to create instances that shared the reloaded compiled > schema binary, but otherwise had separate state, and said that parse/unparse > were synchronized methods on their DP instance. > > Instead we're in a 1/2 way world where we don't have a thread-reasonable API > due to mutable state in what turns out to be a cross-thread shared object. > > ________________________________ > From: Beckerle, Mike <mbecke...@tresys.com> > Sent: Wednesday, March 25, 2020 10:55 AM > To: dev@daffodil.apache.org <dev@daffodil.apache.org> > Subject: Compiler.setExternalDFDLVariable(s) considered challenged > > Why does the API for Daffodil have > > Compiler.setExternalDFDLVariable(...) > and > Compiler.setExternalDFDLVariables(...) > > on it. > > I believe we should deprecate this. > > Compilers are parameterized by some of the tunables I understand. > > But the external DFDL variables? These cannot affect compilation. The schema > compiler needs to know statically the information about variables found in > the schema itself in the dfdl:defineVariable statement annotations. > > But the compiler doesn't need external variable bindings. In fact if it did > know and use them, it would be building assumptions into the compiled schema > that it shouldn't be building in. > > Setting external var bindings on the Compiler just causes problems when that > compiler instance is reused in a context where those settings aren't > appropriate. (JIRA DAFFODIL-2302 is one such problem) > > I believe setExternalDFDLVariable(s) methods should be deprecated, and > external variables bindings should be an optional argument to the > parse/unparse methods. > > The setters cause thread safety issues because the DP is stateful, even > though we want multiple calls to parse/unparse to be executable on different > threads. > > Consider: if we allow ordinary setExternalDFDLVariables and add a > resetExternalDFDLVariables to clear them, then imagine one wants to make two > parse calls on separate threads with different external variables bindings: > > so on main thread..... > dp.setExternalVariables(...bindings 1...) > spawn the thread 1 > on the thread 1 > dp.parse(....) > back on main thread > dp.resetExternalVariables() // race condition. Did the parse call read > the external variables before this reset or not? > dp.setExternalVariables(...bindings 2....) > ..... > > However, if we make the external variable bindings an argument to parse, we > avoid all of this. > > Alternatively, since DataProcessor has setExternalDFDLVariable, we can > prohibit multiple calls on the same DataProcessor object simultaneously. We > can provide a clone() method that preserves the loaded/reloaded processor, > but constructs another DataProcessor object, thereby allowing separate > external variables state per DataProcessor instance. > > > Comments? > > > > > > Mike Beckerle | Principal Engineer > [Owl Cyber Defense]<http://owlcyberdefense.com> > [cid:2a423dec-0558-414e-b369-14a24becc40f] is now a part of > Owl<https://owlcyberdefense.com/news/owl-cyber-defense-tresys-technology-announce-merger/> > P +1-781-330-0412 > W owlcyberdefense.com<http://www.owlcyberdefense.com> >