On 11.08.2016, at 21:53, Richard Eckart de Castilho <[email protected]> wrote:
>
> On 11.08.2016, at 21:34, Marshall Schor <[email protected]> wrote:
>>
>> load ( InputStream casInputStream, TypeSystem compressedForm6originalTS) ?
>>
>> Form 6 is always "lenient", if it is passed an original (meaning the one that
>> corresponds to the serialized form) type system.
>>
>> Would this cover all the cases more directly (as an API)?
>
> That would defy the content-auto-detection on the input stream, wouldn't it?
> I'd have to reimplement the logic to detect if the stream is
> form 6 or something else. If it is form 6, I'd use your suggested
> method. If it is not, I'd use the regular method. Also doesn't sound
> very attractive.
Actually... I currently use COMPRESSED_FILTERD_TSI only in lenient mode.
So the line in DKPro actually always sets leniently to "true":
SerialFormat format = CasIOUtils.load(bis, casMgr, aCAS, true);
So presently, introducing the signature that you suggest should indeed in
principle work.
> As it is implemented right now, form 6 is not always lenient:
>
> - if "leniently" is true, then it is loaded leniently. If leniently
> - is false and a TSI is specified, then the CAS is reinitialized.
I'm not entirely hung up on the way this is presently implemented.
Basically "lenient" only says that FSes for types present in the
serialized CAS but not present in the target CAS can be dropped.
We didn't really agree that "!lenient" amounts to simply overwriting
the TS in the CAS. So we should maybe drop that behavior... and if
we really wanted it, then we should probably introduce a second
parameter "reinit CAS"?
Cheers,
-- Richard