On 10.08.2016, at 22:54, Richard Eckart de Castilho <r...@apache.org> wrote:
> 
> On 10.08.2016, at 22:37, Richard Eckart de Castilho <r...@apache.org> wrote:
>> 
>> On 05.08.2016, at 14:18, Richard Eckart de Castilho <r...@apache.org> wrote:
>>> 
>>> Ok, then I think we there is agreement that we keep COMPRESSED_FILTERED 
>>> (form 6)
>>> and COMPRESSED_FILTERED_TSI (form 6 + TS). 
>> 
>> Hm, I don't see a COMPRESSED_FILTERED_TSI in the SerialFormat.
>> 
>>> But for the time being we only support lenient loading (filter on load) - 
>>> i.e. the TS in the serialized form corresponds to the original TS from the 
>>> CAS.
>> 
>> It looks like the following features are currently not supported by 
>> CasIOUtils:
>> 
>> - storing TS along with form 6 in a single file: I see no code path where a 
>> TS is stored in a COMPRESSED_FILTERED binary file and none where it is 
>> loaded from a COMPRESSED_FILTERED binary file
>> 
>> - lenient loading of COMPRESSED_FILTERED: I do not even see a path where a 
>> separately specific TS input stream is used when reading a 
>> COMPRESSED_FILTERED file. The TS only seems to be used when reading a 
>> SERIALIZED file.
>> 
>> Am I missing something? If we still miss the two features listed above, then 
>> we kind of lost them in translation. I am pretty sure they were there in the 
>> initial code that I provided. Why did we loose them?
> 
> Looks like they were removed in #1755237:
> 
> [UIMA-4685] refactoring, moving some common stuff into more core UIMA, 
> augmenting JavaDocs, removing compressed form 6 with type system and 
> definitions, correcting deserialization process when installing type system 
> (and index defs) - by using the core code paths for this.  Fixed the test 
> cases to account for some renaming and removal of compressed form 6 with type 
> sys. Made error message using standard UIMA msg things.
> 
> Can we undo the removal of COMPRESSED_FILTERED_TS from that commit please? 
> It is basically *the* core functionality from my perspective since it enables
> lenient loading of binary CASes.

We are also lacking a method with a signature like this:

  load(InputStream casInputStream, CASMgrSerializer tsi, CAS aCAS, boolean 
leniently)

Such a method is important when bulk-reading multiple CASes in order to avoid 
having to read the TSI over and over again.

-- Richard

Reply via email to