The main different is the moment you know you have to write the file.
If there's zero interpolation/filtering, then there's no need to write the file.
As soon as you detect interpolation/filtering you can start overwriting the 
original file.

The challenge is probably to fit this concept in the current codebase.
However,the result would a much more elegant and efficient result compared to 
the temporary created files.

Robert
 
On 30-4-2020 09:54:55, Rob Oxspring <[email protected]> wrote:

> On 30 Apr 2020, at 07:38, Robert Scholte wrote:
>
> I prefer to see an in memory solution.

Well if it’s reasonable to assume that filtered files are always small then we 
could use replace the temporary file in my solution with an in memory buffer... 
but I’m not sure that’s what you’re shooting at?

> Key should be to detect if filtering is applied, which is done in the 
> MultiDelimiterInterpolatorFilterReaderLineEnding[1]
> Once a value has been interpolated, you must rewrite the file, otherwise you 
> shouldn't.

Again though, this appears to miss the subtlety: “if filtering is applied” is 
insufficient, the condition needs to be “if filtering is applied with different 
results than the previous run”. This requires either attempting to store some 
state between runs.

We could scan the source file for filtered values and store just that state (or 
checksum) in a file between runs. The cost would be an extra read of the source 
file + state comparison + writing out the state of each filtering. Is this what 
you’re thinking?

The alternative is to just use the target file as the (not minimal) state. We 
could read and filter the source file once, while reading and comparing with 
the target file in parallel. As soon as the contents start to differ then 
truncate the target file and append the rest of the filtered source to it. The 
cost here would be an extra full read of the target. Is this what you’re after?

Otherwise I’m at a loss to understand what would be acceptable.

Thanks!

Rob

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to