[ 
https://issues.apache.org/jira/browse/LUCENE-10143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17424037#comment-17424037
 ] 

Uwe Schindler edited comment on LUCENE-10143 at 10/4/21, 5:02 PM:
------------------------------------------------------------------

Hi,
in general I am fine with the PR, my complaints were more about verbosity. So 
if we intend to start and rewrite all readPrimitive/writePrimitive in separate 
issues, the change is fine.

But the issue here was opened about {{RateLimitedIndexOutput}}. The only code 
to change to "fix" this issue is the following change: 
https://github.com/apache/lucene/pull/348/files#diff-b3a615b272ca539d83f1b48f760d2fe5fc51a6297f9b2e5b6d78b93e9440a57d
 - and we should do this in this issue; nothing more.

The issue is the other DataInputs/Outputs are not all necessarily "slow" (the 
name of protected slowWriteBytes() is IMHO really a horrible name, sorry!). 
Most of them optimize fairly easy by hotspot and I see no reason to optimize 
them. Adding now a call to the protected slowMethods() looks like a horrible 
API design to me, especially as IndexOutput/IndexInput are classes often 
implemented by 3rd party.

RateLimitedIndexOutput is special, because it does "heavy" (to some extent) 
checks on each written primitive or byte unit. By default all methods, also 
writeVInt/writeVLong are implemented and convert to a series of singular 
writes, each with a check. As the check is not easily optimizable (it depends 
on the current file position / number of written bytes), I think the design of 
RateLimitedIndexOutput should be rethought: For sure it has to delegate all 
methods to be efficient (also writeVInt/writeVLong/...), but just because of 
this we would not make all those methods abstract.

About the PR: We should really change the DataInput/Dataoutput API to clarify 
that we write Little Endian. A reference to the VarHandle or ByteBuffer as 
{{@see}} if a good hint, but we must specifiy how the method  must behave - if 
it is abstract!

One idea to maybe make it easier is to switch DataInput/DataOutput to be an 
interface with default methods. In the base interface (DataOutput), only the 
"hard" ones are implemented. But primitive writes aren't. For subclasses who 
are happy with default impl, we can add a sub-interface like DefaultDataOutput 
that has above slow impls ("implements SlowDataOutputDefaultImpl").


was (Author: thetaphi):
Hi,
in general I am fine with the PR, my complaints were more about verbosity. So 
if we intend to start and rewrite all readPrimitive/writePrimitive in separate 
issues, the change is fine.

But the issue here was opened about {{RateLimitedIndexOutput}}. The only code 
to change to "fix" this issue is the following change: 
https://github.com/apache/lucene/pull/348/files#diff-b3a615b272ca539d83f1b48f760d2fe5fc51a6297f9b2e5b6d78b93e9440a57d
 - and we should do this in this issue; nothing more.

The issue is the other DataInputs/Outputs are not all necessarily "slow" (the 
name of protected slowWriteBytes() is IMHO really a horrible name, sorry!). 
Most of them optimize fairly easy by hotspot and I see no reason to optimize 
them. Adding now a call to the protected sowMethods() looks like a horrible API 
design to me, especially as IndexOutput/IndexInput are classes often 
implemented by 3rd party.

RateLimitedIndexOutput is special, because it does "heavy" (to some extent) 
checks on each written primitive or byte unit. By default all methods, also 
writeVInt/writeVLong are implemented and convert to a series of singular 
writes, each with a check. As the check is not easily optimizable (it depends 
on the current file position / number of written bytes), I think the design of 
RateLimitedIndexOutput should be rethought: For sure it has to delegate all 
methods to be efficient (also writeVInt/writeVLong/...), but just because of 
this we would not make all those methods abstract.

About the PR: We should really change the DataInput/Dataoutput API to clarify 
that we write Little Endian. A reference to the VarHandle or ByteBuffer as 
{{@see}} if a good hint, but we must specifiy how the method  must behave - if 
it is abstract!

One idea to maybe make it easier is to switch DataInput/DataOutput to be an 
interface with default methods. In the base interface (DataOutput), only the 
"hard" ones are implemented. But primitive writes aren't. For subclasses who 
are happy with default impl, we can add a sub-interface like DefaultDataOutput 
that has above slow impls ("implements SlowDataOutputDefaultImpl").

> RateLimitedIndexOutput should delegate writeShort/writeInt/writeLong
> --------------------------------------------------------------------
>
>                 Key: LUCENE-10143
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10143
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Otherwise merges are not taking advantage of LUCENE-10125.
> cc [~uschindler]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to