rondagostino commented on PR #13280: URL: https://github.com/apache/kafka/pull/13280#issuecomment-1500342559
Thanks for the review, @cmccabe. No problem with the package change and the more precise import controls. Regarding the wrapper suggestion: It is true that the wrapper implementation in the PR as it stands right now always creates a new object when creating a new persistent collection (either by creating an empty/singleton or by adding/removing from an existing one). The overhead is low -- object creation generally isn't onerous in time or space in general, and the work created by having the additional object will be a very low percentage of the work that the underlying persistent collection has to do anyway. So it doesn't concern me very much in that regard, but I do agree we should avoid it it is unnecessary. PCollections and Paguro both support using the `java.util` interfaces directly on their implementation classes, but Vavr (or whatever else we might decide to abstract away like this) does not. Such libraries would require the use of a wrapper class as is proposed here. The `PersistentCollectionFactory` you proposed would still work -- it would just take wrapper instances and delegate to those, which would in turn delegate to the underlying wrapped persistent collection. So there would be 2 levels of indirection instead of 1 for those libraries (the factory always has to do the downcast and perform the delegation, but in the Vavr-like cases it would cast the input to a wrapper and delegate to that, which would then in turn do the delegation again to the underlying library object). But delegation is cheap, so I would have no problem with that. In short, being required to introduce a wrapper for some libraries doesn't necessarily require us to introduce one for PCollections. The last issue I can think of that we should discuss here is type safety. I would prefer to be able to know that something is in fact a persistent collection. Unfortunately there is no way to keep that distinction with your proposed factory definition. This drawback gives me pause. So I think we have a situation where the current PR introduces a small amount of overhead that is not expected to be significant but that we would prefer to eliminate of possible. But the cost is a lack of type safety -- the compiler cannot know if a collection is a persistent one or not. Is this a fair discussion of the tradeoffs? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org