where do you see insert-in-the-middle/replace being commonly used? lineage tracing, as you call it, would probably be implemented by way of:
1. every "stop" along the way appending itself (at the end) 2. some replication technologies, instead of just doing #1, may clear out everything when they replicate (starting from a clean slate) On Mon, Mar 6, 2017 at 11:00 AM, Colin McCabe <cmcc...@apache.org> wrote: > As others have mentioned, it seems clear that we want to preserve the > ordering of message headers, so that we can implement things like > lineage tracing. (For example, each stage could add a "lineage:" > header.) I also think that we want the ability to add and remove > headers as needed. It would be really unfortunate if the only way to > remove a message header or add a header at a certain position was the > duplicate the whole message and re-create everything. > > So how about implementing ListIterator? > https://docs.oracle.com/javase/7/docs/api/java/util/ListIterator.html > It supports adding and removing things at arbitrary positions. For > people who want to use it as a simple Iterator, it is one (and you can > use all the fancy syntax such as Java's foreach with it). For people > who want to add and remove things at arbitrary locations, they can. And > it doesn't expose the implementation, so that can be changed later. We > can materialize things in memory lazily if we want to, and so forth. I > think using the standard interface is better than rolling our own > nonstandard collection or iterator interfaces. > > regards, > Colin > > > On Wed, Mar 1, 2017, at 12:59, Becket Qin wrote: > > Hi Ismael, > > > > Thanks for the reply. Please see the comments inline. > > > > On Wed, Mar 1, 2017 at 6:47 AM, Ismael Juma <ism...@juma.me.uk> wrote: > > > > > Hi Becket, > > > > > > Thanks for sharing your thoughts. More inline. > > > > > > On Wed, Mar 1, 2017 at 2:54 AM, Becket Qin <becket....@gmail.com> > wrote: > > > > > > > As you can imagine if the ProducerRecord has a value as a List and > the > > > > Interceptor.onSend() can actually add an element to the List. If the > > > > producer.send() is called on the same ProducerRecord again, the value > > > list > > > > would have one more element than the previously sent ProducerRecord > even > > > > though the ProducerRecord itself is not mutable, right? Same thing > can > > > > apply to any modifiable class type. > > > > > > > > > > The difference is that the user chooses the value type. They are free > to > > > choose a mutable or immutable type. A generic interceptor cannot > mutate the > > > value because it doesn't know the type (and it could be immutable). One > > > could write an interceptor that checked the type of the value at > runtime > > > and did things based on that. But again, the user who creates the > record is > > > in control. > > > > > But there is no generic interceptor, right? The interceptor always takes > > specific K, V type. > > > > > > > From this standpoint allowing headers to be mutable doesn't really > weaken > > > > the mutability we already have. Admittedly a mutable header is kind > of > > > > guiding user towards to change the headers in the existing object > instead > > > > of creating a new one. > > > > > > > > > Yes, with headers, we are providing a model for the user (the user > doesn't > > > get to choose it like with keys and values) and for the interceptors. > So, I > > > think it's not the same. > > > > > > > > > > > But I think reusing an object while it is possible > > > > to be modified by user code is a risk that users themselves are > willing > > > to > > > > take. And we do not really protect against that. > > > > > > > > > If users want to take that risk, it's fine. But we give them the > option to > > > protect themselves. With mutable headers, there is no option. > > > > If we want to let the users control the mutability, users can always call > > headers.close() before calling producer.send() and that will force the > > interceptor to create new ProducerRecord object. > > > > Because the headers are mostly useful for interceptors, unless the users > > do > > not want the interceptors to change their records, it seems reasonable to > > say that by default modification of headers are allowed for the > > interceptors. > > > > > > > > > > > > But there still seems > > > > value to allow the users to not pay the overhead of creating tons of > > > > objects if they do not reuse an object to send it twice, which is > > > probably > > > > a more common case. > > > > > > > > > > I think the assumption that there would be tons of objects created is > > > incorrect (I suggested a solution that would only involve one > additional > > > reference in the `Header` instance). The usability of the immutable API > > > seemed to be more of an issue. > > > > > If we do not allow the users to add headers on existing ProducerRecord > > objects, each interceptor who wants to add headers will have to create a > > new ProducerRecord. So we will have to create NUM_INTERCEPTOR times of > > ProducerRecord, if a producer is sending 100K messages per second, it > > would > > be a lot of new objects, right? > > > > > > > > In any case, if we do add the `close()` method, then we need to add a > note > > > to the compatibility section stating that once a producer record is > sent, > > > it cannot be sent again as this would cause interceptors that add > headers > > > to fail. > > > > > Agreed, clear documentation is important. > > > > > > > > Ismael > > > >