The "custom cell type" never exists in the story. (Sorry for misleading you)
Here is the story. i add some custom cells (for saving memory) to Put via Put#add(Cell). The pseudocode of custom cell is shown below. {code} class MyObject() { Cell toCell() { return CellBuilderFactory.newBuilfer(SHALLOW_COPY) .setRow(sharedBuffer, myRowOffset, myRowLength). .setType(KeyValue.Type.Put.getCode()) // We call the IA.Private to get valid code of Put // set other fields .build(); } } put.add(myObject.toCell); {code} And then, I noticed the Put#add is not optimized for our heavy table(a chunk of cells in single row), so I also extend the Put to add some #add methods for avoiding resizing collection. That was the story -- I try to reducer the cost of converting our object to Put/Cell. A another story i had mentioned is to build custom write path via Endpoint, but it is unrelated to this topic. All class we use are shown below: 1) Cell -> IA.Public 2) CellBuilder -> IA.Public 3) CellBuilderFactory -> IA.Public 4) Put -> IA.Public 5) Put#add(Cell) -> IA.Public 5) KeyValue#Type -> IA.Private That is why i want to make KeyValue#Type IA.Public. -- Chia-Ping On 2017-10-01 00:34, Andrew Purtell <andrew.purt...@gmail.com> wrote: > Thanks for sharing these details. They are intriguing. If possible could you > explain why the custom type is needed? > > Something has to be deployed on the server or the custom cell type isnât > guaranteed to be handled correctly. It may work now by accident. Iâm a > little surprised a custom cell type doesnât cause an abort. Did you patch > the code to handle it? > > > > On Sep 30, 2017, at 1:06 AM, Chia-Ping Tsai <chia7...@apache.org> wrote: > > > > Thanks for the nice suggestions. Andrew. Sorry for delay response. Busy > > today. > > > > The root reason we must build own Cell on client side is that the data are > > located on shared memory which is similar with MSLAB. > > > > You are right. We can use attribute to carry our data but the byte[] is not > > acceptable because we canât assign the offset and length. In fact, the > > endpoint is a better way for our case because our object can be directly > > converted to PB object. Also it is easy to apply shared memory to manage > > our object. However, it will be easier and more readable to follow regular > > Put operation. All we have to do is to build own cell and extended Put. > > Nothing have to be deployed on server. > > > > I agree the custom cell is low level thing, and it should be used by > > advanced users. What I concern is the classes related to custom Cell have > > different IA declaration. Iâam fine to make them IA.Private but building > > the custom cell may be a common case. > > > > â > > Chia-Ping > > > >> On 2017-09-30 06:05, Andrew Purtell <apurt...@apache.org> wrote: > >> âConstruct a normal put or delete or batch mutation, add whatever extra > >> state you need in one or more operation attributes, and use a > >> regionobserver to extend normal processing to handle the extra state. I'm > >> curious what dispatching to extension code because of a custom cell type > >> buys you over dispatching to extension code because of the presence of an > >> attribute (or cell tag). For example, in security coprocessors we take > >> attribute data and attach it to the cell using cell tags. Later we check > >> for cell tag(s) to determine if we have to take special action when the > >> cell is accessed by a scanner, or during some operations (e.g. appends or > >> increments have to do extra handling for cell security tags). > >> > >> > >> On Fri, Sep 29, 2017 at 2:43 PM, Chia-Ping Tsai <chia7...@apache.org> > >> wrote: > >> > >>>> Instead of a custom cell, could you use a regular cell with a custom > >>>> operation attribute (see OperationWithAttributes). > >>> Pardon me, I didn't get what you said. > >>> > >>> > >>> > >>>> On 2017-09-30 04:31, Andrew Purtell <apurt...@apache.org> wrote: > >>>> Instead of a custom cell, could you use a regular cell with a custom > >>>> operation attribute (see OperationWithAttributes). > >>>> > >>>> On Fri, Sep 29, 2017 at 1:28 PM, Chia-Ping Tsai <chia7...@apache.org> > >>> wrote: > >>>> > >>>>> The custom cell help us to save memory consumption. We don't have own > >>>>> serialization/deserialization mechanism, hence to transform data from > >>>>> client to server needs many conversion phase (user data -> Put/Cell -> > >>> pb > >>>>> object). The cost of conversion is large in transferring bulk data. In > >>>>> fact, we also have custom mutation to manage the memory usage of inner > >>> cell > >>>>> collection. > >>>>> > >>>>>> On 2017-09-30 02:43, Andrew Purtell <apurt...@apache.org> wrote: > >>>>>> What are the use cases for a custom cell? It seems a dangerously low > >>>>> level > >>>>>> thing to attempt and perhaps we should unwind support for it. But > >>> perhaps > >>>>>> there is a compelling justification. > >>>>>> > >>>>>> > >>>>>> On Thu, Sep 28, 2017 at 10:20 PM, Chia-Ping Tsai < > >>> chia7...@apache.org> > >>>>>> wrote: > >>>>>> > >>>>>>> Thanks for all comment. > >>>>>>> > >>>>>>> The problem i want to resolve is the valid code should be exposed > >>> as > >>>>>>> IA.Public. Otherwise, end user have to access the IA.Private class > >>> to > >>>>> build > >>>>>>> the custom cell. > >>>>>>> > >>>>>>> For example, I have a use case which plays a streaming role in our > >>>>>>> appliaction. It > >>>>>>> applies the CellBuilder(HBASE-18519) to build custom cells. These > >>> cells > >>>>>>> have many same fields so they are put in shared-memory for > >>> avoiding GC > >>>>>>> pause. Everything is wonderful. However, we have to access the > >>>>> IA.Private > >>>>>>> class - KeyValue#Type - to get the valid code of Put. > >>>>>>> > >>>>>>> I believe there are many use cases of custom cell, and > >>> consequently it > >>>>> is > >>>>>>> worth adding a way to get the valid type via IA.Public class. > >>>>> Otherwise, it > >>>>>>> may imply that the custom cell is based on a unstable way, because > >>> the > >>>>>>> related code can be changed at any time. > >>>>>>> -- > >>>>>>> Chia-Ping > >>>>>>> > >>>>>>>> On 2017-09-29 00:49, Andrew Purtell <apurt...@apache.org> wrote: > >>>>>>>> I agree with Stack. Was typing up a reply to Anoop but let me > >>> move it > >>>>>>> down > >>>>>>>> here. > >>>>>>>> > >>>>>>>> The type code exposes some low level details of how our current > >>>>> stores > >>>>>>> are > >>>>>>>> architected. But what if in the future you could swap out HStore > >>>>>>> implements > >>>>>>>> Store with PStore implements Store, where HStore is backed by > >>> HFiles > >>>>> and > >>>>>>>> PStore is backed by Parquet? Just as a hypothetical example. I > >>> know > >>>>> there > >>>>>>>> would be larger issues if this were actually attempted. Bear with > >>>>> me. You > >>>>>>>> can imagine some different new Store implementation that has some > >>>>>>>> advantages but is not a design derived from the log structured > >>> merge > >>>>> tree > >>>>>>>> if you like. Most values from a new Cell.Type based on > >>> KeyValue.Type > >>>>>>>> wouldn't apply to cells from such a thing because they are > >>>>> particular to > >>>>>>>> how LSMs work. I'm sure such a project if attempted would make a > >>>>> number > >>>>>>> of > >>>>>>>> changes requiring a major version increment and low level details > >>>>> could > >>>>>>> be > >>>>>>>> unwound from Cell then, but if we could avoid doing it in the > >>> first > >>>>>>> place, > >>>>>>>> I think it would better for maintainability. > >>>>>>>> > >>>>>>>> > >>>>>>>>> On Thu, Sep 28, 2017 at 9:39 AM, Stack <st...@duboce.net> wrote: > >>>>>>>>> > >>>>>>>>> On Thu, Sep 28, 2017 at 2:25 AM, Chia-Ping Tsai < > >>>>> chia7...@apache.org> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> hi folks, > >>>>>>>>>> > >>>>>>>>>> User is allowed to create custom cell but the valid code of > >>> type > >>>>> - > >>>>>>>>>> KeyValue#Type - is declared as IA.Private. As i see it, we > >>> should > >>>>>>> expose > >>>>>>>>>> KeyValue#Type as Public Client. Three possible ways are shown > >>>>> below: > >>>>>>>>>> 1) Change declaration of KeyValue#Type from IA.Private to > >>>>> IA.Public > >>>>>>>>>> 2) Move KeyValue#Type into Cell. > >>>>>>>>>> 3) Move KeyValue#Type to upper level > >>>>>>>>>> > >>>>>>>>>> Any suggestions? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> What is the problem that we are trying to solve Chia-Ping? You > >>>>> want to > >>>>>>> make > >>>>>>>>> Cells of a new Type? > >>>>>>>>> > >>>>>>>>> My first reaction is that KV#Type is particular to the KV > >>>>>>> implementation. > >>>>>>>>> Any new Cell implementation should not have to adopt the > >>> KeyValue > >>>>>>> typing > >>>>>>>>> mechanism. > >>>>>>>>> > >>>>>>>>> S > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Chia-Ping > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Best regards, > >>>>>>>> Andrew > >>>>>>>> > >>>>>>>> Words like orphans lost among the crosstalk, meaning torn from > >>>>> truth's > >>>>>>>> decrepit hands > >>>>>>>> - A23, Crosstalk > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Best regards, > >>>>>> Andrew > >>>>>> > >>>>>> Words like orphans lost among the crosstalk, meaning torn from > >>> truth's > >>>>>> decrepit hands > >>>>>> - A23, Crosstalk > >>>>>> > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Best regards, > >>>> Andrew > >>>> > >>>> Words like orphans lost among the crosstalk, meaning torn from truth's > >>>> decrepit hands > >>>> - A23, Crosstalk > >>>> > >>> > >> > >> > >> > >> -- > >> Best regards, > >> Andrew > >> > >> Words like orphans lost among the crosstalk, meaning torn from truth's > >> decrepit hands > >> - A23, Crosstalk > >> >