The "custom cell type" never exists in the story. (Sorry for misleading you)
Here is the story. i add some custom cells (for saving memory) to Put via
Put#add(Cell). The pseudocode of custom cell is shown below.
{code}
class MyObject() {
Cell toCell() {
return CellBuilderFactory.newBuilfer(SHALLOW_COPY)
.setRow(sharedBuffer, myRowOffset, myRowLength).
.setType(KeyValue.Type.Put.getCode()) // We call the
IA.Private to get valid code of Put
// set other fields
.build();
}
}
put.add(myObject.toCell);
{code}
And then, I noticed the Put#add is not optimized for our heavy table(a chunk of
cells in single row), so I also extend the Put to add some #add methods for
avoiding resizing collection.
That was the story -- I try to reducer the cost of converting our object to
Put/Cell. A another story i had mentioned is to build custom write path via
Endpoint, but it is unrelated to this topic.
All class we use are shown below:
1) Cell -> IA.Public
2) CellBuilder -> IA.Public
3) CellBuilderFactory -> IA.Public
4) Put -> IA.Public
5) Put#add(Cell) -> IA.Public
5) KeyValue#Type -> IA.Private
That is why i want to make KeyValue#Type IA.Public.
--
Chia-Ping
On 2017-10-01 00:34, Andrew Purtell <[email protected]> wrote:
> Thanks for sharing these details. They are intriguing. If possible could you
> explain why the custom type is needed?
>
> Something has to be deployed on the server or the custom cell type isnât
> guaranteed to be handled correctly. It may work now by accident. Iâm a
> little surprised a custom cell type doesnât cause an abort. Did you patch
> the code to handle it?
>
>
> > On Sep 30, 2017, at 1:06 AM, Chia-Ping Tsai <[email protected]> wrote:
> >
> > Thanks for the nice suggestions. Andrew. Sorry for delay response. Busy
> > today.
> >
> > The root reason we must build own Cell on client side is that the data are
> > located on shared memory which is similar with MSLAB.
> >
> > You are right. We can use attribute to carry our data but the byte[] is not
> > acceptable because we canât assign the offset and length. In fact, the
> > endpoint is a better way for our case because our object can be directly
> > converted to PB object. Also it is easy to apply shared memory to manage
> > our object. However, it will be easier and more readable to follow regular
> > Put operation. All we have to do is to build own cell and extended Put.
> > Nothing have to be deployed on server.
> >
> > I agree the custom cell is low level thing, and it should be used by
> > advanced users. What I concern is the classes related to custom Cell have
> > different IA declaration. Iâam fine to make them IA.Private but building
> > the custom cell may be a common case.
> >
> > â
> > Chia-Ping
> >
> >> On 2017-09-30 06:05, Andrew Purtell <[email protected]> wrote:
> >> âConstruct a normal put or delete or batch mutation, add whatever extra
> >> state you need in one or more operation attributes, and use a
> >> regionobserver to extend normal processing to handle the extra state. I'm
> >> curious what dispatching to extension code because of a custom cell type
> >> buys you over dispatching to extension code because of the presence of an
> >> attribute (or cell tag). For example, in security coprocessors we take
> >> attribute data and attach it to the cell using cell tags. Later we check
> >> for cell tag(s) to determine if we have to take special action when the
> >> cell is accessed by a scanner, or during some operations (e.g. appends or
> >> increments have to do extra handling for cell security tags).
> >>
> >>
> >> On Fri, Sep 29, 2017 at 2:43 PM, Chia-Ping Tsai <[email protected]>
> >> wrote:
> >>
> >>>> Instead of a custom cell, could you use a regular cell with a custom
> >>>> operation attribute (see OperationWithAttributes).
> >>> Pardon me, I didn't get what you said.
> >>>
> >>>
> >>>
> >>>> On 2017-09-30 04:31, Andrew Purtell <[email protected]> wrote:
> >>>> Instead of a custom cell, could you use a regular cell with a custom
> >>>> operation attribute (see OperationWithAttributes).
> >>>>
> >>>> On Fri, Sep 29, 2017 at 1:28 PM, Chia-Ping Tsai <[email protected]>
> >>> wrote:
> >>>>
> >>>>> The custom cell help us to save memory consumption. We don't have own
> >>>>> serialization/deserialization mechanism, hence to transform data from
> >>>>> client to server needs many conversion phase (user data -> Put/Cell ->
> >>> pb
> >>>>> object). The cost of conversion is large in transferring bulk data. In
> >>>>> fact, we also have custom mutation to manage the memory usage of inner
> >>> cell
> >>>>> collection.
> >>>>>
> >>>>>> On 2017-09-30 02:43, Andrew Purtell <[email protected]> wrote:
> >>>>>> What are the use cases for a custom cell? It seems a dangerously low
> >>>>> level
> >>>>>> thing to attempt and perhaps we should unwind support for it. But
> >>> perhaps
> >>>>>> there is a compelling justification.
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Sep 28, 2017 at 10:20 PM, Chia-Ping Tsai <
> >>> [email protected]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Thanks for all comment.
> >>>>>>>
> >>>>>>> The problem i want to resolve is the valid code should be exposed
> >>> as
> >>>>>>> IA.Public. Otherwise, end user have to access the IA.Private class
> >>> to
> >>>>> build
> >>>>>>> the custom cell.
> >>>>>>>
> >>>>>>> For example, I have a use case which plays a streaming role in our
> >>>>>>> appliaction. It
> >>>>>>> applies the CellBuilder(HBASE-18519) to build custom cells. These
> >>> cells
> >>>>>>> have many same fields so they are put in shared-memory for
> >>> avoiding GC
> >>>>>>> pause. Everything is wonderful. However, we have to access the
> >>>>> IA.Private
> >>>>>>> class - KeyValue#Type - to get the valid code of Put.
> >>>>>>>
> >>>>>>> I believe there are many use cases of custom cell, and
> >>> consequently it
> >>>>> is
> >>>>>>> worth adding a way to get the valid type via IA.Public class.
> >>>>> Otherwise, it
> >>>>>>> may imply that the custom cell is based on a unstable way, because
> >>> the
> >>>>>>> related code can be changed at any time.
> >>>>>>> --
> >>>>>>> Chia-Ping
> >>>>>>>
> >>>>>>>> On 2017-09-29 00:49, Andrew Purtell <[email protected]> wrote:
> >>>>>>>> I agree with Stack. Was typing up a reply to Anoop but let me
> >>> move it
> >>>>>>> down
> >>>>>>>> here.
> >>>>>>>>
> >>>>>>>> The type code exposes some low level details of how our current
> >>>>> stores
> >>>>>>> are
> >>>>>>>> architected. But what if in the future you could swap out HStore
> >>>>>>> implements
> >>>>>>>> Store with PStore implements Store, where HStore is backed by
> >>> HFiles
> >>>>> and
> >>>>>>>> PStore is backed by Parquet? Just as a hypothetical example. I
> >>> know
> >>>>> there
> >>>>>>>> would be larger issues if this were actually attempted. Bear with
> >>>>> me. You
> >>>>>>>> can imagine some different new Store implementation that has some
> >>>>>>>> advantages but is not a design derived from the log structured
> >>> merge
> >>>>> tree
> >>>>>>>> if you like. Most values from a new Cell.Type based on
> >>> KeyValue.Type
> >>>>>>>> wouldn't apply to cells from such a thing because they are
> >>>>> particular to
> >>>>>>>> how LSMs work. I'm sure such a project if attempted would make a
> >>>>> number
> >>>>>>> of
> >>>>>>>> changes requiring a major version increment and low level details
> >>>>> could
> >>>>>>> be
> >>>>>>>> unwound from Cell then, but if we could avoid doing it in the
> >>> first
> >>>>>>> place,
> >>>>>>>> I think it would better for maintainability.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On Thu, Sep 28, 2017 at 9:39 AM, Stack <[email protected]> wrote:
> >>>>>>>>>
> >>>>>>>>> On Thu, Sep 28, 2017 at 2:25 AM, Chia-Ping Tsai <
> >>>>> [email protected]>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> hi folks,
> >>>>>>>>>>
> >>>>>>>>>> User is allowed to create custom cell but the valid code of
> >>> type
> >>>>> -
> >>>>>>>>>> KeyValue#Type - is declared as IA.Private. As i see it, we
> >>> should
> >>>>>>> expose
> >>>>>>>>>> KeyValue#Type as Public Client. Three possible ways are shown
> >>>>> below:
> >>>>>>>>>> 1) Change declaration of KeyValue#Type from IA.Private to
> >>>>> IA.Public
> >>>>>>>>>> 2) Move KeyValue#Type into Cell.
> >>>>>>>>>> 3) Move KeyValue#Type to upper level
> >>>>>>>>>>
> >>>>>>>>>> Any suggestions?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>> What is the problem that we are trying to solve Chia-Ping? You
> >>>>> want to
> >>>>>>> make
> >>>>>>>>> Cells of a new Type?
> >>>>>>>>>
> >>>>>>>>> My first reaction is that KV#Type is particular to the KV
> >>>>>>> implementation.
> >>>>>>>>> Any new Cell implementation should not have to adopt the
> >>> KeyValue
> >>>>>>> typing
> >>>>>>>>> mechanism.
> >>>>>>>>>
> >>>>>>>>> S
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Chia-Ping
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Best regards,
> >>>>>>>> Andrew
> >>>>>>>>
> >>>>>>>> Words like orphans lost among the crosstalk, meaning torn from
> >>>>> truth's
> >>>>>>>> decrepit hands
> >>>>>>>> - A23, Crosstalk
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Best regards,
> >>>>>> Andrew
> >>>>>>
> >>>>>> Words like orphans lost among the crosstalk, meaning torn from
> >>> truth's
> >>>>>> decrepit hands
> >>>>>> - A23, Crosstalk
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>> Andrew
> >>>>
> >>>> Words like orphans lost among the crosstalk, meaning torn from truth's
> >>>> decrepit hands
> >>>> - A23, Crosstalk
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Best regards,
> >> Andrew
> >>
> >> Words like orphans lost among the crosstalk, meaning torn from truth's
> >> decrepit hands
> >> - A23, Crosstalk
> >>
>