I opened the following jira epic for this contribution: https://issues.apache.org/jira/browse/THRIFT-5443
On Mon, Jul 12, 2021 at 11:58 PM Duru Can Celasun <dcela...@apache.org> wrote: > On Tue, 13 Jul 2021, at 01:07, Bhalchandra Pandit wrote: > > Thanks for showing interest and for raising valid questions. > > > > I do not currently have a forked branch to show how it is done. However, > I > > can certainly make it happen. It will take me a couple of weeks or so to > do > > it. I will need to get myself familiarized with the contribution process. > > The short version is that you need to open a ticket on Jira [1] and then > send a PR against the master branch on Github [2] with the title > "THRIFT-NNNN: Your title". Everything else can be figured out later. > > [1] https://issues.apache.org/jira/projects/THRIFT/issues > > [2] https://github.com/apache/thrift/ > > > > > To answer other questions: > > > > 1. The current implementation is in Java. However, it can also be > easily > > ported to other languages. Happy to help in that direction as well. > > 2. The solution does not modify the definition of any structure (eg, > > Payload => SlimPayload). The output instance has the same type > regardless > > of whether we use full deserialization or partial deserialization. In > that > > sense, it is 100% compatible in both directions (partial <=> full). > The > > solution enables selectively deserializing any arbitrary nested > subset. > > > > Kumar > > > > On Mon, Jul 12, 2021 at 4:10 PM Duru Can Celasun <dcela...@apache.org> > > wrote: > > > > > This is definitely an interesting problem at scale and it'd be great to > > > have a solution upstream. > > > > > > I second Yuxuan's questions. From the blog post it seems you have an > > > implementation for Java, but it would be great to have at least one > more. > > > > > > On Tue, 13 Jul 2021, at 00:00, Yuxuan Wang wrote: > > > > Hi Bhalchandra, > > > > > > > > Do you have any open source code (e.g. forked thrift) to show how you > > > > implemented it? The blog post stated what's the problem but really > lack > > > > information on the solution side: > > > > > > > > 1. How did you solve the problem? > > > > 2. In which language(s) did you implement the solution? > > > > > > > > Without that information it's really not much we can do here. > > > > > > > > Also just based on the problem you described, a simple solution > would be > > > > just to duplicate the struct definition and remove the fields you > don't > > > > need, for example: > > > > > > > > // Original struct > > > > struct Payload { > > > > 1: optional Type1 field1, > > > > 2: optional Type2 field2, > > > > 3: optional Type3 field3, > > > > ... > > > > } > > > > > > > > // Slim struct > > > > struct SlimPayload { > > > > 1: optional Type1 field1, > > > > // field2 removed because we don't care about it in this use case > > > > 3: optional SlimType3 field3, > > > > ... > > > > } > > > > > > > > But of course it's hard to keep SlimPayload in sync with the original > > > > Payload so I can see there are some values to have some helpers to > help > > > > that, but as long as you don't make breaking changes into Payload > "keep > > > > them in sync" is a false problem. > > > > > > > > On Mon, Jul 12, 2021 at 12:56 PM Bhalchandra Pandit > > > > <kpan...@pinterest.com.invalid> wrote: > > > > > > > > > Hi All, > > > > > I work for Pinterest. I developed a technique for partial > > > deserialization > > > > > of Thrift that has been very useful in significantly improving > > > efficiency > > > > > of the data processing at Pinterest. I would like to contribute > that > > > > > feature to Apache Thrift. More details on this technique are > available > > > in > > > > > this blog I recently wrote: > > > > > > > > > > > > > > https://medium.com/pinterest-engineering/improving-data-processing-efficiency-using-partial-deserialization-of-thrift-16bc3a4a38b4 > > > > > > > > > > I would like to know if any of you are interested in helping with > > > > > contributing this work to the main branch. > > > > > > > > > > Kumar > > > > > > > > > > > > > > >