got it. Thanks to all of you! On Sat, Apr 6, 2019 at 4:24 AM Karthikeyan Manivannan <kmanivan...@mapr.com> wrote:
> Hi Weijie, > > You are right. Before DRILL-6340 the purpose of the hasRemainder() logic > was not clear. projector.projectRecords() always took in the > incomingRowCount as the argument and returned the same value in > non-exceptional paths. So, I think the whole hasReaminder() was dead-code > then. I did not investigate it further because I knew that under DRILL-6340 > that code would definitely be necessary. > > Karthik > > > On Fri, Apr 5, 2019 at 9:27 AM Sorabh Hamirwasia <sohami.apa...@gmail.com> > wrote: > > > Hi Weijie, > > I think the only case in which that line will be executed is if there is > > any UDF like flatten operation which results in producing multiple rows > for > > each input row. Even though currently Flatten is a separate operator in > > Drill but I think that code is there to handle such cases. > > > > Thanks, > > Sorabh > > > > On Fri, Apr 5, 2019 at 6:08 AM weijie tong <tongweijie...@gmail.com> > > wrote: > > > > > The first appearance of the comparison code is at DRILL-620 : > > > > > > > > > https://github.com/apache/drill/commit/a2355d42dbff51b858fc28540915cf793f1c0fac#diff-e87beb3f2aa0fbc06b07b1d55c3d3536 > > > . Before DRILL-6340 , according to the ProjectorTemplate's > projectRecords > > > method and its actual input parameter values , I think the line 234 of > > > ProjectRecordBatch will never be executed. Untill DRILL-6340 , we > control > > > the output batch memory size, that part of code finally come into use. > > > > > > If I was wrong, please let me know. > > > > > > On Fri, Apr 5, 2019 at 12:15 AM weijie tong <tongweijie...@gmail.com> > > > wrote: > > > > > > > Thanks for the reply, But it seems the code has been there even > before > > > > DRILL-6340. > > > > > > > > On Thu, Apr 4, 2019 at 10:45 PM Vova Vysotskyi <vvo...@gmail.com> > > wrote: > > > > > > > >> Hi Weijie, > > > >> > > > >> It is possible if maxOuputRecordCount (received from > > > >> memoryManager.getOutputRowCount()) is less than incomingRecordCount. > > > >> For more details please see DRILL-6340 > > > >> <https://issues.apache.org/jira/browse/DRILL-6340> and design > > document > > > >> < > > > >> > > > > > > https://docs.google.com/document/d/1h0WsQsen6xqqAyyYSrtiAniQpVZGmQNQqC1I2DJaxAA/edit?usp=sharing > > > >> > > > > >> attached to this Jira. > > > >> > > > >> Kind regards, > > > >> Volodymyr Vysotskyi > > > >> > > > >> > > > >> On Thu, Apr 4, 2019 at 5:17 PM weijie tong <tongweijie...@gmail.com > > > > > >> wrote: > > > >> > > > >> > I have a doubt about the ProjectRecordBatch implementation. Hope > > > >> someone > > > >> > could give an explanation about that. To the line 234 of > > > >> > ProjectRecordBatch, at what case,the projector output row size > less > > > than > > > >> > the input size ? > > > >> > > > > >> > On Thu, Apr 4, 2019 at 5:11 PM weijie tong < > tongweijie...@gmail.com > > > > > > >> > wrote: > > > >> > > > > >> > > Hi Igor: > > > >> > > That's a good idea! It could resolve that issue. The basic > > question > > > >> has > > > >> > > solved. To use the official Arrow, there's still two issues > > needed > > > >> to be > > > >> > > contributed to Arrow, that I will do: > > > >> > > 1. make gcc lib static linked into the jni dynamic lib. > > > >> > > Without this, it will require the platform installed right > > version > > > >> gcc > > > >> > > 2. add convertToNull function to gandiva > > > >> > > This could make some project expression with convertToNull > > function > > > >> to > > > >> > be > > > >> > > gandiva executed > > > >> > > > > > >> > > Of course, without these two issues solved, I still could give > an > > > >> > > integration implementation. > > > >> > > > > > >> > > BTW, once the integration is done. How do we supply the gandiva > > jni > > > >> lib ? > > > >> > > Leave it to user to build it ? or we supply different platform > > > >> > > distributions? > > > >> > > > > > >> > > > > > >> > > On Thu, Apr 4, 2019 at 3:53 PM Igor Guzenko < > > > >> ihor.huzenko....@gmail.com> > > > >> > > wrote: > > > >> > > > > > >> > >> Hello Weijie, > > > >> > >> > > > >> > >> Did you try to create same package as in Arrow, but in Drill > and > > > use > > > >> > >> wrapper class around target for publishing > > > >> > >> desired methods with package access ? > > > >> > >> > > > >> > >> Thanks, Igor > > > >> > >> > > > >> > >> On Thu, Apr 4, 2019 at 9:51 AM weijie tong < > > > tongweijie...@gmail.com> > > > >> > >> wrote: > > > >> > >> > > > > >> > >> > HI : > > > >> > >> > > > > >> > >> > Gandiva is a sub project of Arrow. Arrow gandiva using LLVM > > > codegen > > > >> > and > > > >> > >> > simd skill could achieve better query performance. Arrow and > > > Drill > > > >> > has > > > >> > >> > similar column memory format. The main difference now is the > > null > > > >> > >> > representation. Also Arrow has made great changes to the > > > >> ValueVector. > > > >> > To > > > >> > >> > adopt Arrow to replace Drill's VV has been discussed before. > > That > > > >> > would > > > >> > >> be > > > >> > >> > a great job. But to leverage gandiva , by working at the > > physical > > > >> > memory > > > >> > >> > address level , this work could be little relatively. > > > >> > >> > > > > >> > >> > Now I have done the integration work at our own branch by > make > > > some > > > >> > >> changes > > > >> > >> > to the Arrow branch, and issued DRILL-7087 and ARROW-4819. > The > > > main > > > >> > >> changes > > > >> > >> > to ARROW-4819 is to make some package level method to be > > public. > > > >> But > > > >> > >> arrow > > > >> > >> > community seems not plan to accept this change. Their advice > is > > > to > > > >> > have > > > >> > >> a > > > >> > >> > arrow branch. > > > >> > >> > > > > >> > >> > So what do you think? > > > >> > >> > > > > >> > >> > 1、Have a self branch of Arrow. > > > >> > >> > 2、waiting for the Arrow integration completely. > > > >> > >> > or some other ideas? > > > >> > >> > > > >> > > > > > >> > > > > >> > > > > > > > > > >