Hi Nitin, As it turns out, I just had to fix a bug in the windowing operator. I'm not an expert on this operator, but perhaps I can offer a suggestion or two.
We have a few existing unit tests for window functions in TestWindowFrame. They are a bit hard to follow, however. Take a look at testFix3605(), which does: select col2, lead(col2) over(partition by col2 order by col0) as lead_col2 from dfs.`window/fewRowsAllData.parquet` When executed, we use the NoFrameSupportTemplate [1] class to do the work. Specifically, processPartition() contains code that handles the lead/lag by 1 case. I found it useful to enable saving of the generated code: [2]. When you step into the generated code, in Eclipse, you can set the source path to include /tmp/drill/codegen (on Linux/Mac). You can then see the contents of the generated copyPrev() and copyNext() functions. If you step into these generated functions, you can see that the code simply takes two indexes: a to and a from, then copies the data from one to the other. As a result, I suspect that you do not need to change the generated code to achieve your goal. Instead, you may want to change the processPartition() function. Instead of the simple +/-1 logic it currently has, use your lead/lag offset instead. By the way, a handy way to share work is simply to push your work to your private GitHub repo, then link to that code. Thanks, - Paul [1] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/NoFrameSupportTemplate.java#L139 [2] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java#L366 On Wednesday, November 6, 2019, 10:04:12 PM PST, Nitin Pawar <nitinpawar...@gmail.com> wrote: any help on this? On Tue, Nov 5, 2019 at 7:09 PM Nitin Pawar <nitinpawar...@gmail.com> wrote: > Ohh ok > let me provide a google drive url > Here > <https://drive.google.com/file/d/1no-xq40Q2HEoWx5xG1t5dOx5BKTFH72F/view?usp=sharing> > is the link. Can you check if can access it. > > Thanks, > Nitin > > On Tue, Nov 5, 2019 at 7:02 PM Charles Givre <cgi...@gmail.com> wrote: > >> Hi Nitin, >> It seems to have been filtered out. >> >> >> > On Nov 5, 2019, at 8:29 AM, Nitin Pawar <nitinpawar...@gmail.com> >> wrote: >> > >> > Hi Charles, >> > >> > I have attached git patch. >> > I was currently doing for lag function only for testing purposes >> > >> > Thanks, >> > Nitin >> > >> > On Tue, Nov 5, 2019 at 6:34 PM Charles Givre <cgi...@gmail.com <mailto: >> cgi...@gmail.com>> wrote: >> > Hi Nitin, >> > Thanks for your question. Could you/did you share your code? If not, >> could you please post a draft PR so that we can take a look and offer >> suggestions? >> > Thanks, >> > -- C >> > >> > >> > > On Nov 5, 2019, at 7:27 AM, Nitin Pawar <nitinpawar...@gmail.com >> <mailto:nitinpawar...@gmail.com>> wrote: >> > > >> > > Hi Devs, >> > > >> > > I had sent request for this almost 2.5 years ago. Trying it again now. >> > > >> > > Currently Apache drill window functions LEAD and LAG support offset >> as 1. >> > > In another words in a given window these functions can return either >> > > previous or next row only. >> > > >> > > >> > > I am trying modify the behavior these function and allow offset >=1 in >> > > query such as >> > > select employee_id, department_id,salary, lag(salary,*4*) >> over(partition by >> > > department_id order by salary asc) from cp.`employee.json`; >> > > >> > > I have managed to remove the limitation which fails the query can not >> have >> > > offset > 1 and able to pass the offset to actual function >> implementation. >> > > >> > > Currently I am stuck where the record processor is crossing the window >> > > boundary of department_id and gets row from next/previous window in >> > > lead/lag function >> > > >> > > For eg: If you notice in row 2 for department_id=2, it is getting >> previous >> > > windows of department_id=1 >> > > >> > > Here is sample output for below query >> > > apache drill> select employee_id, department_id,salary, lag(salary,4) >> > > over(partition by department_id order by salary asc) from >> > > cp.`employee.json` where department_id <=3; >> > > +-------------+---------------+---------+----------+ >> > > | employee_id | department_id | salary | EXPR$3 | >> > > +-------------+---------------+---------+----------+ >> > > | 20 | 1 | 30000.0 | null | >> > > | 5 | 1 | 35000.0 | null | >> > > | 22 | 1 | 35000.0 | null | >> > > | 21 | 1 | 35000.0 | null | >> > > | 2 | 1 | 40000.0 | 30000.0 | >> > > | 4 | 1 | 40000.0 | 35000.0 | >> > > | 1 | 1 | 80000.0 | 35000.0 | >> > > | 37 | 2 | 6700.0 | null | >> > > | 38 | 2 | 8000.0 | 40000.0 | >> > > | 39 | 2 | 10000.0 | 40000.0 | >> > > | 40 | 2 | 10000.0 | 80000.0 | >> > > | 6 | 2 | 25000.0 | 6700.0 | >> > > | 42 | 3 | 5000.0 | null | >> > > | 41 | 3 | 8500.0 | 10000.0 | >> > > | 7 | 3 | 15000.0 | 10000.0 | >> > > | 36 | 3 | 45000.0 | 25000.0 | >> > > +-------------+---------------+---------+----------+ >> > > >> > > >> > > Thanks, >> > > Nitin Pawar >> > >> > >> > >> > -- >> > Nitin Pawar >> >> > > -- > Nitin Pawar > -- Nitin Pawar