On Wed, May 13, 2020 at 8:04 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote:
> On Fri, May 8, 2020 at 7:27 AM Andy Fan <zhihui.fan1...@gmail.com> wrote: > > >> + else if (inner_is_onerow) > >> + { > >> + /* Since rows in innerrel can't be duplicated AND if > >> innerrel is onerow, > >> + * the join result will be onerow also as well. Note: > >> onerow implies > >> + * multi_nullvals = false. > >> > >> I don't understand this comment. Why is there only one row in the other > >> relation which can join to this row? > > > > > > I guess you may miss the onerow special case if I understand correctly. > > inner_is_onerow means something like "SELECT xx FROM t1 where uk = 1". > > innerrel can't be duplicated means: t1.y = t2.pk; so the finally > result is onerow > > as well. One of the overall query is SELECT .. FROM t1, t2 where t2.y > = t2.pk; > > > > > > I explained more about onerow in the v7 README.unqiuekey document, just > copy > > it here. > For some reason this mail remained in my drafts without being sent. > Sending it now. Sorry. > > My impression about the one row stuff, is that there is too much > special casing around it. We should somehow structure the UniqueKey > data so that one row unique keys come naturally rather than special > cased. E.g every column in such a case is unique in the result so > create as many UniqueKeys are the number of columns This is the beginning state of the UniqueKey, later David suggested this as an optimization[1], I buy-in the idea and later I found it mean more than the original one [2], so I think onerow is needed actually. > or create one > unique key with no column as you have done but handle it more > gracefully rather than spreading it all over the place. I think this is what I do now, but it is possible that I spread it more than necessary, if so, please let me know. I maintained the README.uniquekey carefully since v7 and improved a lot in v8, it may be a good place to check it. > Also, the amount of code that these patches changes seems to be much > larger than the feature's worth arguably. But it indicates that we are > modifying/adding more code than necessary. Some of that code can be > merged into existing code which does similar things as I have pointed > out in my previous comment. > > I have reused the code select_mergejoin_clause rather than maintaining my own copies in v8. Thanks a lot about that suggestion. This happened mainly because I didn't read enough code. I will go through more to see if I have similar issues. > Thanks for working on the expanded scope of the initial feature you > proposed. But it makes the feature more useful, I think. > > That's mainly because your suggestions are always insightful which makes me willing to continue to work on it, so thank you all! === In fact, I was hesitated that how to reply an email when I send an new version of the patch. One idea is I should explain clear what is the difference between Vn and Vn-1. The other one is not many people read the Vn-1, so I would like to keep the email simplified and keep the README clearly to save the reviewer's time. Actually there are more changes in v8 than I stated above. for example: for the UniqueKey on baserelation, we will reduce the expr from the UniqueKey if the expr is a const. Unique on (A, B). query is SELECT b FROM t WHERE a = 1; in v7, the UniqueKey is (a, b). In v8, the UniqueKey is (b) only. But since most people still not start to read it, so I add such information to README rather than echo the same in email thread. I will try more to understand how to communicate more smooth. But any suggestion on this part is appreciated. Best Regards Andy Fan