On Sat, Dec 19, 2009 at 03:42:32PM +0100, Sjoerd Mullender wrote: > On 2009-12-19 15:28, Lefteris wrote: > > On Sat, Dec 19, 2009 at 2:59 PM, Niels Nes <[email protected]> wrote: > >> On Sat, Dec 19, 2009 at 12:21:09PM +0100, Wouter Alink wrote: > >>> Lefteris, you are correct in that i meant 'second time the query was > >>> run' when I wrote 'hot run'. > >>> > >>> I see that at GDK level reuse cannot be estimated. Although with > >>> current hardware which has an abundance of memory, and the fact that > >>> strings take up much more storage than a single BUN (so a hash-entry > >>> is usually relatively small compared to its data) GDK might weigh the > >>> additional costs. GDK also decides which things to keep in memory or > >>> throw it out, which in turn is also based on reuse. > >>> The costs for performing the initial join are dominated by the strHash > >>> function, and building the hashtable on the big BAT or the smaller BAT > >>> makes (almost) no difference, except for the additional memory use. If > >>> on such a big bat again a join is performed, it will be beneficial to > >>> have the hashtable in place. > >>> > >>> What I was hoping for were explanations of situations where it makes > >>> no sense to build the hashtable on the bigger string BAT, but a good > >>> counter-example I haven't seen. In general, i can see, it would not be > >>> beneficial if the big BAT is not joined twice, but if it doesn't hurt > >>> too much, couldn't it just be the default? > >> > >> It cannot be the default as hashtables are simply to big. Assume a > >> single column which barely fits into your main memory. In that case the > >> hashtable will not fit leading to sub optimal performance. > >> > >> One global rule could be > >> build hash on largest if it still fits into L2. > >> build hash on smallest once we run out of L2. > >> > >> Niels > >> > > > > Fair enough, but if the big fits in L2, then it is not so big, and > > then the small is in L2 too, I think you will not see so much > > difference there, the entire join will be almost in L2, right? > > > > Wouter, is you big BAT string sorted? while the small unsorted? (many > > times happens that to me because the big is persistent and the small > > intermediate result). In such cases you can also trigger a sorting on > > the small and then merge join. there is some code in the gdk for that, > > but too strict in my opinion. > > We should also consider building the hash table on both sides. Since > the hashes have to be calculated, no matter what, it makes sense to > store those values. This doesn't necessarily have to be done up front. > The hashes for the second table could be stored while doing the scan > for the join.
All true if you have the space to store them, and although you did calculate it doesn't make it free to store a hash, it will bring down the memory->cache bandwidth with about a factor 2. Niels > > > Lefteris > > > >>> > >>> Eventually I would like to be using the SQL layer only. Here there > >>> would be plenty of tables with string-columns, and some will be joined > >>> against. Should a MAL optimizer detect that I am about to join two > >>> string-BATs, and that one BAT is bigger than the other and has many > >>> different values, and therefore should build a hashtable on the bigger > >>> one? The MAL optimizer can only guess about my next query (although I > >>> agree that it could do a better job at guessing), and calculating > >>> heapsize/batsize seems to be an operation that is also difficult to do > >>> on a MAL layer. > >>> > >>> is really nobody in favour of changing the behavior of joining string > >>> bats for large bats with many different values? well, than I give up. > >>> Wouter > >>> > >>> > >>> 2009/12/18 Stefan Manegold <[email protected]>: > >>>> Hi Wouter, > >>>> > >>>> in the lines of Lefteris' reply: > >>>> for a single join with no hash table present a priori, the number of hash > >>>> function calls is euqal to the number of BUNs in both BATs; for each > >>>> inner > >>>> BUN the function need to be called to build the hash table, for each > >>>> outer > >>>> BUN it needs to be called to probe the hash table. Hence, the pure > >>>> hashing > >>>> costs are independent of which BAT is inner and which outer. > >>>> Given that, the reason the choose the smaller as inner is indeed to > >>>> increase > >>>> spacial locallity (and thus performance) of the inherently random access > >>>> while building and probing the hashtable. > >>>> > >>>> As Lefteris pointed out, the "operational optimization" in GDK is a pure > >>>> peephole optimization dealing only with the very operation at hand. > >>>> I.e., in > >>>> general it cannot anticipate the benefits of future re-use of efforts, > >>>> like > >>>> investing in the (more expensive) building of a larger hash table to be > >>>> able > >>>> to re-use this in several later operations --- which IMHO is independent > >>>> of > >>>> the data type. Such descisions need to be made at higher levels, either > >>>> in > >>>> MAL optimizers or in the front-end that generates the MAL plan. > >>>> > >>>> Stefan > >>>> > >>>> > >>>> On Fri, Dec 18, 2009 at 05:01:07PM +0100, Lefteris wrote: > >>>>> Hi Wouter, > >>>>> > >>>>> funny think, I had the same exact problem and we were thinking about > >>>>> this issue. The idea here is that this observation for strings might > >>>>> not be always true, and it is a situation that cannot be always > >>>>> determined on the kernel level. Correct me if I am wrong, but your > >>>>> benefit on query comes because the hash in the large BAT is already > >>>>> there, that's why the second time you get 0.01? You mention hot run so > >>>>> I assume the BAT is already there with a hash index. While in the > >>>>> original situation the hash is on the small BAT thus you don't benefit > >>>>> from the hot run. But if a big BAT of strings is to be used again it > >>>>> is unknown in the gdk level. So, I solved the problem by forcing the > >>>>> hash index on the big BAT in a higher level (in Monet5) where it knows > >>>>> something more about the application (in my case RDF store). Can you > >>>>> do instead that? force the hash index in a higher level for you > >>>>> application? If gdk see a hash index already there, then it will > >>>>> choose that independent of the size. > >>>>> > >>>>> lefteris > >>>>> > >>>>> On Fri, Dec 18, 2009 at 4:22 PM, Wouter Alink <[email protected]> > >>>>> wrote: > >>>>>> Dear developers, > >>>>>> > >>>>>> I would like to propose a change in GDK and hear opinions. It is about > >>>>>> the following issue: > >>>>>> > >>>>>> in the BATjoin code, if there is no possibility to do a fetch or merge > >>>>>> join, a hashjoin is performed. A hashtable is created for the smallest > >>>>>> BAT. The reasons (i could think of) for choosing the smallest BAT for > >>>>>> the hashtable are that less space is required for the hashtable (which > >>>>>> in turn causes less cache misses when doing a lookup) and also because > >>>>>> the hashfunction used is assumed to be very inexpensive (it needs to > >>>>>> be calculated for each item in the large bat each time a join is > >>>>>> performed). > >>>>>> I can see that the hashfunction can be very efficient for data types > >>>>>> without indirection, but I feel that for data types like strings in > >>>>>> some cases this is a little different. If a string BAT for example > >>>>>> contains many different values (i.e. is not a bat which contains > >>>>>> enumeration values) the hashfunction will not be inexpensive anymore > >>>>>> (many cache misses), as each hashfunction call needs to hash a whole > >>>>>> (arbitrary long) string at an arbitrary place in the heap. > >>>>>> > >>>>>> Is it perhaps possible to specify that, when a BAT of type 'str' has > >>>>>> many different values a hashtable may be build on the large BAT > >>>>>> instead of on the small BAT? > >>>>>> > >>>>>> Reason that I ask this: I was analysing costs of a query in which I > >>>>>> had a few short strings (26 tuples, 1-column table: varchar) which I > >>>>>> wanted to look up in a dictionary (9M tuples, 2-column table: > >>>>>> int,varchar). "SELECT a.id FROM longlist AS a JOIN smalllist as b ON > >>>>>> a.strvalue=b.strvalue;" > >>>>>> The result is a small list of integers (26 or less tuples). This > >>>>>> operation currently takes roughly 1.5 seconds for a hot run, mostly > >>>>>> due to 9M strHash operations. By applying the patch below the > >>>>>> execution time for a hot run dropped down to .01 seconds. The > >>>>>> performance gain is caused by only having to perform strHash on the > >>>>>> items in the small bat once the hashtable for the large bat has been > >>>>>> created. > >>>>>> > >>>>>> Any suggestions whether such a change is useful? Which benchmarks will > >>>>>> be influenced? > >>>>>> > >>>>>> I guess this code change is probably not useful for large string BATs > >>>>>> with only few different values, but perhaps a guess could be made how > >>>>>> diverse the strings in a bat are (by taking a sample or perhaps simply > >>>>>> by looking at the ratio batsize/heapsize), and based on that determine > >>>>>> whether to build it on the large or small BAT? > >>>>>> > >>>>>> Greetings, > >>>>>> Wouter > >>>>>> > >>>>>> > >>>>>> Index: src/gdk/gdk_relop.mx > >>>>>> =================================================================== > >>>>>> RCS file: /cvsroot/monetdb/MonetDB/src/gdk/gdk_relop.mx,v > >>>>>> retrieving revision 1.167.2.4 > >>>>>> diff -u -r1.167.2.4 gdk_relop.mx > >>>>>> --- src/gdk/gdk_relop.mx 20 Nov 2009 13:04:06 -0000 > >>>>>> 1.167.2.4 > >>>>>> +++ src/gdk/gdk_relop.mx 18 Dec 2009 14:59:13 -0000 > >>>>>> @@ -1232,7 +1232,12 @@ > >>>>>> @- > >>>>>> hash join: the bread&butter join of monet > >>>>>> @c > >>>>>> - /* Simple rule, always build hash on the smallest */ > >>>>>> + /* Simple rule, always build hash on the smallest, > >>>>>> + except when it is a string-join, then we do the > >>>>>> opposite */ > >>>>>> + if (swap && rcount < lcount && l->ttype == TYPE_str) { > >>>>>> + ALGODEBUG THRprintf(GDKout, "#BATjoin: > >>>>>> BATmirror(BAThashjoin(BATmirror(r), BATmirror(l)," BUNFMT "));\n", > >>>>>> estimate); > >>>>>> + return BATmirror(BAThashjoin(BATmirror(r), > >>>>>> BATmirror(l), estimate)); > >>>>>> + } > >>>>>> if (swap && rcount > lcount) { > >>>>>> ALGODEBUG THRprintf(GDKout, "#BATjoin: > >>>>>> BATmirror(BAThashjoin(BATmirror(r), BATmirror(l)," BUNFMT "));\n", > >>>>>> estimate); > >>>>>> > >>>>>> ------------------------------------------------------------------------------ > >>>>>> This SF.Net email is sponsored by the Verizon Developer Community > >>>>>> Take advantage of Verizon's best-in-class app development support > >>>>>> A streamlined, 14 day to market process makes app distribution fast > >>>>>> and easy > >>>>>> Join now and get one step closer to millions of Verizon customers > >>>>>> http://p.sf.net/sfu/verizon-dev2dev > >>>>>> _______________________________________________ > >>>>>> Monetdb-developers mailing list > >>>>>> [email protected] > >>>>>> https://lists.sourceforge.net/lists/listinfo/monetdb-developers > >>>>>> > >>>>> > >>>>> ------------------------------------------------------------------------------ > >>>>> This SF.Net email is sponsored by the Verizon Developer Community > >>>>> Take advantage of Verizon's best-in-class app development support > >>>>> A streamlined, 14 day to market process makes app distribution fast and > >>>>> easy > >>>>> Join now and get one step closer to millions of Verizon customers > >>>>> http://p.sf.net/sfu/verizon-dev2dev > >>>>> _______________________________________________ > >>>>> Monetdb-developers mailing list > >>>>> [email protected] > >>>>> https://lists.sourceforge.net/lists/listinfo/monetdb-developers > >>>>> > >>>> > >>>> -- > >>>> | Dr. Stefan Manegold | mailto:[email protected] | > >>>> | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | > >>>> | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | > >>>> | The Netherlands | Fax : +31 (20) 592-4199 | > >>>> > >>> > >>> ------------------------------------------------------------------------------ > >>> This SF.Net email is sponsored by the Verizon Developer Community > >>> Take advantage of Verizon's best-in-class app development support > >>> A streamlined, 14 day to market process makes app distribution fast and > >>> easy > >>> Join now and get one step closer to millions of Verizon customers > >>> http://p.sf.net/sfu/verizon-dev2dev > >>> _______________________________________________ > >>> Monetdb-developers mailing list > >>> [email protected] > >>> https://lists.sourceforge.net/lists/listinfo/monetdb-developers > >> > >> -- > >> Niels Nes, Centrum Wiskunde & Informatica (CWI) > >> Science Park 123, 1098 XG Amsterdam, The Netherlands > >> room L3.14, phone ++31 20 592-4098 > >> url: http://www.cwi.nl/~niels e-mail: [email protected] > >> > >> ------------------------------------------------------------------------------ > >> This SF.Net email is sponsored by the Verizon Developer Community > >> Take advantage of Verizon's best-in-class app development support > >> A streamlined, 14 day to market process makes app distribution fast and > >> easy > >> Join now and get one step closer to millions of Verizon customers > >> http://p.sf.net/sfu/verizon-dev2dev > >> _______________________________________________ > >> Monetdb-developers mailing list > >> [email protected] > >> https://lists.sourceforge.net/lists/listinfo/monetdb-developers > >> > > > > ------------------------------------------------------------------------------ > > This SF.Net email is sponsored by the Verizon Developer Community > > Take advantage of Verizon's best-in-class app development support > > A streamlined, 14 day to market process makes app distribution fast and easy > > Join now and get one step closer to millions of Verizon customers > > http://p.sf.net/sfu/verizon-dev2dev > > _______________________________________________ > > Monetdb-developers mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/monetdb-developers > > > -- > Sjoerd Mullender > > ------------------------------------------------------------------------------ > This SF.Net email is sponsored by the Verizon Developer Community > Take advantage of Verizon's best-in-class app development support > A streamlined, 14 day to market process makes app distribution fast and easy > Join now and get one step closer to millions of Verizon customers > http://p.sf.net/sfu/verizon-dev2dev > _______________________________________________ > Monetdb-developers mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/monetdb-developers -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 url: http://www.cwi.nl/~niels e-mail: [email protected] ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Monetdb-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/monetdb-developers
