The redbook [1] deserves a mention. It also has a chapter (collection of papers) dedicated to query optimization [2].
[1] http://www.redbook.io/ [2] http://www.redbook.io/ch7-queryoptimization.html On Tue, Jan 22, 2019 at 4:16 AM Joel Pfaff <[email protected]> wrote: > Hello, > > Thanks for this initiative. > I have found a couple of years ago this page of link from Reynold Xin: > https://github.com/rxin/db-readings > > And it is full of nice things. > > Regards, Joel > > On Tue, Jan 22, 2019 at 9:01 AM weijie tong <[email protected]> > wrote: > > > Hi Paul: > > Thanks for the sharing. I would like to share another good latest paper > > here "Everything you always wanted to know about compiled and > vectorized > > queries but were afraid to ask" : > > http://www.vldb.org/pvldb/vol11/p2209-kersten.pdf > > > > It explains the two kind of database execution architecture : vectorized > & > > compiled. It can also answer the ever asked question about what's the > > difference between spark's whole stage codegen and Drill's codegen. > > > > > > > > On Tue, Jan 22, 2019 at 10:51 AM Paul Rogers <[email protected]> > > wrote: > > > > > Hi All, > > > > > > Wanted to pass along some good foundational material about databases. > We > > > find ourselves immersed day-to-day in the details of Drill's > > > implementation. It is helpful to occasionally step back and look at the > > > larger DB tradition in which Drill resides. This material is especially > > > good for anyone who didn't study DB theory in college. > > > > > > "Architecture of a Database System": > > > http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf - By > > > Stonebraker et al. While focused on "classic" DB systems, the ideas > > readily > > > apply to "Big Data" distributed engines such as Drill. Walks through > many > > > of the basic architectural choices. You'll find yourself saying, "I > see, > > > Drill chose the shared-nothing, OS thread model but random heap > > allocation > > > rather than a buffer pool." That is, you can see Drill's design choices > > in > > > the context of the overall DB solution space. > > > > > > "Database Management Systems", 3e by Ramakrishnan & Gehrke. A > > > textbook-length overview of DB theory. I used the second edition years > > ago > > > to design and build a complete embedded hybrid DB and object store. I > > keep > > > returning to the book any time I need a refresher on some topic or > other. > > > > > > What other favorites do people have? Anyone know of any good references > > > that explain the rule-based architecture of a planner such as Calcite? > > > (R&G, 2e, mostly discuss the classic "dynamic programming" style of > > > planner.) > > > > > > Thanks, > > > - Paul > > > > > > > > >
