On Jan 7, 2021, 13:42 -0800, Florents Tselai <florents.tse...@gmail.com>, wrote: > Apologies for the shameless self-promotion :) > > Around a year ago I collected my thoughts on this topic. You can read the > post here Modern Data Practice and the SQL Tradition . > It looks like it resonated with a lot of folks in the community. HN > Discussion https://news.ycombinator.com/item?id=21482114 > > I would specifically underline the fact that the newer generation of > programmers & data pros (my former self included) don't really appreciate > things like triggers and server-side programming. Triggers and DB-side > functions are considered something like Assembly code. > > Not many neophytes have been shown with use cases why for example writing 2-3 > lines of PL/SQL can save you a huge overhead of back and forth and > environment set up to write the same thing in say Pandas. > > I would focus on Triggers, Indices on expressions, and time-related > functions. Probably on generated columns too. They may be considered a new > feature, but the reasoning of building successively columns on top of a few > base ones is quite appealing nowadays, especially for ML purposes. > > I also wouldn't hesitate to discuss advanced topics. They are usually > considered obscure because people don't present useful examples, rather toy > and arbitrary ones. > > In a recent O'Reilly training, I was skeptical of talking about triggers for > optimization but it looks like it was probably the most useful part of my > training as students could actually "steal and tweak" my code.
Thanks for this. May I steal some of your examples if they prove useful? I’ll credit you of course. I’m planning on somewhat emphasizing that a relational database is a logic engine. Viewed through this lens, a query or view is a “backward” implication and a trigger is a “forward” one. This leads to considering triggers (and the moral equivalent in external code) as requiring “truth maintenance”, and is a great way to think about when the database is the appropriate place for some bit of logic. > On Thu, Jan 7, 2021 at 11:21 PM Guyren Howe <guy...@gmail.com> wrote: > > On Jan 7, 2021, 13:07 -0800, Kevin Brannen <kbran...@efji.com>, wrote: > > > From: Guyren Howe <guy...@gmail.com> > > > > > > >Most folks, in my experience, who use relational databases don’t really > > > >understand the basic theory or even more important the why - the > > > >philosophy - of what a relational database is and how to get the most > > > >out of them. I see a lot of folks trying to use SQL in an imperative > > > >manner - make this temp table, then update it some, then make this other > > > >temp table, etc... > > > > Actually, I’m mostly going to talk about the relational model, rather than > > SQL. Our industry seems to always settle for third-best, and SQL is the > > worst of all the examples of this. The world desperately needs a good > > relational database based on a better query language — datalog, for example. > > > > I put up with SQL so I can use the relational model, and I think that > > understanding SQL has to start with that. > > > > Anyhow. > > > An example of this is that we have a report we're trying to write that I'd > > > like to think can be done in SQL, but I can't think of a way to do it. > > > Yet, > > > if I do the base query and pull the data back into my application, I can > > > do > > > the last bit with 3 lines of Perl very easily. The problem here revolves > > > around comparing a row of data to the previous row to know whether the > > > data > > > changed "significantly enough" to keep the new row. > > > > > > Another example is doing running totals. A couple of years ago I would > > > have > > > said SQL can't do that. Now I know about the OVER clause, something that I > > > would categorize as somewhat obscure, I can do it as needed. > > > > Actually, Window functions might be “advanced”, but are certainly not > > obscure. Your example sounds like it’s trivially solved with LAG(). > > > As Michael Lewis pointed out, large dataset can also cause you to choose > > > not > > > to use SQL in 1 big statemet for everything (i.e. advocating the use to > > > temp > > > tables). In some ways, using a CTE is a type of temp table, or at least I > > > view it as such. That allows a person to solve a problem in bite-sized > > > chunks. > > > I will agree that optimization can do it better at times, but the code > > > also has > > > to be maintained as well – a balancing act. > > > > This appears to be good advice with SQL Server, which I’m coming to learn > > has a fairly poor query optimizer. But I would have thought Postgres’s > > optimizer would usually use a temporary table where appropriate. > > > > Curious to hear if that’s wrong. > > > I think your idea is a good one, but I do hope you present that SQL can't > > > solve everything ... else why do we have plpgsql. :) You’re correct > > > though, > > > SQL isn’t used as much as it should be in many places. > > > > An important consideration will be when relational is inappropriate. My > > biggest goal, though, is to get folks to understand how much relations > > *can* do — far too many devs in my experience don’t use the power of SQL > > because they don’t understand it. > > > > Thanks for taking the time to give me this feedback.