On Sat, 3 Nov 2007, Ted Byers wrote:

As one of these programmers, where is the best place to find the information I need to get it right...I ask you where I can learn what you believe a good DBA needs to know.

What a DBA should know in general is a little different from the question I think you want an answer to, which is "what should a programmer know so that they can effectively work like/without a DBA?"

There's an academic answer to that question. I could tell you to learn something about data normalization, indexing, what happens on the server when you join two tables, and how cursors work in your language of choice. But without some practice, I don't know how much of that would stick.

The most valuable exercise I think someone with a good programming background, but is relatively new to databases, can go through is to work on a dramatically larger data set than you would normally encounter. The main thing I've seen developers do wrong is writing code or designing tables that don't scale well. Since coding works better when you can get quick feedback after changes, it's very easy to settle into working with only small test cases, and that can turn into a serious problem when such code runs into the real world.

The only way to really understand how to think more like a DBA is to try and write something that works well against a big pile of data. To throw out some simple guidelines, you want to be working with a database that's at least 10X as big as the amount of RAM on your system, and if you do something that scans the full table like "select * from x" that should take at least a couple of minutes to complete.

Now, try to build a program that operates quickly on subsets of this data. Working on this scale will let you discover very quickly if you've made any of the blatant mistakes that programmers unfamiliar with working on full-size data sets tend to make. Most importantly, you don't ever want to write queries that assume it's OK to send all the data to the client to sort through, rather than pushing as much as possible toward the database server. Second, you'll end up needing to construct indexes properly to optimize the server side. And if there's more than one big table involved, you'll be forced to come to grips with how joins work and probably learn something about normalization.

You'll know you're learning something valuable whenver you run something that you expect to return almost instantly, but instead it churns away for minutes before finishing. Resist the urge to stop it too quickly, and instead spend that time thinking about what's gone wrong, or monitoring the client and/or server for clues.

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Reply via email to