Hi Greg, So my exams are over now and am fully committed to the project in terms of time. I have started compiling a sort of personal todo for myself. I agree with your advice to start the project with small steps first. (I have a copy of the code and am trying to glean as much of it as I can) I would really appreciate your reply to Josh's thoughts. It would help me understand the variety of tasks and a possible ordering for me to attempt them. Josh's comments :* "What would you list as the main things pgtune doesn't cover right now? I have my own list, but I suspect that yours is somewhat different.* * * *I do think that autotuning based on interrogating the database is possible. However, I think the way to make it not be a tar baby is to tackle it one setting at a time, and start with ones we have the most information for. One of the real challenges there is that some data can be gleaned from pg_* views, but a *lot* of useful performance data only shows up in the activity log, and then only if certain settings are enabled."* Regards, Shiv
On Thu, Apr 28, 2011 at 9:34 PM, Shiv <[email protected]> wrote: > That's some great starting advice there. I have a couple of final exams in > the next 36 hours. Will get to work almost immediately after that. > I will definitely take small steps before going for some of the tougher > tasks. I would of-course like this conversation to go on, so I can see a > more comprehensive TODO list. > One of my first tasks on GSoC is to make sure I create a good project > specification document. So there can be definite expectations and targets. > This conversation helps me do that! > Regards, > Shiv > > > On Thu, Apr 28, 2011 at 9:50 AM, Greg Smith <[email protected]> wrote: > >> Shiv wrote: >> >>> On the program I hope to learn as much about professional software >>> engineering principles as PostgreSQL. My project is aimed towards extending >>> and hopefully improving upon pgtune. If any of you have some ideas or >>> thoughts to share. I am all ears!! >>> >> >> Well, first step on the software engineering side is to get a copy of the >> code in a form you can modify. I'd recommend grabbing it from >> https://github.com/gregs1104/pgtune ; while there is a copy of the >> program on git.postgresql.org, it's easier to work with the one on github >> instead. I can push updates over to the copy on postgresql.org easily >> enough, and that way you don't have to worry about getting an account on >> that server. >> >> There's a long list of suggested improvements to make at >> https://github.com/gregs1104/pgtune/blob/master/TODO >> >> Where I would recommend getting started is doing some of the small items >> on there, some of which I have already put comments into the code about but >> just not finished yet. Some examples: >> >> -Validate against min/max >> -Show original value in output >> -Limit shared memory use on Windows (see notes on shared_buffers at >> http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server for more >> information) >> -Look for postgresql.conf file using PGDATA environment variable >> -Look for settings files based on path of the pgtune executable >> -Save a settings reference files for newer versions of PostgreSQL (right >> now I only target 8.4) and allow passing in the version you're configuring. >> >> A common mistake made by GSOC students is to dive right in to trying to >> make big changes. You'll be more successful if you get practice at things >> like preparing and sharing patches on smaller changes first. >> >> At the next level, there are a few larger features that I would consider >> valuable that are not really addressed by the program yet: >> >> -Estimate how much shared memory is used by the combination of settings. >> See Table 17-2 at >> http://www.postgresql.org/docs/9.0/static/kernel-resources.html ; those >> numbers aren't perfect, and improving that table is its own useful project. >> But it gives an idea how they fit together. I have some notes at the end >> of the TODO file on how I think the information needed to produce this needs >> to be passed around the inside of pgtune. >> >> -Use that estimate to produce a sysctl.conf file for one platform; Linux >> is the easiest one to start with. I've attached a prototype showing how to >> do that, written in bash. >> >> -Write a Python-TK or web-based front-end for the program. >> >> Now that I know someone is going to work on this program again, I'll see >> what I can do to clean some parts of it up. There are a couple of things >> it's easier for me to just fix rather than to describe, like the way I >> really want to change how it adds comments to the settings it changes. >> >> -- >> Greg Smith 2ndQuadrant US [email protected] Baltimore, MD >> PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us >> >> >> >> #!/bin/bash >> >> # Output lines suitable for sysctl configuration based >> # on total amount of RAM on the system. The output >> # will allow up to 50% of physical memory to be allocated >> # into shared memory. >> >> # On Linux, you can use it as follows (as root): >> # >> # ./shmsetup >> /etc/sysctl.conf >> # sysctl -p >> >> # Early FreeBSD versions do not support the sysconf interface >> # used here. The exact version where this works hasn't >> # been confirmed yet. >> >> page_size=`getconf PAGE_SIZE` >> phys_pages=`getconf _PHYS_PAGES` >> >> if [ -z "$page_size" ]; then >> echo Error: cannot determine page size >> exit 1 >> fi >> >> if [ -z "$phys_pages" ]; then >> echo Error: cannot determine number of memory pages >> exit 2 >> fi >> >> shmall=`expr $phys_pages / 2` >> shmmax=`expr $shmall \* $page_size` >> >> echo \# Maximum shared segment size in bytes >> echo kernel.shmmax = $shmmax >> echo \# Maximum number of shared memory segments in pages >> echo kernel.shmall = $shmall >> >> >
