Re: [GENERAL] postgres generates too much processes per minute
On Thu, Jun 19, 2008 at 8:17 AM, Bill Moran <[EMAIL PROTECTED]> wrote: > In response to "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>: > >> Hello, >> >> i have the problem that postgres ist starting and stopping several (up >> to 4) processes per minute, so that the error log in windows is >> running full, with more than 14 entries every minute. >> >> Does someone know, how to reduce the start and the end of so many >> processes, is there a variable or something ? > > A process is started for every connection and ends when the connection > is closed. If you're opening/closing connections frequently, you'll > see this. It's by design. You might consider using a connection pooler as well. The point of such a thing is to allow applications requiring lots of connections to share a smaller set of real connections to the database by proxying connections. Look for pgBouncer and pgPool as example connection pooling applications - Josh / eggyknap -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Server not listening
On Mon, May 12, 2008 at 4:53 PM, D Galen <[EMAIL PROTECTED]> wrote: > If this isn't the right place to post this, please advise. > > I've spent a week trying to get PostgreSQL 8.3 to install correctly on > WIN2K. Server will load & I see the server processes loaded but none of > them have any open ports. I keep getting the message the server isn't > listening. Server set up to connect to default port 5432 on localhost but > doesn't appear to be opening the port when it loads. > >Any help would be welcome. > Thanks, > > Dennis Well, easy ones first, I guess: what does postgresql.conf look like, specifically "listen_addresses" and "port"; what are their values and are they commented out? - Josh / eggyknap -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Problem returning strings with pgsql 8.3.x
While developing PL/LOLCODE, I've found something wrong with returning strings from LOLCODE functions using 8.3.0 or greater. Using 8.4beta from a few days ago, for instance, a function that should return "test string" returns "\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F" in pgsql (sometimes the number of \x7F characters varies). In 8.2.4 it works fine. Here's the code involved, from pl_lolcode_call_handler, the call handler function for PL/LOLCODE. First, the bit that finds the FmgrInfo structure and typioparam for the result type: procTup = SearchSysCache(PROCOID, ObjectIdGetDatum(fcinfo->flinfo->fn_oid), 0, 0, 0); if (!HeapTupleIsValid(procTup)) elog(ERROR, "Cache lookup failed for procedure %u", fcinfo->flinfo->fn_oid); procStruct = (Form_pg_proc) GETSTRUCT(procTup); typeTup = SearchSysCache(TYPEOID, ObjectIdGetDatum(procStruct->prorettype), 0, 0, 0); if (!HeapTupleIsValid(typeTup)) elog(ERROR, "Cache lookup failed for type %u", procStruct->prorettype); typeStruct = (Form_pg_type) GETSTRUCT(typeTup); resultTypeIOParam = getTypeIOParam(typeTup); fmgr_info_cxt(typeStruct->typinput, &flinfo, TopMemoryContext); /*CurTransactionContext); */ ReleaseSysCache(typeTup); Here's the code that converts the return value into a Datum later on in the function: if (returnTypeOID != VOIDOID) { if (returnVal != NULL) { if (returnVal->type == ident_NOOB) fcinfo->isnull = true; else { SPI_push(); if (returnTypeOID == BOOLOID) retval = InputFunctionCall(&flinfo, lolVarGetTroof(returnVal) == lolWIN ? "TRUE" : "FALSE", resultTypeIOParam, -1); else { /* elog(NOTICE, lolVarGetString(returnVal, true)); */ retval = InputFunctionCall(&flinfo, lolVarGetString(returnVal, true), resultTypeIOParam, -1); } SPI_pop(); } } else { fcinfo->isnull = true; } } SPI_finish(); /* elog(NOTICE, "PL/LOLCODE ending"); */ return retval; returnVal is an instance of the struct PL/LOLCODE uses to store its variables. The key line in this case is the one after the commented-out call to elog. retval is a Datum type. lolVarGetString() returns the string value the returnVal struct represents -- I'm certain of that thanks to gdb and other testing. All other data types PL/LOLCODE knows about internally seem to return just fine. I'm fairly certain I'm screwing up memory somewhere, but I can't see what I've done wrong. I'm glad to provide further details, but those included above are all the ones I thought were relevant. Thanks in advance for any help you can provide. - Josh / eggyknap Note: The -hackers list seemed like the place for this post, but its list description gives instructions to try another list first, hence the post here. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using PL/R for predictive analysis of data.
On Mon, Mar 17, 2008 at 2:27 AM, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Hi Sam, > > Thankyou for the suggestions. They make perfect sense to me. I > appreciate your time and input. The lack of optimiser usage was > something that I had not considered, and I thank you for making me > aware of it. > > Cheers > > The Frog On the subject of the planner and optimizer, as of 8.3 (I think it's new to 8.3...) you can tell the planner somewhat about how it might expect your function to behave. See http://www.postgresql.org/docs/8.3/interactive/sql-createfunction.html - Josh/eggyknap -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Plotting with PL/R on *NIX - A HOWTO
People tend to like the idea of PL/R, because R draws some awfully nice (and frequently useful) eye candy, and it sounds cool to be able to make such things from inside PostgreSQL. In my (probably very limted) experience those people often find that it's harder than it looks to make these plots work, and if Google and my ability to come up with applicable search terms are any indication, instructions are not readily available. Hence this HOWTO. I hope I'll be forgiven for the cross post. This HOWTO covers two topics: "How to get PL/R to plot something other than PDFs", and "How to make a function return the image data". HOW TO GET PL/R TO PLOT SOMETHING OTHER THAN PDFS The PL/R initiate will often try to create a plot as follows: CREATE FUNCTION plr_plot_simple() RETURNS VOID AS $$ png('test.png') plot(c(1:10)) dev.off() $$ LANGUAGE plr; SImple enough, but when called, this happens: eggyknap=# select plr_plot_simple(); ERROR: R interpreter expression evaluation error DETAIL: Error in X11(paste("png::", filename, sep = ""), width, height, pointsize, : unable to start device PNG CONTEXT: In PL/R function plr_plot_simple This means R is trying to connect to an X server, and can't. R requires an X server to handle most plot devices, and PostgreSQL typically doesn't have a DISPLAY environment variable set to tell it where to find such a server. R's built-in pdf() device is a notable exception, but the PDFs R will create when asked to plot anything particularly complex are very large, and slow to render, making them an unattractive option. The answer is to tell PostgreSQL where to find an X server, and here there are two alternatives: connect to an exist X server, or make a new one. The casual desktop experimenter will probably have an X server running on the same machine as PostgreSQL, and can simply set the DISPLAY variable PostgreSQL uses to point to this existing server. Production PostgreSQL servers run X servers less commonly, so an alternative is to use Xvfb, a "virtual" X server designed for just such an occasion. This is simple enough: $ Xvfb :5 -screen 0 1024x768x24 This will start Xvfb as server 5, defining screen 0 on that server as having a resolution of 1024x768 with 24-bit color depth. Users may also need to configure Xvfb's authorization properly, using the -ac option (allow all hosts to connect to this server), or the -auth option, where is a file that contains a list of hosts allowed to connect to this Xvfb server. If this is running on the PostgreSQL server itself, most likely this file just contains "localhost" on one line. The PostgreSQL server will have to know where to connect to this server; for the Xvfb command line given above and assuming Xvfb runs on the same server as PostgreSQL, the DISPLAY variable is ":5.0". Most of the time, a script in /etc/init.d calls pg_ctl to start the database when the system boots, and the DISPLAY variable can be set there. Users of Debian packages for PostgreSQL may find that starting the package abstracts pg_ctl far enough away that it's difficult to tell in what script the variable should be set; as it turns out the packagers have created a new file for the sole purpose of managing environment variables. In my box it's called /etc/postgresql/8.2/main/environment, and can just contain the following line: DISPLAY = ":5.0" Note that since PL/R requires a properly set R_HOME environment variable, PL/R users may already have had to deal with setting environment variables inside of PostgreSQL. In fact, R provides a nice function to examine the server's environment variables, called plr_environ(). eggyknap=# select * from plr_environ() where name = 'DISPLAY'; name | value -+--- DISPLAY | :5.0 (1 row) With that environment variable set, the original function works: eggyknap=# select plr_plot_simple(); plr_plot_simple - (1 row) But where's the image? The function put it in a file called test.png, and it turns out that file is in the server's data directory by default. If we want it somewhere else, we need to code the png() call accordingly. Note that the PL/R process runs as the same user as the PostgreSQL server; that user must have proper permissions to create the specified file in the specified location, or it won't work. HOW TO MAKE A FUNCTION RETURN THE IMAGE DATA Telling PostgreSQL to make exciting graphs is all well and good, but if my end goal is to have some client somewhere (for instance, a web application) display the graphs. It's possible to define a function like plr_plot_simple() above, and use pg_read_file() to read the image from the filesystem, but only superusers can call pg_read_file(), so the client would have to connect as a superuser, which is a deal breaker for those web applications interested in staying decently secured. Another option is to make the PL/R function return the binary data. WARNING: The method shown here is fairly convoluted; I'm unconvinced that there exists no s
Re: [GENERAL] COPY ... FROM and index usage
On 11/4/07, Reg Me Please <[EMAIL PROTECTED]> wrote: > Hi all. > > I'd like to know whether the indexes on a table are updated or not during > a "COPY ... FROM" request. > > That is, should I drop all indexes during a "COPY ... FROM" in order to gain > the maximum speed to load data? > > Thanks. Although questions of "which is faster" often depend very heavily on the data involved, the database schema, the hardware, etc., typically people find it best to drop all indexes during a large import and recreate them afterward. - Josh/eggyknap ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] PostgreSQL and AutoCad
On 10/24/07, Bob Pawley <[EMAIL PROTECTED]> wrote: > Is there any way of converting text from an AutoCad (.dwg ot .dxf) file into > a PostgreSQL Database?? > > Bob Pawley I know nothing of AutoCad, but your message has been sitting for a while without response, so I'll throw out the suggestion that you probably want AutoCad to export the text to some more common format (like a ASCII or UTF8 file or some such) and import that. -Josh/eggyknap ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Partitioned tables, rules, triggers
On 10/6/07, Goboxe <[EMAIL PROTECTED]> wrote: > Josh, > > Thanks for sharing a very good info on partitioning. > Don't thank me -- this comes from Robert Treat. I'm just the messenger :) -Josh/eggyknap ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Partitioned tables, rules, triggers
On 10/3/07, Goboxe <[EMAIL PROTECTED]> wrote: > Hi, > > I have two questions on the above: > > 1. I found in some postings recommended to use triggers instead of > rules. Is this documented somewhere? A howto is available at http://images.omniti.net/omniti.com/talks/partitions-public.pdf - Josh/eggyknap ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Design Question (Time Series Data)
On 10/4/07, Ted Byers <[EMAIL PROTECTED]> wrote: > --- Michael Glaesemann <[EMAIL PROTECTED]> wrote: > > > > > On Oct 4, 2007, at 9:30 , Ted Byers wrote: > > > > > I do not know if PostgreSQL, or any other RDBMS, > > > includes the ability to call on software such as > > "R" > > > > See PL/R: > > > > http://www.joeconway.com/plr/ > > > Thanks. Good to know. See also RdbiPgSQL (http://bioconductor.org/packages/2.0/bioc/html/RdbiPgSQL.html) PL/R lets you write R functions as procedural functions you can call from pgsql (e.g. select my_r_function(myfield) from mytable. RdbiPgSQL creates R functions you can use to query pgsql from within R. -Josh/eggyknap ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Feature Request - Defining default table space for Indexes in Conf file
On 10/1/07, S Sharma <[EMAIL PROTECTED]> wrote: > Hi All, > > The default table space defined in db conf file is used for all database > tables as well as indexes. So putting the indexes on another table space > requires manually dropping and re-creating indexes. > It would be nice to have a feature to define a default table space for > indexes in db conf file and all indexed are created in that table space. > This would allow creating a good database architecture to avoid disc > contention easily. > > Thanks > Data_arch Although the most basic optimization suggested when using tablespaces is always "Put indexes on one and data on another to avoid disk contention", I doubt that the ideal optimization for many workloads, which means sticking such a thing in a config file might not be such a good idea. In other words, a DBA probably ought to think harder about optimizing his/her use of tablespaces than just "I'll put indexes on this one and data on another". See http://www.depesz.com/index.php/2007/09/30/finding-optimum-tables-placement-in-2-tablespace-situation/ and http://people.planetpostgresql.org/xzilla/ for two recent blog posts on the subject. But now I'll be quiet, because I have no evidence to prove any of the above :) - Josh ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Open Source Application Server (PostgreSQL Perspective)
On 9/1/07, john_sm <[EMAIL PROTECTED]> wrote: > > Hey Guys, can you suggest any Open Source Application Server for SOA > deployments. Also, wondering, if our needs are somewhat lightweight, is > there some lightweight open source Application Server you could Suggest. What language are your applications written in? What services do you expect your app server to provide? - Josh/eggyknap ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Obtaining random rows from a result set
On 8/31/07, Alban Hertroys <[EMAIL PROTECTED]> wrote: > Hello, > > I've recently been busy improving a query that yields a fixed number of > random records matching certain conditions. I have tried all the usual > approaches, and although they do work, they're all limited in some way > and don't translate really well to what you "want". They're kludges, IMHO. > > The methods I've tried are explained quite well on > http://people.planetpostgresql.org/greg/index.php?/archives/40-Getting-random-rows-from-a-database-table.html > > All these methods involve calculating a random number for every record > in the result set at some point in time, which is really not what I'm > trying to model. I think the database should provide some means to get > those records, so... > > Dear Santa, > > I'd like my database to have functionality analogue to how LIMIT works, > but for other - non-sequential - algorithms. > > I was thinking along the lines of: > > SELECT * > FROM table > WHERE condition = true > RANDOM 5; > > Which would (up to) return 5 random rows from the result set, just as > LIMIT 5 returns (up to) the first 5 records in the result set. > > > Or maybe even with a custom function, so that you could get non-linear > distributions: > > SELECT * > FROM table > WHERE condition = true > LIMIT 5 USING my_func(); > > Where my_func() could be a user definable function accepting a number > that should be (an estimate of?) the number of results being returned so > that it can provide pointers to which rows in the resultset will be > returned from the query. > > Examples: > * random(maxrows) would return random rows from the resultset. > * median() would return the rows in the middle of the result set (this > would require ordering to be meaningful). > > What do people think, is this feasable? Desirable? Necessary? > > If I'd have time I'd volunteer for at least looking into this, but I'm > working on three projects simultaneously already. Alas... > > Regards, > Alban Hertroys. > > -- > Alban Hertroys > [EMAIL PROTECTED] > > magproductions b.v. > > T: ++31(0)534346874 > F: ++31(0)534346876 > M: > I: www.magproductions.nl > A: Postbus 416 >7500 AK Enschede > > // Integrate Your World // > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend > It seems to me that anything that wants to return a random set of rows will need to calculate a random number for all the rows it processes, unless you change how the database scans rows in indexes or tables, which if at all possible will probably make things *really* slow. If it's a given that the database will always sequentially scan whatever it is the query plan tells it to scan, you're pretty much stuck with the rows in your result set being in the same order unless you start picking random numbers. One possible alternative not mentioned on the site you linked to is to as follows: select [whatever] from [table] where random() < [some number between 0 and 1] limit [limit value] That doesn't require assigning a random number for *every* row in the table, nor does it require sorting everything. It does mean that numbers encountered earlier in the query processing have a higher likelihood of being returned, and it also means that there's some chance you won't actually get as many as [limit value] rows returned. jtolley=# create table a (i integer); CREATE TABLE jtolley=# insert into a (i) select * from generate_series(1, 100); INSERT 0 100 jtolley=# create table a (i integer); CREATE TABLE jtolley=# insert into a (i) select * from generate_series(1, 100); INSERT 0 100 jtolley=# select * from a where random() < .1 limit 3; i 22 23 25 (3 rows) Hope this helps... -Josh ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] URGENT: Whole DB down ("no space left on device")
On 8/31/07, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote: > Phoenix Kiula írta: > > I am getting this message when I start the DB: > > > > > > psql: FATAL: could not access status of transaction 0 > > DETAIL: Could not write to file "pg_subtrans/01F8" at offset 221184: > > No space left on device. > > > > > > What is this about and how do I solve this? A "df -h" on my system shows > > this: > > > > > > FilesystemTypeSize Used Avail Use% Mounted on > > ... > > /dev/sda2 ext39.9G 9.5G 0 100% /var > > In addition to what others have already said, when things calm down you should consider implementing some sort of monitoring system that is configured to start screaming before you run into problems like this. At my place of work, we've set up Nagios to monitor the space left on various partitions, and email us when a partition gets above 90% full. - eggyknap ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Can this function be declared IMMUTABLE?
On 8/27/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Hello, > > I have a question about whether I can safely declare a function IMMUTABLE. > Citing the documentation under "Function Volatility Categories" in the > section on "Extending SQL": > > It is generally unwise to select from database tables within an IMMUTABLE > function at all, since the immutability will be broken if the table > contents ever change. > > > Well, I am considering a function that does read from a table, but the > table contents change extremely infrequently (the table is practically a > list of constants). Would it be safe to declare the function IMMUTABLE > provided that the table itself is endowed with a trigger that will drop > and recreate the function any time the table contents are modified? In > this way, it seems that the database would gain the performance benefit of > an immutable function for the long stretches of time in between changes to > the table. > > I apologize that I don't have any details -- it is still very early in the > development of the database design, and I was just hoping to get a better > understanding of whether an immutable function would safely offer any > benefit in this scenario. > Lemme see if I can embarrass myself trying to answer something like this. It seems like your function really ought to be declared STABLE, because during a single transaction MVCC will make sure your function sees the same values in its references table each time you call it, but between transactions the reference table might change, changing the result of the function. The benefits of an IMMUTABLE function over a STABLE one, as far as I know, are these: 1) The planner can take advantage of the fact that this function is IMMUTABLE to evaluate it only once if its arguments are constant, and keep this result throughout the life of the query plan. If you don't cache the query plan (e.g. with PREPARE) this is identical to STABLE, but if you PREPARE a query, for instance, involving a call to an IMMUTABLE function with constant arguments, the system can evaluate the function only once during planning, and never again, whereas for STABLE you'd have to execute the function each time it was called. For this to be a big win over STABLE, you have to both call your function with constant arguments and cache the query plan somehow, such as by having the query inside another pl/pgsql function or by using PREPARE. 2) You can use IMMUTABLE functions, but not STABLE ones, in expression-based indexes. If you declared your function IMMUTABLE, you could build an index on a bunch of data using an index expression involving your function, have your referenced table change somehow, and end up not being able to correctly use that index anymore. There may well be other advantages of IMMUTABLE over STABLE that I don't know about. Were we talking about data I was supposed to care for, I'd make the function STABLE, not IMMUTABLE, because that's the most appropriate for the function. -Josh ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Writing most code in Stored Procedures
On 8/18/07, Ron Johnson <[EMAIL PROTECTED]> wrote: > Interesting. Does PG have to initiate the Perl interpreter every > time you call a Perl-written SP? IIRC PostgreSQL should only load the perl interpreter once per session. - Josh ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Partitioning
On 8/18/07, Julio Cesar Sánchez González <[EMAIL PROTECTED]> wrote: > Hi guys, > > It's natural what master table in the partitioning table contain data > (http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html) ? > or to be empty. I'm no partitioning expert, but I would say most of the time the parent table is supposed to be empty. That said, partitioning schemes need to be developed based on what data you have and what exactly you're trying to do. If you're using the common example of keeping a bunch of historical data where partitions represent a distinct chunk of time (for instance, one partition for every month of data) you probably want to keep the parent table empty. Every row has a date, and so there's a child table for each row naturally, so it doesn't make sense to put anything in the parent tables. But I'm sure someone can come up with a scheme where having data in the parent is also useful. - Josh ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Writing most code in Stored Procedures
On 8/15/07, Rohit <[EMAIL PROTECTED]> wrote: > I have few queries regarding the use of Stored Procedures, Functions > and Triggers in an RDBMS. > > (1) When to use Stored Procedure? Writing an INSERT query in a Stored > Procedure is better or firing it from the application level? > > (2) Can a Trigger call a Stored Procedure? > > (3) What type of code must reside in application and what type in > RDBMS? > > (4) Is it faster to work at application level or at the database level? One of the major advantages of stored procedures over application code is that stored procedures can get to the data much more quickly than an application can, in general. An application needs to talk to PostgreSQL through some sort of driver (JDBC, libpq, etc.) and typically data need to traverse a network to get from the database machine to the application server. Stored procedures, on the other hand, don't have the overhead of either the network or the driver. However, stored procedures can be more difficult to debug, profile, etc., so they might not be the best for really complex logic. They tend to be really useful if you're doing something that requires lots and lots of queries to the database, and don't need anything else (data from other sources, user interaction, etc.), because that takes greatest advantage of their quick connection to the data. Richard Huxton's point that stored procedures are typically best for data integrity types of functions, whereas business logic should often be in application code is an excellent rule of thumb. -Josh ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] language interface in postgresql
On 8/15/07, Magnus Hagander <[EMAIL PROTECTED]> wrote: > Trevor Talbot wrote: > > On 8/14/07, Jasbinder Singh Bali <[EMAIL PROTECTED]> wrote: > > > >> Let me fine tune my question here. What I mean to say is the way we can > >> write stored procedures in C, perl etc in Postgres specifying the language > >> parameter at the end of stored procedure, compared to that, in SQL Server > >> 2000 I've seen SP writing in pure SQL only. > >> Can you write Stored Procedures in SQL Server 2000 or Oracle in different > >> programing languages as well ? > > > > AFAIK SQL Server 2000 only has a C interface as the other option; CLR > > hosting was added in SQL Server 2005. Because the CLR is a virtual > > machine that runs compiled bytecode, and compilers for all of the > > available languages are not necessarily available at runtime, it > > doesn't make sense to specify such code in source form. The process > > is more like creating a function in C in PostgreSQL (compile and load > > a shared library). Details here, if you're curious: > > http://msdn2.microsoft.com/en-us/library/ms345136.aspx > > > > I don't know what Oracle supports. > > I believe Oracle support Java in the same way MSSQL supports .net, give > or take. I don't know specifics of what exactly you can do with it nor how exactly you go about it, but I know at least Oracle 10g supports extension in Java. We had grand plans to improve some sort of processing by writing a comparison function in Java for one the Oracle databases I've been cursed to associate with. I don't know of any other languages supported by Oracle for stored procedures and the like, though it does have an ECPG-like system for C, C++, COBOL, FORTRAN, and PL/1 of all things. - Josh - Josh ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] query help
On 8/14/07, Kirk Wythers <[EMAIL PROTECTED]> wrote: > > I need some help with rewriting a query. I have a fairly complicated query > (for me anyway) that dumps daily climate data, filling in missing data with > monthly averages (one line per day). > > I want to output monthly averages (one line per month). I am having a hard > time wrapping my head around this. Particularly how to deal with the doy > column (day of year). I have tried several approaches and my forehead is > starting to get my keyboard bloody. I think this came up on IRC today, so perhaps this is only for the archives' sake, but you want to do something like this: Assuming you have a table as follows: CREATE TABLE climate_data ( measurement_time timestamp, measurement_value integer); ...and you insert data into it regularly, you can get the average measurement over a period of time with date_trunc(), which will truncate a date or timestamp value to match whatever precision you specify. For example, see the following: eggyknap=# select date_trunc('month', now()); date_trunc 2007-08-01 00:00:00-06 (1 row) Note: the -06 at the end means I'm in mountain time. So if you want to get the average measurement over a month's time, you need to do something like this: SELECT DATE_TRUNC('MONTH', measurement_time), AVG(measurement_value) FROM climate_data GROUP BY DATE_TRUNC('MONTH', measurement_time); This will chop all the measurement_time values down to the month the measurement was taken in, put all measurements in groups based on the resulting value, and take the average measurement_value from each group. - Josh ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Intelligent Database in postgreSQL
On 8/6/07, Rodrigo Faccioli <[EMAIL PROTECTED]> wrote: > Hi, > > I'm a new user's postgreSQL and I have a question: Is possible add function > of intelligent System like Neural Networks or Fuzzy Logic within postgre > SGDB? > > For example: I have a database and I want to execute a select instruction. > But, this select constains there is column that uses a Perceptron Neural > Networks for return a result, follow below: > > Select perceptron(vendas_cli) > from client > Where vendas_cli > 1000 > > Thanks, PostgreSQL allows you to add your own functions in any of several different languages, including Perl and C. If you can code your perceptron function in one of the languages PostgreSQL understands, you can do exactly what you suggest. Refer to http://www.postgresql.org/docs/8.2/static/xplang.html for starter documentation. - Josh ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] PG for DataWarehouse type Queries
On 8/3/07, Ow Mun Heng <[EMAIL PROTECTED]> wrote: > Can anyone shed some light on this. I just would like to know if queries > for raw data (not aggregregates) is expected to take a long time. > Running times between 30 - 2 hours for large dataset pulls. > > Involves lots of joins on very large tables (min 1 millon rows each > table, 300 columns per table) > > Joins are done in the form of Left joins (sometimes on the same tables, > due to normalisation) > > Is 30min - 2hours too long or is this considered "normal"?? That depends entirely on your query, your dataset, the machine you're using, etc. Your best bet is probably to post an EXPLAIN ANALYZE of your query along with some description of what's going on and what you're running this all on (pg version, machine specs, etc.) and see if someone can come up with an acceptable way to make the query faster. - Josh ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] What do people like to monitor (or in other words, what might be nice in pgsnmpd)?
On 8/2/07, Gavin M. Roy <[EMAIL PROTECTED]> wrote: > Are you contemplating providing access to data that's currently not stored > in the pg_ catalog tables? I currently monitor the statio data, > transactions per second, and active/idle backends. Things that I think > would be useful would be average query execution time, longest execution > time, etc. Other pie in the sky ideas would include current level of total > bloat in a database, total size on disk of a database broken down by tables, > indexes, etc. > > Regards, > > Gavin My own goal is to have pgsnmpd able, as much as possible, to fill the same role the set of scripts an arbitrary PostgreSQL DBA sets up on a typical production server. That includes statistics tables and catalog tables, but certainly isn't limited to just that. So doing things like categorizing total sessions in interesting and useful ways (for instance, # of idle connections, # of active connections, max transaction length, etc.) are certainly within pgsnmpd's purview. In short, all the suggestions you listed are useful, and provided the framework allows us to get reasonably good values for them, worthy of implementation in pgsnmpd. Thanks. -Josh ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] What do people like to monitor (or in other words, what might be nice in pgsnmpd)?
Work is beginning on pgsnmpd v 2.0, and I figured it would be a good time to ask folks what they typically like to monitor, so we can make sure pgsnmpd instruments it properly. The current version of pgsnmpd supports something called RDBMS-MIB, which is a set of data designed to be applicable to any relational database, so it doesn't get very PostgreSQL-specific. The next version will augment that with PGSQL-MIB, which we have yet to write. PGSQL-MIB should contain data elements for, ideally, anything specific to the database that someone could possibly want to monitor in a generic PostgreSQL installation within reason. Things like CPU load, available disk space, total system memory, etc. would not be included, because they're not PostgreSQL specific, but things like CPU and memory usage of individual PostgreSQL processes are very good candidates for inclusion in PGSQL-MIB. Current plans have us including SNMP representations of all the statistics tables as well as the system catalogs, runtime information about PostgreSQL processes (such as CPU and RAM usage), shared memory usage information, and potentially mechanisms to easily include administrator-specified queries and generate SNMP traps based on LISTEN/NOTIFY. So please respond, if you feel so inclined, describing things you like to monitor in your PostgreSQL instances as well as things you would like to be able to easily monitor in a more ideal world. Many thanks, and apologies for any breach of netiquette I may have committed in posting to two lists simultaneously. - Josh Tolley ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Difference between PRIMARY KEY index and UNIQUE-NOT NULL index
On 7/20/07, Michael Glaesemann <[EMAIL PROTECTED]> wrote: On Jul 20, 2007, at 17:54 , Vincenzo Romano wrote: > In an inner join involving a 16M+ rows table and a 100+ rows table > performances got drastically improved by 100+ times by replacing a > UNIQUE-NOT NULL index with a PRIMARY KEY on the very same columns in > the very same order. The query has not been modified. There should be no difference in query performance, AIUI. If I read the documentation correctly, PRIMARY KEY is simply syntactic sugar equivalent to UNIQUE + NOT NULL, the only difference being that a PRIMARY KEY is reported as such to someone looking at the table structure, which becomes more intuitive than seeing UNIQUE + NOT NULL. > In the older case, thanks to the EXPLAIN command, I saw that the join > was causing a sort on the index elements, while the primary key was > not. Might it just be that the original UNIQUE + NOT NULL index was bloated or otherwise degraded, and reindexing it would have resulted in the same performance gain? That's just a guess. -Josh ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] The speed problem of Varchar vs. Char
On 7/8/07, Crystal <[EMAIL PROTECTED]> wrote: Hi All, Our company need to save contact details into the PostgreSQL database. I just begin to learn it, so I got many questions. I am not sure which data type I should choose for website address, varchar or char. The website address may be very long, and we also don't want to lose the speed. Thus, the question is: if we have a large contact database, how much slowdown or speed up will be expected if we choose variable length rather than fixed length? Thanks forward. Best Wishes, Crystal ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ See http://www.postgresql.org/docs/8.2/interactive/datatype-character.html (or whatever the corresponding page is for the version you're using). Specifically, the following: Tip: There are no performance differences between these three types, apart from the increased storage size when using the blank-padded type. While character(n) has performance advantages in some other database systems, it has no such advantages in PostgreSQL. In most situations text or character varying should be used instead. - Josh ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] ORDER BY with exception
On 6/21/07, brian <[EMAIL PROTECTED]> wrote: I have a lookup table with a bunch of disciplines: # SELECT id, name FROM discipline; id |name +- 1 | writing 2 | visual arts 3 | music 4 | dance 5 | film and television 6 | theatre 7 | media arts 8 | community 9 | fine craft 10 | other (10 rows) and a function that returns each discipline name along with the total number of records in another table (showcase) that are related to each discipline. Each showcase entry may have 0 or more items (showcase_item) related to it, so ones that have no items are disregarded here. Also, only showcases that have been accepted should be counted. First, here's the working function: CREATE FUNCTION getshowcasetotalsbydiscipline(OUT name text, OUT total integer) RETURNS SETOF record AS $$ DECLARE rec record; BEGIN FOR rec IN EXECUTE 'SELECT id, name, 1 AS total FROM discipline' LOOP name := rec.name; SELECT INTO rec.total -- a showcase may be in the DB but not accepted by an admin SUM(CASE s.accepted WHEN TRUE THEN 1 ELSE 0 END) FROM showcase AS s WHERE s.id IN -- a showcase may exist with no items, so should be ignored (SELECT si.showcase_id FROM showcase_item AS si WHERE si.discipline_id = rec.id); -- If no showcase items have this discipline, -- give it a total of zero IF rec.total IS NULL THEN SELECT INTO total 0; ELSE total := rec.total; END IF; RETURN NEXT; END LOOP; RETURN; END; $$ LANGUAGE plpgsql IMMUTABLE; test=# SELECT * FROM getShowcaseTotalsByDiscipline(); name | total -+--- writing |130 visual arts |252 music |458 dance |131 film and television |102 theatre |271 media arts | 83 community | 20 fine craft | 78 other | 59 (10 rows) Works fine, but i'd like to order the disciplines alphabetically *except* have 'other' fall at the end. So, should i loop a second time, after summing the totals, and keep the 'other' row aside, then add it to the end? (btw, the output of this function is cached until a new showcase is accepted) Or, should i re-order the disciplines alphabetically in the lookup trable, keeping 'other' to be last? I could do the latter, although it would mean a fair bit of work because the disciplines table relates to a bunch of other stuff, as well. Also, there's always the chance that a new discipline will be added in the future. I suppose i could write a trigger that bumped the 'other' id above that of the new entry, then re-relate everything else in the DB that's connected to the 'other' discipline. But that strikes me as kind of a hack. The third option is to re-order the resultset in the PHP script that displays this. But that wasn't why i chose Postgres for this app ;-) brian ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster It seems to me you could replace it all with one query, something like this: SELECT discipline, COUNT(1) FROM showcase WHERE EXISTS (SELECT * FROM showcase_item WHERE showcase_id = showcase.id LIMIT 1) GROUP BY discipline ORDER BY (discipline != 'other'), discipline; - Josh ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] finding items with 0 rels for a 0 to many relationship
On 6/21/07, danmcb <[EMAIL PROTECTED]> wrote: Hi I have two tables, say A and B, that have a many-to-many relationship, implemented in the usual way with a join table A_B. How can I economically find all the rows in table A whose id's are not in A_B at all (i.e. they have zero instances of B associated)? Use a left join. For instance, say there are a.id and b.id columns, which are the primary keys in A and B respectively. Also say A_B contains columns aid and bid which reference a.id and b.id respectively. SELECT * FROM A LEFT JOIN A_B ON (A.ID = A_B.AID) WHERE A_B.BID IS NULL; - Josh ---(end of broadcast)--- TIP 6: explain analyze is your friend