Re: [HACKERS] SELECT constant; takes 15x longer on 9.0?

2010-04-07 Thread Pavel Stehule
2010/4/7 Greg Smith g...@2ndquadrant.com:
 Merlin Moncure wrote:

 On Tue, Apr 6, 2010 at 3:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:


 Greg has the right idea: show debug_assertions.


 why not the entire set of configure options?


 Given that the standard way to find those is pg_config, there's a couple of
 reasons why not to prefer that, on top of those Tom already mentioned:

 1) pg_config isn't in the standard PostgreSQL package set in some
 distributions (it's sometimes in the -devel package), so it may not be
 available; debug_assertions is always there if you have psql.  For my goals,
 which include benchmarking scripts I often distribute to other people, that
 matters.

 2) It's easy to get pg_config output from your client that doesn't actually
 match the running server, particularly when developing.  That's been the
 source of more than one of the times I was running a debug build on the
 server but didn't notice it, and therefore would have produced worthless
 performance numbers.  Given that the main slowdowns from having assertions
 turned on are server side, whether or not the local client running things
 like psql have them turned on or not doesn't worry me as much.

 3) It's a little easier to check the value of show in a script to confirm
 you're not running a bad build than to parse the output from pg_config.
 Here's the recipe I use for shell scripts:

 #!/bin/sh
 DEBUG=`psql -At -c show debug_assertions`
 if [ $DEBUG = on ] ; then
   echo Debug build - aborting performance test
   exit 1
 fi

 Pushing this data into something like version() would solve the first two
 items above, while making the issue of how to parse the results in a test
 client even harder, given there's already too much junk in one big string
 there.  You couldn't make the above check much simpler, which makes it hard
 to justify any alternative approach to grab this data.


good idea. Can do it to 9.0? It has zero impact on behave and can help
to protect us against same bug.

I plan one night test fest on begin of may in Prague, and wouldn't do
same mistake :). But it is really import feature - maybe can signaled
in promt in future.

Regards
Pavel Stehule
 --
 Greg Smith  2ndQuadrant US  Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com   www.2ndQuadrant.us


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pending patch: Re: [HACKERS] Streaming replication and pg_xlogfile_name()

2010-04-07 Thread Heikki Linnakangas
Fujii Masao wrote:
 On Tue, Apr 6, 2010 at 11:48 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Fujii Masao wrote:
 On Tue, Apr 6, 2010 at 8:02 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Should we throw an error in pg_xlogfile_name() if called during
 recovery? It's not doing anything useful as it is.
 I have no objection for now.
 Here is the patch.
 ...
 + if (RecoveryInProgress())
 + ereport(ERROR,
 + 
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 +  errmsg(recovery is in progress),
 +  errhint(WAL control functions cannot be 
 executed during recovery.)));
 +
 The hint is a bit confusing for pg_xlogfile_name(). pg_xlogfile_name()
 is hardly a WAL control function like pg_start/stop_backup() are. How
 about pg_xlogfile_name() cannot be executed during recovery.?
 
 OK. Here is the revised patch.

Thanks, committed.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set LC_COLLATE to de_DE_phoneb

2010-04-07 Thread Takahiro Itagaki

Frank Jagusch fr...@jagusch-online.de wrote:

 The german phone book order has the windows language setting
 de_DE_phoneb an the value 00010407 in the registry. Unfortunately I was
 not able to find a corresponding string for the LC_COLLATE setting.

I cannot find any resources for de_DE_phoneb in Web. What is the true
name for it? Locale names should be in Country_Language.CodePage
format on Windows. If you can find the counterpart name for it, you can
initialize PostgreSQL DB with the locale, and CodePage or UTF-8 encoding.

 Background: I moved an old application from a borland paradox database
 to postgesql. The speed gain is great but the sorting order isn't the
 usual to the user. I can't change the order by clauses of the select
 statements because they are generated by the borland database engine.

I'm afraid of de_DE_phoneb is an original locale implementation in your
old database. If so, PostgreSQL cannot support it because postgres depends
on locale libraries in each platform. (i.e., msvcrt on Windows)

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fallback_application_name and pgbench

2010-04-07 Thread Takahiro Itagaki

Fujii Masao masao.fu...@gmail.com wrote:

 By default, the application_name of pgbench is [unknown]. But I think
 that pgbench should use fallback_application_name as psql, pg_dump,
 etc does. Is it worth creating the patch?

If we will take care of fallback_application_name for contrib modules,
we need to fix not only pgbench but also oid2name and dblink.
I think those fixes would be worth; at least for telling third-party
developers to handle the new parameter.

It might be better to set fallback_application_name automatically
in libpq, but it requires argv[0] away from main() function.
GetModuleFilename() is available on Windows for the purpose,
but I don't know what API is available on other platforms.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fallback_application_name and pgbench

2010-04-07 Thread Magnus Hagander
On Wed, Apr 7, 2010 at 10:21, Takahiro Itagaki
itagaki.takah...@oss.ntt.co.jp wrote:

 Fujii Masao masao.fu...@gmail.com wrote:

 By default, the application_name of pgbench is [unknown]. But I think
 that pgbench should use fallback_application_name as psql, pg_dump,
 etc does. Is it worth creating the patch?

 If we will take care of fallback_application_name for contrib modules,
 we need to fix not only pgbench but also oid2name and dblink.
 I think those fixes would be worth; at least for telling third-party
 developers to handle the new parameter.

Uh, why fallback_application_name? Isn't this the *exact* usecase
where application_name should be used? At least for the two apps -
fallback_app_name might be correct for dblink.

And yes, I think it's a good idea to set it for at least pgbench and oid2name.


 It might be better to set fallback_application_name automatically
 in libpq, but it requires argv[0] away from main() function.
 GetModuleFilename() is available on Windows for the purpose,
 but I don't know what API is available on other platforms.

I think that's a pretty bad idea in general. But if we do, then it
should at least never override something that's specified - we need to
keep the ability for tools like psql and pgadmin to set it, regardless
of what they happen to have as argv[0].

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fallback_application_name and pgbench

2010-04-07 Thread Dave Page
On Wed, Apr 7, 2010 at 10:16 AM, Magnus Hagander mag...@hagander.net wrote:
 Uh, why fallback_application_name? Isn't this the *exact* usecase
 where application_name should be used? At least for the two apps -
 fallback_app_name might be correct for dblink.

 And yes, I think it's a good idea to set it for at least pgbench and oid2name.

For pgbench, I can imagine scenarios where you might want to override
the name in a script - for example, if you're trying to simulate an
environment with multiple workload types running against the same
database.

 I think that's a pretty bad idea in general. But if we do, then it
 should at least never override something that's specified - we need to
 keep the ability for tools like psql and pgadmin to set it, regardless
 of what they happen to have as argv[0].

We discussed that during the development of the patch - the original
idea was to default to argv[0] if no other value was set, but
apparently there's no portable way to get at argv[0] from within
libpq, even if you ignore Windows.


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and a disk full in primary

2010-04-07 Thread Heikki Linnakangas
This task has been languishing for a long time, so I took a shot at it.
I took the approach I suggested before, keeping a variable in shared
memory to track the latest removed WAL segment. After walsender has read
a bunch of WAL records from a WAL file, it checks that what it read is
after the latest removed WAL segment, otherwise the data it read might
have came from a file that was already recycled and overwritten with new
data, and an error is thrown.

This changes the behavior so that if a standby server doing streaming
replication falls behind too much, the primary will remove/recycle a WAL
segment needed by the standby server. The previous behavior was that WAL
segments still needed by any connected standby server were never
removed, at the risk of filling the disk in the primary if a standby
server behaves badly.

In your version of this patch, the default was still the current
behavior where the primary retains WAL files that are still needed by
connected stadby servers indefinitely. I think that's a dangerous
default, so I changed it so that if you don't set standby_keep_segments,
the primary doesn't retain any extra segments; the number of WAL
segments available for standby servers is determined only by the
location of the previous checkpoint, and the status of WAL archiving.
That makes the code a bit simpler too, as we never care how far the
walsenders are. In fact, the GetOldestWALSenderPointer() function is now
dead code.

Fujii Masao wrote:
 Thanks for the review! And, sorry for the delay.
 
 On Thu, Jan 21, 2010 at 11:10 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 I don't think we should do the check XLogWrite(). There's really no
 reason to kill the standby connections before the next checkpoint, when
 the old WAL files are recycled. XLogWrite() is in the critical path of
 normal operations, too.
 
 OK. I'll remove that check from XLogWrite().
 
 There's another important reason for that: If archiving is not working
 for some reason, the standby can't obtain the old segments from the
 archive either. If we refuse to stream such old segments, and they're
 not getting archived, the standby has no way to catch up until archiving
 is fixed. Allowing streaming of such old segments is free wrt. disk
 space, because we're keeping the files around anyway.
 
 OK. We should terminate the walsender whose currently-opened WAL file
 has been already archived, isn't required for crash recovery AND is
 'max-lag' older than the currently-written one. I'll change so.
 
 Walreceiver will get an error if it tries to open a segment that's been
 deleted or recycled already. The dangerous situation we need to avoid is
 when walreceiver holds a file open while bgwriter recycles it.
 Walreceiver will merrily continue streaming data from it, even though
 it's be overwritten by new data already.
 
 s/walreceiver/walsender ?
 
 Yes, that's the problem that I'll have to fix.
 
 A straightforward fix is to keep an newest recycled XLogRecPtr in
 shared memory that RemoveOldXlogFiles() updates. Walreceiver checks it
 right after read()ing from a file, before sending it to the client, and
 throws an error if the data it read() was already recycled.
 
 I prefer this. But I don't think such an aggressive check of a newest
 recycled XLogRecPtr is required if the bgwriter always doesn't delete
 the WAL file which is newer than or equal to the walsenders' oldest WAL
 file. In other words, the WAL files which the walsender is reading (or
 will read) are not removed at the moment.
 
 Regards,
 


-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 1823,1828  archive_command = 'copy %p C:\\server\\archivedir\\%f'  # Windows
--- 1823,1856 
 /para
 /listitem
/varlistentry
+ 
+   varlistentry id=guc-replication-lag-segments xreflabel=replication_lag_segments
+termvarnamestandby_keep_segments/varname (typeinteger/type)/term
+indexterm
+ primaryvarnamestandby_keep_segments/ configuration parameter/primary
+/indexterm
+listitem
+para
+ Specifies the number of log file segments kept in filenamepg_xlog/
+ directory, in case a standby server needs to fetch them via streaming
+ replciation. Each segment is normally 16 megabytes. If a standby
+ server connected to the primary falls behind more than
+ varnamestandby_keep_segments/ segments, the primary might remove
+ a WAL segment still needed by the standby and the replication
+ connection will be terminated.
+ 
+ This sets only the minimum number of segments retained for standby
+ purposes, the system might need to retain more segments for WAL
+ archival or to recover from a checkpoint. If standby_keep_segments
+ is zero (the default), the system doesn't keep any extra segments
+ for 

Re: [HACKERS] [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()

2010-04-07 Thread Heikki Linnakangas
(moving to pgsql-hackers)

Simon Riggs wrote:
 On Wed, 2010-04-07 at 06:12 +, Heikki Linnakangas wrote:
 Log Message:
 ---
 Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset() during
 recovery. We might want to relax this in the future, but ThisTimeLineID
 isn't currently correct in backends during recovery, so the filename
 returned was wrong.
 
 Any reason why we couldn't just do this:
 
 if (RecoveryInProgress())
 {
   volatile XLogCtlData *xlogctl = XLogCtl;
   XLogFileName(xlogfilename, xlogctl-ThisTimeLineID, 
   xlogid, xlogseg);
 }
 else
   XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
 
 
 rather than preventing access to those functions completely?

Because if you do something like
pg_xlogfile_name(pg_last_xlog_receive_location()),
xloctl-ThisTimeLineId would not necessarily be the timeline
corresponding the last received location. Even with
pg_xlogfile_name(pg_last_xlog_replay_location()), there's a small race
condition between those calls; if a checkpoint record is replayed in
between that changes timeline, the returned filename doesn't correspond
the name of the file where the replayed WAL record was read from, as you
would expect.

This commit is a stop-gap solution until we figure out what exactly to
do about that. Masao-san wrote a patch that included the TLI in the
string returned by pg_last_xlog_receive/replay_location() (see
http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa424930...@mail.gmail.com
and
http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf4...@mail.gmail.com),
but it still wasn't clear it did the right thing in corner-cases where
the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by
pg_last_receive_location() seems bogus, at least.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #5394: invalid declspec for PG_MODULE_MAGIC

2010-04-07 Thread Magnus Hagander
On Tue, Apr 6, 2010 at 21:55, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Mon, Mar 29, 2010 at 11:47 AM, Takahiro Itagaki
 itagaki.takah...@oss.ntt.co.jp wrote:
 A patch attached. The name of PGMODULEEXPORT might be arguable.

 I agree with this in principle, but won't this break every single
 add-on module out there that supports Win32?

 The patch didn't touch the contrib modules, so why would it break
 third-party sources?

Oh, d'oh. I was clearly too tired when reading that thing. I was
thinking we required changes in the contrib modules/third party
modules, but they don't actually use DLLEXPORT, do they?

In that case, that argument clearly falls, and I'm in favour of the
patch :-) Just make sure you test it on both current and semi-old
versions of mingw, they tend to not always act the same way. Or just
get it in and the buildfarm can figure that part out for us.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-07 Thread Magnus Hagander
On Wed, Apr 7, 2010 at 00:48, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Wed, Apr 7, 2010 at 00:02, Tom Lane t...@sss.pgh.pa.us wrote:
 Oh, another thought here: what is the effect of the combination of this
 with your other proposal to add more timezones to the list?

 [ none ]

 Ah, right, I hadn't looked closely at the logic before.  Still have
 two comments though:

 * There are other error conditions that cause breaks in that loop.
 Should we change the others?

The only other error condition is if we find a key but can't open it.
That indicates something that's *seriously* wrong, so I'm inclined to
keep that one. The other two break statements are actually for when we
*find* a match, so they just indicate that we pick the first match
that we find.

 * There is a comment at the head of win32_tzmap[] explaining where the
 data came from; you need to update that.

Right.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Thoughts on pg_hba.conf rejection

2010-04-07 Thread Simon Riggs

When there is a specific reject rule, why does the server say 

FATAL:  no pg_hba.conf entry

That sounds like an administrative error, rather than a specific
decision on the part of an admin to reject the connection. Suggested
message would be

FATAL: connection rejected for host xxx, user , database xxx

Clearly needs to be secure. Does the second message give any information
to a would-be hacker than the first? I don't think so, but it certainly
helps an admin work out if they've missed something.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quoting in recovery.conf

2010-04-07 Thread Heikki Linnakangas
Fujii Masao wrote:
 On Wed, Apr 7, 2010 at 1:48 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Ok, here's what I came up with.
 
 Looks OK to me.

Committed.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()

2010-04-07 Thread Simon Riggs
On Wed, 2010-04-07 at 13:23 +0300, Heikki Linnakangas wrote:
 (moving to pgsql-hackers)
 
 Simon Riggs wrote:
  On Wed, 2010-04-07 at 06:12 +, Heikki Linnakangas wrote:
  Log Message:
  ---
  Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset() during
  recovery. We might want to relax this in the future, but ThisTimeLineID
  isn't currently correct in backends during recovery, so the filename
  returned was wrong.
  
  Any reason why we couldn't just do this:
  
  if (RecoveryInProgress())
  {
  volatile XLogCtlData *xlogctl = XLogCtl;
  XLogFileName(xlogfilename, xlogctl-ThisTimeLineID, 
  xlogid, xlogseg);
  }
  else
  XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
  
  
  rather than preventing access to those functions completely?
 
 Because if you do something like
 pg_xlogfile_name(pg_last_xlog_receive_location()),
 xloctl-ThisTimeLineId would not necessarily be the timeline
 corresponding the last received location. Even with
 pg_xlogfile_name(pg_last_xlog_replay_location()), there's a small race
 condition between those calls; if a checkpoint record is replayed in
 between that changes timeline, the returned filename doesn't correspond
 the name of the file where the replayed WAL record was read from, as you
 would expect.

If timelineId changed in normal operation, I'd be inclined to agree this
was a problem. It hardly ever changes, and can only change on standby
when that server is not yet streaming.

I'd rather have a function with a rare and documented weirdness, than no
function at all.

 This commit is a stop-gap solution until we figure out what exactly to
 do about that. Masao-san wrote a patch that included the TLI in the
 string returned by pg_last_xlog_receive/replay_location() (see
 http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa424930...@mail.gmail.com
 and
 http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf4...@mail.gmail.com),
 but it still wasn't clear it did the right thing in corner-cases where
 the TLI changes. 

 Using GetRecoveryTargetTLI() for the tli returned by
 pg_last_receive_location() seems bogus, at least.

Agree with that, using the current value makes most sense 
xlogctl-ThisTimeLineID

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining Streaming Replication Open Items

2010-04-07 Thread Robert Haas
On Tue, Apr 6, 2010 at 3:14 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, 2010-04-06 at 11:06 -0400, Robert Haas wrote:

      * Redefine smart shutdown in standby mode?
  Drop. Too big a change at this point.
 
  We have a working patch for this - I want to commit it.  I don't think
  it's a big change, and the current behavior is extremely pathological.
 
  Oh, ok. I didn't look at the latest patch, if it looks good to you, fine
  with me.

 I'll commit it tonight.

 I don't see this on hackers. Have you posted it? I'd like to see what
 you do before it gets committed. Thanks.

It's the same patch Fujii Masao posted previously, for which I
previously said I would fix up the comments and docs and commit.  But
here is the adjusted version, which is hopefully more clear about what
we're doing at why we're doing it.

...Robert


smart-shutdown.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fallback_application_name and pgbench

2010-04-07 Thread Robert Haas
On Wed, Apr 7, 2010 at 5:30 AM, Dave Page dp...@pgadmin.org wrote:
 On Wed, Apr 7, 2010 at 10:16 AM, Magnus Hagander mag...@hagander.net wrote:
 Uh, why fallback_application_name? Isn't this the *exact* usecase
 where application_name should be used? At least for the two apps -
 fallback_app_name might be correct for dblink.

 And yes, I think it's a good idea to set it for at least pgbench and 
 oid2name.

 For pgbench, I can imagine scenarios where you might want to override
 the name in a script - for example, if you're trying to simulate an
 environment with multiple workload types running against the same
 database.

Agreed.

 I think that's a pretty bad idea in general. But if we do, then it
 should at least never override something that's specified - we need to
 keep the ability for tools like psql and pgadmin to set it, regardless
 of what they happen to have as argv[0].

 We discussed that during the development of the patch - the original
 idea was to default to argv[0] if no other value was set, but
 apparently there's no portable way to get at argv[0] from within
 libpq, even if you ignore Windows.

That's true, and I also agree with Magnus's commenet that it's a bad
idea anyway.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining Streaming Replication Open Items

2010-04-07 Thread Simon Riggs
On Wed, 2010-04-07 at 07:40 -0400, Robert Haas wrote:
 On Tue, Apr 6, 2010 at 3:14 PM, Simon Riggs si...@2ndquadrant.com wrote:
  On Tue, 2010-04-06 at 11:06 -0400, Robert Haas wrote:
 
   * Redefine smart shutdown in standby mode?
   Drop. Too big a change at this point.
  
   We have a working patch for this - I want to commit it.  I don't think
   it's a big change, and the current behavior is extremely pathological.
  
   Oh, ok. I didn't look at the latest patch, if it looks good to you, fine
   with me.
 
  I'll commit it tonight.
 
  I don't see this on hackers. Have you posted it? I'd like to see what
  you do before it gets committed. Thanks.
 
 It's the same patch Fujii Masao posted previously, for which I
 previously said I would fix up the comments and docs and commit.  But
 here is the adjusted version, which is hopefully more clear about what
 we're doing at why we're doing it.

OK, that looks a lot less risky than I had understood from discussions.
The main thing for me is it doesn't interfere with Startup or
WalReceiver, so assuming it works I've got no objections. Thanks for
chasing this down, good addition.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] system table/view and sequence

2010-04-07 Thread Olivier Baheux
i'm trying to find where are stored sequence definition
(increment,minvalue,maxvalue,start,cache) in system tables. Atm I
found everything exept sequence.

Thanks in advance.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 When there is a specific reject rule, why does the server say 
 FATAL:  no pg_hba.conf entry

It's intentional.  We try to expose the minimum amount of knowledge
about the contents of pg_hba.conf to potential attackers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-07 Thread Jaime Casanova
On Wed, Apr 7, 2010 at 10:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 When there is a specific reject rule, why does the server say
 FATAL:  no pg_hba.conf entry

 It's intentional.  We try to expose the minimum amount of knowledge
 about the contents of pg_hba.conf to potential attackers.


i just tried it in CVS and in 8.4 and when i put a reject rule on
pg_hba.conf what i get is:
psql: FATAL:  no pg_hba.conf entry for host 127.0.0.1, user mic,
database mic

so we are giving a lot of info already changing no pg_hba.conf entry
for connection rejected doesn't seem like a lot more and the change
could be useful for a DBA understanding what happens

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-07 Thread Tom Lane
Jaime Casanova jcasa...@systemguards.com.ec writes:
 On Wed, Apr 7, 2010 at 10:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 It's intentional.  We try to expose the minimum amount of knowledge
 about the contents of pg_hba.conf to potential attackers.

 i just tried it in CVS and in 8.4 and when i put a reject rule on
 pg_hba.conf what i get is:
 psql: FATAL:  no pg_hba.conf entry for host 127.0.0.1, user mic,
 database mic

 so we are giving a lot of info already

All three of those data values are known to the client; they don't add
knowledge about what is in pg_hba.conf.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system table/view and sequence

2010-04-07 Thread Nicolas Barbier
2010/4/7 Olivier Baheux olivierbah...@gmail.com:

 i'm trying to find where are stored sequence definition
 (increment,minvalue,maxvalue,start,cache) in system tables. Atm I
 found everything exept sequence.

It's in the sequence itself (which can be accessed like a table). The
fact that this table is in fact a sequence is stored in pg_class:

8
itsme=# CREATE TABLE a (i serial);
HINWEIS:  CREATE TABLE erstellt implizit eine Sequenz »a_i_seq« für
die »serial«-Spalte »a.i«
CREATE TABLE
itsme=# SELECT * FROM a_i_seq;
 sequence_name | last_value | increment_by |  max_value  |
min_value | cache_value | log_cnt | is_cycled | is_called
---++--+-+---+-+-+---+---
 a_i_seq   |  1 |1 | 9223372036854775807 |
1 |   1 |   1 | f | f
(1 Zeile)

itsme=# SELECT relkind FROM pg_class WHERE relname = 'a_i_seq';
 relkind
-
 S
(1 Zeile)
8

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-07 Thread Stefan Kaltenbrunner

Magnus Hagander wrote:

On Wed, Apr 7, 2010 at 00:48, Tom Lane t...@sss.pgh.pa.us wrote:

Magnus Hagander mag...@hagander.net writes:

On Wed, Apr 7, 2010 at 00:02, Tom Lane t...@sss.pgh.pa.us wrote:

Oh, another thought here: what is the effect of the combination of this
with your other proposal to add more timezones to the list?

[ none ]

Ah, right, I hadn't looked closely at the logic before.  Still have
two comments though:

* There are other error conditions that cause breaks in that loop.
Should we change the others?


The only other error condition is if we find a key but can't open it.
That indicates something that's *seriously* wrong, so I'm inclined to
keep that one. The other two break statements are actually for when we
*find* a match, so they just indicate that we pick the first match
that we find.


hmm all that code makes me wonder a bit about a more general issue - is 
the fallback to GMT if we fail to actually make sense of the right 
imezone to use actually a good idea?
In the case of the person I helped diagnosing the issue pg was bundled 
into some commercial app and the only reason the user immediately 
noticed that something was wrong was because the app used something 
like select now() to display the current time in the GUI.
I would consider the failure to make sense of the registry on windows or 
 failure to figure timezone information out a more serious issue than a 
mere WARNING because depending on how you look at the issue it might 
actually cause silent data corruption.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-07 Thread Tom Lane
Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:
 hmm all that code makes me wonder a bit about a more general issue - is 
 the fallback to GMT if we fail to actually make sense of the right 
 imezone to use actually a good idea?

What alternative are you proposing?  Failing to start the server doesn't
seem like an attractive choice.

 I would consider the failure to make sense of the registry on windows or 
   failure to figure timezone information out a more serious issue than a 
 mere WARNING because depending on how you look at the issue it might 
 actually cause silent data corruption.

Somehow, if you're running a database on windoze, I doubt your data
integrity standards are that high.  In any case I'd rather get a bleat
about why is the server running in GMT than my database won't start.
The former will be a lot easier to narrow down.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-07 Thread Stefan Kaltenbrunner

Tom Lane wrote:

Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:
hmm all that code makes me wonder a bit about a more general issue - is 
the fallback to GMT if we fail to actually make sense of the right 
imezone to use actually a good idea?


What alternative are you proposing?  Failing to start the server doesn't
seem like an attractive choice.


why not? we do error out in a lot of other cases as well... Personally I 
find a hard and clear something is wrong please fix much more 
convinient than defaulting to something that is more or less completely 
arbitrary but well...




I would consider the failure to make sense of the registry on windows or 
  failure to figure timezone information out a more serious issue than a 
mere WARNING because depending on how you look at the issue it might 
actually cause silent data corruption.


Somehow, if you're running a database on windoze, I doubt your data
integrity standards are that high.  In any case I'd rather get a bleat
about why is the server running in GMT than my database won't start.
The former will be a lot easier to narrow down.


heh - except that we fail in that department - The only (not really 
useful hint) that pg logged was:


WARNUNG:  could not query value for 'std' to identify Windows timezone: 2

which says nothing about I failed to figure something sane out and so I 
have to fallback to GMT (which is what the !WIN32 code path seems to be 
actually doing but not the WIN32 code).
And even from the vendor perspective getting a support call on uhm the 
database for your app is not starting what logs should I look at seems 
better than hmm we are now 2 weeks in production and just noticed that 
all the timestamps are off by a few hours how can we fix our data?.
PostgreSQL is bundled with a lot of apps on windows these days so the 
enduser might not even aware of it (and look into the eventlog only to 
find a rather oddly phrased WARNING) unless it fails hard...





Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-07 Thread Josh Berkus

 Clearly needs to be secure. Does the second message give any information
 to a would-be hacker than the first? I don't think so, but it certainly
 helps an admin work out if they've missed something.

I think this question needs a bona fide network security geek to decide,
rather than us database geeks.  Hello!  Is there a security hacker in
the house?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-07 Thread Robert Haas
On Wed, Apr 7, 2010 at 12:20 PM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:
 Tom Lane wrote:

 Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:

 hmm all that code makes me wonder a bit about a more general issue - is
 the fallback to GMT if we fail to actually make sense of the right imezone
 to use actually a good idea?

 What alternative are you proposing?  Failing to start the server doesn't
 seem like an attractive choice.

 why not? we do error out in a lot of other cases as well... Personally I
 find a hard and clear something is wrong please fix much more convinient
 than defaulting to something that is more or less completely arbitrary but
 well...

While I can understand why someone might want that behavior in some
cases, in other cases it might be a severe overreaction.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-07 Thread Robert Haas
On Wed, Apr 7, 2010 at 10:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 When there is a specific reject rule, why does the server say
 FATAL:  no pg_hba.conf entry

 It's intentional.  We try to expose the minimum amount of knowledge
 about the contents of pg_hba.conf to potential attackers.

The problem with the message is not that it's uninformative, but that
it's counterfactual.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-07 Thread Magnus Hagander
On Wed, Apr 7, 2010 at 7:04 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Apr 7, 2010 at 12:20 PM, Stefan Kaltenbrunner
 ste...@kaltenbrunner.cc wrote:
 Tom Lane wrote:

 Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:

 hmm all that code makes me wonder a bit about a more general issue - is
 the fallback to GMT if we fail to actually make sense of the right imezone
 to use actually a good idea?

 What alternative are you proposing?  Failing to start the server doesn't
 seem like an attractive choice.

 why not? we do error out in a lot of other cases as well... Personally I
 find a hard and clear something is wrong please fix much more convinient
 than defaulting to something that is more or less completely arbitrary but
 well...

 While I can understand why someone might want that behavior in some
 cases, in other cases it might be a severe overreaction.

I think the dangerous scenario is if it worked, and then stopped
working. In that case, the database will change it's behavior and it
might go unnoticed. If it's wrong on first install, it'll likely get
noticed..

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and a disk full in primary

2010-04-07 Thread Robert Haas
On Wed, Apr 7, 2010 at 6:02 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 This task has been languishing for a long time, so I took a shot at it.
 I took the approach I suggested before, keeping a variable in shared
 memory to track the latest removed WAL segment. After walsender has read
 a bunch of WAL records from a WAL file, it checks that what it read is
 after the latest removed WAL segment, otherwise the data it read might
 have came from a file that was already recycled and overwritten with new
 data, and an error is thrown.

 This changes the behavior so that if a standby server doing streaming
 replication falls behind too much, the primary will remove/recycle a WAL
 segment needed by the standby server. The previous behavior was that WAL
 segments still needed by any connected standby server were never
 removed, at the risk of filling the disk in the primary if a standby
 server behaves badly.

 In your version of this patch, the default was still the current
 behavior where the primary retains WAL files that are still needed by
 connected stadby servers indefinitely. I think that's a dangerous
 default, so I changed it so that if you don't set standby_keep_segments,
 the primary doesn't retain any extra segments; the number of WAL
 segments available for standby servers is determined only by the
 location of the previous checkpoint, and the status of WAL archiving.
 That makes the code a bit simpler too, as we never care how far the
 walsenders are. In fact, the GetOldestWALSenderPointer() function is now
 dead code.

This seems like a very useful feature, but I can't speak to the code
quality without a good deal more study.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Default libpq application name

2010-04-07 Thread Bruce Momjian
I just emailed Dave Page and it seems we don't set a default application
name in libpq.  Should we do:

extern char *argv[];

and reference argv[0] in libpq to set a default application name?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default libpq application name

2010-04-07 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I just emailed Dave Page and it seems we don't set a default application
 name in libpq.  Should we do:
   extern char *argv[];
 and reference argv[0] in libpq to set a default application name?

No.  This is not portable and it's not a good idea even if it were.

The issue was discussed extensively when the patch was committed,
and does not need to be revisited now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-07 Thread Stefan Kaltenbrunner

Magnus Hagander wrote:

On Wed, Apr 7, 2010 at 7:04 PM, Robert Haas robertmh...@gmail.com wrote:

On Wed, Apr 7, 2010 at 12:20 PM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:

Tom Lane wrote:

Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:

hmm all that code makes me wonder a bit about a more general issue - is
the fallback to GMT if we fail to actually make sense of the right imezone
to use actually a good idea?

What alternative are you proposing?  Failing to start the server doesn't
seem like an attractive choice.

why not? we do error out in a lot of other cases as well... Personally I
find a hard and clear something is wrong please fix much more convinient
than defaulting to something that is more or less completely arbitrary but
well...

While I can understand why someone might want that behavior in some
cases, in other cases it might be a severe overreaction.


I think the dangerous scenario is if it worked, and then stopped
working. In that case, the database will change it's behavior and it
might go unnoticed. If it's wrong on first install, it'll likely get
noticed..


yeah that is one aspect - and in talking to the OP he would have 
prefered the database not starting up at all, logging an error and a 
hint on setting a fixed timezone in the conf.
Even if if keep the current fallback behaviour we should at least fix 
the windows codepath to do the same as the unix codepath does - as in 
actually logging that the fallback to GMT happened...




Stefan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-07 Thread Tom Lane
Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:
 yeah that is one aspect - and in talking to the OP he would have 
 prefered the database not starting up at all, logging an error and a 
 hint on setting a fixed timezone in the conf.

Well, you started from the statement that this was an embedded copy
of postgres ... so most users might not even know it was there,
much less to check its postmaster log for problems.  I'm still of
the opinion that refusing to start is an overreaction.

 Even if if keep the current fallback behaviour we should at least fix 
 the windows codepath to do the same as the unix codepath does - as in 
 actually logging that the fallback to GMT happened...

+1 for that anyway.  There already are WARNING messages for the various
Windows failure cases, but compared to the Unix code

ereport(LOG,
(errmsg(could not determine system time zone, defaulting to 
\%s\, GMT),
errhint(You can specify the correct timezone in postgresql.conf.)));

they lack either the note about defaulting to GMT or the hint.  I guess
we should add both of those to the failure cases in the Windows version
of identify_system_timezone.  Should we also change the WARNING errlevel
to LOG?  I think the latter is more likely to actually get into the log.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-07 Thread Tom Lane
I wrote:
 ereport(LOG,
 (errmsg(could not determine system time zone, defaulting to 
 \%s\, GMT),

BTW, does anyone remember the reason for making GMT nonlocalizable
in these messages?  It seems more straightforward to do

(errmsg(could not determine system time zone, defaulting to 
\GMT\),

I suppose we had a reason for doing it the first way but I can't see
what.  GMT seems a fairly English-centric way of referring to UTC
anyhow; translators might wish to put in UTC instead, or some other
spelling.  Shouldn't we let them?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Enhancing phonetic search support for more languages - GSoC 2010

2010-04-07 Thread Dhiraj Lohiya
Hello

I am Dhiraj Lohiya, Computer Science undergraduate from BITS Pilani. I
wanted to propose idea to improvise upon the *phonetic search support,
*initially
for some Indian languages like Hindi and Marathi with a framework for
extending it to other languages easily by contributing the rules in a simple
format. I am looking to take it forward as a* GSoC project*. Check out if
you find this interesting enough:

I plan to customize the soundex algorithm for all languages where each
language could have a different phonetic equivalent class of rules
(Generally around 20 rules for most Indian languages I have worked with).  I
would keep the approach layered so that support for multiple language rules
could be easily contributed and more languages could be added by others.

Moreover, since it is important that once a base set of rules are defined by
someone, the rules could themselves be added/evolve based on the user input
and usage.
For instance, if many users(above a threshold set by us) insert some
search string
for which no wanted search result is retrieved, we could track what he
finally selects and then accordingly append/modify our set of phonetic rules
based on the phonetic mismatch amongst the  query inserted and result wanted
according to our set of rules. Using this, the* rule sets it could evolve
itself when we collect usage statistics from users based on their
experience. *This feature would add a new dimension to the searchfunctionality
and would surely stand out.

Initially I plan to code this for few Indian languages like Hindi, Marathi
etc. and define a simple way (probably a gui on concept based on
GoogleImageLabeler http://images.google.com/imagelabeler/, wherein two
words which sound similar will be mapped for improving upon the rules set)
in which rules for different languages can be directly added and then people
knowing those languages could contribute.

*
Samples:*

   - Some case of Hindi songs,
   - if I search for a song which has word naiyya but I spell the word as
   ''nayya, presently no result would be returned since this is not in the
  playlist.
  - Moreover, if pyar is searched, the results vary than when pyaar
  is searched but it is easy to realize that both are the same and hence
  should give the same results.

*Some background on this:*
I have already worked out a basic customized version of soundex algorithm as
a part of my intern project at
PennyWiseSolutionshttp://www.pennywisesolutions.com/and implemented
it in java (which had features of self improving upon its
rule set based on the 2 input phonetically similar words as well). Right
now, the rule sets are designed only for Hindi and Marathi. The results are
narrowed down pretty well with much less false positives and this works well
with Marath and Hindi. Now since the algorithm part remains same (almost
equivalent to soundex) and only the rule set of other languages is to be
contributed which would be used by the algorithm to process, I guess this
could do. Some specific customization that was done included not to take
care of silent letters like in soundex since when spelling a Hindi word in
English, users don't really use silent letters.

I would be glad to have more input on this.

--
Regards
Dhiraj Lohiya


[HACKERS] FM suffix in to_char Y/YY/YYY still screwy

2010-04-07 Thread Tom Lane
I thought we'd fixed this ...

regression=# select to_char('2009-01-01'::date, 'YY');
 to_char 
-
 09
(1 row)

regression=# select to_char('2009-01-01'::date, 'FMYY');
 to_char 
-
 09
(1 row)

Not a lot of zero suppression happening there :-(.

I believe the correct fix is to reduce the year mod 100 (or 10 or 1000)
before feeding it to snprintf, rather than playing games with printing
only part of the result string as the original and current code try to
do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancing phonetic search support for more languages - GSoC 2010

2010-04-07 Thread Josh Berkus
Dhiraj,

 For instance, if many users(above a threshold set by us) insert some 
 search  string for which no wanted  search  result is retrieved, we
 could track what he finally selects and then accordingly append/modify
 our set of phonetic rules based on the phonetic mismatch amongst the
  query inserted and result wanted according to our set of rules. Using
 this, the *  rule sets it could evolve itself when we collect usage
 statistics from users based on their experience.  * This feature would
 add a new dimension to the  search functionality and would surely stand
 out.

You're mixing two completely different kinds of features here.  One is a
backend function and the other is an application for building soundex
rules.  While both of these are interesting projects, it is unlikely you
can complete both in one summer.

What I'd suggest focussing on for SoC is creating a new soundex funciton
(suggested name: soundex_ml) which includes a facility for loadable
algorithms and callability on a per-language basis.  That would be
plenty of work by itself.  From there, you could then continue your
undergraduate work on the tool to build the algorithms in the first place.

I'm also curious why you chose to focus on the extremely imprecise
soundex instead of the more discriminating metaphone.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancing phonetic search support for more languages - GSoC 2010

2010-04-07 Thread Robert Haas
On Wed, Apr 7, 2010 at 4:24 PM, Dhiraj Lohiya lohiya.dhi...@gmail.com wrote:
 For instance, if many users(above a threshold set by us) insert
 some search string for which no wanted search result is retrieved, we could
 track what he finally selects and then accordingly append/modify our set of
 phonetic rules based on the phonetic mismatch amongst the  query inserted
 and result wanted according to our set of rules. Using this, the rule
 sets it could evolve itself when we collect usage statistics from users
 based on their experience. This feature would add a new dimension to
 the searchfunctionality and would surely stand out.

This is really more of an application than something you're going to
be able to build into the database.  It might be an interesting
project, but it isn't really a PostgreSQL project (though you might
choose to use PostgreSQL to implement it).

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [ADMIN] Compile Problem for Alpha 5 in /src/backend/utils/error/elog.c

2010-04-07 Thread Alvaro Herrera
Kiswono Prayogo escribió:
 Just reporting that Alpha 4 was working just fine..
 http://pastebin.com/ri2gXJDN
 
 but Alpha 5 didn't:

Yeah, known packaging bug.  Please remove src/Makefile.custom and try
again (notice the -Werror that shouldn't be there)


 elog.c:1698: error: ignoring return value of ‘write’, declared with
 attribute warn_unused_result
 elog.c: In function ‘write_pipe_chunks’:
 elog.c:2390: error: ignoring return value of ‘write’, declared with
 attribute warn_unused_result
 elog.c:2399: error: ignoring return value of ‘write’, declared with
 attribute warn_unused_result

Though, is there value in silencing these warnings?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Compile Problem for Alpha 5 in /src/backend/utils/error/elog.c

2010-04-07 Thread Kiswono Prayogo
oic, thanks ^^

-- 
Regards,
Kiswono P
GB

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set LC_COLLATE to de_DE_phoneb

2010-04-07 Thread Takahiro Itagaki

Frank Jagusch fr...@jagusch-online.de wrote:

 de_DE_phoneb is the name of an alternative sorting in german (only a
 few languages have alternate sorting). You may find some information
 when you search the MSDN for de_DE_phoneb, i.e.
 http://msdn.microsoft.com/en-en/library/ms404373.aspx
 These alternate sorting is supported by the OS, but I don't know how it
 is supported in the msvcrt.

Hmmm, I found de-DE_phoneb in MSDN:
http://msdn.microsoft.com/en-us/library/dd374060
but setlocale(de-DE_phoneb) always fails at least on my machine.

The doc says de-DE_phoneb is a locale name for 
MAKELCID(MAKELANGID(LANG_GERMAN, SUBLANG_GERMAN), SORT_GERMAN_PHONE_BOOK).
Some of native Win32 APIs could accept the locale and sort-order
combination, but setlocale() in CRT seems to reject it.

So, you could use the locale if you find a setlocale-compatible name of
de-DE_phoneb. Or, you cannot use it, unless we modify PostgreSQL to
use Win32 locale functions instead of standard libc ones -- but it is
hardly acceptable.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining Streaming Replication Open Items

2010-04-07 Thread Robert Haas
On Wed, Apr 7, 2010 at 8:17 AM, Simon Riggs si...@2ndquadrant.com wrote:
 OK, that looks a lot less risky than I had understood from discussions.
 The main thing for me is it doesn't interfere with Startup or
 WalReceiver, so assuming it works I've got no objections. Thanks for
 chasing this down, good addition.

Thanks.  Committed.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining Streaming Replication Open Items

2010-04-07 Thread Robert Haas
On Tue, Apr 6, 2010 at 4:09 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, 2010-04-06 at 10:09 +0300, Heikki Linnakangas wrote:

      *  Walsender and dblink are not interruptible on win32. - related 
  thread

 I'd actually be happy to just leave it for 9.0, but it seems like
 consensus has been reached on how to fix it, and Fujii is working on a
 patch, so let's follow that through.

 That one is a must, for me.

 I would put relaying easily above any of the other stuff. That is a
 truly useful feature that we are very close to being able to have in
 this release. Adding things like quotes is not moving us forwards in any
 important sense.

+1.  I think this is easily the most important remaining issue that we
need to fix, with the possible exception of the shutdown checkpoint
issue.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()

2010-04-07 Thread Fujii Masao
On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 This commit is a stop-gap solution until we figure out what exactly to
 do about that. Masao-san wrote a patch that included the TLI in the
 string returned by pg_last_xlog_receive/replay_location() (see
 http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa424930...@mail.gmail.com
 and
 http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf4...@mail.gmail.com),
 but it still wasn't clear it did the right thing in corner-cases where
 the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by
 pg_last_receive_location() seems bogus, at least.

Why? The tli of the last WAL record received is always the
recovery target tli currently. So using GetRecoveryTargetTLI()
for pg_last_xlog_receive_location() seems OK for me.
Am I missing something?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancing phonetic search support for more languages - GSoC 2010

2010-04-07 Thread Dhiraj Lohiya
I'm also curious why you chose to focus on the extremely imprecise
 soundex instead of the more discriminating metaphone.


The main reason to choose soundex over metaphone/double metaphone is for
Indian languages, soundex itself with some customizations works pretty well.
Use of Double Metaphone only increases upon the processing overhead
 alongwith the need to store 2 hashes but the performance would remain the
same since the way the words are pronounced in Indian languages is based on
the Phonology of Devnagri script in which we don't have silent letters and
other accent related inclusions (which was a major consideration that went
in the design of Double Metaphone). One more customization required with
reference to Indian languages is that the characters in the words aren't
taken one by one but are broken as substrings of continuous vowels and
consonants and accordingly are mapped to the equivalent class. Also, one
rule from metaphone needs to be incorporated wherein in soundex the first
letter of the word is not considered but  we would encode it also for the
corresponding equivalent class.

Now with this approach of Soundex (without consideration for silent letters
and breaking the word into substrings not on a character by character basis)
delivers with almost same performance and much less overhead compared to
Double metaphone with considerations for silent letters, accents etc. which
don't have much impact on Indian languages and hence this would be more
efficient.

For western languages, double metaphone is known to perform with great
results. Hence, it could be used.

My previous  mail was concentrated on soundex since I had also considered
how it would proceed to self improve its rule set of equivalent classes,
which is a little trickier in double metaphone whereas in soundex, we can
map the rules based on the  corresponding mapping that are present. But this
could be looked upon later whether we want to include this functionality as
well.

So for the SoC project, as proposed, probably I could concentrate on the
algorithmic part for multi-lingual support. Once the framework is set ready
with tutorials and wiki as to how to add rules for a new language, this
could be contributed upon for other users for more languages by the
community and after testing for a particular quality threshold, this could
be incorporated.

Thanks for the inputs. More suggestions/reviews please!

-- 
Regards
Dhiraj Lohiya


Re: [HACKERS] FM suffix in to_char Y/YY/YYY still screwy

2010-04-07 Thread Brendan Jurd
On 8 April 2010 06:46, Tom Lane t...@sss.pgh.pa.us wrote:
 Not a lot of zero suppression happening there :-(.

 I believe the correct fix is to reduce the year mod 100 (or 10 or 1000)
 before feeding it to snprintf, rather than playing games with printing
 only part of the result string as the original and current code try to
 do.

I think you're right.  FM should behave as advertised, even for years.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A maze of twisty mailing lists all the same

2010-04-07 Thread Greg Stark
I've often said in the past that we have too many mailing lists with
overlapping and vague charters. I submit the following thread as
evidence that this causes real problems.

http://archives.postgresql.org/message-id/g2o4b46b5f01004010610ib8625426uae6ee90ac1435...@mail.gmail.com

Because the poster chose to send it to pgsql-admin instead of
pgsql-general (or pgsql-bugs) very few of the usual suspects had a
chance to see it. 7 days later a question about a rather serious
database corruption problem had no responses. I've never understand
what the point of pgsql-admin is;  just about every question posted is
an admin question of some sort.

Likewise I don't think we should have pgsql-performance or pgsql-sql
or pgsql-novice -- any thread appropriate for any of these would be
better served by sending it to pgsql-general anyways (with the
exception of pgsql-performance which has a weird combination of hacker
threads and user performance tuning threads). Sending threads to
pgsql-general would get more eyes on them and would avoid a lot of the
cross-posting headaches. What would someone subscribed to one of these
lists but not pgsql-general get anyways but some random sample of
threads that might be vaguely performance or admin related. They would
still miss most of the administration and performance questions and
discussions which happen on -general and -hackers as appropriate.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A maze of twisty mailing lists all the same

2010-04-07 Thread Jaime Casanova
On Thu, Apr 8, 2010 at 1:11 AM, Greg Stark st...@mit.edu wrote:

 Likewise I don't think we should have pgsql-performance or pgsql-sql
 or pgsql-novice -- any thread appropriate for any of these would be
 better served by sending it to pgsql-general anyways (with the

+1

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Oddly indented raw_expression_tree_walker

2010-04-07 Thread Takahiro Itagaki
I found raw_expression_tree_walker() is oddly indented in 8.4 and HEAD.
I expected pgindent would fix those clutter, but it could not.
Should we cleanup it manually, or leave it as-is?
Also, should we backport such kind of cleanups to previous releases?

Index: src/backend/nodes/nodeFuncs.c
===
@@ -2198,7 +2198,7 @@
  * that could appear under it, but not other statement types.
  */
 bool
-   raw_expression_tree_walker(Node *node, bool (*walker) 
(), void *context)
+raw_expression_tree_walker(Node *node, bool (*walker) (), void *context)
 {
ListCell   *temp;
 

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers