Re: [HACKERS] Hostnames in pg_hba.conf

2010-07-19 Thread Peter Eisentraut
On tor, 2010-02-11 at 14:13 +0100, Bart Samwel wrote:
 I've been working on a patch to add hostname support to pg_hba.conf.
 It's not ready for public display yet, but I would just like to run a
 couple of issues / discussion points past everybody.

What's the status of this?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-12 Thread Bart Samwel
On Fri, Feb 12, 2010 at 02:31, Mark Mielke m...@mark.mielke.cc wrote:

  But once there, it seems clear that packing hostnames or netmasks onto one
 line is just ugly and hard to manage. I'd like to see this extended to any
 of the many ways to allow hostnames to be specified one per line. For
 example:

 set tool_servers {
 127.0.0.1/32
 ::1/128
 1.2.3.4/32
 1.2.3.5/32
 }

 host DATABASE USER $tool_servers md5

 The above features easy parsing capability.

 Of course, then I'll ask for the ability to simplify specifying multiple
 databases:

 set databases {
 db1
 db2
 }

 set users {
 user1
 user2
 }

 host $databases $users $tool_servers md5

 Sorry... :-)


Definitely sounds useful! But I do now see that this is entirely orthogonal
to what I was trying to do -- which means I don't have to do anything about
it. :-)


  I think wildcards are interesting, but I have yet to see an actual use
 case other than it's cool and very generalized. In my mind (tell me if I'm
 wrong), the most common type of PostgreSQL authentication setup is within a
 local network within an organization. There, you either authorize an entire
 subnet (the entire server park or all client PCs) or you authorize
 specific hosts (single IP address). The wildcard case is for replacing the
 first case, but for that case, subnets are usually just fine. I'm trying to
 target the second case here.


 The user case would be an organization with nodes all over the IP space,
 that wants to manage configuration from a single place. DNS would be that
 single place of choice. If moves trust from trust the netmasks to be kept
 up-to-date to trust that DNS will be kept up-to-date. Since DNS has
 important reasons to be up-to-date, it's a pretty safe bet that DNS is equal
 or more up-to-date than pg_hba.conf hard coded netmasks. It makes sense, but
 it can be a later use case. It doesn't have to be in version 1.


DNS is preferred to subnets in that regard, definitely. But again, that
points to the per-hostname route, and it's not a use case for the wildcard
route (unless people explicitly choose to organize their DNS hierarchy so
that they can use it for PostgreSQL authorization -- doubtful.)

Cheers,
Bart


Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-12 Thread Peter Eisentraut
On tor, 2010-02-11 at 14:13 +0100, Bart Samwel wrote:
 I've been working on a patch to add hostname support to pg_hba.conf.
 It's not ready for public display yet, but I would just like to run a
 couple of issues / discussion points past everybody.

It might be good to review Apache's hostname-based access control:
http://httpd.apache.org/docs/2.2/mod/mod_authz_host.html#allow


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Kevin Grittner
Bart Samwel b...@samwel.tk wrote:
 
 I've been working on a patch to add hostname support to
 pg_hba.conf.
 
 At present, I've simply not added caching.
 
Perhaps you could just recommend using nscd (or similar).
 
 There was a suggestion on the TODO list on the wiki, which
 basically said that maybe we could use reverse lookup to find
 the hostname and then check for that hostname in the list. I
 think that won't work, since IPs can go by many names and may not
 support reverse lookup for some hostnames (/etc/hosts anybody?).
 
Right.  Any reverse lookup should be, at best, for display in error
messages or logs.  There can be zero to many names for an IP
address.
 
 Currently, a pg_hba entry lists an IP / netmask combination. I
 would suggest allowing lists of hostnames in the entries, so that
 you can at least mimic the match multiple hosts by a single
 rule. Any reason not to do this?
 
I can't see any reason other than code complexity.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Mark Mielke

On 02/11/2010 08:13 AM, Bart Samwel wrote:

ISSUE #1: Performance / caching

At present, I've simply not added caching. The reasoning for this is 
as follows:

(a) getaddrinfo doesn't tell us about expiry, so when do you refresh?
(b) If you put the cache in the postmaster, it will not work for 
exec-based backends as opposed to fork-based backends, since those 
read pg_hba.conf every time they are exec'ed.
(c) If you put this in the postmaster, the postmaster will have to 
update the cache every once in a while, which may be slow and which 
may prevent new connections while the cache update takes place.
(d) Outdated cache entries may inexplicably and without any logging 
choose the wrong rule for some clients. Big aargh: people will start 
using this to specify 'deny' rules based on host names.


If you COULD get expiry info out of getaddrinfo you could potentially 
store this info in a table or something like that, and have it updated 
by the backends? But that's way over my head for now. ISTM that this 
stuff may better be handled by a locally-running caching DNS server, 
if people have performance issues with the lack of caching. These 
local caching DNS servers can also handle expiry correctly, etcetera.


We should of course still take care to look up a given hostname only 
once for each connection request.


You should cache for some minimal amount of time or some minimal number 
of records - even if it's just one minute, and even if it's a fixed 
length LRU sorted list. This would deal with situations where a new 
connection is raised several times a second (some types of load). For 
connections raised once a minute or less, the benefit of caching is far 
less. But, this can be a feature tagged on later if necessary and 
doesn't need to gate the feature.


Many UNIX/Linux boxes have some sort of built-in cache, sometimes 
persistent, sometimes shared. On my Linux box, I have nscd - name 
server caching daemon - which should be able to cache these sorts of 
lookups. I believe it is used for things as common as mapping uid to 
username in output of /bin/ls -l, so it does need to be pretty fast.


The difference between in process cache and something like nscd is the 
inter-process communication required to use nscd.




ISSUE #2: Reverse lookup?

There was a suggestion on the TODO list on the wiki, which basically 
said that maybe we could use reverse lookup to find the hostname and 
then check for that hostname in the list. I think that won't work, 
since IPs can go by many names and may not support reverse lookup for 
some hostnames (/etc/hosts anybody?). Furthermore, due to the 
top-to-bottom processing of pg_hba.conf, you CANNOT SKIP entries that 
might possibly match. For instance, if the third line is for host 
foo.example.com http://foo.example.com and the fifth line is for 
bar.example.com http://bar.example.com, both lines may apply to 
the same IP, and you still HAVE to check the first one, even if 
reverse lookup turns up the second host name. So it doesn't save you 
any lookups, it just costs an extra one.


I don't see a need to do a reverse lookup. Reverse lookups are sometimes 
done as a verification check, in the sense that it's cheap to get a map 
from NAME - IP, but sometimes it is much harder to get the reverse map 
from IP - NAME. However, it's not a reliable check as many legitimate 
users have trouble getting a reverse map from IP - NAME. It also 
doesn't same anything as IP - NAME lookups are a completely different 
set of name servers, and these name servers are not always optimized for 
speed as IP - NAME lookups are less common than NAME - IP. Finally, if 
one finds a map from IP - NAME, that doesn't prove that a map from NAME 
- IP exists, so using *any* results from IP - NAME is questionable.


I think reverse lookups are unnecessary and undesirable.


ISSUE #3: Multiple hostnames?

Currently, a pg_hba entry lists an IP / netmask combination. I would 
suggest allowing lists of hostnames in the entries, so that you can at 
least mimic the match multiple hosts by a single rule. Any reason 
not to do this?


I'm mixed. In some situations, I've wanted to put multiple IP/netmask. I 
would say that if multiple names are supported, then multiple IP/netmask 
should be supported. But, this does make the lines unwieldy beyond two 
or three. This direction leans towards the capability to define host 
classes, where the rules allows the host class, and the host class can 
have a list of hostnames.


Two other aspects I don't see mentioned:

1) What will you do for hostnames that have multiple IP addresses? Will 
you accept all IP addresses as being valid?
2) What will you do if they specify a hostname and a netmask? This seems 
like a convenient way of saying everybody on the same subnet as NAME.


Cheers,
mark

--
Mark Mielkem...@mielke.cc



Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Tom Lane
Bart Samwel b...@samwel.tk writes:
 I've been working on a patch to add hostname support to pg_hba.conf.

Have you read the previous discussions about that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Bart Samwel
On Thu, Feb 11, 2010 at 16:36, Mark Mielke m...@mark.mielke.cc wrote:

  On 02/11/2010 08:13 AM, Bart Samwel wrote:

ISSUE #2: Reverse lookup?

 There was a suggestion on the TODO list on the wiki, which basically said
 that maybe we could use reverse lookup to find the hostname and then check
 for that hostname in the list. I think that won't work, since IPs can go by
 many names and may not support reverse lookup for some hostnames (/etc/hosts
 anybody?). Furthermore, due to the top-to-bottom processing of pg_hba.conf,
 you CANNOT SKIP entries that might possibly match. For instance, if the
 third line is for host foo.example.com and the fifth line is for 
 bar.example.com, both lines may apply to the same IP, and you still HAVE
 to check the first one, even if reverse lookup turns up the second host
 name. So it doesn't save you any lookups, it just costs an extra one.


 I don't see a need to do a reverse lookup. Reverse lookups are sometimes
 done as a verification check, in the sense that it's cheap to get a map from
 NAME - IP, but sometimes it is much harder to get the reverse map from IP
 - NAME. However, it's not a reliable check as many legitimate users have
 trouble getting a reverse map from IP - NAME. It also doesn't same anything
 as IP - NAME lookups are a completely different set of name servers, and
 these name servers are not always optimized for speed as IP - NAME lookups
 are less common than NAME - IP. Finally, if one finds a map from IP -
 NAME, that doesn't prove that a map from NAME - IP exists, so using *any*
 results from IP - NAME is questionable.

 I think reverse lookups are unnecessary and undesirable.


 ISSUE #3: Multiple hostnames?

 Currently, a pg_hba entry lists an IP / netmask combination. I would
 suggest allowing lists of hostnames in the entries, so that you can at least
 mimic the match multiple hosts by a single rule. Any reason not to do
 this?


 I'm mixed. In some situations, I've wanted to put multiple IP/netmask. I
 would say that if multiple names are supported, then multiple IP/netmask
 should be supported. But, this does make the lines unwieldy beyond two or
 three. This direction leans towards the capability to define host classes,
 where the rules allows the host class, and the host class can have a list of
 hostnames.


Yes, but before you know it people will ask for being able to specify
multiple host classes. :-) Quite simply put, with a single subnet you can
allow multiple hosts in. Allowing only a single hostname is a step backward
from that, so adding support for multiple hostnames could be useful if
somebody is replacing subnets with hostname-based configuration.

Two other aspects I don't see mentioned:

 1) What will you do for hostnames that have multiple IP addresses? Will you
 accept all IP addresses as being valid?


Yes, all addresses returned by (pg_)getaddrinfo will be considered valid.
Most importantly, this ensures that if a host has an IPv4 and an IPv6
address they are both accepted. Plus, if there are multiple addresses, we
have no clue of figuring out which address is the address. :-)


 2) What will you do if they specify a hostname and a netmask? This seems
 like a convenient way of saying everybody on the same subnet as NAME.


Not supported. Either an IP address / netmask combo, or a hostname, but not
both. I wouldn't want to recommend hardcoding something such as netmasks
(which are definitely subnet dependent) in combination with something as
volatile as a host name -- move it to a different subnet, and you might
allow a whole bigger subnet than you intended. If they want to specify a
netmask, then they should just use hardcoded IPs as well.

Cheers,
Bart


Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Mark Mielke

On 02/11/2010 04:54 PM, Bart Samwel wrote:
On Thu, Feb 11, 2010 at 16:36, Mark Mielke m...@mark.mielke.cc 
mailto:m...@mark.mielke.cc wrote:



ISSUE #3: Multiple hostnames?

Currently, a pg_hba entry lists an IP / netmask combination. I
would suggest allowing lists of hostnames in the entries, so that
you can at least mimic the match multiple hosts by a single
rule. Any reason not to do this?


I'm mixed. In some situations, I've wanted to put multiple
IP/netmask. I would say that if multiple names are supported, then
multiple IP/netmask should be supported. But, this does make the
lines unwieldy beyond two or three. This direction leans towards
the capability to define host classes, where the rules allows
the host class, and the host class can have a list of hostnames.


Yes, but before you know it people will ask for being able to specify 
multiple host classes. :-) Quite simply put, with a single subnet you 
can allow multiple hosts in. Allowing only a single hostname is a step 
backward from that, so adding support for multiple hostnames could be 
useful if somebody is replacing subnets with hostname-based configuration.


This implies two aspects which may not be true:

1) All hosts that I want to allow belong to the same subnet.
2) If I trust one host on the subnet, then I trust all hosts on the 
subnet.


While the above two points are often true, they are not universally true.



2) What will you do if they specify a hostname and a netmask? This
seems like a convenient way of saying everybody on the same
subnet as NAME.


Not supported. Either an IP address / netmask combo, or a hostname, 
but not both. I wouldn't want to recommend hardcoding something such 
as netmasks (which are definitely subnet dependent) in combination 
with something as volatile as a host name -- move it to a different 
subnet, and you might allow a whole bigger subnet than you intended. 
If they want to specify a netmask, then they should just use hardcoded 
IPs as well.


Ah yes, I recall this from a previous thread. I think I also disagreed 
on the other thread. :-)


I thought of a use for reverse lookup - it would allow wild card 
hostnames. Still, that's an advanced feature that might be for later... :-)


Cheers,
mark

--
Mark Mielkem...@mielke.cc



Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Bart Samwel
On Thu, Feb 11, 2010 at 17:21, Tom Lane t...@sss.pgh.pa.us wrote:

 Bart Samwel b...@samwel.tk writes:
  I've been working on a patch to add hostname support to pg_hba.conf.

 Have you read the previous discussions about that?


Yes, mostly.

The previous discussions included all sorts of complex stuff such as
wildcards. Personally, I'd think that in the cases where you'd want
wildcards, then you should use IP / netmask configuration, because that's a
way better indicator of something that comes from the same source network
entity. For instance, wildcards are nice for all our own servers, except
that you'd normally use IP / netmaks  to indicate your own server subnet.

The way I see it, hostname based configuration should be plain and simple.
You suggested in one of the earlier discussions that it should not be much
more than removing the AI_NUMERICHOST hint in the lookup. My current
solution is slightly more involved, since it performs the by-hostname lookup
at check time, not at pg_hba.conf read time -- but there is not much more
complexity involved. If there is a case for more complexity, then we will
hear the actual use cases after this basic support is added, I guess.

Cheers,
Bart


Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Bart Samwel
On Thu, Feb 11, 2010 at 23:01, Mark Mielke m...@mark.mielke.cc wrote:

  On 02/11/2010 04:54 PM, Bart Samwel wrote:

   ISSUE #3: Multiple hostnames?

 Currently, a pg_hba entry lists an IP / netmask combination. I would
 suggest allowing lists of hostnames in the entries, so that you can at least
 mimic the match multiple hosts by a single rule. Any reason not to do
 this?


  I'm mixed. In some situations, I've wanted to put multiple IP/netmask. I
 would say that if multiple names are supported, then multiple IP/netmask
 should be supported. But, this does make the lines unwieldy beyond two or
 three. This direction leans towards the capability to define host classes,
 where the rules allows the host class, and the host class can have a list of
 hostnames.


 Yes, but before you know it people will ask for being able to specify
 multiple host classes. :-) Quite simply put, with a single subnet you can
 allow multiple hosts in. Allowing only a single hostname is a step backward
 from that, so adding support for multiple hostnames could be useful if
 somebody is replacing subnets with hostname-based configuration.


 This implies two aspects which may not be true:

 1) All hosts that I want to allow belong to the same subnet.
 2) If I trust one host on the subnet, then I trust all hosts on the
 subnet.

 While the above two points are often true, they are not universally true.


I don't think we're talking about the same thing here. I wasn't suggesting
doing hostname-plus-netmask. NO! I was suggesting that where a lazy sysadmin
would previously configure by subnet, they might switch to more fine-grained
hostname-based configuration ONLY IF it doesn't require duplicating every
line in pg_hba.conf for every host in the subnet.

 2) What will you do if they specify a hostname and a netmask? This seems
 like a convenient way of saying everybody on the same subnet as NAME.


Not supported. Either an IP address / netmask combo, or a hostname, but not
both. I wouldn't want to recommend hardcoding something such as netmasks
(which are definitely subnet dependent) in combination with something as
volatile as a host name -- move it to a different subnet, and you might
allow a whole bigger subnet than you intended. If they want to specify a
netmask, then they should just use hardcoded IPs as well.


Ah yes, I recall this from a previous thread. I think I also disagreed on
 the other thread. :-)

 I thought of a use for reverse lookup - it would allow wild card hostnames.
 Still, that's an advanced feature that might be for later... :-)


I think wildcards are interesting, but I have yet to see an actual use case
other than it's cool and very generalized. In my mind (tell me if I'm
wrong), the most common type of PostgreSQL authentication setup is within a
local network within an organization. There, you either authorize an entire
subnet (the entire server park or all client PCs) or you authorize
specific hosts (single IP address). The wildcard case is for replacing the
first case, but for that case, subnets are usually just fine. I'm trying to
target the second case here.

Cheers,
Bart


Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Mark Mielke

On 02/11/2010 05:12 PM, Bart Samwel wrote:
On Thu, Feb 11, 2010 at 23:01, Mark Mielke m...@mark.mielke.cc 
mailto:m...@mark.mielke.cc wrote:


On 02/11/2010 04:54 PM, Bart Samwel wrote:



ISSUE #3: Multiple hostnames?

Currently, a pg_hba entry lists an IP / netmask combination.
I would suggest allowing lists of hostnames in the entries,
so that you can at least mimic the match multiple hosts by
a single rule. Any reason not to do this?


I'm mixed. In some situations, I've wanted to put multiple
IP/netmask. I would say that if multiple names are supported,
then multiple IP/netmask should be supported. But, this does
make the lines unwieldy beyond two or three. This direction
leans towards the capability to define host classes, where
the rules allows the host class, and the host class can have
a list of hostnames.


Yes, but before you know it people will ask for being able to
specify multiple host classes. :-) Quite simply put, with a
single subnet you can allow multiple hosts in. Allowing only a
single hostname is a step backward from that, so adding support
for multiple hostnames could be useful if somebody is replacing
subnets with hostname-based configuration.


This implies two aspects which may not be true:

1) All hosts that I want to allow belong to the same subnet.
2) If I trust one host on the subnet, then I trust all hosts
on the subnet.

While the above two points are often true, they are not
universally true.


I don't think we're talking about the same thing here. I wasn't 
suggesting doing hostname-plus-netmask. NO! I was suggesting that 
where a lazy sysadmin would previously configure by subnet, they might 
switch to more fine-grained hostname-based configuration ONLY IF it 
doesn't require duplicating every line in pg_hba.conf for every host 
in the subnet.


Ah yes. You are focusing on allowing a netmask to expand to hostnames. 
I'm focusing on how netmasks were never that great on their own.


You want to allow multiple hosts - I want you to allow multiple 
netmasks. I think the requirement is the same. I also think that same 
line has always been an annoying restriction. I have many duplicated 
lines today just for:


host DATABASE USER 127.0.0.1/32 md5
host DATABASE USER ::1/128 md5

Isn't that a big silly? If you think it's acceptable to allow multiple 
hostname, I'm pointing out that your requirement is not limited to 
hostnames only. Why not?


host DATABASE USER 127.0.0.1/32,::1/128 md5

Same requirements, same syntax (assuming you were suggesting ','), same 
documentation. Why not?


But once there, it seems clear that packing hostnames or netmasks onto 
one line is just ugly and hard to manage. I'd like to see this extended 
to any of the many ways to allow hostnames to be specified one per line. 
For example:


set tool_servers {
127.0.0.1/32
::1/128
1.2.3.4/32
1.2.3.5/32
}

host DATABASE USER $tool_servers md5

The above features easy parsing capability.

Of course, then I'll ask for the ability to simplify specifying multiple 
databases:


set databases {
db1
db2
}

set users {
user1
user2
}

host $databases $users $tool_servers md5

Sorry... :-)


2) What will you do if they specify a hostname and a netmask?
This seems like a convenient way of saying everybody on the same
subnet as NAME.


Not supported. Either an IP address / netmask combo, or a hostname, 
but not both. I wouldn't want to recommend hardcoding something such 
as netmasks (which are definitely subnet dependent) in combination 
with something as volatile as a host name -- move it to a different 
subnet, and you might allow a whole bigger subnet than you intended. 
If they want to specify a netmask, then they should just use 
hardcoded IPs as well.


Ah yes, I recall this from a previous thread. I think I also
disagreed on the other thread. :-)

I thought of a use for reverse lookup - it would allow wild card
hostnames. Still, that's an advanced feature that might be for
later... :-)


I think wildcards are interesting, but I have yet to see an actual use 
case other than it's cool and very generalized. In my mind (tell me 
if I'm wrong), the most common type of PostgreSQL authentication setup 
is within a local network within an organization. There, you either 
authorize an entire subnet (the entire server park or all client 
PCs) or you authorize specific hosts (single IP address). The 
wildcard case is for replacing the first case, but for that case, 
subnets are usually just fine. I'm trying to target the second case here.


The user case would be an organization with nodes all over the IP space, 
that wants to manage configuration from a single place. DNS would be 
that single place of choice. If moves trust from trust the netmasks to 
be kept up-to-date to trust that DNS will be kept 

Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Euler Taveira de Oliveira
Mark Mielke escreveu:
 Of course, then I'll ask for the ability to simplify specifying multiple
 databases:
 
We already support multiple users and/or databases for a single pg_hba.conf
line ...


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hostnames in pg_hba.conf

2010-02-11 Thread Mark Mielke

On 02/11/2010 09:38 PM, Euler Taveira de Oliveira wrote:

Mark Mielke escreveu:
   

Of course, then I'll ask for the ability to simplify specifying multiple
databases:

 

We already support multiple users and/or databases for a single pg_hba.conf
line ...
   


Is there a reason you trimmed out my points about specifying classes 
as a list of data (host, db, user) and using it in host lines? :-)


Simplifying specifying multiple databases in the sense of if a 
particular line has to list 10 databases, 10 users, and 10 hostnames, it 
would be most convenient to specify 3 classes with 10 records each, 
followed by one host line, instead of have 10 lines with 20 entries on 
it or 1000 lines... :-)


Cheers,
mark

--
Mark Mielkem...@mielke.cc


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers