> On Jan 24, 2020, at 10:03 AM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> 
> The other is Mark's point about "expected file pattern", which seems a
> slippery slope to me.  If the pattern is /^[a-zA-Z0-9_.]*$/ then I'm
> okay with it (maybe add a few other punctuation chars); as you say no
> sane extension would use names much weirder than that.  But we should
> not be stricter, such as counting the number of periods/underscores
> allowed or where are alpha chars expected (except maybe disallow period
> at start of filename), or anything too specific like that.

What bothered me about skipping files based only on encoding is that it creates 
hard to anticipate bugs.  If extensions embed something, like a customer name, 
into a filename, and that something is usually ASCII, or usually valid UTF-8, 
and gets backed up, but then some day they embed something that is not 
ASCII/UTF-8, then it does not get backed up, and maybe nobody notices until 
they actually *need* the backup, and it’s too late.

We either need to be really strict about what gets backed up, so that nobody 
gets a false sense of security about what gets included in that list, or we 
need to be completely permissive, which would include files named in arbitrary 
encodings.  I don’t see how it does anybody any favors to make the system 
appear to back up everything until you hit this unanticipated case and then it 
fails.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Reply via email to