st 2. 12. 2020 v 0:05 odesÃlatel Andrew Dunstan <and...@dunslane.net> napsal:
> > On 11/30/20 8:14 AM, Peter Eisentraut wrote: > > On 2020-11-29 18:36, Pavel Stehule wrote: > >> > >> I don't really get the point of this function. There is AFAICT no > >> function to produce this escaped format, and it's not a recognized > >> interchange format. So under what circumstances would one need to > >> use this? > >> > >> > >> Some corporate data can be in CSV format with escaped unicode > >> characters. Without this function it is not possible to decode these > >> files without external application. > > > > I would like some supporting documentation on this. So far we only > > have one stackoverflow question, and then this implementation, and > > they are not even the same format. My worry is that if there is not > > precise specification, then people are going to want to add things in > > the future, and there will be no way to analyze such requests in a > > principled way. > > > > > > > > > Also, should this be an extension? I'm dubious about including such > marginal uses in the core code unless there's a really good case for it. > I am not sure, and I am inclined so it should be core functionality. 1. Although this use case is marginal, this is related to most used encodings - ascii and unicode. 8 bit encodings enhanced about escaped multibyte chars will be used for a very long time. Unfortunately - this will be worse, because Postgres will be used more in the corporate environment, where there is a bigger press to conserve very legacy technologies without correct multibyte support. The core problem so this issue is out of concept bytea -> text or text -> bytea transformations supported by Postgres. This is text -> text transformation (for almost all encoding based on ascii), that is not supported by Postgres now. 2. Postgres already has this functionality - but unfortunately there is a limit just only literal constants. create or replace function uunescape(text) returns text as $$ declare r text; begin -- don't use this code!!! execute 'select e''' || $1 || '''' into r; return r; end; $$ language plpgsql immutable; But one way how anybody can use it is SQL injection vulnerable and slow. So some simple buildin solution can be protection against some future security issues. Personally I am happy with just this limited function that will be safe (although the design based on introducing new encoding and conversions can be more complete and accurate). I agree so this case is marginal, but it is a fully valid use case, and supporting unicode escaped codes just by parser is a needless limit. 3. there are new disadvantages of extensions in current DBaaS times. Until the extension is not directly accepted by a cloud provider, then the extension is not available for users. The acceptance of extensions is not too agile - so moving this code to extension doesn't solve this problem. Without DBaaS the implementation of this feature as the extensions can be good enough. Regards Pavel > > cheers > > > andrew > >