Hi Dave,

-- 
*Harshal Dhumal*
*Software Engineer*

EnterpriseDB India: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

On Mon, Feb 6, 2017 at 6:48 PM, Dave Page <dp...@pgadmin.org> wrote:

> Hi
>
> On Mon, Feb 6, 2017 at 12:57 PM, Harshal Dhumal
> <harshal.dhu...@enterprisedb.com> wrote:
> > Hi,
> >
> > Please find attached patch for RM 1983.
> >
> > This issue only occurs when database encoding is other than utf-8
> >
> > Also other issue was when we use connection of database with encoding
> other
> > than utf-8 to retrieve data from cluster table/s which has encoding utf-8
> > (e.g. pg_database) then data was not decoded properly.
>
> The code makes an assumption that pg_database is always utf-8 encoded.
> I don't believe that is correct - I believe it's the encoding used in
> the database from which the new database was created. The general
> advice is that users should avoid using non-7bit ASCII characters in
> shared catalogs, e.g. databases and comments etc.
>

Ok.

Let me split this into two issues:
i) RM1983 for which I have attached updated patch. (basically I removed
changes related to decode data retried from pg_database when connection
encoding is other than utf-8)
ii) Support to allow user to use non-&bit ASCII characters in shared
catalogs with the help of pgAdmin4.

Regarding your statement about pg_database *"I believe it's the encoding
used in the database from which the new database was created."**. *I found
it little-bit confusing for me (correct me if i'm wrong); As mentioned here
<https://www.postgresql.org/docs/9.5/static/catalog-pg-database.html> there
is only one copy of pg_database per cluster. So I assume pg_database is
created when we initialize database cluster and not when we create new
database.



>
> See https://www.postgresql.org/message-id/flat/20160216163833.GF31273%
> 40momjian.us#20160216163833.gf31...@momjian.us
> for more info for example.
>
> Did pgAdmin 3 just assume it was UTF-8? I suspect it did - and that
> just happened to work in most cases.
>
> --
> Dave Page
> Blog: http://pgsnake.blogspot.com
> Twitter: @pgsnake
>
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
diff --git a/web/pgadmin/utils/driver/psycopg2/__init__.py b/web/pgadmin/utils/driver/psycopg2/__init__.py
index 4cafeb2..4e460a6 100644
--- a/web/pgadmin/utils/driver/psycopg2/__init__.py
+++ b/web/pgadmin/utils/driver/psycopg2/__init__.py
@@ -26,7 +26,7 @@ from flask import g, current_app, session
 from flask_babel import gettext
 from flask_security import current_user
 from pgadmin.utils.crypto import decrypt
-from psycopg2.extensions import adapt
+from psycopg2.extensions import adapt, encodings
 
 import config
 from pgadmin.model import Server, User
@@ -74,6 +74,7 @@ psycopg2.extensions.register_type(
         'NUMERIC_RANGE_TEXT', psycopg2.STRING)
 )
 
+
 def register_string_typecasters(connection):
     """
     Casts various types to string, resolving issues with out of
@@ -94,6 +95,30 @@ def register_string_typecasters(connection):
     new_type = psycopg2.extensions.new_type(oids, 'RETURN_STRING', return_as_string)
     psycopg2.extensions.register_type(new_type)
 
+    # In python3 when database encoding is other than utf-8 and client encoding
+    # is set to UNICODE then we need to map data from database encoding
+    # to utf-8.
+    # This is required because when client encoding is set to UNICODE then
+    # psycopg assumes database encoding utf-8 and not the actual encoding.
+    # Not sure whether it's bug or feature in psycopg for python3.
+
+    if sys.version_info >= (3,) and connection.encoding != 'UTF8':
+        def return_as_unicode(value, cursor):
+            if value is None:
+                return None
+            # Treat value as byte sequence of database encoding and then decode
+            # it as utf-8 to get correct unicode value.
+            return bytes(
+                value, encodings[cursor.connection.encoding]
+            ).decode('utf-8')
+
+        unicode_type = psycopg2.extensions.new_type(
+            (19, 18, 25, 1042, 1043, 0),
+            'UNICODE', return_as_unicode)
+
+        psycopg2.extensions.register_type(unicode_type)
+
+
 class Connection(BaseConnection):
     """
     class Connection(object)
@@ -568,11 +593,17 @@ WHERE
             query: SQL query to run.
             params: Extra parameters
         """
+
+        if sys.version_info < (3,):
+            if type(query) == unicode:
+                query = query.encode('utf-8')
+        else:
+            query = query.encode('utf-8')
+
         cur.execute(query, params)
         if self.async == 1:
             self._wait(cur.connection)
 
-
     def execute_on_server_as_csv(self, query, params=None, formatted_exception_msg=False, records=2000):
         status, cur = self.__cursor(server_cursor=True)
         self.row_count = 0
@@ -581,11 +612,14 @@ WHERE
             return False, str(cur)
         query_id = random.randint(1, 9999999)
 
+        if sys.version_info < (3,) and type(query) == unicode:
+            query = query.encode('utf-8')
+
         current_app.logger.log(25,
                                u"Execute (with server cursor) for server #{server_id} - {conn_id} (Query-id: {query_id}):\n{query}".format(
                                    server_id=self.manager.sid,
                                    conn_id=self.conn_id,
-                                   query=query,
+                                   query=query.decode('utf-8') if sys.version_info < (3,) else query,
                                    query_id=query_id
                                )
                                )
@@ -703,6 +737,13 @@ WHERE
             params: extra parameters to the function
             formatted_exception_msg: if True then function return the formatted exception message
         """
+
+        if sys.version_info < (3,):
+            if type(query) == unicode:
+                query = query.encode('utf-8')
+        else:
+            query = query.encode('utf-8')
+
         self.__async_cursor = None
         status, cur = self.__cursor()
 
@@ -715,7 +756,7 @@ WHERE
             u"Execute (async) for server #{server_id} - {conn_id} (Query-id: {query_id}):\n{query}".format(
                 server_id=self.manager.sid,
                 conn_id=self.conn_id,
-                query=query,
+                query=query.decode('utf-8'),
                 query_id=query_id
             )
         )
@@ -733,7 +774,7 @@ Failed to execute query (execute_async) for the server #{server_id} - {conn_id}
 """.format(
                 server_id=self.manager.sid,
                 conn_id=self.conn_id,
-                query=query,
+                query=query.decode('utf-8'),
                 errmsg=errmsg,
                 query_id=query_id
             )
-- 
Sent via pgadmin-hackers mailing list (pgadmin-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgadmin-hackers

Reply via email to