On Mon, Oct 16, 2017 at 3:36 PM Daniel Shahaf <d...@daniel.shahaf.name>
wrote:

>
> > It has been quite some time since I was last on here, but as I was
> looking
>
> Almost a decade, according to contribulyzer. :-)
>
> Yeah, you know it has been a while when your feature shows up in RHEL..2
releases ago (RHEL6)!


>
> There hasn't been any suggestion to deprecate swig-py.  Moreover, they
> are our favourite bindings for tools/ scripts, so I don't anticipate
> them to be deprecated, either.
>
> That said, the bindings see few changes nowadays, and we have always had
> few swig-savvy devs around; so any help would be most welcome.
>
> Perfect, that's great to hear.  This is what I assumed, but I wanted to be
sure.


> > I also wanted to know of any partial efforts that might have
> > already been started, or if there were discussions related to the
> > implementation that my searches did not turn up.
>
> There are several separate uses of Python in the source tree.  I recall
> patches to build/, tools/, and subversion/tests/cmdline/ that improve
> 3.x compatibility, but I don't recall any such changes to the bindings.
> Note that we have both SWIG bindings at subversion/bindings/swig/python/
> and ctypes bindings at subversion/bindings/ctypes-python/.
>
> I take it that of all these, you're interested in the swig-py bindings?
> Or are the build system and test suite also within your scope?
>
>
Well I certainly don't want to half way do it, so while my initial target
was the swig bindings, I'll take a look at the full set.

You said the patch was going to be a largeish one.  How large/invasive
> are we talking about?  (This affects how easy it'd be for us to
> review/apply it)
>

Yes, I suspect the effort will touch a fair number of code, but much of it
will be direct substitution, and thus should be reasonable to review.
Plus, it'll certainly be broken up into smaller, more manageable commits.
I'll also be planning to simultaneously build python3 and python2 bindings,
since I'll be eyeing future Fedora (and presumably other distro) packaging,
which will touch a few front end build pieces.

In fact, to get the conversation going I've attached a patch which gives a
sense of the road ahead.  This is where I got to yesterday before deciding
that I should probably start talking to the dev team about desires and
direction.  I believe that it should consist mostly of replacing various
functions deprecated/removed in Py3 with wrappers to consolidate all the
conditional compiling into a common location.  Then substituting the use of
those functions.

My initial assessment is there there are really only a handful of the
deprecated functions in use by the current subversion python bindings.
However, it may make sense to use the py3c project [1], which already
provides most (all?) of the necessary shims and compatibility functions.
However, I'm not sure whether this (header-only) dependency is something
you really want to pick up or not.  As I don't think there is a wide
variety of functions that need wrappers, it may not be worth the new dep.

There is one fairly big decision to be made.

As you may or may not know, one of the primary user facing changes of
Python 3 is the migration of the "string" type to "unicode" types.
Previously in python 2 the "string" type basically represented a sequence
of bytes, and when you printed it or did a few other manipulations you
assumed/hoped/prayed it was actually in a valid encoding (typically assumed
to be ASCII/latin-1/uft8).  Now in Python 3 basically all I/O operation
result in a more accurate 'bytes' object, which you then 'decode' into a
"unicode" object, explicitly indicating the encoding format.

In practice, you can almost always use 'utf8', but of course that is
'almost'.  I'm sure there are scenarios when you know that you have some
other encoding coming in or going out that you would need to use some other
specified encoding, but I suspect that is quite rare.  I believe for this
code-set, which will target being compatible with both Py2 and Py3 at the
same time, it is perfectly reasonable to assume 'utf8'.  Indeed, this is
the same approach the aforementioned py3c project took [2].

So I believe (and it seems the py3c developer agrees) that making the
assumption of utf8 encoding is reasonable, especially given that the
current py2 is likely making that same same assumption implicitly (though
ASCII/latin-1 might be closer to the truth in many py2 scenarios).  There
is actually a Py2 unicode object that could be an option to convert to, but
the level of changes trying to move over to that for Py2 coupled with the
pretty large differences in the necessary unicode conversions, would mean a
lot more code churn, for likely little gain.

The questions to decide on are:
1. Are you generally comfortable with the build changes necessary to
(optionally) build Py2 and Py3 bindings at the same time?
2. Do you want to pick up 'py3c' as a new dep, or implement the handful of
necessary wrappers?
3. Is the assumption of utf8 encoding sufficiently reasonable?

My general plan of attack will be:
1. Replace deprecated Py2 swig functions and syntax with Py2/Py3
cross-compatible versions or wrappers.
2. Ensure existing full Py2 support works correctly.
3. Update build to build both, Py2 and Py3, and get Py3 working.
4. Look at the remainder of the python usage to ensure Py3 compliance.

Cheers,
>
> Daniel
>

That's all I have for now,
Troy Curtis, Jr

[1]:  https://github.com/encukou/py3c
[2]: https://github.com/encukou/py3c/blob/master/include/py3c/compat.h
diff a/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.h b/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.h
--- a/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.h
+++ b/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.h
@@ -50,6 +50,27 @@ extern "C" {
 apr_status_t svn_swig_py_initialize(void);
 
 
+/*** Python 2/3 Compatibility Functions ***/
+
+/* Checks whether the provided python object is a File instance.
+ * Returns 1 on success, and 0 on failure.
+ */
+int svn_swig_py_file_check(PyObject *pyfile);
+
+/* Returns the provided python object as a FILE *object.
+ * Return the underlying FILE or NULL if the object is not a File instance.
+ */
+FILE *svn_swig_py_as_file(PyObject *pyfile);
+
+int svn_swig_py_string_check(PyObject *pystr);
+
+PyObject *svn_swig_py_string_from_string(const char *str, Py_ssize_t len);
+
+char *svn_swig_py_as_string(PyObject *pystr);
+
+int svn_swig_py_as_string_and_size(PyObject *pystr, char **ret_str, Py_ssize_t *ret_len);
+
+
 
 /* Functions to manage python's global interpreter lock */
 void svn_swig_py_release_py_lock(void);
diff a/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c b/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
--- a/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
+++ b/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
@@ -60,6 +60,11 @@ typedef int Py_ssize_t;
 # define PY_SSIZE_T_MIN INT_MIN
 #endif
 
+/* Need a reference to the PyIOBase_Type for Python 3 */
+#if PY_VERSION_HEX >= 0x03000000
+extern PyTypeObject PyIOBase_Type;
+#endif
+
 
 /*** Manage the Global Interpreter Lock ***/
 
@@ -114,6 +119,82 @@ void svn_swig_py_acquire_py_lock(void)
 #endif
 }
 
+
+/*** Python 2/3 Compatibility Functions ***/
+int svn_swig_py_file_check(PyObject *pyfile)
+{
+#if PY_VERSION_HEX >= 0x03000000
+  return PyObject_IsInstance(pyfile, (PyObject *)&PyIOBase_Type);
+#else
+  return PyFile_Check(pyfile);
+#endif
+}
+
+FILE *svn_swig_py_as_file(PyObject *pyfile)
+{
+#if PY_VERSION_HEX >= 0x03000000
+  if (!svn_swig_py_file_check(pyfile))
+  {
+    return NULL;
+  }
+  else
+  {
+    int fd = PyObject_AsFileDescriptor(pyfile);
+    PyObject *mode_obj = PyObject_GetAttrString(pyfile, "mode");
+    PyObject *mode_byte_obj = PyUnicode_AsUTF8String(mode_obj);
+    char *mode = PyBytes_AsString(mode_byte_obj);
+    FILE *file = fdopen(fd, mode);
+
+    Py_DECREF(mode_obj);
+    Py_DECREF(mode_byte_obj);
+  }
+#else
+  return PyFile_AsFile(pyfile);
+#endif
+}
+
+int svn_swig_py_string_check(PyObject *pystr)
+{
+#if PY_VERSION_HEX >= 0x03000000
+  return PyUnicode_Check(pystr);
+#else
+  return PyString_Check(pystr);
+#endif
+}
+
+PyObject *svn_swig_py_string_from_string(const char *str, Py_ssize_t len)
+{
+#if PY_VERSION_HEX >= 0x03000000
+  if (len <= 0)
+    return PyUnicode_FromString(str);
+  else
+    return PyUnicode_FromStringAndSize(str, len);
+#else
+  if (len <= 0)
+    return PyString_FromString(str);
+  else
+    return PyString_FromStringAndSize(str, len);
+#endif
+}
+
+char *svn_swig_py_as_string(PyObject *pystr)
+{
+#if PY_VERSION_HEX >= 0x03000000
+  return PyUnicode_AsUTF8(pystr);
+#else
+  return PyString_AsString(pystr);
+#endif
+}
+
+int svn_swig_py_as_string_and_size(PyObject *pystr, char **ret_str, Py_ssize_t *ret_len)
+{
+#if PY_VERSION_HEX >= 0x03000000
+  *ret_str = PyUnicode_AsUTF8AndSize(pystr, ret_len);
+  return (*ret_str == NULL) ? -1 : 0;
+#else
+  return PyString_AsStringAndSize(pystr, ret_str, ret_len);
+#endif
+}
 
 
 /*** Automatic Pool Management Functions ***/
@@ -370,14 +451,14 @@ void svn_swig_py_svn_exception(svn_error_t *error_chain)
           Py_INCREF(Py_None);
           message_ob = Py_None;
         }
-      else if ((message_ob = PyString_FromString(err->message)) == NULL)
+      else if ((message_ob = svn_swig_py_string_from_string(err->message,-1)) == NULL)
         goto finished;
       if (err->file == NULL)
         {
           Py_INCREF(Py_None);
           file_ob = Py_None;
         }
-      else if ((file_ob = PyString_FromString(err->file)) == NULL)
+      else if ((file_ob = svn_swig_py_string_from_string(err->file, -1)) == NULL)
         goto finished;
       if ((line_ob = PyInt_FromLong(err->line)) == NULL)
         goto finished;
@@ -476,23 +557,23 @@ static char *make_string_from_ob(PyObject *ob, apr_pool_t *pool)
 {
   if (ob == Py_None)
     return NULL;
-  if (! PyString_Check(ob))
+  if (! svn_swig_py_string_check(ob))
     {
       PyErr_SetString(PyExc_TypeError, "not a string");
       return NULL;
     }
-  return apr_pstrdup(pool, PyString_AS_STRING(ob));
+  return apr_pstrdup(pool, svn_swig_py_as_string(ob));
 }
 static svn_string_t *make_svn_string_from_ob(PyObject *ob, apr_pool_t *pool)
 {
   if (ob == Py_None)
     return NULL;
-  if (! PyString_Check(ob))
+  if (! svn_swig_py_string_check(ob))
     {
       PyErr_SetString(PyExc_TypeError, "not a string");
       return NULL;
     }
-  return svn_string_create(PyString_AS_STRING(ob), pool);
+  return svn_string_create(svn_swig_py_as_string(ob), pool);
 }
 
 
@@ -553,7 +634,7 @@ static PyObject *convert_svn_string_t(void *value, void *ctx,
   const svn_string_t *s = value;
 
   /* ### gotta cast this thing cuz Python doesn't use "const" */
-  return PyString_FromStringAndSize((void *)s->data, s->len);
+  return svn_swig_py_string_from_string((void *)s->data, s->len);
 }
 
 /* Convert a C string into a Python String object (or a reference to
@@ -566,7 +647,7 @@ static PyObject *cstring_to_pystring(const char *cstring)
       Py_INCREF(Py_None);
       return retval;
     }
-  return PyString_FromString(cstring);
+  return svn_swig_py_string_from_string(cstring, -1);
 }
 
 static PyObject *convert_svn_client_commit_item3_t(void *value, void *ctx)
@@ -643,7 +724,7 @@ static PyObject *convert_string(void *value, void *ctx,
                                 PyObject *py_pool)
 {
   /* ### gotta cast this thing cuz Python doesn't use "const" */
-  return PyString_FromString((const char *)value);
+  return svn_swig_py_string_from_string((const char *)value, -1);
 }
 
 PyObject *svn_swig_py_stringhash_to_dict(apr_hash_t *hash)
@@ -725,7 +806,7 @@ svn_swig_py_propinheriteditemarray_to_dict(const apr_array_header_t *array)
         apr_hash_t *prop_hash = prop_inherited_item->prop_hash;
         PyObject *py_key, *py_value;
 
-        py_key = PyString_FromString(prop_inherited_item->path_or_url);
+        py_key = svn_swig_py_string_from_string(prop_inherited_item->path_or_url, -1);
         if (py_key == NULL)
           goto error;
 
@@ -769,7 +850,7 @@ PyObject *svn_swig_py_proparray_to_dict(const apr_array_header_t *array)
 
         prop = APR_ARRAY_IDX(array, i, svn_prop_t);
 
-        py_key = PyString_FromString(prop.name);
+        py_key = svn_swig_py_string_from_string(prop.name, -1);
         if (py_key == NULL)
           goto error;
 
@@ -780,7 +861,7 @@ PyObject *svn_swig_py_proparray_to_dict(const apr_array_header_t *array)
           }
         else
           {
-             py_value = PyString_FromStringAndSize((void *)prop.value->data,
+             py_value = svn_swig_py_string_from_string((void *)prop.value->data,
                                                    prop.value->len);
              if (py_value == NULL)
                {
@@ -831,7 +912,7 @@ PyObject *svn_swig_py_locationhash_to_dict(apr_hash_t *hash)
             Py_DECREF(dict);
             return NULL;
           }
-        value = PyString_FromString((char *)v);
+        value = svn_swig_py_string_from_string((char *)v, -1);
         if (value == NULL)
           {
             Py_DECREF(key);
@@ -895,7 +976,7 @@ PyObject *svn_swig_py_c_strings_to_list(char **strings)
 
     while ((s = *strings++) != NULL)
       {
-        PyObject *ob = PyString_FromString(s);
+        PyObject *ob = svn_swig_py_string_from_string(s, -1);
 
         if (ob == NULL)
             goto error;
@@ -1252,7 +1333,7 @@ svn_swig_py_unwrap_string(PyObject *source,
                           void *baton)
 {
     const char **ptr_dest = destination;
-    *ptr_dest = PyString_AsString(source);
+    *ptr_dest = svn_swig_py_as_string(source);
 
     if (*ptr_dest != NULL)
         return 0;
@@ -1371,7 +1452,7 @@ PyObject *svn_swig_py_array_to_list(const apr_array_header_t *array)
     for (i = 0; i < array->nelts; ++i)
       {
         PyObject *ob =
-          PyString_FromString(APR_ARRAY_IDX(array, i, const char *));
+        svn_swig_py_string_from_string(APR_ARRAY_IDX(array, i, const char *), -1);
         if (ob == NULL)
           goto error;
         PyList_SET_ITEM(list, i, ob);
@@ -1446,13 +1527,13 @@ static svn_error_t *exception_to_error(PyObject * exc)
 
     if ((message_ob = PyObject_GetAttrString(exc, "message")) == NULL)
         goto finished;
-    message = PyString_AsString(message_ob);
+    message = svn_swig_py_as_string(message_ob);
     if (PyErr_Occurred()) goto finished;
 
     if ((file_ob = PyObject_GetAttrString(exc, "file")) == NULL)
         goto finished;
     if (file_ob != Py_None)
-        file = PyString_AsString(file_ob);
+        file = svn_swig_py_as_string(file_ob);
     if (PyErr_Occurred()) goto finished;
 
     if ((line_ob = PyObject_GetAttrString(exc, "line")) == NULL)
@@ -2436,10 +2517,10 @@ apr_file_t *svn_swig_py_make_file(PyObject *py_file,
   if (py_file == NULL || py_file == Py_None)
     return NULL;
 
-  if (PyString_Check(py_file))
+  if (svn_swig_py_string_check(py_file))
     {
       /* input is a path -- just open an apr_file_t */
-      char* fname = PyString_AS_STRING(py_file);
+      char* fname = svn_swig_py_as_string(py_file);
       apr_err = apr_file_open(&apr_file, fname,
                               APR_CREATE | APR_READ | APR_WRITE,
                               APR_OS_DEFAULT, pool);
@@ -2452,13 +2533,14 @@ apr_file_t *svn_swig_py_make_file(PyObject *py_file,
           return NULL;
         }
     }
-  else if (PyFile_Check(py_file))
+  else if (svn_swig_py_file_check(py_file))
     {
       FILE *file;
       apr_os_file_t osfile;
 
       /* input is a file object -- convert to apr_file_t */
-      file = PyFile_AsFile(py_file);
+      file = svn_swig_py_as_file(py_file);
+
 #ifdef WIN32
       osfile = (apr_os_file_t)_get_osfhandle(_fileno(file));
 #else
@@ -2482,7 +2564,6 @@ read_handler_pyio(void *baton, char *buffer, apr_size_t *len)
 {
   PyObject *result;
   PyObject *py_io = baton;
-  apr_size_t bytes;
   svn_error_t *err = SVN_NO_ERROR;
 
   if (py_io == Py_None)
@@ -2499,18 +2580,24 @@ read_handler_pyio(void *baton, char *buffer, apr_size_t *len)
     {
       err = callback_exception_error();
     }
-  else if (PyString_Check(result))
+  else if (svn_swig_py_string_check(result))
     {
-      bytes = PyString_GET_SIZE(result);
-      if (bytes > *len)
+      char *resultChar;
+      Py_ssize_t resultLen;
+      if (svn_swig_py_as_string_and_size(result,
+                                        &resultChar, &resultLen) == -1)
+        {
+          err == callback_bad_return_error("Failed to get string");
+        }
+      else if (resultLen > *len)
         {
           err = callback_bad_return_error("Too many bytes");
         }
       else
         {
           /* Writeback, in case this was a short read, indicating EOF */
-          *len = bytes;
-          memcpy(buffer, PyString_AS_STRING(result), *len);
+          *len = resultLen;
+          memcpy(buffer, svn_swig_py_as_string(result), *len);
         }
     }
   else
@@ -2948,9 +3035,9 @@ svn_error_t *svn_swig_py_get_commit_log_func(const char **log_msg,
       *log_msg = NULL;
       err = SVN_NO_ERROR;
     }
-  else if (PyString_Check(result))
+  else if (svn_swig_py_string_check(result))
     {
-      *log_msg = apr_pstrdup(pool, PyString_AS_STRING(result));
+      *log_msg = apr_pstrdup(pool, svn_swig_py_as_string(result));
       Py_DECREF(result);
       err = SVN_NO_ERROR;
     }
@@ -3854,7 +3941,7 @@ ra_callbacks_get_wc_prop(void *baton,
     {
       char *buf;
       Py_ssize_t len;
-      if (PyString_AsStringAndSize(result, &buf, &len) == -1)
+      if (svn_swig_py_as_string_and_size(result, &buf, &len) == -1)
         {
           err = callback_exception_error();
         }
@@ -3897,7 +3984,7 @@ ra_callbacks_push_or_set_wc_prop(const char *callback,
       goto finished;
     }
 
-  if ((py_value = PyString_FromStringAndSize(value->data, value->len)) == NULL)
+  if ((py_value = svn_swig_py_string_from_string(value->data, value->len)) == NULL)
     {
       err = callback_exception_error();
       goto finished;
@@ -4096,7 +4183,9 @@ ra_callbacks_get_client_string(void *baton,
     }
   else if (result != Py_None)
     {
-      if ((*name = PyString_AsString(result)) == NULL)
+      /* FIXME: *name is returned outside this function, but decreffed before returning,
+         this is an internal string. */
+      if ((*name = svn_swig_py_as_string(result)) == NULL)
         {
           err = callback_exception_error();
         }
diff a/subversion/bindings/swig/core.i b/subversion/bindings/swig/core.i
--- a/subversion/bindings/swig/core.i
+++ b/subversion/bindings/swig/core.i
@@ -605,7 +605,7 @@
 */
 #ifdef SWIGPYTHON
 %typemap(in) FILE * {
-    $1 = PyFile_AsFile($input);
+    $1 = svn_swig_py_as_file($input);
     if ($1 == NULL) {
         PyErr_SetString(PyExc_ValueError, "Must pass in a valid file object");
         SWIG_fail;

Reply via email to