#29878: GEOSContextHandle leaks probably due to thread local object destructing
order
-------------------------------+------------------------------------
     Reporter:  Yong Li        |                    Owner:  nobody
         Type:  Bug            |                   Status:  new
    Component:  GIS            |                  Version:  1.11
     Severity:  Normal         |               Resolution:
     Keywords:  GEOS leak GIS  |             Triage Stage:  Accepted
    Has patch:  0              |      Needs documentation:  0
  Needs tests:  0              |  Patch needs improvement:  0
Easy pickings:  0              |                    UI/UX:  0
-------------------------------+------------------------------------
Changes (by Tim Graham):

 * stage:  Unreviewed => Accepted


Old description:

> I have observed consistent GEOSContextHandle leaks when using Django
> Geometry features in temporary threads. And I can avoid the leaks by
> manually clearing all attributes of
> `django.contrib.gis.geos.prototypes.io.thread_context`. My theory is that
> destructors of attributes in `io.thread_context` call some GEOSFunc
> objects, and that can create new GEOSContextHandle while Python is
> clearing thread local storage.
>
> 1. threadsafe.thread_context.handle is cleared
> 2. io.thread_context attributes are cleared
> 3. io.thread_context attributes are destructed, and then created new
> threadsafe.thread_context.handle.
>
> When I am trying a minimized sample, I also found that it got double free
> or corruption error very often if I do GC in main thread right after the
> thread using GEOS is joined. So this may be another big issue
>
> BTW, I am using Python 2.7.12 and django 1.11.  But after checking the
> latest Django code, I think the issue is still there.
>
> My sample code is:
>
> {{{
> #!div style="font-size: 80%"
> Code highlighting:
>   {{{#!python
> #!/usr/bin/env python
> import gc
> import threading
> from django.contrib.gis.geos import GEOSGeometry
>
> _old_objs = None
> _new_objs = None
> _first_time = True
>

> def gc_objects():
>     gc.collect()
>     objs_counts = {}
>     for obj in gc.get_objects():
>         key = str(type(obj))
>         if key in objs_counts:
>             objs_counts[key] += 1
>         else:
>             objs_counts[key] = 1
>     return objs_counts
>

> def dump_memory_leaks():
>     global _old_objs
>     global _new_objs
>     global _first_time
>     if _first_time:
>         _old_objs = gc_objects()
>         _first_time = False
>     else:
>         _new_objs = gc_objects()
>         leaked = {}
>         for k, v in _new_objs.iteritems():
>             old_v = _old_objs.get(k)
>             if old_v:
>                 diff = _new_objs[k] - old_v
>                 if diff > 0:
>                     leaked[str(k)] = diff
>             else:
>                 leaked[str(k)] = v
>
>         print "Leaks: {}".format(leaked)
>
>         _new_objs = None
>

> def use_geos():
>     GEOSGeometry('POINT(5 23)')
>     # These lines can get rid of the GEOSContextHandle leak
>     # from django.contrib.gis.geos.prototypes.io import thread_context as
> io_thread_context
>     # io_thread_context.__dict__.clear()
>     dump_memory_leaks()
>

> if __name__ == '__main__':
>     for i in xrange(10):
>         t = threading.Thread(target=use_geos)
>         t.start()
>         t.join()
>         # If I do GC here, it will crash with "double free or corruption"
> at random `i`
>         # dump_memory_leaks()
>

>   }}}
> }}}
>
> Output:
>
> {{{
> Leaks: {"<type 'dict'>": 2, "<class
> 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 1,
> "<type 'weakref'>": 1, "<class
> 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 1, "<type
> 'frame'>": 1}
> Leaks: {"<type 'weakref'>": 2, "<class
> 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 2,
> "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 2,
> "<type 'dict'>": 4}
> Leaks: {"<type 'weakref'>": 3, "<class
> 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 3,
> "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 3,
> "<type 'dict'>": 6}
> Leaks: {"<type 'weakref'>": 4, "<class
> 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 4,
> "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 4,
> "<type 'dict'>": 8}
> Leaks: {"<type 'frame'>": 1, "<type 'weakref'>": 5, "<type
> 'instancemethod'>": 1, "<class
> 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 5, "<class
> 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 5,
> "<type 'dict'>": 10}
> Leaks: {"<type 'weakref'>": 6, "<class
> 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 6,
> "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 6,
> "<type 'dict'>": 12}
> Leaks: {"<type 'weakref'>": 7, "<class
> 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 7,
> "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 7,
> "<type 'dict'>": 14}
> Leaks: {"<type 'weakref'>": 8, "<class
> 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 8,
> "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 8,
> "<type 'dict'>": 16}
> Leaks: {"<type 'weakref'>": 9, "<class
> 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 9,
> "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 9,
> "<type 'dict'>": 18}
>
> }}}

New description:

 I have observed consistent GEOSContextHandle leaks when using Django
 Geometry features in temporary threads. And I can avoid the leaks by
 manually clearing all attributes of
 `django.contrib.gis.geos.prototypes.io.thread_context`. My theory is that
 destructors of attributes in `io.thread_context` call some GEOSFunc
 objects, and that can create new GEOSContextHandle while Python is
 clearing thread local storage.

 1. threadsafe.thread_context.handle is cleared
 2. io.thread_context attributes are cleared
 3. io.thread_context attributes are destructed, and then created new
 threadsafe.thread_context.handle.

 When I am trying a minimized sample, I also found that it got double free
 or corruption error very often if I do GC in main thread right after the
 thread using GEOS is joined. So this may be another big issue

 BTW, I am using Python 2.7.12 and django 1.11.  But after checking the
 latest Django code, I think the issue is still there.

 My sample code is:

 {{{
 #!div style="font-size: 80%"
 Code highlighting:
   {{{#!python
 #!/usr/bin/env python
 import gc
 import threading
 from django.contrib.gis.geos import GEOSGeometry

 _old_objs = None
 _new_objs = None
 _first_time = True


 def gc_objects():
     gc.collect()
     objs_counts = {}
     for obj in gc.get_objects():
         key = str(type(obj))
         if key in objs_counts:
             objs_counts[key] += 1
         else:
             objs_counts[key] = 1
     return objs_counts


 def dump_memory_leaks():
     global _old_objs
     global _new_objs
     global _first_time
     if _first_time:
         _old_objs = gc_objects()
         _first_time = False
     else:
         _new_objs = gc_objects()
         leaked = {}
         for k, v in _new_objs.items():
             old_v = _old_objs.get(k)
             if old_v:
                 diff = _new_objs[k] - old_v
                 if diff > 0:
                     leaked[str(k)] = diff
             else:
                 leaked[str(k)] = v

         print("Leaks: {}".format(leaked))

         _new_objs = None


 def use_geos():
     GEOSGeometry('POINT(5 23)')
     # These lines can get rid of the GEOSContextHandle leak
     # from django.contrib.gis.geos.prototypes.io import thread_context as
 io_thread_context
     # io_thread_context.__dict__.clear()
     dump_memory_leaks()


 if __name__ == '__main__':
     for i in range(10):
         t = threading.Thread(target=use_geos)
         t.start()
         t.join()
         # If I do GC here, it will crash with "double free or corruption"
 at random `i`
         # dump_memory_leaks()


   }}}
 }}}

 Output:

 {{{
 Leaks: {"<type 'dict'>": 2, "<class
 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 1,
 "<type 'weakref'>": 1, "<class
 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 1, "<type
 'frame'>": 1}
 Leaks: {"<type 'weakref'>": 2, "<class
 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 2,
 "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 2,
 "<type 'dict'>": 4}
 Leaks: {"<type 'weakref'>": 3, "<class
 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 3,
 "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 3,
 "<type 'dict'>": 6}
 Leaks: {"<type 'weakref'>": 4, "<class
 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 4,
 "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 4,
 "<type 'dict'>": 8}
 Leaks: {"<type 'frame'>": 1, "<type 'weakref'>": 5, "<type
 'instancemethod'>": 1, "<class
 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 5, "<class
 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 5,
 "<type 'dict'>": 10}
 Leaks: {"<type 'weakref'>": 6, "<class
 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 6,
 "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 6,
 "<type 'dict'>": 12}
 Leaks: {"<type 'weakref'>": 7, "<class
 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 7,
 "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 7,
 "<type 'dict'>": 14}
 Leaks: {"<type 'weakref'>": 8, "<class
 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 8,
 "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 8,
 "<type 'dict'>": 16}
 Leaks: {"<type 'weakref'>": 9, "<class
 'django.contrib.gis.geos.prototypes.threadsafe.GEOSContextHandle'>": 9,
 "<class 'django.contrib.gis.geos.libgeos.LP_GEOSContextHandle_t'>": 9,
 "<type 'dict'>": 18}

 }}}

--

Comment:

 Tentatively accepting. I'm not a GeoDjango expert. I updated the ticket
 description to make the script compatible with Python 3 as Python 2 is no
 longer supported as of Django 2.0.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/29878#comment:6>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/065.c11e7ae66aeafe68f3a97d13e427c5d8%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to