[ 
https://issues.apache.org/jira/browse/AVRO-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779440#action_12779440
 ] 

Thiruvalluvan M. G. commented on AVRO-210:
------------------------------------------

Any scheme that uses only reference counting will cause a leak in case of 
circular references. One partial solution for this is to use Boost SharedPtr 
and WeakPtr. There are two kinds of references between nodes in a schema - 
parent to child reference and symbolic references. We can use SharedPtr to 
refer to children in parents and use WeakPtr for symbolic references. This will 
not have cycles and no leaks.

But this solution has one problem in multi-threaded situations. If a thread 
holds an intermediate node n1 in a temporary (say during a schema walk) and 
another thread deletes the "root" node, all nodes that are ancestors of n1 will 
get cleared. But one of these cleared nodes could be referred through a weak 
pointer by one of the children of n1. Then that weak pointer will become 
invalid. So the thread that is doing a schema walk will not get the whole 
picture.

I suppose this will not be a big problem and we can live with it.

If there are no big objections to this approach, I'll submit a patch.

> Memory leak with recursive schemas when constructed by hand
> -----------------------------------------------------------
>
>                 Key: AVRO-210
>                 URL: https://issues.apache.org/jira/browse/AVRO-210
>             Project: Avro
>          Issue Type: Bug
>          Components: c++
>            Reporter: Thiruvalluvan M. G.
>
> Schema consists of a node or bunch of nodes. These nodes are represented as 
> intrusive pointers of nodes (NodPtr). Since the intrusive pointers use 
> reference counts, recursive schemas which result in cycles of intrusive 
> pointers lead to memory leak. The following code, when compiled and run, 
> causes the memory to grow steadily:
> {code:title=test.cc|borderStyle=solid}
> #include <unistd.h>
> #include "Schema.hh"
> int main(int argc, char** argv)
> {
>     const int count1 = 10;
>     const int count2 = 1000;
>     for (int i = 0; i < count1; i++) {
>         for (int j = 0; j < count2; j++) {
>             avro::RecordSchema rec("LongList");
>             rec.addField("value", avro::LongSchema());
>             avro::UnionSchema next;
>             next.addType(avro::NullSchema());
>             next.addType(rec);
>             rec.addField("next", next);
>             rec.addField("end", avro::BoolSchema());
>         }
>         sleep(1);
>     }
> }
> {code}
> The leak should not happen when we build the schema by parsing a JSON schema 
> file. This is because the current implementation does not use pointers for 
> symbolic links; it uses symbols and there is a symbol table that resolves the 
> symbols at runtime. But unfortunately the nested schema file generates an 
> error. I'll file a separate JIRA for that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to