[ 
https://issues.apache.org/jira/browse/AVRO-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16717803#comment-16717803
 ] 

ASF subversion and git services commented on AVRO-1167:
-------------------------------------------------------

Commit 1f00b1a7a2f76921325ba073de3212a4a22524de in avro's branch 
refs/heads/branch-1.8 from John Gill
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=1f00b1a ]

AVRO-1167, AVRO-766: Fix c AVRO_LINK memory leaks (#217)

* AVRO-1167: Enhance avro_schema_copy for AVRO_LINK

- Add hash of named schemas found during copy
- Find saved named  schema for copy of AVRO_LINK

* AVRO-766: Correct memory leaks in AVRO_LINK copy

- Adds test cases for AVRO-766 & AVRO-1167
- Corrects reference counting for avro_schema_copy

* Enable TEST_AVRO_1167 in test_avro_766

This ensures that both fixes work together and that no valgrind errors are 
produced from a recrusive schema.


> Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.
> -------------------------------------------------------------
>
>                 Key: AVRO-1167
>                 URL: https://issues.apache.org/jira/browse/AVRO-1167
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: c
>    Affects Versions: 1.7.1
>         Environment: Ubuntu Linux 11.10
>            Reporter: Vivek Nadkarni
>            Priority: Major
>         Attachments: AVRO-1167-TEST.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When avro_schema_copy() encounters an AVRO_LINK from an old_schema to a 
> new_schema, it sets the target of the new_link to the target of the old_link 
> in the old_schema. Thus, the AVRO_LINK in the new_schema points to an element 
> in the old_schema. 
> While this is currently safe, since the reference count of the target in the 
> old_schema is incremented, we are not really making a "copy" of the schema.
> There is a "TODO" in the code, which says that we should make a 
> avro_schema_copy() of the target in old_schema instead of linking directly to 
> it. However, this solution of making a copy would result in a few problems:
> 1. Avro schemas are intended to be self-contained. That implies that 
> AVRO_LINKs are intended to be internal links inside a self-contained schema. 
> The code introduces unnecessary (and potentially disallowed) external 
> dependencies in an Avro schema. 
> 2. The purpose of copying a schema that we want to decouple the old_schema 
> from the new_schema. The two copies may have different owners, we may want to 
> deallocate old schema etc.
> 3. If the schema is recursive, then the code would enter an infinite 
> recursion loop.
> It appears to me that the "correct" solution would be to replicate the entire 
> structure of the current schema, including the internal links. This means 
> that if old_link_A points to old_target_B, then new_link_A should point to 
> new_target_B in the new schema. Note that there should only be one copy of 
> new_target_B in the new schema, even if there are multiple links pointing to 
> new_target_B - i.e. we should not make a new copy for each link.
> In order to implement this proper copying of links, we would need to keep a 
> lookup table of pairs of old and new schemas as they are being created, as 
> well as a list of all the AVRO_LINKs that are copied. Then as a post-copy 
> step, we would go and fix up all the AVRO_LINKs to point to the appropriate 
> targets. This is the way the schema is constructed in the first place in 
> avro_schema_from_json().
> An inefficient way to obtain the correct result from avro_schema_copy() would 
> be to perform an avro_schema_to_json() followed by an avro_schema_from_json().
> Note: I have not implemented a fix for this issue, but I am documenting this 
> issue in AVRO-JIRA because this issue needs to be resolved before AVRO-766 
> can be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to