[jira] [Commented] (AVRO-1173) C++ API for dynamic reading/writing based on schema

2012-09-26 Thread Vivek Nadkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463977#comment-13463977
 ] 

Vivek Nadkarni commented on AVRO-1173:
--

It would be great if you could create a JIRA entry for the issues you fixed 
with the Windows build, and upload a patch file with all your changes in it. 
These changes would be merged by one of the committers (usually Douglas Creager 
for Avro-C) before the next release. 

Also, does anyone on the list know how the automatic build servers work for 
Avro? Would it be possible to add a Windows build for Avro-C, so we could catch 
issues more quickly when things break. 


 C++ API for dynamic reading/writing based on schema
 ---

 Key: AVRO-1173
 URL: https://issues.apache.org/jira/browse/AVRO-1173
 Project: Avro
  Issue Type: Wish
  Components: c++
Reporter: Stefan Langer

 When I started looking at Avro I hoped it would offer some API to read values 
 by name/id (or at least get name/id of datum while iterating over all 
 entries).
 When looking at examples for C: 
 http://avro.apache.org/docs/1.6.3/api/c/index.html#_examples
 ... or some Java examples
 There are getters/setters which have name-arguments, and there are 
 Record-objects constructed from schema which help reading/writing data.
 While testing the C++ API, I couldn't find a way to do so with it!
 I'm still not sure if I'm missing some part of the API or if it is just not 
 yet part of the C++ Interface.
 About C API: I could not use it, because it is C99 focused, so it can't be 
 compiled on our VS2008 ... For the C++ API it's just some tiny tweaks to get 
 it running.
 About Generator: I'm not interested in generating code (if I would be there 
 are enough alternatives to Avro ...)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1173) C++ API for dynamic reading/writing based on schema

2012-09-26 Thread Vivek Nadkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13464324#comment-13464324
 ] 

Vivek Nadkarni commented on AVRO-1173:
--

Thanks for the information about the buildbot.

 C++ API for dynamic reading/writing based on schema
 ---

 Key: AVRO-1173
 URL: https://issues.apache.org/jira/browse/AVRO-1173
 Project: Avro
  Issue Type: Wish
  Components: c++
Reporter: Stefan Langer

 When I started looking at Avro I hoped it would offer some API to read values 
 by name/id (or at least get name/id of datum while iterating over all 
 entries).
 When looking at examples for C: 
 http://avro.apache.org/docs/1.6.3/api/c/index.html#_examples
 ... or some Java examples
 There are getters/setters which have name-arguments, and there are 
 Record-objects constructed from schema which help reading/writing data.
 While testing the C++ API, I couldn't find a way to do so with it!
 I'm still not sure if I'm missing some part of the API or if it is just not 
 yet part of the C++ Interface.
 About C API: I could not use it, because it is C99 focused, so it can't be 
 compiled on our VS2008 ... For the C++ API it's just some tiny tweaks to get 
 it running.
 About Generator: I'm not interested in generating code (if I would be there 
 are enough alternatives to Avro ...)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (AVRO-1164) Avro-C: Deallocate resources used in test_avro_schema.c

2012-09-14 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-1164:
-

Attachment: AVRO-1164.patch

Patch to fix AVRO-1164.

 Avro-C: Deallocate resources used in test_avro_schema.c
 ---

 Key: AVRO-1164
 URL: https://issues.apache.org/jira/browse/AVRO-1164
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.7.1
 Environment: Ubuntu Linux 11.04
Reporter: Vivek Nadkarni
Priority: Trivial
 Fix For: 1.7.2

 Attachments: AVRO-1164.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 There are a few resources that are allocated in test_avro_schema.c that are 
 not deallocated. This minor fix deallocates these resources. Specifically - a 
 file pointer, a directory entry and a schema that were allocated are now also 
 deallocated.
 These resources leaks can be seen using valgrind. However, fixing these 
 issues doesn't resolve all valgrind issues, since there are other bugs that 
 need to be resolved (e.g. AVRO-766) before all valgrind issues can be 
 resolved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (AVRO-1164) Avro-C: Deallocate resources used in test_avro_schema.c

2012-09-14 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-1164:
-

Status: Patch Available  (was: Open)

Submit patch for review.

 Avro-C: Deallocate resources used in test_avro_schema.c
 ---

 Key: AVRO-1164
 URL: https://issues.apache.org/jira/browse/AVRO-1164
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.7.1
 Environment: Ubuntu Linux 11.04
Reporter: Vivek Nadkarni
Priority: Trivial
 Fix For: 1.7.2

 Attachments: AVRO-1164.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 There are a few resources that are allocated in test_avro_schema.c that are 
 not deallocated. This minor fix deallocates these resources. Specifically - a 
 file pointer, a directory entry and a schema that were allocated are now also 
 deallocated.
 These resources leaks can be seen using valgrind. However, fixing these 
 issues doesn't resolve all valgrind issues, since there are other bugs that 
 need to be resolved (e.g. AVRO-766) before all valgrind issues can be 
 resolved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (AVRO-1165) Avro-C: Memory leak in value iface containing AVRO_LINK.

2012-09-14 Thread Vivek Nadkarni (JIRA)
Vivek Nadkarni created AVRO-1165:


 Summary: Avro-C: Memory leak in value iface containing AVRO_LINK.
 Key: AVRO-1165
 URL: https://issues.apache.org/jira/browse/AVRO-1165
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.7.1
 Environment: Ubuntu Linux 11.10
Reporter: Vivek Nadkarni
 Fix For: 1.7.2


A memory leak can be seen when the following matched pair of commands is 
called, using a schema containing an AVRO_LINK. This pair of commands 
constructs a class (value iface) from a schema and then destroys the 
constructed class.

record_class = avro_generic_class_from_schema( schema );
avro_value_iface_decref( record_class );

If schema contains an AVRO_LINK, then avro_generic_class_from_schema() calls 
avro_generic_link_class(), which calls avro_schema_incref() on the AVRO_LINK 
target schema and assigns the target schema pointer to the iface-schema.

When we subsequently call avro_value_iface_decref() to deallocate the class, 
this function calls avro_generic_link_decref_iface(), which frees the memory 
for the link interface without calling avro_schema_decref() on the target 
schema pointed to by iface-schema.

Thus the memory of the target schema is leaked when we create and destroy a 
value interface for an AVRO_LINK. 

Calling avro_schema_decref() on the the target schema (iface-schema) before 
calling avro_freet() on the iface fixes this memory leak.

Note: The pair of commands shown above results in a memory leak, when we create 
and destroy a value interface from *any* schema containing an AVRO_LINK, 
regardless of whether it is recursive or not. There is a separate issue 
regarding memory leaks with recursive schemas described in AVRO-766. The fix 
for this issue can only be tested with non-recursive schemas containing 
AVRO_LINKs until AVRO-766 is fixed.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (AVRO-1165) Avro-C: Memory leak in value iface containing AVRO_LINK.

2012-09-14 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-1165:
-

Attachment: AVRO-1165-FIX.patch

This fix (AVRO-1165-FIX.patch) decrements the reference count of the target 
schema of an AVRO_LINK before deallocating the AVRO_LINK value interface.

Before the fix is applied, valgrind shows the following output:

==6697== 1,410 (24 direct, 1,386 indirect) bytes in 1 blocks are definitely 
lost in loss record 12 of 12
==6697==at 0x4C28F9F: malloc (vg_replace_malloc.c:236)
==6697==by 0x4C29019: realloc (vg_replace_malloc.c:525)
==6697==by 0x413494: avro_default_allocator (allocation.c:36)
==6697==by 0x40AE84: avro_schema_link (schema.c:663)
==6697==by 0x40B467: avro_schema_from_json_t (schema.c:786)
==6697==by 0x40B80A: avro_schema_from_json_t (schema.c:888)
==6697==by 0x40BE97: avro_schema_from_json_root (schema.c:1083)
==6697==by 0x40BFA2: avro_schema_from_json (schema.c:1108)
==6697==by 0x404A1F: main (test_avro_1165.c:56)
==6697== 
==6697== LEAK SUMMARY:
==6697==definitely lost: 48 bytes in 2 blocks
==6697==indirectly lost: 1,386 bytes in 14 blocks
==6697==  possibly lost: 0 bytes in 0 blocks
==6697==still reachable: 0 bytes in 0 blocks
==6697== suppressed: 0 bytes in 0 blocks


After the fix is applied, valgrind shows the following output:

==7247== HEAP SUMMARY:
==7247== in use at exit: 0 bytes in 0 blocks
==7247==   total heap usage: 180 allocs, 180 frees, 9,563 bytes allocated
==7247== 
==7247== All heap blocks were freed -- no leaks are possible
==7247== 
==7247== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)


 Avro-C: Memory leak in value iface containing AVRO_LINK.
 

 Key: AVRO-1165
 URL: https://issues.apache.org/jira/browse/AVRO-1165
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.7.1
 Environment: Ubuntu Linux 11.10
Reporter: Vivek Nadkarni
 Fix For: 1.7.2

 Attachments: AVRO-1165-FIX.patch, AVRO-1165-TEST.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 A memory leak can be seen when the following matched pair of commands is 
 called, using a schema containing an AVRO_LINK. This pair of commands 
 constructs a class (value iface) from a schema and then destroys the 
 constructed class.
 record_class = avro_generic_class_from_schema( schema );
 avro_value_iface_decref( record_class );
 If schema contains an AVRO_LINK, then avro_generic_class_from_schema() calls 
 avro_generic_link_class(), which calls avro_schema_incref() on the AVRO_LINK 
 target schema and assigns the target schema pointer to the iface-schema.
 When we subsequently call avro_value_iface_decref() to deallocate the class, 
 this function calls avro_generic_link_decref_iface(), which frees the memory 
 for the link interface without calling avro_schema_decref() on the target 
 schema pointed to by iface-schema.
 Thus the memory of the target schema is leaked when we create and destroy a 
 value interface for an AVRO_LINK. 
 Calling avro_schema_decref() on the the target schema (iface-schema) before 
 calling avro_freet() on the iface fixes this memory leak.
 Note: The pair of commands shown above results in a memory leak, when we 
 create and destroy a value interface from *any* schema containing an 
 AVRO_LINK, regardless of whether it is recursive or not. There is a separate 
 issue regarding memory leaks with recursive schemas described in AVRO-766. 
 The fix for this issue can only be tested with non-recursive schemas 
 containing AVRO_LINKs until AVRO-766 is fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (AVRO-1167) Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.

2012-09-14 Thread Vivek Nadkarni (JIRA)
Vivek Nadkarni created AVRO-1167:


 Summary: Avro-C: avro_schema_copy() does not copy AVRO_LINKs 
properly.
 Key: AVRO-1167
 URL: https://issues.apache.org/jira/browse/AVRO-1167
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.7.1
 Environment: Ubuntu Linux 11.10
Reporter: Vivek Nadkarni


When avro_schema_copy() encounters an AVRO_LINK from an old_schema to a 
new_schema, it sets the target of the new_link to the target of the old_link in 
the old_schema. Thus, the AVRO_LINK in the new_schema points to an element in 
the old_schema. 

While this is currently safe, since the reference count of the target in the 
old_schema is incremented, we are not really making a copy of the schema.

There is a TODO in the code, which says that we should make a 
avro_schema_copy() of the target in old_schema instead of linking directly to 
it. However, this solution of making a copy would result in a few problems:

1. Avro schemas are intended to be self-contained. That implies that AVRO_LINKs 
are intended to be internal links inside a self-contained schema. The code 
introduces unnecessary (and potentially disallowed) external dependencies in an 
Avro schema. 

2. The purpose of copying a schema that we want to decouple the old_schema from 
the new_schema. The two copies may have different owners, we may want to 
deallocate old schema etc.

3. If the schema is recursive, then the code would enter an infinite recursion 
loop.

It appears to me that the correct solution would be to replicate the entire 
structure of the current schema, including the internal links. This means that 
if old_link_A points to old_target_B, then new_link_A should point to 
new_target_B in the new schema. Note that there should only be one copy of 
new_target_B in the new schema, even if there are multiple links pointing to 
new_target_B - i.e. we should not make a new copy for each link.

In order to implement this proper copying of links, we would need to keep a 
lookup table of pairs of old and new schemas as they are being created, as well 
as a list of all the AVRO_LINKs that are copied. Then as a post-copy step, we 
would go and fix up all the AVRO_LINKs to point to the appropriate targets. 
This is the way the schema is constructed in the first place in 
avro_schema_from_json().

An inefficient way to obtain the correct result from avro_schema_copy() would 
be to perform an avro_schema_to_json() followed by an avro_schema_from_json().

Note: I have not implemented a fix for this issue, but I am documenting this 
issue in AVRO-JIRA because this issue needs to be resolved before AVRO-766 can 
be fixed.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (AVRO-766) C: Memory leak from reference count cycles

2012-09-14 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-766:


Attachment: AVRO-766.patch

The current AVRO specification requires that AVRO_LINKs are internal to a 
schema. See comments in AVRO-530, which indicate that external links are not 
backwards compatible and may only be introduced in Avro 2.0.

Currently, when an AVRO_LINK is created in a schema, the reference count of the 
target schema is incremented. If the schema is recursive, then the reference 
count of the top-level schema is incremented -- and decrementing the reference 
count of the top-level schema doesn't deallocate the schema. Thus, a memory 
leak is formed.

Since, all AVRO_LINKs have targets that are internal to the same top-level 
schema, we could decide that we would not increment the reference count of the 
targets of any links. The targets would be available as long as the top-level 
schema is available. But if the top-level schema is not available, then all the 
internal links would be destroyed too. Therefore, as long as the link itself is 
valid, the targets would also be valid. Using this internal structural 
knowledge of the schema, gives us an implicit guarantee of link target 
validity, while breaking the reference count cycles for recursive schemas.

To implement this mechanism, we would need to ensure that no AVRO_LINKs are 
created with targets outside the top-level schema. While we cannot enforce this 
rule, we can document that external link targets would violate the spec, and 
could result in memory leaks. 

Unfortunately, avro_schema_copy() currently implements a link to an external 
target - described in AVRO-1167. Therefore, AVRO-766 should not be fixed using 
the described mechanism until AVRO-1167 is also fixed.

This patch removes the increment and decrement of reference counts for link 
targets as described above. It also contains a test case test_avro_766.c 
(derived from ref-cycle.c), which shows the memory leak. It also contains a 
macro called TEST_AVRO_1167, that is currently enabled. If the test is 
disabled, you can see that this patch works.

With TEST_AVRO_1167 set to (0):

==21796== HEAP SUMMARY:
==21796== in use at exit: 0 bytes in 0 blocks
==21796==   total heap usage: 129 allocs, 129 frees, 6,090 bytes allocated
==21796== 
==21796== All heap blocks were freed -- no leaks are possible
==21796== 
==21796== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

With TEST_AVRO_1167 set to (1):

==21417== 4,240 (32 direct, 4,208 indirect) bytes in 1 blocks are definitely 
lost in loss record 30 of 30
==21417==at 0x4C28F9F: malloc (vg_replace_malloc.c:236)
==21417==by 0x4C29019: realloc (vg_replace_malloc.c:525)
==21417==by 0x40D34C: avro_default_allocator (allocation.c:36)
==21417==by 0x404647: avro_schema_union (schema.c:310)
==21417==by 0x406886: avro_schema_copy (schema.c:1250)
==21417==by 0x40670A: avro_schema_copy (schema.c:1183)
==21417==by 0x403EEA: main (test_avro_766.c:64)
==21417== 
==21417== LEAK SUMMARY:
==21417==definitely lost: 56 bytes in 2 blocks
==21417==indirectly lost: 4,232 bytes in 44 blocks
==21417==  possibly lost: 0 bytes in 0 blocks
==21417==still reachable: 0 bytes in 0 blocks
==21417== suppressed: 0 bytes in 0 blocks
==21417== 
==21417== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 4 from 4)


I am posting this test with TEST_AVRO_1167 set to (1) because I don't know the 
implications of applying this patch before fixing AVRO-1167.



 C: Memory leak from reference count cycles
 --

 Key: AVRO-766
 URL: https://issues.apache.org/jira/browse/AVRO-766
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.5.0
Reporter: Douglas Creager
 Attachments: AVRO-766.patch, ref-cycle.c


 If you parse a recursive Avro schema, you end up with a cycle in the 
 reference graph for the avro_schema_t objects that are created.  The 
 reference counting mechanism that we're using can't detect this, and so you 
 get a memory leak.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1092) avro-c: improving thread safety in error management code

2012-05-17 Thread Vivek Nadkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278201#comment-13278201
 ] 

Vivek Nadkarni commented on AVRO-1092:
--

Thanks for the clarification. 

Cheers,
Vivek

 avro-c: improving thread safety in error management code
 

 Key: AVRO-1092
 URL: https://issues.apache.org/jira/browse/AVRO-1092
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.6.3, 1.7.0
Reporter: Pugachev Maxim
Priority: Critical
 Attachments: AVRO-1092-patch-2.patch, AVRO-1092.patch


 Error management code isn`t thread safe at all. I wrote a patch for this 
 issue, but it works only for *nix systems.
 Affected functions: avro_set_error(), avro_prefix_error()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (AVRO-1092) avro-c: improving thread safety in error management code

2012-05-16 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-1092:
-

Attachment: AVRO-1092-patch-2.patch

I downloaded the patch to test it out in Windows, since you had called
out the potential Windows incompatibility. The code didn't compile
under Windows, but I was able to get the code to compile and pass
tests with a minor modification. 

I am attaching my minor modification as a patch that should be applied
on top of AVRO-1092.patch. This patch should be applied from within
the avro-trunk/lang/c directory.

My patch also updates the file README.maintaining_win32.txt, to
capture the inability of MSVC++ to support structure initialization
using element names.



 avro-c: improving thread safety in error management code
 

 Key: AVRO-1092
 URL: https://issues.apache.org/jira/browse/AVRO-1092
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.6.3, 1.7.0
Reporter: Pugachev Maxim
Priority: Critical
 Attachments: AVRO-1092-patch-2.patch, AVRO-1092.patch


 Error management code isn`t thread safe at all. I wrote a patch for this 
 issue, but it works only for *nix systems.
 Affected functions: avro_set_error(), avro_prefix_error()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (AVRO-1092) avro-c: improving thread safety in error management code

2012-05-16 Thread Vivek Nadkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13277137#comment-13277137
 ] 

Vivek Nadkarni commented on AVRO-1092:
--

A few questions regarding the patch, specifically the CMakeLists.txt
file. I am by no means a CMake expert, so apologies if the answers
should be obvious :-).

1. Are THREAD_LIBRARIES and THREADSAFE cmake intrinsic definitions or
   are they your definitions?

2. I didn't see why the definition _REENTRANT was set. It isn't used
   anywhere in the source. Is it a requirement of pthreads?

3. How do you disable or enable threads (under Linux)? Is there a
   reason you didn't use the syntax similar to the zlib and lzma
   codecs? For example

find_package(Threads)
if (Threads_FOUND)
  message(Threads_FOUND)
  # Use threads here.
else (Threads_FOUND)
  message(Threads not FOUND)
endif(Threads_FOUND)


Thanks,
Vivek

 avro-c: improving thread safety in error management code
 

 Key: AVRO-1092
 URL: https://issues.apache.org/jira/browse/AVRO-1092
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.6.3, 1.7.0
Reporter: Pugachev Maxim
Priority: Critical
 Attachments: AVRO-1092-patch-2.patch, AVRO-1092.patch


 Error management code isn`t thread safe at all. I wrote a patch for this 
 issue, but it works only for *nix systems.
 Affected functions: avro_set_error(), avro_prefix_error()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (AVRO-1091) Avro-C - Simple scripts to call cmake from windows and linux

2012-05-15 Thread Vivek Nadkarni (JIRA)
Vivek Nadkarni created AVRO-1091:


 Summary: Avro-C - Simple scripts to call cmake from windows and 
linux
 Key: AVRO-1091
 URL: https://issues.apache.org/jira/browse/AVRO-1091
 Project: Avro
  Issue Type: Improvement
  Components: c
Affects Versions: 1.7.0
 Environment: Windows XP, Windows 7, Ubuntu Linux
Reporter: Vivek Nadkarni
Priority: Minor
 Fix For: 1.7.0


New users to Avro-C may not know the specific commandline options to cmake to 
compile the Avro library. I would like to add simple scripts document the cmake 
commands under windows and linux. These scripts treat the Avro-C directory as a 
standalone project and are independent of the Apache build system and directory 
tree. 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (AVRO-1091) Avro-C - Simple scripts to call cmake from windows and linux

2012-05-15 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-1091:
-

Attachment: AVRO-1091.patch

This patch should be applied in the avro-trunk/lang/c directory. It contains 
two files.

cmake_avrolib.bat - for Windows
o Run cmake under Windows to generate the Visual C++ 2008 solution file. 
o Does not run the compiler. 

cmake_avrolib.sh - for Linux
o Run cmake to generate the build directory.
o Run make.
o Run the tests
o Install the avro library to a subdirectory in the local build directory.



 Avro-C - Simple scripts to call cmake from windows and linux
 

 Key: AVRO-1091
 URL: https://issues.apache.org/jira/browse/AVRO-1091
 Project: Avro
  Issue Type: Improvement
  Components: c
Affects Versions: 1.7.0
 Environment: Windows XP, Windows 7, Ubuntu Linux
Reporter: Vivek Nadkarni
Priority: Minor
 Fix For: 1.7.0

 Attachments: AVRO-1091.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 New users to Avro-C may not know the specific commandline options to cmake to 
 compile the Avro library. I would like to add simple scripts document the 
 cmake commands under windows and linux. These scripts treat the Avro-C 
 directory as a standalone project and are independent of the Apache build 
 system and directory tree. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (AVRO-1091) Avro-C - Simple scripts to call cmake from windows and linux

2012-05-15 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-1091:
-

Status: Patch Available  (was: Open)

The scripts are now available for inclusion in Avro-C.

 Avro-C - Simple scripts to call cmake from windows and linux
 

 Key: AVRO-1091
 URL: https://issues.apache.org/jira/browse/AVRO-1091
 Project: Avro
  Issue Type: Improvement
  Components: c
Affects Versions: 1.7.0
 Environment: Windows XP, Windows 7, Ubuntu Linux
Reporter: Vivek Nadkarni
Priority: Minor
 Fix For: 1.7.0

 Attachments: AVRO-1091.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 New users to Avro-C may not know the specific commandline options to cmake to 
 compile the Avro library. I would like to add simple scripts document the 
 cmake commands under windows and linux. These scripts treat the Avro-C 
 directory as a standalone project and are independent of the Apache build 
 system and directory tree. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

2012-05-14 Thread Vivek Nadkarni (JIRA)
Vivek Nadkarni created AVRO-1088:


 Summary: Avro-C - Add performance tests for schema resolution and 
arrays.
 Key: AVRO-1088
 URL: https://issues.apache.org/jira/browse/AVRO-1088
 Project: Avro
  Issue Type: Improvement
  Components: c
Affects Versions: 1.7.0
 Environment: Ubuntu Linux 11.10
Reporter: Vivek Nadkarni
 Fix For: 1.7.0


The current performance test in Avro-C measures the performance while
reading and writing of Avro values using a complex record schema,
which does not contain any arrays.

We add tests to measure the performance for simple and nested
arrays. We also replicate all tests to measure the performance of the
schema resolution using a resolved reader and a resolved writer.

Specifically we add the following performance tests:

Nested Record
1. Replicating the test nested record value by index, using a helper
   function. Using helper functions adds a little overhead, but it
   allows us to test various schemas, as well as different modes of
   schema resolution much more easily.
2. Using a resolved writer to resolve between (identical) reader and
   writer schemas, while reading a complex record.
3. Using a resolved reader to resolve between (identical) reader and
   writer schemas, while writing a complex record.

Simple Array
4. Test the performance for reading and writing a simple array.
5. Using a resolved writer to resolve between (identical) reader and
   writer schemas, while reading a simple array.
6. Using a resolved reader to resolve between (identical) reader and
   writer schemas, while writing a simple array.

Nested Array
7. Test the performance for reading and writing a nested array.
8. Using a resolved writer to resolve between (identical) reader and
   writer schemas, while reading a nested array.
9. Using a resolved reader to resolve between (identical) reader and
   writer schemas, while writing a nested array.

Additionally we fix a minor bug:
1. The return value of avro_value_equal_fast() was not being
   tested. Test this return value, and fail if it is FALSE.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

2012-05-14 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-1088:
-

Attachment: AVRO-1088.patch

Uploading patch file implementing the new performance tests. 


 Avro-C - Add performance tests for schema resolution and arrays.
 

 Key: AVRO-1088
 URL: https://issues.apache.org/jira/browse/AVRO-1088
 Project: Avro
  Issue Type: Improvement
  Components: c
Affects Versions: 1.7.0
 Environment: Ubuntu Linux 11.10
Reporter: Vivek Nadkarni
 Fix For: 1.7.0

 Attachments: AVRO-1088.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The current performance test in Avro-C measures the performance while
 reading and writing of Avro values using a complex record schema,
 which does not contain any arrays.
 We add tests to measure the performance for simple and nested
 arrays. We also replicate all tests to measure the performance of the
 schema resolution using a resolved reader and a resolved writer.
 Specifically we add the following performance tests:
 Nested Record
 1. Replicating the test nested record value by index, using a helper
function. Using helper functions adds a little overhead, but it
allows us to test various schemas, as well as different modes of
schema resolution much more easily.
 2. Using a resolved writer to resolve between (identical) reader and
writer schemas, while reading a complex record.
 3. Using a resolved reader to resolve between (identical) reader and
writer schemas, while writing a complex record.
 Simple Array
 4. Test the performance for reading and writing a simple array.
 5. Using a resolved writer to resolve between (identical) reader and
writer schemas, while reading a simple array.
 6. Using a resolved reader to resolve between (identical) reader and
writer schemas, while writing a simple array.
 Nested Array
 7. Test the performance for reading and writing a nested array.
 8. Using a resolved writer to resolve between (identical) reader and
writer schemas, while reading a nested array.
 9. Using a resolved reader to resolve between (identical) reader and
writer schemas, while writing a nested array.
 Additionally we fix a minor bug:
 1. The return value of avro_value_equal_fast() was not being
tested. Test this return value, and fail if it is FALSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

2012-05-14 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-1088:
-

Status: Patch Available  (was: Open)

I ran the performance tests and got the results appended below.

The results show that, as expected, there is a slight performance hit
for using a resolved writer or resolved reader for the complex record,
compared to using the matched schemas.

However, the results also show that for the simple array and for the
nested array, the penalty for using the resolved writer is
substantial. Using the resolved writer takes 30 to 50 times longer
than using no schema resolution or using the resolved reader for
simple and nested arrays.

The performance results indicate that there is a likely bug in the
resolved writer, when it is trying to resolve simple or nested
arrays. This bug will be reported in a separate AVRO-JIRA issue.


 Running refcount 
  1 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.423s
  Tests/sec:41265475
 Running nested record (legacy) 
  10 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.270s
  Tests/sec:44053
 Running nested record (value by index) 
  100 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.077s
  Tests/sec:481541
 Running nested record (value by name) 
  100 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.333s
  Tests/sec:428571
 Running nested record (value by index) matched schemas 
  100 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.147s
  Tests/sec:465839
 Running nested record (value by index) resolved writer 
  100 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.480s
  Tests/sec:403226
 Running nested record (value by index) resolved reader 
  100 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.230s
  Tests/sec:448430
 Running simple array matched schemas 
  25 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.123s
  Tests/sec:117739
 Running simple array resolved writer 
  1 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.747s
  Tests/sec:3641
 Running simple array resolved reader 
  25 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.270s
  Tests/sec:110132
 Running nested array matched schemas 
  25 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 3.030s
  Tests/sec:82508
 Running nested array resolved writer 
  1 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 6.650s
  Tests/sec:1504
 Running simple array resolved reader 
  25 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 3.313s
  Tests/sec:75453



 Avro-C - Add performance tests for schema resolution and arrays.
 

 Key: AVRO-1088
 URL: https://issues.apache.org/jira/browse/AVRO-1088
 Project: Avro
  Issue Type: Improvement
  Components: c
Affects Versions: 1.7.0
 Environment: Ubuntu Linux 11.10
Reporter: Vivek Nadkarni
 Fix For: 1.7.0

 Attachments: AVRO-1088.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The current performance test in Avro-C measures the performance while
 reading and writing of Avro values using a complex record schema,
 which does not contain any arrays.
 We add tests to measure the performance for simple and nested
 arrays. We also replicate all tests to measure the performance of the
 schema resolution using a resolved reader and a resolved writer.
 Specifically we add the following performance tests:
 Nested Record
 1. Replicating the test nested record value by index, using a helper
function. Using helper functions adds a little overhead, but it
allows us to test various schemas, as well as different modes of
schema resolution much more easily.
 2. Using a resolved writer to resolve between (identical) reader and
writer schemas, while reading a complex record.
 3. Using a resolved reader to resolve between (identical) reader and
writer schemas, while writing a complex record.
 Simple Array
 4. Test the performance for reading and writing a simple array.
 5. Using a resolved writer to resolve between (identical) reader and
writer schemas, while reading a simple array.
 6. Using a resolved reader to resolve between (identical) reader and
writer schemas, while writing a simple array.
 Nested Array
 7. Test the performance for reading and writing a nested array.
 8. Using a resolved writer to resolve between (identical) reader and
writer schemas, while reading a nested array.
 9. Using a resolved reader to resolve between (identical) reader and
writer schemas, while writing a nested array.
 Additionally we fix a minor bug:
 1. 

[jira] [Created] (AVRO-1089) Avro-C - Penalty 30x to 50x for using resolved writer on arrays

2012-05-14 Thread Vivek Nadkarni (JIRA)
Vivek Nadkarni created AVRO-1089:


 Summary: Avro-C - Penalty 30x to 50x for using resolved writer on 
arrays
 Key: AVRO-1089
 URL: https://issues.apache.org/jira/browse/AVRO-1089
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.6.3, 1.7.0
 Environment: Ubuntu Linux
Reporter: Vivek Nadkarni
 Fix For: 1.7.0


The new performance tests created in AVRO-1088 show that using the
resolved writer takes 30 to 50 times longer than using no schema
resolution or using the resolved reader for simple and nested arrays.

For a simple array, using the resolved writer took ~30x longer than
using the memory reader that assumed a matching schema. For the nested
array, using the resolved writer took ~50x longer.

These results suggest that there is a bug in resolved writer. I do not
have a proposed fix at this time.


 Running simple array matched schemas 
  25 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.123s
  Tests/sec:117739
 Running simple array resolved writer 
  1 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.747s
  Tests/sec:3641


 Running nested array matched schemas 
  25 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 3.030s
  Tests/sec:82508
 Running nested array resolved writer 
  1 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 6.650s
  Tests/sec:1504



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (AVRO-1089) Avro-C - Penalty 30x to 50x for using resolved writer on arrays

2012-05-14 Thread Vivek Nadkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Nadkarni updated AVRO-1089:
-

Attachment: AVRO-1089-performance.png

This screenshot was generated using kcachegrind, after running the
performance test test_simple_array_resolved_writer(). The plot shows
that the majority of the time (97%) is spent in the function
avro_resolved_writer_free_elements() called by
avro_resolved_array_writer_reset(). This information suggests that the
bug lies in one of these two functions. Unfortunately, I still don't
have a mechanism or a fix for this issue. 



 Avro-C - Penalty 30x to 50x for using resolved writer on arrays
 ---

 Key: AVRO-1089
 URL: https://issues.apache.org/jira/browse/AVRO-1089
 Project: Avro
  Issue Type: Bug
  Components: c
Affects Versions: 1.6.3, 1.7.0
 Environment: Ubuntu Linux
Reporter: Vivek Nadkarni
 Fix For: 1.7.0

 Attachments: AVRO-1089-performance.png

   Original Estimate: 48h
  Remaining Estimate: 48h

 The new performance tests created in AVRO-1088 show that using the
 resolved writer takes 30 to 50 times longer than using no schema
 resolution or using the resolved reader for simple and nested arrays.
 For a simple array, using the resolved writer took ~30x longer than
 using the memory reader that assumed a matching schema. For the nested
 array, using the resolved writer took ~50x longer.
 These results suggest that there is a bug in resolved writer. I do not
 have a proposed fix at this time.
  Running simple array matched schemas 
   25 tests per run
   Run 1
   Run 2
   Run 3
   Average time: 2.123s
   Tests/sec:117739
  Running simple array resolved writer 
   1 tests per run
   Run 1
   Run 2
   Run 3
   Average time: 2.747s
   Tests/sec:3641
  Running nested array matched schemas 
   25 tests per run
   Run 1
   Run 2
   Run 3
   Average time: 3.030s
   Tests/sec:82508
  Running nested array resolved writer 
   1 tests per run
   Run 1
   Run 2
   Run 3
   Average time: 6.650s
   Tests/sec:1504

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira