[jira] [Commented] (AVRO-1173) C++ API for dynamic reading/writing based on schema
[ https://issues.apache.org/jira/browse/AVRO-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463977#comment-13463977 ] Vivek Nadkarni commented on AVRO-1173: -- It would be great if you could create a JIRA entry for the issues you fixed with the Windows build, and upload a patch file with all your changes in it. These changes would be merged by one of the committers (usually Douglas Creager for Avro-C) before the next release. Also, does anyone on the list know how the automatic build servers work for Avro? Would it be possible to add a Windows build for Avro-C, so we could catch issues more quickly when things break. C++ API for dynamic reading/writing based on schema --- Key: AVRO-1173 URL: https://issues.apache.org/jira/browse/AVRO-1173 Project: Avro Issue Type: Wish Components: c++ Reporter: Stefan Langer When I started looking at Avro I hoped it would offer some API to read values by name/id (or at least get name/id of datum while iterating over all entries). When looking at examples for C: http://avro.apache.org/docs/1.6.3/api/c/index.html#_examples ... or some Java examples There are getters/setters which have name-arguments, and there are Record-objects constructed from schema which help reading/writing data. While testing the C++ API, I couldn't find a way to do so with it! I'm still not sure if I'm missing some part of the API or if it is just not yet part of the C++ Interface. About C API: I could not use it, because it is C99 focused, so it can't be compiled on our VS2008 ... For the C++ API it's just some tiny tweaks to get it running. About Generator: I'm not interested in generating code (if I would be there are enough alternatives to Avro ...) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1173) C++ API for dynamic reading/writing based on schema
[ https://issues.apache.org/jira/browse/AVRO-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13464324#comment-13464324 ] Vivek Nadkarni commented on AVRO-1173: -- Thanks for the information about the buildbot. C++ API for dynamic reading/writing based on schema --- Key: AVRO-1173 URL: https://issues.apache.org/jira/browse/AVRO-1173 Project: Avro Issue Type: Wish Components: c++ Reporter: Stefan Langer When I started looking at Avro I hoped it would offer some API to read values by name/id (or at least get name/id of datum while iterating over all entries). When looking at examples for C: http://avro.apache.org/docs/1.6.3/api/c/index.html#_examples ... or some Java examples There are getters/setters which have name-arguments, and there are Record-objects constructed from schema which help reading/writing data. While testing the C++ API, I couldn't find a way to do so with it! I'm still not sure if I'm missing some part of the API or if it is just not yet part of the C++ Interface. About C API: I could not use it, because it is C99 focused, so it can't be compiled on our VS2008 ... For the C++ API it's just some tiny tweaks to get it running. About Generator: I'm not interested in generating code (if I would be there are enough alternatives to Avro ...) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1164) Avro-C: Deallocate resources used in test_avro_schema.c
[ https://issues.apache.org/jira/browse/AVRO-1164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1164: - Attachment: AVRO-1164.patch Patch to fix AVRO-1164. Avro-C: Deallocate resources used in test_avro_schema.c --- Key: AVRO-1164 URL: https://issues.apache.org/jira/browse/AVRO-1164 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.7.1 Environment: Ubuntu Linux 11.04 Reporter: Vivek Nadkarni Priority: Trivial Fix For: 1.7.2 Attachments: AVRO-1164.patch Original Estimate: 1h Remaining Estimate: 1h There are a few resources that are allocated in test_avro_schema.c that are not deallocated. This minor fix deallocates these resources. Specifically - a file pointer, a directory entry and a schema that were allocated are now also deallocated. These resources leaks can be seen using valgrind. However, fixing these issues doesn't resolve all valgrind issues, since there are other bugs that need to be resolved (e.g. AVRO-766) before all valgrind issues can be resolved. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1164) Avro-C: Deallocate resources used in test_avro_schema.c
[ https://issues.apache.org/jira/browse/AVRO-1164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1164: - Status: Patch Available (was: Open) Submit patch for review. Avro-C: Deallocate resources used in test_avro_schema.c --- Key: AVRO-1164 URL: https://issues.apache.org/jira/browse/AVRO-1164 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.7.1 Environment: Ubuntu Linux 11.04 Reporter: Vivek Nadkarni Priority: Trivial Fix For: 1.7.2 Attachments: AVRO-1164.patch Original Estimate: 1h Remaining Estimate: 1h There are a few resources that are allocated in test_avro_schema.c that are not deallocated. This minor fix deallocates these resources. Specifically - a file pointer, a directory entry and a schema that were allocated are now also deallocated. These resources leaks can be seen using valgrind. However, fixing these issues doesn't resolve all valgrind issues, since there are other bugs that need to be resolved (e.g. AVRO-766) before all valgrind issues can be resolved. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1165) Avro-C: Memory leak in value iface containing AVRO_LINK.
Vivek Nadkarni created AVRO-1165: Summary: Avro-C: Memory leak in value iface containing AVRO_LINK. Key: AVRO-1165 URL: https://issues.apache.org/jira/browse/AVRO-1165 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.7.1 Environment: Ubuntu Linux 11.10 Reporter: Vivek Nadkarni Fix For: 1.7.2 A memory leak can be seen when the following matched pair of commands is called, using a schema containing an AVRO_LINK. This pair of commands constructs a class (value iface) from a schema and then destroys the constructed class. record_class = avro_generic_class_from_schema( schema ); avro_value_iface_decref( record_class ); If schema contains an AVRO_LINK, then avro_generic_class_from_schema() calls avro_generic_link_class(), which calls avro_schema_incref() on the AVRO_LINK target schema and assigns the target schema pointer to the iface-schema. When we subsequently call avro_value_iface_decref() to deallocate the class, this function calls avro_generic_link_decref_iface(), which frees the memory for the link interface without calling avro_schema_decref() on the target schema pointed to by iface-schema. Thus the memory of the target schema is leaked when we create and destroy a value interface for an AVRO_LINK. Calling avro_schema_decref() on the the target schema (iface-schema) before calling avro_freet() on the iface fixes this memory leak. Note: The pair of commands shown above results in a memory leak, when we create and destroy a value interface from *any* schema containing an AVRO_LINK, regardless of whether it is recursive or not. There is a separate issue regarding memory leaks with recursive schemas described in AVRO-766. The fix for this issue can only be tested with non-recursive schemas containing AVRO_LINKs until AVRO-766 is fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1165) Avro-C: Memory leak in value iface containing AVRO_LINK.
[ https://issues.apache.org/jira/browse/AVRO-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1165: - Attachment: AVRO-1165-FIX.patch This fix (AVRO-1165-FIX.patch) decrements the reference count of the target schema of an AVRO_LINK before deallocating the AVRO_LINK value interface. Before the fix is applied, valgrind shows the following output: ==6697== 1,410 (24 direct, 1,386 indirect) bytes in 1 blocks are definitely lost in loss record 12 of 12 ==6697==at 0x4C28F9F: malloc (vg_replace_malloc.c:236) ==6697==by 0x4C29019: realloc (vg_replace_malloc.c:525) ==6697==by 0x413494: avro_default_allocator (allocation.c:36) ==6697==by 0x40AE84: avro_schema_link (schema.c:663) ==6697==by 0x40B467: avro_schema_from_json_t (schema.c:786) ==6697==by 0x40B80A: avro_schema_from_json_t (schema.c:888) ==6697==by 0x40BE97: avro_schema_from_json_root (schema.c:1083) ==6697==by 0x40BFA2: avro_schema_from_json (schema.c:1108) ==6697==by 0x404A1F: main (test_avro_1165.c:56) ==6697== ==6697== LEAK SUMMARY: ==6697==definitely lost: 48 bytes in 2 blocks ==6697==indirectly lost: 1,386 bytes in 14 blocks ==6697== possibly lost: 0 bytes in 0 blocks ==6697==still reachable: 0 bytes in 0 blocks ==6697== suppressed: 0 bytes in 0 blocks After the fix is applied, valgrind shows the following output: ==7247== HEAP SUMMARY: ==7247== in use at exit: 0 bytes in 0 blocks ==7247== total heap usage: 180 allocs, 180 frees, 9,563 bytes allocated ==7247== ==7247== All heap blocks were freed -- no leaks are possible ==7247== ==7247== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4) Avro-C: Memory leak in value iface containing AVRO_LINK. Key: AVRO-1165 URL: https://issues.apache.org/jira/browse/AVRO-1165 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.7.1 Environment: Ubuntu Linux 11.10 Reporter: Vivek Nadkarni Fix For: 1.7.2 Attachments: AVRO-1165-FIX.patch, AVRO-1165-TEST.patch Original Estimate: 24h Remaining Estimate: 24h A memory leak can be seen when the following matched pair of commands is called, using a schema containing an AVRO_LINK. This pair of commands constructs a class (value iface) from a schema and then destroys the constructed class. record_class = avro_generic_class_from_schema( schema ); avro_value_iface_decref( record_class ); If schema contains an AVRO_LINK, then avro_generic_class_from_schema() calls avro_generic_link_class(), which calls avro_schema_incref() on the AVRO_LINK target schema and assigns the target schema pointer to the iface-schema. When we subsequently call avro_value_iface_decref() to deallocate the class, this function calls avro_generic_link_decref_iface(), which frees the memory for the link interface without calling avro_schema_decref() on the target schema pointed to by iface-schema. Thus the memory of the target schema is leaked when we create and destroy a value interface for an AVRO_LINK. Calling avro_schema_decref() on the the target schema (iface-schema) before calling avro_freet() on the iface fixes this memory leak. Note: The pair of commands shown above results in a memory leak, when we create and destroy a value interface from *any* schema containing an AVRO_LINK, regardless of whether it is recursive or not. There is a separate issue regarding memory leaks with recursive schemas described in AVRO-766. The fix for this issue can only be tested with non-recursive schemas containing AVRO_LINKs until AVRO-766 is fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1167) Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.
Vivek Nadkarni created AVRO-1167: Summary: Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly. Key: AVRO-1167 URL: https://issues.apache.org/jira/browse/AVRO-1167 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.7.1 Environment: Ubuntu Linux 11.10 Reporter: Vivek Nadkarni When avro_schema_copy() encounters an AVRO_LINK from an old_schema to a new_schema, it sets the target of the new_link to the target of the old_link in the old_schema. Thus, the AVRO_LINK in the new_schema points to an element in the old_schema. While this is currently safe, since the reference count of the target in the old_schema is incremented, we are not really making a copy of the schema. There is a TODO in the code, which says that we should make a avro_schema_copy() of the target in old_schema instead of linking directly to it. However, this solution of making a copy would result in a few problems: 1. Avro schemas are intended to be self-contained. That implies that AVRO_LINKs are intended to be internal links inside a self-contained schema. The code introduces unnecessary (and potentially disallowed) external dependencies in an Avro schema. 2. The purpose of copying a schema that we want to decouple the old_schema from the new_schema. The two copies may have different owners, we may want to deallocate old schema etc. 3. If the schema is recursive, then the code would enter an infinite recursion loop. It appears to me that the correct solution would be to replicate the entire structure of the current schema, including the internal links. This means that if old_link_A points to old_target_B, then new_link_A should point to new_target_B in the new schema. Note that there should only be one copy of new_target_B in the new schema, even if there are multiple links pointing to new_target_B - i.e. we should not make a new copy for each link. In order to implement this proper copying of links, we would need to keep a lookup table of pairs of old and new schemas as they are being created, as well as a list of all the AVRO_LINKs that are copied. Then as a post-copy step, we would go and fix up all the AVRO_LINKs to point to the appropriate targets. This is the way the schema is constructed in the first place in avro_schema_from_json(). An inefficient way to obtain the correct result from avro_schema_copy() would be to perform an avro_schema_to_json() followed by an avro_schema_from_json(). Note: I have not implemented a fix for this issue, but I am documenting this issue in AVRO-JIRA because this issue needs to be resolved before AVRO-766 can be fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-766) C: Memory leak from reference count cycles
[ https://issues.apache.org/jira/browse/AVRO-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-766: Attachment: AVRO-766.patch The current AVRO specification requires that AVRO_LINKs are internal to a schema. See comments in AVRO-530, which indicate that external links are not backwards compatible and may only be introduced in Avro 2.0. Currently, when an AVRO_LINK is created in a schema, the reference count of the target schema is incremented. If the schema is recursive, then the reference count of the top-level schema is incremented -- and decrementing the reference count of the top-level schema doesn't deallocate the schema. Thus, a memory leak is formed. Since, all AVRO_LINKs have targets that are internal to the same top-level schema, we could decide that we would not increment the reference count of the targets of any links. The targets would be available as long as the top-level schema is available. But if the top-level schema is not available, then all the internal links would be destroyed too. Therefore, as long as the link itself is valid, the targets would also be valid. Using this internal structural knowledge of the schema, gives us an implicit guarantee of link target validity, while breaking the reference count cycles for recursive schemas. To implement this mechanism, we would need to ensure that no AVRO_LINKs are created with targets outside the top-level schema. While we cannot enforce this rule, we can document that external link targets would violate the spec, and could result in memory leaks. Unfortunately, avro_schema_copy() currently implements a link to an external target - described in AVRO-1167. Therefore, AVRO-766 should not be fixed using the described mechanism until AVRO-1167 is also fixed. This patch removes the increment and decrement of reference counts for link targets as described above. It also contains a test case test_avro_766.c (derived from ref-cycle.c), which shows the memory leak. It also contains a macro called TEST_AVRO_1167, that is currently enabled. If the test is disabled, you can see that this patch works. With TEST_AVRO_1167 set to (0): ==21796== HEAP SUMMARY: ==21796== in use at exit: 0 bytes in 0 blocks ==21796== total heap usage: 129 allocs, 129 frees, 6,090 bytes allocated ==21796== ==21796== All heap blocks were freed -- no leaks are possible ==21796== ==21796== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4) With TEST_AVRO_1167 set to (1): ==21417== 4,240 (32 direct, 4,208 indirect) bytes in 1 blocks are definitely lost in loss record 30 of 30 ==21417==at 0x4C28F9F: malloc (vg_replace_malloc.c:236) ==21417==by 0x4C29019: realloc (vg_replace_malloc.c:525) ==21417==by 0x40D34C: avro_default_allocator (allocation.c:36) ==21417==by 0x404647: avro_schema_union (schema.c:310) ==21417==by 0x406886: avro_schema_copy (schema.c:1250) ==21417==by 0x40670A: avro_schema_copy (schema.c:1183) ==21417==by 0x403EEA: main (test_avro_766.c:64) ==21417== ==21417== LEAK SUMMARY: ==21417==definitely lost: 56 bytes in 2 blocks ==21417==indirectly lost: 4,232 bytes in 44 blocks ==21417== possibly lost: 0 bytes in 0 blocks ==21417==still reachable: 0 bytes in 0 blocks ==21417== suppressed: 0 bytes in 0 blocks ==21417== ==21417== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 4 from 4) I am posting this test with TEST_AVRO_1167 set to (1) because I don't know the implications of applying this patch before fixing AVRO-1167. C: Memory leak from reference count cycles -- Key: AVRO-766 URL: https://issues.apache.org/jira/browse/AVRO-766 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.5.0 Reporter: Douglas Creager Attachments: AVRO-766.patch, ref-cycle.c If you parse a recursive Avro schema, you end up with a cycle in the reference graph for the avro_schema_t objects that are created. The reference counting mechanism that we're using can't detect this, and so you get a memory leak. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1092) avro-c: improving thread safety in error management code
[ https://issues.apache.org/jira/browse/AVRO-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278201#comment-13278201 ] Vivek Nadkarni commented on AVRO-1092: -- Thanks for the clarification. Cheers, Vivek avro-c: improving thread safety in error management code Key: AVRO-1092 URL: https://issues.apache.org/jira/browse/AVRO-1092 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.6.3, 1.7.0 Reporter: Pugachev Maxim Priority: Critical Attachments: AVRO-1092-patch-2.patch, AVRO-1092.patch Error management code isn`t thread safe at all. I wrote a patch for this issue, but it works only for *nix systems. Affected functions: avro_set_error(), avro_prefix_error() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1092) avro-c: improving thread safety in error management code
[ https://issues.apache.org/jira/browse/AVRO-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1092: - Attachment: AVRO-1092-patch-2.patch I downloaded the patch to test it out in Windows, since you had called out the potential Windows incompatibility. The code didn't compile under Windows, but I was able to get the code to compile and pass tests with a minor modification. I am attaching my minor modification as a patch that should be applied on top of AVRO-1092.patch. This patch should be applied from within the avro-trunk/lang/c directory. My patch also updates the file README.maintaining_win32.txt, to capture the inability of MSVC++ to support structure initialization using element names. avro-c: improving thread safety in error management code Key: AVRO-1092 URL: https://issues.apache.org/jira/browse/AVRO-1092 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.6.3, 1.7.0 Reporter: Pugachev Maxim Priority: Critical Attachments: AVRO-1092-patch-2.patch, AVRO-1092.patch Error management code isn`t thread safe at all. I wrote a patch for this issue, but it works only for *nix systems. Affected functions: avro_set_error(), avro_prefix_error() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1092) avro-c: improving thread safety in error management code
[ https://issues.apache.org/jira/browse/AVRO-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13277137#comment-13277137 ] Vivek Nadkarni commented on AVRO-1092: -- A few questions regarding the patch, specifically the CMakeLists.txt file. I am by no means a CMake expert, so apologies if the answers should be obvious :-). 1. Are THREAD_LIBRARIES and THREADSAFE cmake intrinsic definitions or are they your definitions? 2. I didn't see why the definition _REENTRANT was set. It isn't used anywhere in the source. Is it a requirement of pthreads? 3. How do you disable or enable threads (under Linux)? Is there a reason you didn't use the syntax similar to the zlib and lzma codecs? For example find_package(Threads) if (Threads_FOUND) message(Threads_FOUND) # Use threads here. else (Threads_FOUND) message(Threads not FOUND) endif(Threads_FOUND) Thanks, Vivek avro-c: improving thread safety in error management code Key: AVRO-1092 URL: https://issues.apache.org/jira/browse/AVRO-1092 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.6.3, 1.7.0 Reporter: Pugachev Maxim Priority: Critical Attachments: AVRO-1092-patch-2.patch, AVRO-1092.patch Error management code isn`t thread safe at all. I wrote a patch for this issue, but it works only for *nix systems. Affected functions: avro_set_error(), avro_prefix_error() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1091) Avro-C - Simple scripts to call cmake from windows and linux
Vivek Nadkarni created AVRO-1091: Summary: Avro-C - Simple scripts to call cmake from windows and linux Key: AVRO-1091 URL: https://issues.apache.org/jira/browse/AVRO-1091 Project: Avro Issue Type: Improvement Components: c Affects Versions: 1.7.0 Environment: Windows XP, Windows 7, Ubuntu Linux Reporter: Vivek Nadkarni Priority: Minor Fix For: 1.7.0 New users to Avro-C may not know the specific commandline options to cmake to compile the Avro library. I would like to add simple scripts document the cmake commands under windows and linux. These scripts treat the Avro-C directory as a standalone project and are independent of the Apache build system and directory tree. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1091) Avro-C - Simple scripts to call cmake from windows and linux
[ https://issues.apache.org/jira/browse/AVRO-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1091: - Attachment: AVRO-1091.patch This patch should be applied in the avro-trunk/lang/c directory. It contains two files. cmake_avrolib.bat - for Windows o Run cmake under Windows to generate the Visual C++ 2008 solution file. o Does not run the compiler. cmake_avrolib.sh - for Linux o Run cmake to generate the build directory. o Run make. o Run the tests o Install the avro library to a subdirectory in the local build directory. Avro-C - Simple scripts to call cmake from windows and linux Key: AVRO-1091 URL: https://issues.apache.org/jira/browse/AVRO-1091 Project: Avro Issue Type: Improvement Components: c Affects Versions: 1.7.0 Environment: Windows XP, Windows 7, Ubuntu Linux Reporter: Vivek Nadkarni Priority: Minor Fix For: 1.7.0 Attachments: AVRO-1091.patch Original Estimate: 24h Remaining Estimate: 24h New users to Avro-C may not know the specific commandline options to cmake to compile the Avro library. I would like to add simple scripts document the cmake commands under windows and linux. These scripts treat the Avro-C directory as a standalone project and are independent of the Apache build system and directory tree. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1091) Avro-C - Simple scripts to call cmake from windows and linux
[ https://issues.apache.org/jira/browse/AVRO-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1091: - Status: Patch Available (was: Open) The scripts are now available for inclusion in Avro-C. Avro-C - Simple scripts to call cmake from windows and linux Key: AVRO-1091 URL: https://issues.apache.org/jira/browse/AVRO-1091 Project: Avro Issue Type: Improvement Components: c Affects Versions: 1.7.0 Environment: Windows XP, Windows 7, Ubuntu Linux Reporter: Vivek Nadkarni Priority: Minor Fix For: 1.7.0 Attachments: AVRO-1091.patch Original Estimate: 24h Remaining Estimate: 24h New users to Avro-C may not know the specific commandline options to cmake to compile the Avro library. I would like to add simple scripts document the cmake commands under windows and linux. These scripts treat the Avro-C directory as a standalone project and are independent of the Apache build system and directory tree. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.
Vivek Nadkarni created AVRO-1088: Summary: Avro-C - Add performance tests for schema resolution and arrays. Key: AVRO-1088 URL: https://issues.apache.org/jira/browse/AVRO-1088 Project: Avro Issue Type: Improvement Components: c Affects Versions: 1.7.0 Environment: Ubuntu Linux 11.10 Reporter: Vivek Nadkarni Fix For: 1.7.0 The current performance test in Avro-C measures the performance while reading and writing of Avro values using a complex record schema, which does not contain any arrays. We add tests to measure the performance for simple and nested arrays. We also replicate all tests to measure the performance of the schema resolution using a resolved reader and a resolved writer. Specifically we add the following performance tests: Nested Record 1. Replicating the test nested record value by index, using a helper function. Using helper functions adds a little overhead, but it allows us to test various schemas, as well as different modes of schema resolution much more easily. 2. Using a resolved writer to resolve between (identical) reader and writer schemas, while reading a complex record. 3. Using a resolved reader to resolve between (identical) reader and writer schemas, while writing a complex record. Simple Array 4. Test the performance for reading and writing a simple array. 5. Using a resolved writer to resolve between (identical) reader and writer schemas, while reading a simple array. 6. Using a resolved reader to resolve between (identical) reader and writer schemas, while writing a simple array. Nested Array 7. Test the performance for reading and writing a nested array. 8. Using a resolved writer to resolve between (identical) reader and writer schemas, while reading a nested array. 9. Using a resolved reader to resolve between (identical) reader and writer schemas, while writing a nested array. Additionally we fix a minor bug: 1. The return value of avro_value_equal_fast() was not being tested. Test this return value, and fail if it is FALSE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.
[ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1088: - Attachment: AVRO-1088.patch Uploading patch file implementing the new performance tests. Avro-C - Add performance tests for schema resolution and arrays. Key: AVRO-1088 URL: https://issues.apache.org/jira/browse/AVRO-1088 Project: Avro Issue Type: Improvement Components: c Affects Versions: 1.7.0 Environment: Ubuntu Linux 11.10 Reporter: Vivek Nadkarni Fix For: 1.7.0 Attachments: AVRO-1088.patch Original Estimate: 24h Remaining Estimate: 24h The current performance test in Avro-C measures the performance while reading and writing of Avro values using a complex record schema, which does not contain any arrays. We add tests to measure the performance for simple and nested arrays. We also replicate all tests to measure the performance of the schema resolution using a resolved reader and a resolved writer. Specifically we add the following performance tests: Nested Record 1. Replicating the test nested record value by index, using a helper function. Using helper functions adds a little overhead, but it allows us to test various schemas, as well as different modes of schema resolution much more easily. 2. Using a resolved writer to resolve between (identical) reader and writer schemas, while reading a complex record. 3. Using a resolved reader to resolve between (identical) reader and writer schemas, while writing a complex record. Simple Array 4. Test the performance for reading and writing a simple array. 5. Using a resolved writer to resolve between (identical) reader and writer schemas, while reading a simple array. 6. Using a resolved reader to resolve between (identical) reader and writer schemas, while writing a simple array. Nested Array 7. Test the performance for reading and writing a nested array. 8. Using a resolved writer to resolve between (identical) reader and writer schemas, while reading a nested array. 9. Using a resolved reader to resolve between (identical) reader and writer schemas, while writing a nested array. Additionally we fix a minor bug: 1. The return value of avro_value_equal_fast() was not being tested. Test this return value, and fail if it is FALSE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.
[ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1088: - Status: Patch Available (was: Open) I ran the performance tests and got the results appended below. The results show that, as expected, there is a slight performance hit for using a resolved writer or resolved reader for the complex record, compared to using the matched schemas. However, the results also show that for the simple array and for the nested array, the penalty for using the resolved writer is substantial. Using the resolved writer takes 30 to 50 times longer than using no schema resolution or using the resolved reader for simple and nested arrays. The performance results indicate that there is a likely bug in the resolved writer, when it is trying to resolve simple or nested arrays. This bug will be reported in a separate AVRO-JIRA issue. Running refcount 1 tests per run Run 1 Run 2 Run 3 Average time: 2.423s Tests/sec:41265475 Running nested record (legacy) 10 tests per run Run 1 Run 2 Run 3 Average time: 2.270s Tests/sec:44053 Running nested record (value by index) 100 tests per run Run 1 Run 2 Run 3 Average time: 2.077s Tests/sec:481541 Running nested record (value by name) 100 tests per run Run 1 Run 2 Run 3 Average time: 2.333s Tests/sec:428571 Running nested record (value by index) matched schemas 100 tests per run Run 1 Run 2 Run 3 Average time: 2.147s Tests/sec:465839 Running nested record (value by index) resolved writer 100 tests per run Run 1 Run 2 Run 3 Average time: 2.480s Tests/sec:403226 Running nested record (value by index) resolved reader 100 tests per run Run 1 Run 2 Run 3 Average time: 2.230s Tests/sec:448430 Running simple array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 2.123s Tests/sec:117739 Running simple array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 2.747s Tests/sec:3641 Running simple array resolved reader 25 tests per run Run 1 Run 2 Run 3 Average time: 2.270s Tests/sec:110132 Running nested array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 3.030s Tests/sec:82508 Running nested array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 6.650s Tests/sec:1504 Running simple array resolved reader 25 tests per run Run 1 Run 2 Run 3 Average time: 3.313s Tests/sec:75453 Avro-C - Add performance tests for schema resolution and arrays. Key: AVRO-1088 URL: https://issues.apache.org/jira/browse/AVRO-1088 Project: Avro Issue Type: Improvement Components: c Affects Versions: 1.7.0 Environment: Ubuntu Linux 11.10 Reporter: Vivek Nadkarni Fix For: 1.7.0 Attachments: AVRO-1088.patch Original Estimate: 24h Remaining Estimate: 24h The current performance test in Avro-C measures the performance while reading and writing of Avro values using a complex record schema, which does not contain any arrays. We add tests to measure the performance for simple and nested arrays. We also replicate all tests to measure the performance of the schema resolution using a resolved reader and a resolved writer. Specifically we add the following performance tests: Nested Record 1. Replicating the test nested record value by index, using a helper function. Using helper functions adds a little overhead, but it allows us to test various schemas, as well as different modes of schema resolution much more easily. 2. Using a resolved writer to resolve between (identical) reader and writer schemas, while reading a complex record. 3. Using a resolved reader to resolve between (identical) reader and writer schemas, while writing a complex record. Simple Array 4. Test the performance for reading and writing a simple array. 5. Using a resolved writer to resolve between (identical) reader and writer schemas, while reading a simple array. 6. Using a resolved reader to resolve between (identical) reader and writer schemas, while writing a simple array. Nested Array 7. Test the performance for reading and writing a nested array. 8. Using a resolved writer to resolve between (identical) reader and writer schemas, while reading a nested array. 9. Using a resolved reader to resolve between (identical) reader and writer schemas, while writing a nested array. Additionally we fix a minor bug: 1.
[jira] [Created] (AVRO-1089) Avro-C - Penalty 30x to 50x for using resolved writer on arrays
Vivek Nadkarni created AVRO-1089: Summary: Avro-C - Penalty 30x to 50x for using resolved writer on arrays Key: AVRO-1089 URL: https://issues.apache.org/jira/browse/AVRO-1089 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.6.3, 1.7.0 Environment: Ubuntu Linux Reporter: Vivek Nadkarni Fix For: 1.7.0 The new performance tests created in AVRO-1088 show that using the resolved writer takes 30 to 50 times longer than using no schema resolution or using the resolved reader for simple and nested arrays. For a simple array, using the resolved writer took ~30x longer than using the memory reader that assumed a matching schema. For the nested array, using the resolved writer took ~50x longer. These results suggest that there is a bug in resolved writer. I do not have a proposed fix at this time. Running simple array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 2.123s Tests/sec:117739 Running simple array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 2.747s Tests/sec:3641 Running nested array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 3.030s Tests/sec:82508 Running nested array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 6.650s Tests/sec:1504 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1089) Avro-C - Penalty 30x to 50x for using resolved writer on arrays
[ https://issues.apache.org/jira/browse/AVRO-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Nadkarni updated AVRO-1089: - Attachment: AVRO-1089-performance.png This screenshot was generated using kcachegrind, after running the performance test test_simple_array_resolved_writer(). The plot shows that the majority of the time (97%) is spent in the function avro_resolved_writer_free_elements() called by avro_resolved_array_writer_reset(). This information suggests that the bug lies in one of these two functions. Unfortunately, I still don't have a mechanism or a fix for this issue. Avro-C - Penalty 30x to 50x for using resolved writer on arrays --- Key: AVRO-1089 URL: https://issues.apache.org/jira/browse/AVRO-1089 Project: Avro Issue Type: Bug Components: c Affects Versions: 1.6.3, 1.7.0 Environment: Ubuntu Linux Reporter: Vivek Nadkarni Fix For: 1.7.0 Attachments: AVRO-1089-performance.png Original Estimate: 48h Remaining Estimate: 48h The new performance tests created in AVRO-1088 show that using the resolved writer takes 30 to 50 times longer than using no schema resolution or using the resolved reader for simple and nested arrays. For a simple array, using the resolved writer took ~30x longer than using the memory reader that assumed a matching schema. For the nested array, using the resolved writer took ~50x longer. These results suggest that there is a bug in resolved writer. I do not have a proposed fix at this time. Running simple array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 2.123s Tests/sec:117739 Running simple array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 2.747s Tests/sec:3641 Running nested array matched schemas 25 tests per run Run 1 Run 2 Run 3 Average time: 3.030s Tests/sec:82508 Running nested array resolved writer 1 tests per run Run 1 Run 2 Run 3 Average time: 6.650s Tests/sec:1504 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira