[ 
https://issues.apache.org/jira/browse/AVRO-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030138#comment-18030138
 ] 

ASF subversion and git services commented on AVRO-4172:
-------------------------------------------------------

Commit 54b332161524086dcb6cde8afe097097eed7f3ee in avro's branch 
refs/heads/main from Zhang Jiawei
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=54b3321615 ]

AVRO-4172: [C++] Fix ZSTD codec compatibility (#3457)

* AVRO-4172: [C++] Fix ZSTD codec compatibility

* AVRO-4172: [C++] add codec compatibility test

* fix

* fix

* fix

* add ZstdCodecWrapper for compression

* fix

* fix

* split zstd compress and decompress wrapper

* fix

> [C++] Fix ZSTD codec compatibility with Java Avro
> -------------------------------------------------
>
>                 Key: AVRO-4172
>                 URL: https://issues.apache.org/jira/browse/AVRO-4172
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: c++, compatibility
>            Reporter: Zhang Jiawei
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.13.0
>
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> We have identified two cross-language compatibility issues related to the 
> ZSTD codec in Avro:
>  # Different codec names
> • In Java Avro (and the other language bindings that follow it) the codec is 
> written into the file metadata as {{{}"zstandard"{}}},
> • while the C++ implementation writes {{{}"zstd"{}}}.
> This makes a data file produced by one language unreadable by the other.
> Java: 
> [Code|https://github.com/apache/avro/blob/dc7bbd086283bb61dfabd8fcdf980d22f30c7a93/lang/java/avro/src/main/java/org/apache/avro/file/DataFileConstants.java#L40]
> C++: 
> [Code|https://github.com/apache/avro/blob/dc7bbd086283bb61dfabd8fcdf980d22f30c7a93/lang/c%2B%2B/impl/DataFile.cc#L57]
>  # Streaming vs. single-shot encoding
> Java Avro writes ZSTD data in streaming mode, whereas the C++ implementation 
> can only decode single-shot ZSTD frames.
> As a result, a ZSTD-compressed file generated by Java Avro cannot be read by 
> the current C++ library.
> Reference: 
> [Code|https://github.com/apache/avro/blob/dc7bbd086283bb61dfabd8fcdf980d22f30c7a93/lang/c%2B%2B/impl/DataFile.cc#L494]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to