[ 
https://issues.apache.org/jira/browse/AVRO-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863614#comment-17863614
 ] 

ASF subversion and git services commented on AVRO-1521:
-------------------------------------------------------

Commit 82d864fd3751e77ecd255b6b28914926d72916f9 in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=82d864fd3 ]

AVRO-1521 [Perl] Fix boolean encoding errors (#2986)

This change fixes a long-standing issue with the binary encoding
of boolean values. In particular, that while several "smart" values
were accepted as valid boolean values by Avro::Schema (eg. "true"
and "no"), Avro::BinaryEncoder encoded them as true or false depending
on their truth value for Perl. This resulted in both of those examples
being encoded as true, because for Perl any non-empty string is true.

This change makes it so that those values are accepted and properly
handled, and handles other values that represent boolean values
like JSON::PP::Boolean references and native Perl booleans (those
that would be returned by eg. builtin::true).

This also includes a small but possibly breaking bugfix for the
detection of valid boolean values in Avro::Schema, which was using
a non-anchored regular expression to filter values, meaning that
eg. any value that had an "n" anywhere would be considered valid.
This was most likely an involuntary error, so while breaking, it
feels like we have to fix it.

> Inconsistent behavior of Perl API with 'boolean' type
> -----------------------------------------------------
>
>                 Key: AVRO-1521
>                 URL: https://issues.apache.org/jira/browse/AVRO-1521
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: perl
>            Reporter: John Karp
>            Assignee: José Joaquín Atria
>            Priority: Major
>             Fix For: 1.12.0
>
>
> The perl boolean serialization code in BinaryEncoder.pm encodes anything 
> false to perl, such as 0, '0', '', () and undef, as false, and anything true 
> to perl, which is literally everything else, as true.
> Inconsistent with the above serialization, the code used in Schema.pm to 
> determine which union branch to use, is checking for boolean-ness with:
> {noformat}
> m{yes|no|y|n|t|f|true|false}i
> {noformat}
> meaning only those particular strings are considered booleans.
> So all those values, including 'no' 'n' 'f' and 'false', still get serialized 
> to true.
> We could just standardize on one of the two and use it consistently. But 
> neither works that well in unions, because unless you put the boolean type 
> last in the union definition, a wide variety of data will be downcast to 
> boolean type.
> Perl has no built-in or standardized boolean type, so there's no solution 
> like we have in the other language Avro APIs. But we could do as the perl 
> JSON module does, and define objects for true and false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to