Thiago Romão Barcala created AVRO-4090:
------------------------------------------
Summary: PHP data is validated multiple times for nested schemas
Key: AVRO-4090
URL: https://issues.apache.org/jira/browse/AVRO-4090
Project: Apache Avro
Issue Type: Improvement
Reporter: Thiago Romão Barcala
Consider the test script below:
{code:php}
<?php
use Apache\Avro\Datum\AvroIOBinaryEncoder;
use Apache\Avro\Datum\AvroIODatumWriter;
use Apache\Avro\IO\AvroStringIO;
use Apache\Avro\Schema\AvroSchema;
require_once 'vendor/autoload.php';
$writer = new AvroIODatumWriter();
$schemaJson = <<<'JSON'
{
"type": "record",
"name": "A",
"fields": [
{
"name": "a",
"type": {
"type": "record",
"name": "B",
"fields": [
{
"name": "b",
"type": {
"type": "record",
"name": "C",
"fields": [
{
"name": "c",
"type": {
"type": "record",
"name": "D",
"fields": [
{
"name": "d",
"type": {
"type": "record",
"name": "E",
"fields": [
{
"name": "e",
"type": "string"
}
]
}
}
]
}
}
]
}
}
]
}
}
]
}
JSON
;
$data = ['a' => ['b' => ['c' => ['d' => ['e' => 'value']]]]];
$schema = AvroSchema::parse($schemaJson);
$io = new AvroStringIO();
$writer->writeData($schema, $data, new AvroIOBinaryEncoder($io));
var_dump($io->__toString()); {code}
By running the script above with the command line below, it is possible to see,
by inspecting the profiler output, that the method AvroSchema::isValidDatum is
called 21 times:
{code:bash}
php -dxdebug.start_with_request=true -dxdebug.mode=profile
-dxdebug.output_dir=$(pwd) test.php
{code}
The validation should be called only 6 times though, once for each record, and
once for the string value. This is happening, because writeData is being called
for every field of the record, and writeData validates the entire data graph.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)