Ying Zhou created ARROW-11548: --------------------------------- Summary: [C++] RandomArrayGenerator::List sie mismatch Key: ARROW-11548 URL: https://issues.apache.org/jira/browse/ARROW-11548 Project: Apache Arrow Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ying Zhou Fix For: 4.0.0
RandomArrayGenerator::List consistently produces ListArrays with their length 1 below what they should be according to their documentation. Moreover the bitmaps we have are weird. Here is some simple test: {color:#dcdcaa}TEST{color}(TestAdapterWriteNested, ListTest) { {color:#569cd6}int64_t{color} num_rows = {color:#b5cea8}2{color}; {color:#569cd6}static{color} {color:#569cd6}constexpr{color} {color:#4ec9b0}random{color}::SeedType kRandomSeed2 = {color:#b5cea8}0x0ff1ce{color}; {color:#4ec9b0}arrow{color}::{color:#4ec9b0}random{color}::RandomArrayGenerator {color:#dcdcaa}rand{color}(kRandomSeed2); {color:#4ec9b0}std{color}::shared_ptr<Array> value_array = {color:#9cdcfe}rand{color}.{color:#dcdcaa}ArrayOf{color}({color:#dcdcaa}int32{color}(), {color:#b5cea8}2{color} * num_rows, {color:#b5cea8}0.2{color}); {color:#4ec9b0}std{color}::shared_ptr<Array> array = {color:#9cdcfe}rand{color}.{color:#dcdcaa}List{color}(*value_array, num_rows, {color:#b5cea8}1{color}); {color:#dcdcaa}RecordProperty{color}({color:#ce9178}"bitmap"{color},*({color:#9cdcfe}array{color}->{color:#dcdcaa}null_bitmap_data{color}())); {color:#dcdcaa}RecordProperty{color}({color:#ce9178}"length"{color},{color:#9cdcfe}array{color}->{color:#dcdcaa}length{color}()); {color:#dcdcaa}RecordProperty{color}({color:#ce9178}"array"{color},{color:#9cdcfe}array{color}->{color:#dcdcaa}ToString{color}()); } Here are the results: {color:#808080}<{color}{color:#569cd6}testcase{color} {color:#9cdcfe}name{color}={color:#ce9178}"ListTest"{color} {color:#9cdcfe}status{color}={color:#ce9178}"run"{color} {color:#9cdcfe}result{color}={color:#ce9178}"completed"{color} {color:#9cdcfe}time{color}={color:#ce9178}"0"{color} {color:#9cdcfe}timestamp{color}={color:#ce9178}"2021-02-07T15:23:16"{color} {color:#9cdcfe}classname{color}={color:#ce9178}"TestAdapterWriteNested"{color}{color:#808080}>{color} {color:#808080}<{color}{color:#569cd6}properties{color}{color:#808080}>{color} {color:#808080}<{color}{color:#569cd6}property{color} {color:#9cdcfe}name{color}={color:#ce9178}"bitmap"{color} {color:#9cdcfe}value{color}={color:#ce9178}"3"{color}{color:#808080}/>{color} {color:#808080}<{color}{color:#569cd6}property{color} {color:#9cdcfe}name{color}={color:#ce9178}"length"{color} {color:#9cdcfe}value{color}={color:#ce9178}"1"{color}{color:#808080}/>{color} {color:#808080}<{color}{color:#569cd6}property{color} {color:#9cdcfe}name{color}={color:#ce9178}"array"{color} {color:#9cdcfe}value{color}={color:#ce9178}"[{color}{color:#569cd6}
{color}{color:#ce9178} [{color}{color:#569cd6}
{color}{color:#ce9178} null,{color}{color:#569cd6}
{color}{color:#ce9178} 1074834796,{color}{color:#569cd6}
{color}{color:#ce9178} 551076274,{color}{color:#569cd6}
{color}{color:#ce9178} 1184187771{color}{color:#569cd6}
{color}{color:#ce9178} ]{color}{color:#569cd6}
{color}{color:#ce9178}]"{color}{color:#808080}/>{color} {color:#808080}</{color}{color:#569cd6}properties{color}{color:#808080}>{color} {color:#808080}</{color}{color:#569cd6}testcase{color}{color:#808080}>{color} Here is what RandomArrayGenerator::List should do: {color:#6a9955} /// {color}{color:#569cd6}\brief{color}{color:#6a9955} Generate a random ListArray{color} {color:#6a9955} ///{color} {color:#6a9955} /// {color}{color:#569cd6}\param{color}{color:#6a9955}[{color}{color:#569cd6}in{color}{color:#6a9955}] {color}{color:#9cdcfe}values{color}{color:#6a9955} The underlying values array{color} {color:#6a9955} /// {color}{color:#569cd6}\param{color}{color:#6a9955}[{color}{color:#569cd6}in{color}{color:#6a9955}] {color}{color:#9cdcfe}size{color}{color:#6a9955} The size of the generated list array{color} {color:#6a9955} /// {color}{color:#569cd6}\param{color}{color:#6a9955}[{color}{color:#569cd6}in{color}{color:#6a9955}] {color}{color:#9cdcfe}null_probability{color}{color:#6a9955} the probability of a list value being null{color} {color:#6a9955} /// {color}{color:#569cd6}\param{color}{color:#6a9955}[{color}{color:#569cd6}in{color}{color:#6a9955}] {color}{color:#9cdcfe}force_empty_nulls{color}{color:#6a9955} if true, null list entries must have 0 length{color} {color:#6a9955} ///{color} {color:#6a9955} /// {color}{color:#569cd6}\return{color}{color:#6a9955} a generated Array{color} {color:#4ec9b0}std{color}::{color:#4ec9b0}shared_ptr{color}<{color:#4ec9b0}Array{color}> {color:#dcdcaa}List{color}({color:#569cd6}const{color} {color:#4ec9b0}Array{color}{color:#569cd6}&{color} {color:#9cdcfe}values{color}, {color:#4ec9b0}int64_t{color} {color:#9cdcfe}size{color}, {color:#569cd6}double{color} {color:#9cdcfe}null_probability{color}, {color:#569cd6}bool{color} {color:#9cdcfe}force_empty_nulls{color} = {color:#569cd6}false{color}); Note that the generator failed in at least two aspects: 1. The length of the generated array is too low. 2. Even when null_probability is set to 1 there are still 1s in the bitmap. 3. The size of the bitmap is larger than the size of the Array. -- This message was sent by Atlassian Jira (v8.3.4#803005)