+1 to what Francois said. You either want to use the method that takes a length or string_view for this case: https://github.com/apache/arrow/blob/843e8bb556a03f0e4c18841a623d1a0e9c236ee5/cpp/src/arrow/array/builder_binary.h#L72
On Wed, Nov 18, 2020 at 11:05 AM Francois Saint-Jacques < fsaintjacq...@gmail.com> wrote: > I would say at first sight that it's due to your usage of char[] and > builder.Append(d) implicitly does a strlen. > > François > > On Wed, Nov 18, 2020 at 2:00 PM Ying Zhou <yzhou7...@gmail.com> wrote: > > > > Sure! > > > > BinaryBuilder builder; > > char d[] = "\x00\x01\xbf\x5b”; > > (void)(builder.Append(d)); > > std::shared_ptr<Array> array; > > (void)(builder.Finish(&array)); > > int32_t dataLength = 0; > > auto aarray = std::static_pointer_cast<BinaryArray>(array); > > const uint8_t* data = aarray->GetValue(0, &dataLength); > > data = aarray->GetValue(3, &dataLength); > > RecordProperty("l3", dataLength); > > RecordProperty("30", data[0]); > > RecordProperty("31", data[1]); > > RecordProperty("32", data[2]); > > RecordProperty("33", data[3]); > > > > We need Google Test to use RecordProperty. dataLength is 0 instead of 4 > and data[i] are 255, 0, 0 and 0 respectively. > > > > My JIRA ID is yingzhou474. > > > > > > > On Nov 18, 2020, at 1:49 PM, Antoine Pitrou <anto...@python.org> > wrote: > > > > > > > > > Hello, > > > > > > Le 18/11/2020 à 19:06, Ying Zhou a écrit : > > >> > > >> According to the documentation BINARY is "Variable-length bytes (no > guarantee of UTF8-ness)”. However in practice if I embed 0x00 in the middle > of a char array and Append it to a BinaryBuilder the 0x00 is converted to > 0xff, everything after it is not appended and the length is computed as if > the 0x00 and everything after it don’t exist (i.e. standard STRING > behavior). > > > > > > Can you post some code showing how you build the array? > > > > > >> P.S. Please allow me to assign Jira tickets to myself. Really thanks! > > > > > > What is your JIRA id? > > > > > > Regards > > > > > > Antoine. > > >