n3world commented on a change in pull request #10202: URL: https://github.com/apache/arrow/pull/10202#discussion_r627054485
########## File path: cpp/src/arrow/csv/parser.cc ########## @@ -76,9 +76,45 @@ class PresizedDataWriter { parsed_[parsed_size_++] = static_cast<uint8_t>(c); } + // Push the value of a fully complete field. This should only be used to fill in missing + // values. This method can reallocate the buffer if there isn't enough extra space for + // the field. + Status PushField(const std::string& field) { + if (field.length() > extra_allocated_) { + // just in case this happens more allocate enough for 10x this amount + auto to_allocate = static_cast<uint32_t>( + std::max(field.length() * 10, static_cast<std::string::size_type>(128))); Review comment: That being said I can easily add an AddFields optimization which just allocates enough for all the additional fields in a row under the assumption that this is so rare that it won't be called often and not attempt to allocate for the future -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org