pitrou commented on code in PR #48252:
URL: https://github.com/apache/arrow/pull/48252#discussion_r2576093506
##########
cpp/src/arrow/csv/fuzz.cc:
##########
@@ -42,10 +42,11 @@ Status FuzzCsvReader(const uint8_t* data, int64_t size) {
auto read_options = ReadOptions::Defaults();
// Make chunking more likely
- read_options.block_size = 4096;
+ read_options.block_size = 1000;
auto parse_options = ParseOptions::Defaults();
auto convert_options = ConvertOptions::Defaults();
convert_options.auto_dict_encode = true;
+ convert_options.auto_dict_max_cardinality = 50;
Review Comment:
The `block_size` one is to increase the likelihood of chunking and the
number of chunks, to exercise chunked reading and parallelization more. The
`auto_dict_max_cardinality` just explicitly sets to the default value, so it's
really a no-op but it signals a knob that we might want to turn.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]