[ https://issues.apache.org/jira/browse/ARROW-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17661683#comment-17661683 ]
Rok Mihevc commented on ARROW-4661: ----------------------------------- This issue has been migrated to [issue #21192|https://github.com/apache/arrow/issues/21192] on GitHub. Please see the [migration documentation|https://github.com/apache/arrow/issues/14542] for further details. > [C++] Consolidate random string generators for use in benchmarks and unittests > ------------------------------------------------------------------------------ > > Key: ARROW-4661 > URL: https://issues.apache.org/jira/browse/ARROW-4661 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Reporter: Hatem Helal > Priority: Minor > > This was discussed in here: > [https://github.com/apache/arrow/pull/3721] > For testing/benchmarking dictionary encoding its useful to control the number > of repeated values and it would also be good to optionally include null > values. The ability to provide a custom alphabet would be handy for > generating strings with unicode characters. > > Also note that a simple PRNG should be used as the group has observed > performance trouble with Mersenne Twister. -- This message was sent by Atlassian Jira (v8.20.10#820010)