[ 
https://issues.apache.org/jira/browse/ARROW-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17661683#comment-17661683
 ] 

Rok Mihevc commented on ARROW-4661:
-----------------------------------

This issue has been migrated to [issue 
#21192|https://github.com/apache/arrow/issues/21192] on GitHub. Please see the 
[migration documentation|https://github.com/apache/arrow/issues/14542] for 
further details.

> [C++] Consolidate random string generators for use in benchmarks and unittests
> ------------------------------------------------------------------------------
>
>                 Key: ARROW-4661
>                 URL: https://issues.apache.org/jira/browse/ARROW-4661
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Hatem Helal
>            Priority: Minor
>
> This was discussed in here:
> [https://github.com/apache/arrow/pull/3721]
> For testing/benchmarking dictionary encoding its useful to control the number 
> of repeated values and it would also be good to optionally include null 
> values.  The ability to provide a custom alphabet would be handy for 
> generating strings with unicode characters.
>  
> Also note that a simple PRNG should be used as the group has observed 
> performance trouble with Mersenne Twister.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to