[
https://issues.apache.org/jira/browse/GIRAPH-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186211#comment-16186211
]
ASF GitHub Bot commented on GIRAPH-1161:
----------------------------------------
Github user asfgit closed the pull request at:
https://github.com/apache/giraph/pull/50
> implement random sampling for input splits
> ------------------------------------------
>
> Key: GIRAPH-1161
> URL: https://issues.apache.org/jira/browse/GIRAPH-1161
> Project: Giraph
> Issue Type: Improvement
> Reporter: Jianlong Zhong
> Priority: Minor
>
> Currently if we are reading vertex/edge data from multiple tables, and we
> only want to read a fraction of data (with giraph.inputSplitSamplePercent
> conf option), we'll always get the first inputSplitSamplePercent of the input
> slits. We should instead use a random sample of input splits so testing on
> sample of data would look closer to actual full data run.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)