m
抄送人:user <user@spark.apache.org>
主题:Re: A tool to generate simulation data
日期:2017年07月28日 01点18分
I suggest RandomRDDs API. It provides nice tools. If you write
wrappers around that might be good.
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.m
I suggest RandomRDDs API. It provides nice tools. If you write
wrappers around that might be good.
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.random.RandomRDDs$
-
To unsubscribe e-mail:
hello guys Is there a tool or an open source project that can mock lange
amount of data quickly, and support below :1. transaction data2. time series
data3. specified format data like CSV files or json files.4. data generated at
a changing speed.5. distributed data generation