回复:Re: A tool to generate simulation data
thank you Suzen, i've had a try to generate 1 billion records within 1.5min. It is fast.And I will go on to try some other cases. ThanksBest regards! San.Luo - 原始邮件 - 发件人:"Suzen, Mehmet" <su...@acm.org> 收件人:luohui20...@sina.com 抄送人:user <user@spark.apache.org> 主题:Re: A tool to generate simulation data 日期:2017年07月28日 01点18分 I suggest RandomRDDs API. It provides nice tools. If you write wrappers around that might be good. https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.random.RandomRDDs$ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: A tool to generate simulation data
I suggest RandomRDDs API. It provides nice tools. If you write wrappers around that might be good. https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.random.RandomRDDs$ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
A tool to generate simulation data
hello guys Is there a tool or an open source project that can mock lange amount of data quickly, and support below :1. transaction data2. time series data3. specified format data like CSV files or json files.4. data generated at a changing speed.5. distributed data generation ThanksBest regards! San.Luo