Re: how to generate a larg dataset paralleled

15313776907 Fri, 14 Dec 2018 00:40:51 -0800


I also have this problem, hope to be able to solve here, thank you
On 12/14/2018 10:38，lk_spark<lk_sp...@163.com> wrote：
hi,all:
    I want't to generate some test data , which contained about one hundred 
million rows .
    I create a dataset have ten rows ,and I do df.union operation in 'for' 
circulation , but this will case the operation only happen on driver node.
    how can I do it on the whole cluster.
 
2018-12-14
lk_spark

Re: how to generate a larg dataset paralleled

Reply via email to