from:"maqy"

In driver, can I gc myArray after get a rdd by sparkContext.parallelize(myArray,100)

2020-08-31 Thread maqy

Arr: Array[Float] = new Array(1000) 　fill data into tmpArr 　val rddBatch = sparkContext.parallelize(batchArr, 100) 　rddBatch.cache() 　rddBatch.first() 　globalRddList.append(rddBatch) } ``` Best regards, maqy

回复: 回复：Can I collect Dataset[Row] to driver without converting it toArray [Row]?

2020-04-23 Thread maqy

Hi Jinxin, 　Thanks for your suggestions, I will try to use foreachpartition later. 　 Best regards, maqy 发件人: Tang Jinxin 发送时间: 2020年4月23日 7:31 收件人: maqy 抄送: Andrew Melo; user@spark.apache.org 主题: 回复：Can I collect Dataset[Row] to driver without converting it toArray [Row]? Hi maqy, Thanks for

回复: 回复：Can I collect Dataset[Row] to driver without converting it toArray [Row]?

2020-04-23 Thread maqy

Hi Jinxin, 　Thanks for your suggestions, I will try to use foreachpartition later. 　 Best regards, maqy 发件人: Tang Jinxin 发送时间: 2020年4月23日 7:31 收件人: maqy 抄送: Andrew Melo; user@spark.apache.org 主题: 回复：Can I collect Dataset[Row] to driver without converting it toArray [Row]? Hi maqy, Thanks for

回复: 回复：Can I collect Dataset[Row] to driver without converting it toArray [Row]?

2020-04-23 Thread maqy

Hi Jinxin, 　Thanks for your suggestions, I will try to use foreachpartition later. 　 Best regards, maqy 发件人: Tang Jinxin 发送时间: 2020年4月23日 7:31 收件人: maqy 抄送: Andrew Melo; user@spark.apache.org 主题: 回复：Can I collect Dataset[Row] to driver without converting it toArray [Row]? Hi maqy, Thanks for

回复: 回复：[Spark SQL] [Beginner] Dataset[Row] collect to driver throwjava.io.EOFException: Premature EOF: no length prefix available

2020-04-22 Thread maqy

, and after a few minutes, the shell will report this error. 　 Best regards, maqy 发件人: Tang Jinxin 发送时间: 2020年4月22日 23:16 收件人: maqy 抄送: user@spark.apache.org 主题: 回复：[Spark SQL] [Beginner] Dataset[Row] collect to driver throwjava.io.EOFException: Premature EOF: no length prefix available Maybe

回复: Can I collect Dataset[Row] to driver without converting it toArray [Row]?

2020-04-22 Thread maqy

network(use collect()) is too large, and the deserialization seems to take some time. 　 Best wishes, maqy 发件人: Andrew Melo 发送时间: 2020年4月22日 21:02 收件人: maqy 抄送: Michael Artz; user@spark.apache.org 主题: Re: Can I collect Dataset[Row] to driver without converting it toArray [Row]? On Wed, Apr 22, 2020 at

回复: [Spark SQL] [Beginner] Dataset[Row] collect to driver throwjava.io.EOFException: Premature EOF: no length prefix available

2020-04-22 Thread maqy

Today I meet the same problem using rdd.collect (), the format of rdd is Tuple2 [Int, Int]. And this problem will appear when the amount of data reaches about 100GB. I guess there may be something wrong with deserialization. Has anyone else encountered this problem? Best regards, maqy

回复: Can I collect Dataset[Row] to driver without converting it to Array [Row]?

2020-04-22 Thread maqy

I will traverse this Dataset to convert it to Arrow and send it to Tensorflow through Socket. I tried to use toLocalIterator() to traverse the dataset instead of collect to the driver, but toLocalIterator() will create a lot of jobs and will bring a lot of time consumption. Best regards, maqy

Can I collect Dataset[Row] to driver without converting it to Array [Row]?

2020-04-22 Thread maqy

driver and keep its data format? Best regards, maqy

In driver, can I gc myArray after get a rdd by sparkContext.parallelize(myArray,100)

回复: 回复：Can I collect Dataset[Row] to driver without converting it toArray [Row]?

回复: 回复：Can I collect Dataset[Row] to driver without converting it toArray [Row]?

回复: 回复：Can I collect Dataset[Row] to driver without converting it toArray [Row]?

回复: 回复：[Spark SQL] [Beginner] Dataset[Row] collect to driver throwjava.io.EOFException: Premature EOF: no length prefix available

回复: Can I collect Dataset[Row] to driver without converting it toArray [Row]?

回复: [Spark SQL] [Beginner] Dataset[Row] collect to driver throwjava.io.EOFException: Premature EOF: no length prefix available

回复: Can I collect Dataset[Row] to driver without converting it to Array [Row]?

Can I collect Dataset[Row] to driver without converting it to Array [Row]?

9 matches

Site Navigation

Mail list logo

Footer information