Re: batch processing in spark

Genmao Yu Sun, 05 May 2019 18:58:54 -0700

IIUC, you can use mapPartitions transformation and pass a function f. The 
function is used to map a tuple of input iterator to  an output iterator. Upon 
the input iterator, you can process multiple records at a time.



> 在 2019年5月6日，上午2:59，swastik mittal <smitt...@ncsu.edu> 写道：
> 
> From my experience in spark, when working on hdfs data base, spark reads data
> in form of records and does computation on every record as soon as it reads
> it. I have multiple images as my data on hdfs, where each image is a record.
> I want spark to read multiple records before doing any computation. Any idea
> on how could I do this?
> 
> 
> 
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>

Re: batch processing in spark

Reply via email to