Have a look at the sc.wholeTextFiles, you can use it to read the whole csv
contents into the value and then split it on \n and add them up to a list
and return it.

*sc.wholeTextFiles:*

Read a directory of text files from HDFS, a local file system (available on
all nodes), or any Hadoop-supported file system URI. Each file is read as a
single record and returned in a key-value pair, where the key is the path
of each file, the value is the content of each file.

For example, if you have the following files:

hdfs://a-hdfs-path/part-00000

hdfs://a-hdfs-path/part-00001

...

hdfs://a-hdfs-path/part-nnnnn

Do val rdd = sparkContext.wholeTextFile("hdfs://a-hdfs-path"),

then rdd contains

(a-hdfs-path/part-00000, its content)

(a-hdfs-path/part-00001, its content)

...

(a-hdfs-path/part-nnnnn, its content)

minPartitions

A suggestion value of the minimal splitting number for input data.
Note

Small files are preferred, large file is also allowable, but may cause bad
performance.


Thanks
Best Regards

On Wed, Jul 1, 2015 at 10:30 PM, Ashish Soni <asoni.le...@gmail.com> wrote:

> Hi ,
>
> How can i use Map function in java to convert all the lines of csv file
> into a list of objects , Can some one please help...
>
> JavaRDD<List<Charge>> rdd = sc.textFile("data.csv").map(new
> Function<String, List<Charge>>() {
>             @Override
>             public List<Charge> call(String s) {
>
>             }
>         });
>
> Thanks,
>

Reply via email to