Re: Access HDFS within Spark Map Operation

Raghavendra Pandey Tue, 13 Sep 2016 19:05:08 -0700

How large is your first text file? The idea is you read first text file and
if it is not large you can collect all the lines on driver and then again
read text files for each line and union all rdds.


On 13 Sep 2016 11:39 p.m., "Saliya Ekanayake" <esal...@gmail.com> wrote:

> Just wonder if this is possible with Spark?
>
> On Mon, Sep 12, 2016 at 12:14 AM, Saliya Ekanayake <esal...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I've got a text file where each line is a record. For each record, I need
>> to process a file in HDFS.
>>
>> So if I represent these records as an RDD and invoke a map() operation on
>> them how can I access the HDFS within that map()? Do I have to create a
>> Spark context within map() or is there a better solution to that?
>>
>> Thank you,
>> Saliya
>>
>>
>>
>> --
>> Saliya Ekanayake
>> Ph.D. Candidate | Research Assistant
>> School of Informatics and Computing | Digital Science Center
>> Indiana University, Bloomington
>>
>>
>
>
> --
> Saliya Ekanayake
> Ph.D. Candidate | Research Assistant
> School of Informatics and Computing | Digital Science Center
> Indiana University, Bloomington
>
>

Re: Access HDFS within Spark Map Operation

Reply via email to