Local file being refrenced in mapper function

2014-05-30 Thread Rahul Bhojwani
Hi, I recently posted a question on stackoverflow but didn't get any reply. I joined the mailing list now. Can anyone of you guide me a way for the problem mentioned in http://stackoverflow.com/questions/23923966/writing-the-rdd-data-in-excel-file-along-mapping-in-apache-spark Thanks in advance

Re: Local file being refrenced in mapper function

2014-05-30 Thread Marcelo Vanzin
Hi Rahul, I'll just copy paste your question here to aid with context, and reply afterwards. - Can I write the RDD data in excel file along with mapping in apache-spark? Is that a correct way? Isn't that a writing will be a local function and can't be passed over the clusters?? Below is

Re: Local file being refrenced in mapper function

2014-05-30 Thread Marcelo Vanzin
Hello there, On Fri, May 30, 2014 at 9:36 AM, Marcelo Vanzin van...@cloudera.com wrote: workbook = xlsxwriter.Workbook('output_excel.xlsx') worksheet = workbook.add_worksheet() data = sc.textFile(xyz.txt) # xyz.txt is a file whose each line contains string delimited by SPACE row=0 def

Re: Local file being refrenced in mapper function

2014-05-30 Thread Jey Kottalam
Hi Rahul, Marcelo's explanation is correct. Here's a possible approach to your program, in pseudo-Python: # connect to Spark cluster sc = SparkContext(...) # load input data input_data = load_xls(file(input.xls)) input_rows = input_data['Sheet1'].rows # create RDD on cluster input_rdd =

Re: Local file being refrenced in mapper function

2014-05-30 Thread Rahul Bhojwani
Thanks Marcelo, It actually made my few concepts clear. (y). On Fri, May 30, 2014 at 10:14 PM, Marcelo Vanzin van...@cloudera.com wrote: Hello there, On Fri, May 30, 2014 at 9:36 AM, Marcelo Vanzin van...@cloudera.com wrote: workbook = xlsxwriter.Workbook('output_excel.xlsx') worksheet

Re: Local file being refrenced in mapper function

2014-05-30 Thread Rahul Bhojwani
Thanks jey I was hellpful. On Sat, May 31, 2014 at 12:45 AM, Rahul Bhojwani rahulbhojwani2...@gmail.com wrote: Thanks Marcelo, It actually made my few concepts clear. (y). On Fri, May 30, 2014 at 10:14 PM, Marcelo Vanzin van...@cloudera.com wrote: Hello there, On Fri, May 30, 2014