Re: Output Side Effects for different chain of operations

2016-12-15 Thread Chawla,Sumit
I am already creating these files on slave. How can i create an RDD from these slaves? Regards Sumit Chawla On Thu, Dec 15, 2016 at 11:42 AM, Reynold Xin wrote: > You can just write some files out directly (and idempotently) in your > map/mapPartitions functions. It is

Re: Output Side Effects for different chain of operations

2016-12-15 Thread Reynold Xin
You can just write some files out directly (and idempotently) in your map/mapPartitions functions. It is just a function that you can run arbitrary code after all. On Thu, Dec 15, 2016 at 11:33 AM, Chawla,Sumit wrote: > Any suggestions on this one? > > Regards > Sumit

Re: Output Side Effects for different chain of operations

2016-12-15 Thread Chawla,Sumit
Any suggestions on this one? Regards Sumit Chawla On Tue, Dec 13, 2016 at 8:31 AM, Chawla,Sumit wrote: > Hi All > > I have a workflow with different steps in my program. Lets say these are > steps A, B, C, D. Step B produces some temp files on each executor node. >

Output Side Effects for different chain of operations

2016-12-13 Thread Chawla,Sumit
Hi All I have a workflow with different steps in my program. Lets say these are steps A, B, C, D. Step B produces some temp files on each executor node. How can i add another step E which consumes these files? I understand the easiest choice is to copy all these temp files to any shared