Re: fault tolerant dataframe write with overwrite

2017-02-14 Thread Jörn Franke
> successful writing of the new one. > Thanks, > Assaf. > > > From: Steve Loughran [mailto:ste...@hortonworks.com] > Sent: Tuesday, February 14, 2017 3:25 PM > To: Mendelson, Assaf > Cc: Jörn Franke; user > Subject: Re: fault tolerant dataframe write with overwrite

RE: fault tolerant dataframe write with overwrite

2017-02-14 Thread Mendelson, Assaf
Loughran [mailto:ste...@hortonworks.com] Sent: Tuesday, February 14, 2017 3:25 PM To: Mendelson, Assaf Cc: Jörn Franke; user Subject: Re: fault tolerant dataframe write with overwrite On 14 Feb 2017, at 11:12, Mendelson, Assaf <assaf.mendel...@rsa.com<mailto:assaf.mendel...@rsa.com>> wr

Re: fault tolerant dataframe write with overwrite

2017-02-14 Thread Steve Loughran
:jornfra...@gmail.com] Sent: Tuesday, February 14, 2017 12:54 PM To: Mendelson, Assaf Cc: user Subject: Re: fault tolerant dataframe write with overwrite Normally you can fetch the filesystem interface from the configuration ( I assume you mean URI). Managing to get the last itera

RE: fault tolerant dataframe write with overwrite

2017-02-14 Thread Mendelson, Assaf
: fault tolerant dataframe write with overwrite Normally you can fetch the filesystem interface from the configuration ( I assume you mean URI). Managing to get the last iteration: I do not understand the issue. You can have as the directory the current timestamp and at the end you simply select

Re: fault tolerant dataframe write with overwrite

2017-02-14 Thread Jörn Franke
Normally you can fetch the filesystem interface from the configuration ( I assume you mean URI). Managing to get the last iteration: I do not understand the issue. You can have as the directory the current timestamp and at the end you simply select the directory with the highest number.

fault tolerant dataframe write with overwrite

2017-02-14 Thread Mendelson, Assaf
Hi, I have a case where I have an iterative process which overwrites the results of a previous iteration. Every iteration I need to write a dataframe with the results. The problem is that when I write, if I simply overwrite the results of the previous iteration, this is not fault tolerant. i.e.