Hi Steve, Thank you for your detailed and patiently answered. I understand that.
2012/11/5 Steve Loughran <ste...@hortonworks.com> > > > On 4 November 2012 17:25, lei liu <liulei...@gmail.com> wrote: > >> I want to know what applications are idempotent or not idempotent? and >> Why? Could you give me a example. >> >> > > > When you say "idempotent", I presume you mean the operation happens > "at-most-once"; ignoring the degenerate case where all requests are > rejected. > > you can take operations that fail if their conditions aren't met (delete > path named="something") being the simplest. the operation can send an error > back "file not found', but the client library can then downgrade that to an > idempotent assertion: "when the acknowledgment was send from the namenode, > there was nothing at the end of this path". Which will hold on a replay, > though if someone creates a file in between, that replay could be > observable. > > > Now what about move(src,dest)? > > if it succeeds, then there is no src path, as it is now at "dest". > > What happens if you call it a second time? There is no src, only dest. You > can't report that back as a success as it is clearly a failure: no src, no > dest. It's hard to convert that into an assertion on the observable state > of the system as the state doesn't reflect the history, so you need some > temporal logic in there too:: at time t0 there existed a directory src, at > time t1 the directory src no longer existed and its contents were now found > under directory "dest". > > And again, what happens if worse someone else did something in between, > created a src directory (which it could do, given that the first one has > been renamed dest), the operation replays and the move takes place twice > -you've just crossed into at-least-once operations, which is not what you > wanted. > > > At this point I'm sure you are thinking of having some kind of transaction > journal, recording that at time Tn, transaction Xn moved the dir. Which > means you have to start to collect a transaction log of what happened. Now > effectively HDFS is a journalled file system, it does record a lot of > things. It just doesn't record user transactions with it, or rescan the log > whenever any operation comes in, so as to decided what to ignore. > > Or you just skip the filesystem changes and have some data structure > recording "recent" transaction IDs; ignore repeated requests with the same > IDs. Better, though you'd need to make that failure resistant -it's state > must propagate to the journal and any failover namenodes so that a > transaction replay will be idempotent even if the filesystem fails over > between the original and replayed transaction. And of course all of this > needs to be atomic with the filesystem state changes... > > Summary: It gets complicated fast. Throwing errors back to the caller > makes life a lot simpler and lets the caller choose its own outcome -even > though that's not always satisfactory. > > Alternatively: it's not that people don't want globally distributed > transactions -it's just hard. > > > > >> >> > >> 2012/10/29 Ted Dunning <tdunn...@maprtech.com> >> >>> Create cannot be idempotent because of the problem of watches and >>> sequential files. >>> >>> Similarly, mkdirs, rename and delete cannot generally be idempotent. In >>> particular applications, you might find it is OK to treat them as such, but >>> there are definitely applications where they are not idempotent. >>> >>> >>> On Sun, Oct 28, 2012 at 2:40 AM, lei liu <liulei...@gmail.com> wrote: >>> >>>> I think these methods should are idempotent, these methods should be >>>> repeated >>>> calls to be harmless by same client. >>>> >>>> >>>> Thanks, >>>> >>>> LiuLei >>>> >>> >>> >> >