Yes. the Agent will resend. The checkpoint state will not be advanced until an 200 is received from a collector.
Yes, the demux processing is intended to remove duplicates; if it doesn't, that's a bug. On Thu, Oct 28, 2010 at 7:58 AM, Jaydeep Ayachit <[email protected]> wrote: > As per the collector design, the collector accepts multiple chunks and > writes each chunk to hdfs. If all the chunks are written to hdfs, collector > sends back 200 status to agent > > If hdfs write fails in between, the collector aborts entire processing and > sends exception. This could mean that the data is partially written to hdfs. > I have a couple of questions > > > > 1. The agent does not receive response 200. Does it resend the same > data to another collector? How does checkpointing works in this case? > > 2. If the agent sends same data to another collector and it goes to > hdfs, there is a duplication of some records. Are those duplicates filtered > when preprocessor runs? > > > > In summary what data loss happens when hdfs goes down from collector > perspective? > > > > Thanks, > > Jaydeep > > > > Jaydeep Ayachit | Persistent Systems Ltd > > Cell: +91 9822393963 | Desk: +91 712 3986747 > > > > DISCLAIMER ========== This e-mail may contain privileged and confidential > information which is the property of Persistent Systems Ltd. It is intended > only for the use of the individual or entity to which it is addressed. If > you are not the intended recipient, you are not authorized to read, retain, > copy, print, distribute or use this message. If you have received this > communication in error, please notify the sender and delete all copies of > this message. Persistent Systems Ltd. does not accept any liability for > virus infected mails. -- Ari Rabkin [email protected] UC Berkeley Computer Science Department
