Hi, A clarification on 3.1,
Does size here imply both the size in MB and in # events or either? If yes, then the file should be rolled on whichever is earlier between size in MB/# events/ time. ~Dev On Mon, Mar 7, 2016 at 2:54 PM, Yogi Devendra <[email protected]> wrote: > Here is the summary of discussion till now: > > 1. Proposed operator is for concrete implementation for writing tuples > to HDFS. All tuples will be written to same file. > 2. File copy operation will be handled using dedicated component for > file copy. (Proposal for that will be over another email thread). > 3. File rotation is handled in the following way: > 1. Based on file size > 2. Based on time (every X windows) > 3. If both are specified then based on whichever happens first. > 4. If nothing is specified then based on no new data for one > application window. > 4. Conversions to json, csv, avro will be not be responsibility of this > operator. Allowed inputs are byte[] or string. > 5. Custom separators should be allowed. Empty string should be valid > separator. > > Note that, this is just a first iteration implementation of this concrete > operator. We can enhance it later in subsequent iterations. > > Also, we expect that things will be more clear when we have first iteration > of other related components ready. > > Thanks all for you valuable feedback. > > ~ Yogi >
