Re: Kudu 1.6 release
Hi Mike, Thank you for taking care of the process for 1.6.0 release. The plan you described looks good to me. Best regards, Alexey On 11/16/17 1:51 AM, Mike Percy wrote: Hi Kudu dev community, It's been 2 months since release 1.5.0 and we've got a bunch of valuable improvements and bug fixes waiting in the wings. Based on our usual 2-month cadence, now looks like a good time to start thinking about a Kudu 1.6.0 release. I'll volunteer to RM this one, unless someone else has a burning desire to do it. I'll also propose to cut the branch for 1.6.x early in the week *after* the Thanksgiving holiday in the US, and to start a vote on RC1 a couple of days after that. Devs: That means release notes for notable changes in 1.6 should be up for review and ready to go by Monday, November 27 (the Monday after Thanksgiving) to ensure their inclusion. Please let me know your thoughts on the above plan. Thanks! Mike
Re: could anybody help to explain this question
Hi He, Answers inline below: On Fri, Nov 17, 2017 at 1:24 AM, helifu wrote: > Hi everybody, > > > > I have a question about the MinorDeltaCompationOp. > > In my opinion, the data in redo delta file is in order by row_idx. And at > the same time there is a tree for mapping row_idx to block pointer(ptr). Is > it right? > That's correct. It's ordered by a tuple of (row_idx, transaction timestamp) actually since there may be multiple updates for a single row stored in a file. > > > Now, I am reading the code about MinorDeltaCompationOp, especially the > function ‘WriteDeltaIteratorToFile’, I find an interesting thing. The new > redo delta file will be disordered. > > 1.prepare n rows in every input redo delta file: > > RETURN_NOT_OK(iter->PrepareBatch(n, DeltaIterator::PREPARE_FOR_COLLECT)); > > The thing that I think you are missing here is that PrepareBatch(n) doesn't prepare a batch of n deltas, but rather prepares a batch which contains all deltas for the next 'n' rowids. That is to say, if those 'n' rows contained no updates, this would prepare a batch of 0 deltas. If they each contained more than one update, it would prepare more than 'n' deltas. > > > 2.filter and collect these rows, and sort them by deltakey: > > RETURN_NOT_OK(iter->FilterColumnIdsAndCollectDeltas(vector(), > > &cells, > > &arena)); > > > > 3.write them to new redo delta file one by one: > > for (const DeltaKeyAndUpdate& cell : cells) { > > RowChangeList rcl(cell.cell); > > RETURN_NOT_OK(out->AppendDelta(cell.key, rcl)); > > RETURN_NOT_OK(stats.UpdateStats(cell.key.timestamp(), rcl)); > > } > > > > 4.next loop. > > > > Well, my question is that the second n rows in input redo delta file A is > not always larger than the first n rows in input redo delta file B. Thus, > it > will result in failure when MutateRow. > I think given the above explanation this wouldn't be a problem. You can also see various test cases like fuzz-itest that would probably catch this bug if it were to happen. -Todd -- Todd Lipcon Software Engineer, Cloudera
could anybody help to explain this question
Hi everybody, I have a question about the MinorDeltaCompationOp. In my opinion, the data in redo delta file is in order by row_idx. And at the same time there is a tree for mapping row_idx to block pointer(ptr). Is it right? Now, I am reading the code about MinorDeltaCompationOp, especially the function ‘WriteDeltaIteratorToFile’, I find an interesting thing. The new redo delta file will be disordered. 1.prepare n rows in every input redo delta file: RETURN_NOT_OK(iter->PrepareBatch(n, DeltaIterator::PREPARE_FOR_COLLECT)); 2.filter and collect these rows, and sort them by deltakey: RETURN_NOT_OK(iter->FilterColumnIdsAndCollectDeltas(vector(), &cells, &arena)); 3.write them to new redo delta file one by one: for (const DeltaKeyAndUpdate& cell : cells) { RowChangeList rcl(cell.cell); RETURN_NOT_OK(out->AppendDelta(cell.key, rcl)); RETURN_NOT_OK(stats.UpdateStats(cell.key.timestamp(), rcl)); } 4.next loop. Well, my question is that the second n rows in input redo delta file A is not always larger than the first n rows in input redo delta file B. Thus, it will result in failure when MutateRow. 何李夫 2017-04-10 16:06:24