[kudu-CR] KUDU-2529 add a "-tables=" flag to the "kudu table list".

2018-08-30 Thread helifu (Code Review)
helifu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11360


Change subject: KUDU-2529 add a "-tables=" flag to the "kudu table 
list".
..

KUDU-2529
add a "-tables=" flag to the "kudu table list".

Change-Id: I23006a3f3e4f157e6c6c10968b8692e89892d1ce
---
M src/kudu/tools/ksck.cc
M src/kudu/tools/tool_action_cluster.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_table.cc
5 files changed, 23 insertions(+), 18 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/11360/1
--
To view, visit http://gerrit.cloudera.org:8080/11360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I23006a3f3e4f157e6c6c10968b8692e89892d1ce
Gerrit-Change-Number: 11360
Gerrit-PatchSet: 1
Gerrit-Owner: helifu 


[kudu-CR] KUDU-2469 pt 2: fail replicas on CFile corruption

2018-08-30 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11249 )

Change subject: KUDU-2469 pt 2: fail replicas on CFile corruption
..


Patch Set 9:

(5 comments)

I did a quick first pass. The theme of my feedback is that handling the 
corruption at the boundaries of the public api instead of each location of 
internal api should reduce the location we need to add handling code while 
still providing the same coverage.

As long as the lower level calls return Status:Corruption the higher level 
handling can handle and fail the tablet.

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc
File src/kudu/cfile/bloomfile.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc@257
PS9, Line 257: return Status::Corruption("bloom file is compressed 
(compression not supported)",
I suppose the type of corruption is interesting.

It's clear the a Cfile with a checksum issue is corrupt and should be failed.

How could some of these other corruptions happen?  Missing index, compression 
unsupported, etc. All of these should result in checksum errors when 
initializing in theory. Is this catch-all safe?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc@344
PS9, Line 344: RETURN_NOT_OK(ParseBlockHeader(io_context, dblk_data.data(), 
&hdr, &bloom_data));
Can we rely on the the above reader_->ReadBlock to handle any corruption issues 
instead of handling all cases in the bloomfile code too?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc@185
PS9, Line 185:   RETURN_NOT_OK(ReadAndParseFooter(io_context));
Can we handle header and footer corruption here with 
RETURN_NOT_OK_HANDLE_CORRUPTION instead of at each location in the internal 
methods?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc@328
PS9, Line 328: RETURN_NOT_OK(VerifyChecksum(io_context, slices, checksum));
Can we handle the checksum corruption here with RETURN_NOT_OK_HANDLE_CORRUPTION 
instead of at each location in the internal method?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tablet/deltafile.cc
File src/kudu/tablet/deltafile.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tablet/deltafile.cc@274
PS9, Line 274:   RETURN_NOT_OK(reader_->Init(io_context));
Same comment in this file as the ones in the bloomfile code. Can we depend on 
CfileReader correctly handling any corruption so that we don't need to handle 
it in all the wrappers?



--
To view, visit http://gerrit.cloudera.org:8080/11249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f
Gerrit-Change-Number: 11249
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Thu, 30 Aug 2018 17:13:31 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2469 pt 2: fail replicas on CFile corruption

2018-08-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11249 )

Change subject: KUDU-2469 pt 2: fail replicas on CFile corruption
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc
File src/kudu/cfile/bloomfile.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc@257
PS9, Line 257: return Status::Corruption("bloom file is compressed 
(compression not supported)",
> I suppose the type of corruption is interesting.
Yeah, these ones are pretty fuzzy. I think a few of these `return Corruption`s 
could probably be DCHECKs or CHECKs, but returning an error is more 
conservative. If anything, I can just omit this handling since it's a 
corruption that we wouldn't expect, and it's probably worth surfacing as a bug, 
rather than handling it.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc@344
PS9, Line 344: RETURN_NOT_OK(ParseBlockHeader(io_context, dblk_data.data(), 
&hdr, &bloom_data));
> Can we rely on the the above reader_->ReadBlock to handle any corruption is
Hrm, that's a good point. As long as checksums are doing their job, all the 
other `return Corruption` paths _should_ be indicative of a bug. I wonder if 
this extra layer of protection is valuable at all.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc@185
PS9, Line 185:   RETURN_NOT_OK(ReadAndParseFooter(io_context));
> Can we handle header and footer corruption here with RETURN_NOT_OK_HANDLE_C
I'm not too keen on that approach. You're right that that's what we did for EIO 
handling, but I think this approach makes it much clearer where handling is 
happening, _and_ makes it easier to say that all the places that should be 
covered _are_ covered.

That approach circumvents this extra plumbing, but I'm not sure it reduces the 
number of places we need to cover. With current handling, we have to handle at 
every `return Corruption` site. Relying on RETURN_NOT_OK_HANDLE_CORRUPTION, we 
would have to wrap every call to a function that may return a corruption.



--
To view, visit http://gerrit.cloudera.org:8080/11249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f
Gerrit-Change-Number: 11249
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Thu, 30 Aug 2018 17:54:25 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2469 pt 2: fail replicas on CFile corruption

2018-08-30 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11249 )

Change subject: KUDU-2469 pt 2: fail replicas on CFile corruption
..


Patch Set 9:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc
File src/kudu/cfile/bloomfile.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc@257
PS9, Line 257: return Status::Corruption("bloom file is compressed 
(compression not supported)",
> I suppose the type of corruption is interesting.
I'm also wondering why compression or a missing value index are grounds for 
corruption. They seem more like NotSupported to me (i.e backwards incompatible 
on-disk state that is being loaded by an older version of Kudu). Does it really 
make sense to fail these replicas? What happens if reader_->HandleCorruption() 
isn't called; does the server crash?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc@234
PS9, Line 234:   .CloneAndPrepend("failed to parse CFile pre-header");
Shouldn't the CloneAndPrepend be conditioned on !s.OK()?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/fs/io_context.h
File src/kudu/fs/io_context.h:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/fs/io_context.h@30
PS9, Line 30: user
s/user/module/ based on the text in the examples.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc
File src/kudu/integration-tests/disk_failure-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc@217
PS9, Line 217: ::testing::Values(ErrorType::CFILE_CORRUPTION));
Shouldn't this parameterize on DISK_FAILURE too?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc@240
PS9, Line 240:   // Wait a bit for some errors to occur.
Can we ASSERT_EVENTUALLY() on something here, to ensure that we only run the 
ClusterVerifier when an error has actually occurred? Is there a concern that 
rereplication may kick in and finish too quickly for us to observe an error in 
the test? Maybe we can look at a metric?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc@250
PS9, Line 250:   // Set up a workload that does both reads and writes.
But you're setting the number of write threads to 0...


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc@272
PS9, Line 272:   // Wait a bit for some errors to occur.
Same feedback about waiting for an error here.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tablet/deltafile.cc
File src/kudu/tablet/deltafile.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tablet/deltafile.cc@278
PS9, Line 278: return Status::Corruption("file does not have a value 
index!");
Again, seems more like NotSupported than Corruption.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tablet/deltafile.cc@291
PS9, Line 291: return Status::Corruption("missing delta stats from the 
delta file metadata");
Same here.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tserver/tablet_server-test.cc@1676
PS9, Line 1676: auto log_req = MakeScopedCleanup([&] {
  :   LOG(INFO) << SecureDebugString(req);
  : });
Does SCOPED_TRACE not work in here?



--
To view, visit http://gerrit.cloudera.org:8080/11249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f
Gerrit-Change-Number: 11249
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Thu, 30 Aug 2018 17:58:18 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-428: add Sentry to thirdparty, mini-sentry

2018-08-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11347 )

Change subject: KUDU-428: add Sentry to thirdparty, mini-sentry
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11347/2/src/kudu/sentry/mini_sentry.cc
File src/kudu/sentry/mini_sentry.cc:

http://gerrit.cloudera.org:8080/#/c/11347/2/src/kudu/sentry/mini_sentry.cc@88
PS2, Line 88: Substitute("$0/sentry-site.xml", tmp_dir)
nit: use JoinPathSegments?


http://gerrit.cloudera.org:8080/#/c/11347/2/src/kudu/sentry/mini_sentry.cc@169
PS2, Line 169: strings::
nit: not needed


http://gerrit.cloudera.org:8080/#/c/11347/2/src/kudu/sentry/mini_sentry.cc@173
PS2, Line 173: JoinPathSegments(tmp_dir, "sentry-site.xml")
Do you foresee CreateSentrySite() being used with any directory besides 
`GetTestDataDirectory()`? If not, maybe this can be a private member/function? 
And use it here and L88


http://gerrit.cloudera.org:8080/#/c/11347/2/src/kudu/util/test_util.cc
File src/kudu/util/test_util.cc:

http://gerrit.cloudera.org:8080/#/c/11347/2/src/kudu/util/test_util.cc@453
PS2, Line 453:   *home_dir = env == nullptr ? JoinPathSegments(bin_dir, 
Substitute("$0-home", name)) : env;
nit: Maybe don't set this if we're returning an error? Or document that it 
always sets `home_dir`



--
To view, visit http://gerrit.cloudera.org:8080/11347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I03f39cf9b2c813c0c305d085e1ad3851636326f5
Gerrit-Change-Number: 11347
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Thu, 30 Aug 2018 18:02:25 +
Gerrit-HasComments: Yes


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Will Berkeley (Code Review)
Will Berkeley has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11361


Change subject: Add some additional info to ScanRequest traces
..

Add some additional info to ScanRequest traces

This patch makes two improvements to ScanRequest traces:
* It adds the tablet id to the trace in a couple of places. This is
  useful because otherwise it's hard to tell which tablet is actually
  being scanned.
* It adds a metric "rowset_iterators" for the number of rowsets the
  scanner will iterate over. This is useful because I've seen a number
  of scan problems where the underlying issue was a lot of uncompacted
  rowsets.

Here's how a ScanRequest trace on /rpcz looks now:

{
"method_name": "kudu.tserver.ScanRequestPB",
"samples": [
{
"header": {
"call_id": 3,
"remote_method": {
"service_name": "kudu.tserver.TabletServerService",
"method_name": "Scan"
},
"timeout_millis": 2
},
"trace": "0830 10:58:57.936549 (+ 0us) service_pool.cc:162] 
Inserting onto call queue\n0830 10:58:57.936585 (+36us) 
service_pool.cc:221] Handling call\n0830 10:58:57.936763 (+   178us) 
tablet_service.cc:1684] Created scanner f4c9f70b5da64a84b3059832fe087362 for 
tablet 1b1fee4fd7b244ec9ccadd55a812f855\n0830 10:58:57.936912 (+   149us) 
tablet_service.cc:1760] Creating iterator\n0830 10:58:57.936952 (+40us) 
tablet_service.cc:2092] Waiting safe time to advance\n0830 10:58:57.936976 (+   
 24us) tablet_service.cc:2100] Waiting for operations to commit\n0830 
10:58:57.936999 (+23us) tablet_service.cc:2114] All operations in snapshot 
committed. Waited for 35 microseconds\n0830 10:58:57.937022 (+23us) 
tablet_service.cc:1790] Iterator created\n0830 10:58:57.937371 (+   349us) 
tablet_service.cc:1804] Iterator init: OK\n0830 10:58:57.937395 (+24us) 
tablet_service.cc:1853] has_more: true\n0830 10:58:57.937403 (+ 8us) 
tablet_service.cc:1868] Continuing scan request\n0830 10:58:57.937438 (+
35us) tablet_service.cc:1916] Found scanner f4c9f70b5da64a84b3059832fe087362 
for tablet 1b1fee4fd7b244ec9ccadd55a812f855\n0830 10:58:57.943861 (+  6423us) 
inbound_call.cc:162] Queueing success response\n",
"duration_ms": 7,
"metrics": [
{
"key": "rowset_iterators",
"value": 1
},
{
"key": "threads_started",
"value": 1
},
{
"key": "thread_start_us",
"value": 72
},
{
"key": "compiler_manager_pool.run_cpu_time_us",
"value": 120300
},
{
"key": "compiler_manager_pool.run_wall_time_us",
"value": 144585
},
{
"key": "compiler_manager_pool.queue_time_us",
"value": 142
}
]
}
]
},

The new/improved trace messages are hidden in the block of text. They are:

10:58:57.936763 (+   178us) tablet_service.cc:1684] Created scanner 
f4c9f70b5da64a84b3059832fe087362 for tablet 1b1fee4fd7b244ec9ccadd55a812f855

and

10:58:57.937438 (+35us) tablet_service.cc:1916] Found scanner 
f4c9f70b5da64a84b3059832fe087362 for tablet 1b1fee4fd7b244ec9ccadd55a812f855

Note that both appear because our scanner code reuses the "continue
scanning" even for the first scan after the scanner is created.

Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
---
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/tablet_service.cc
2 files changed, 5 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/11361/1
--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Will Berkeley (Code Review)
Hello Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11361

to look at the new patch set (#2).

Change subject: Add some additional info to ScanRequest traces
..

Add some additional info to ScanRequest traces

This patch makes two improvements to ScanRequest traces:
* It adds the tablet id to the trace in a couple of places. This is
  useful because otherwise it's hard to tell which tablet is actually
  being scanned.
* It adds a metric "rowset_iterators" for the number of rowsets the
  scanner will iterate over. This is useful because I've seen a number
  of scan problems where the underlying issue was a lot of uncompacted
  rowsets.

Here's how a ScanRequest trace on /rpcz looks now:

{
"method_name": "kudu.tserver.ScanRequestPB",
"samples": [
{
"header": {
"call_id": 3,
"remote_method": {
"service_name": "kudu.tserver.TabletServerService",
"method_name": "Scan"
},
"timeout_millis": 2
},
"trace": "0830 10:58:57.936549 (+ 0us) service_pool.cc:162] 
Inserting onto call queue\n0830 10:58:57.936585 (+36us) 
service_pool.cc:221] Handling call\n0830 10:58:57.936763 (+   178us) 
tablet_service.cc:1684] Created scanner f4c9f70b5da64a84b3059832fe087362 for 
tablet 1b1fee4fd7b244ec9ccadd55a812f855\n0830 10:58:57.936912 (+   149us) 
tablet_service.cc:1760] Creating iterator\n0830 10:58:57.936952 (+40us) 
tablet_service.cc:2092] Waiting safe time to advance\n0830 10:58:57.936976 (+   
 24us) tablet_service.cc:2100] Waiting for operations to commit\n0830 
10:58:57.936999 (+23us) tablet_service.cc:2114] All operations in snapshot 
committed. Waited for 35 microseconds\n0830 10:58:57.937022 (+23us) 
tablet_service.cc:1790] Iterator created\n0830 10:58:57.937371 (+   349us) 
tablet_service.cc:1804] Iterator init: OK\n0830 10:58:57.937395 (+24us) 
tablet_service.cc:1853] has_more: true\n0830 10:58:57.937403 (+ 8us) 
tablet_service.cc:1868] Continuing scan request\n0830 10:58:57.937438 (+
35us) tablet_service.cc:1916] Found scanner f4c9f70b5da64a84b3059832fe087362 
for tablet 1b1fee4fd7b244ec9ccadd55a812f855\n0830 10:58:57.943861 (+  6423us) 
inbound_call.cc:162] Queueing success response\n",
"duration_ms": 7,
"metrics": [
{
"key": "rowset_iterators",
"value": 1
},
{
"key": "threads_started",
"value": 1
},
{
"key": "thread_start_us",
"value": 72
},
{
"key": "compiler_manager_pool.run_cpu_time_us",
"value": 120300
},
{
"key": "compiler_manager_pool.run_wall_time_us",
"value": 144585
},
{
"key": "compiler_manager_pool.queue_time_us",
"value": 142
}
]
}
]
},

The new/improved trace messages are hidden in the block of text. They are:

10:58:57.936763 (+   178us) tablet_service.cc:1684] Created scanner 
f4c9f70b5da64a84b3059832fe087362 for tablet 1b1fee4fd7b244ec9ccadd55a812f855

and

10:58:57.937438 (+35us) tablet_service.cc:1916] Found scanner 
f4c9f70b5da64a84b3059832fe087362 for tablet 1b1fee4fd7b244ec9ccadd55a812f855

Note that both appear because our scanner code reuses the "continue
scanning" code path even for the first scan after the scanner is created.

Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
---
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/tablet_service.cc
2 files changed, 5 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/11361/2
--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11361 )

Change subject: Add some additional info to ScanRequest traces
..


Patch Set 2: Code-Review+1

great idea, thanks for doing this


--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 30 Aug 2018 18:43:16 +
Gerrit-HasComments: No


[kudu-CR] bitmap: add equality method

2018-08-30 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11266 )

Change subject: bitmap: add equality method
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11266/4/src/kudu/util/bitmap-test.cc
File src/kudu/util/bitmap-test.cc:

http://gerrit.cloudera.org:8080/#/c/11266/4/src/kudu/util/bitmap-test.cc@231
PS4, Line 231: TEST(TestBitMap, TestEquals) {
> Does it make sense to add a test for overlapping bitmaps as well?
Done


http://gerrit.cloudera.org:8080/#/c/11266/4/src/kudu/util/bitmap-test.cc@254
PS4, Line 254: i
> Should this be (i + 1)?
Oops, yes.



--
To view, visit http://gerrit.cloudera.org:8080/11266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If53e41250435501e158bdbc076dc8e89ea153256
Gerrit-Change-Number: 11266
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 30 Aug 2018 19:24:07 +
Gerrit-HasComments: Yes


[kudu-CR] deltamemstore: support iteration with snap to exclude

2018-08-30 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11029 )

Change subject: deltamemstore: support iteration with snap_to_exclude
..


Patch Set 11: Code-Review-2

Don't merge; I've found an issue with ApplyUpdates() skipping "too old" updates 
that it shouldn't.


--
To view, visit http://gerrit.cloudera.org:8080/11029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I04af3737960f3fcc7b1921a77ff91e1607b7bc47
Gerrit-Change-Number: 11029
Gerrit-PatchSet: 11
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 30 Aug 2018 19:33:11 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2529 add a "-tables=" flag to the "kudu table list".

2018-08-30 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11360 )

Change subject: KUDU-2529 add a "-tables=" flag to the "kudu table 
list".
..


Patch Set 1:

(4 comments)

Can you add a new test to kudu-tool-test to cover this?

http://gerrit.cloudera.org:8080/#/c/11360/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11360/1//COMMIT_MSG@7
PS1, Line 7: KUDU-2529
   : add a "-tables=" flag to the "kudu table list".
Please reformat this to adhere to our commit message guidelines:

  KUDU-2529: 

  

See 
https://git-scm.com/book/en/v2/Distributed-Git-Contributing-to-a-Project#_commit_guidelines
 for more information.


http://gerrit.cloudera.org:8080/#/c/11360/1/src/kudu/tools/tool_action_common.cc
File src/kudu/tools/tool_action_common.cc:

http://gerrit.cloudera.org:8080/#/c/11360/1/src/kudu/tools/tool_action_common.cc@105
PS1, Line 105: DEFINE_string(tables, "",
 :   "Tables to check (comma-separated list of names). "
 :   "If not specified, checks all tables.");
The new usage of FLAGS_tables doesn't "check" them; it just includes them in 
the results. Short of overriding the description for that action (seems like 
overkill to me), we should rephrase this to be more abstract. Perhaps "Tables 
to include (comma-separated list of names). If not specified, includes all 
tables."?


http://gerrit.cloudera.org:8080/#/c/11360/1/src/kudu/tools/tool_action_common.cc@457
PS1, Line 457:   if (patterns.empty()) return true;
Can you preserve the "// Consider no filter..." comment here?


http://gerrit.cloudera.org:8080/#/c/11360/1/src/kudu/tools/tool_action_table.cc
File src/kudu/tools/tool_action_table.cc:

http://gerrit.cloudera.org:8080/#/c/11360/1/src/kudu/tools/tool_action_table.cc@194
PS1, Line 194:   .AddOptionalParameter(FLAGS_tables)
This should just be "tables", and is likely the reason that all of the tests 
failed.



--
To view, visit http://gerrit.cloudera.org:8080/11360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23006a3f3e4f157e6c6c10968b8692e89892d1ce
Gerrit-Change-Number: 11360
Gerrit-PatchSet: 1
Gerrit-Owner: helifu 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Thu, 30 Aug 2018 19:43:29 +
Gerrit-HasComments: Yes


[kudu-CR] update public api.

2018-08-30 Thread Adar Dembo (Code Review)
Adar Dembo has abandoned this change. ( http://gerrit.cloudera.org:8080/11126 )

Change subject: update public api.
..


Abandoned

I'm guessing this has been superceded by https://gerrit.cloudera.org/c/11333/1; 
please reopen if I'm wrong.
--
To view, visit http://gerrit.cloudera.org:8080/11126
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ia24abdac343f9699d676170ba3a0b353a944b7cc
Gerrit-Change-Number: 11126
Gerrit-PatchSet: 2
Gerrit-Owner: jinxing6...@126.com
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-2469 pt 2: fail replicas on CFile corruption

2018-08-30 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11249 )

Change subject: KUDU-2469 pt 2: fail replicas on CFile corruption
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc@185
PS9, Line 185:   RETURN_NOT_OK(ReadAndParseFooter(io_context));
> I'm not too keen on that approach. You're right that that's what we did for
I think if we eliminate all the Corruption status that should actually be 
something else. Like "not supported", then we only need to wrap InitOnce and 
the VerifyChecksum call in ReadBlock. These are the private methods that 
validate checksums.



--
To view, visit http://gerrit.cloudera.org:8080/11249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f
Gerrit-Change-Number: 11249
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Thu, 30 Aug 2018 19:58:51 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-08-30 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 4:

I like the article. One thing I think we should so is mention that this is a 
work-in-progress patch and link to the Gerrit review so people can follow along 
if they want.


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 4
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Thu, 30 Aug 2018 20:12:47 +
Gerrit-HasComments: No


[kudu-CR] [tests] make master-stress-test more stable

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11364


Change subject: [tests] make master-stress-test more stable
..

[tests] make master-stress-test more stable

The master-stress-test has been flaky for some time.  After looking
at those failure closely, I found about five different issues.  This
patch addresses the most prominent one: failures of the test scenario
because of timeouts errors in case of TSAN builds.  The timeout errors
were induced by frequent RPC queue overflows.

The rest of issues behind the flakiness will be addressed separately.

This patch also introduces rpc_negotiation_timeout as a member
for ExternalMiniClusterOptions: that's to customize connection
negotiation timeout for the cluster's utility messenger.

Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
---
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
3 files changed, 77 insertions(+), 32 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/11364/1
--
To view, visit http://gerrit.cloudera.org:8080/11364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
Gerrit-Change-Number: 11364
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [tests] make master-stress-test more stable

2018-08-30 Thread Alexey Serbin (Code Review)
Hello Will Berkeley, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11364

to look at the new patch set (#2).

Change subject: [tests] make master-stress-test more stable
..

[tests] make master-stress-test more stable

The master-stress-test has been flaky for some time.  After looking
at those failure closely, I found about five different issues.  This
patch addresses the most prominent one: failures of the test scenario
because of timeouts errors in case of TSAN builds.  The timeout errors
were induced by frequent RPC queue overflows.

The rest of issues behind the flakiness will be addressed separately.

This patch also introduces rpc_negotiation_timeout as a member
for ExternalMiniClusterOptions: that's to customize connection
negotiation timeout for the cluster's utility messenger.

Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
---
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
3 files changed, 82 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/11364/2
--
To view, visit http://gerrit.cloudera.org:8080/11364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
Gerrit-Change-Number: 11364
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [location awareness] Add 'location' column in tserver list

2018-08-30 Thread Fengling Wang (Code Review)
Hello Will Berkeley, Alexey Serbin, Kudu Jenkins, Greg Solovyev,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11313

to look at the new patch set (#5).

Change subject: [location_awareness] Add 'location' column in tserver list
..

[location_awareness] Add 'location' column in tserver list

Command:
kudu tserver list -columns=uuid,location 

Example result:
uuid  |   location
--+--
 1259764cdc5f489984900d49b545802f | loc0
 14446895a8bf47cd92e73836de623ffb | 
 9d7a11e19b324f62b2e6d074f6003ca4 | loc1

This command will list the location of each tserver. If the
location of the tserver has not been set, '' will be
displayed.

Change-Id: If6c9dc8bd08b8d907111fac850fc6fd2c2b96fb8
---
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/tools/CMakeLists.txt
M src/kudu/tools/kudu-tool-test.cc
A src/kudu/tools/testdata/first_argument.sh
M src/kudu/tools/tool_action_tserver.cc
6 files changed, 61 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/11313/5
--
To view, visit http://gerrit.cloudera.org:8080/11313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If6c9dc8bd08b8d907111fac850fc6fd2c2b96fb8
Gerrit-Change-Number: 11313
Gerrit-PatchSet: 5
Gerrit-Owner: Fengling Wang 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Greg Solovyev 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [tests] make master-stress-test more stable

2018-08-30 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11364 )

Change subject: [tests] make master-stress-test more stable
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11364/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11364/2//COMMIT_MSG@9
PS2, Line 9: The master-stress-test has been flaky for some time.  After looking
   : at those failure closely, I found about five different issues.  
This
   : patch addresses the most prominent one: failures of the test 
scenario
   : because of timeouts errors in case of TSAN builds.  The timeout 
errors
   : were induced by frequent RPC queue overflows.
Thanks for tackling this. Are you able to quantify how much this particular 
patch (vs. the others) deflakes the test?


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc
File src/kudu/integration-tests/master-stress-test.cc:

http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@140
PS2, Line 140: // Make the Raft heartbeat interval shorter to allow for 
faster detection
 : // of failed/restarted leader masters.
 : 
opts.extra_master_flags.emplace_back("--raft_heartbeat_interval_ms=500");
Isn't this the default value?


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@161
PS2, Line 161: 
opts.extra_tserver_flags.emplace_back("--raft_heartbeat_interval_ms=500");
See above.


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@327
PS2, Line 327:   if (s.IsNetworkError()) {
 : continue;
 :   }
 :   if (tablet_ids.empty()) {
 : continue;
 :   }
Should we backoff in both of these cases?


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@415
PS2, Line 415: it's crucial to keep the
 :   // master up
All masters? Or just the new one?

If the latter, maybe we should modify this to kill masters in a round robin 
instead of randomly. I'm just a little concerned that a 2s sleep is quite long 
when the test only runs for 5s in normal mode.


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@492
PS2, Line 492:   FLAGS_timeout_ms = kDefaultAdminTimeout.ToMilliseconds();
We don't use the CLI in this test, do we? If this is for the LeaderMasterProxy, 
can you note that in a comment?


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/mini-cluster/external_mini_cluster.h
File src/kudu/mini-cluster/external_mini_cluster.h:

http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/mini-cluster/external_mini_cluster.h@159
PS2, Line 159:   // an incomplete connection negotiation will timeout.
Can you note the default here?



--
To view, visit http://gerrit.cloudera.org:8080/11364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
Gerrit-Change-Number: 11364
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 30 Aug 2018 20:31:00 +
Gerrit-HasComments: Yes


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11361 )

Change subject: Add some additional info to ScanRequest traces
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc@1790
PS2, Line 1790: TRACE_COUNTER_INCREMENT("rowset_iterators", ret.size());
nit: could these be coalesced and put  down in the Tablet::Iterator right after 
the call to CaptureConsistentIterators at L2323?



--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 30 Aug 2018 20:37:56 +
Gerrit-HasComments: Yes


[kudu-CR] bitmap: add equality method

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11266 )

Change subject: bitmap: add equality method
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11266/4/src/kudu/util/bitmap-test.cc
File src/kudu/util/bitmap-test.cc:

http://gerrit.cloudera.org:8080/#/c/11266/4/src/kudu/util/bitmap-test.cc@231
PS4, Line 231: TEST(TestBitMap, TestEquals) {
> Done
Probably, that's going to appear in the next version (like PS6)?



--
To view, visit http://gerrit.cloudera.org:8080/11266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If53e41250435501e158bdbc076dc8e89ea153256
Gerrit-Change-Number: 11266
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 30 Aug 2018 20:38:13 +
Gerrit-HasComments: Yes


[kudu-CR] [tests] make master-stress-test more stable

2018-08-30 Thread Alexey Serbin (Code Review)
Hello Will Berkeley, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11364

to look at the new patch set (#3).

Change subject: [tests] make master-stress-test more stable
..

[tests] make master-stress-test more stable

The master-stress-test has been flaky for some time.  After looking
at those failure closely, I found about five different issues.  This
patch addresses the most prominent one: failures of the test scenario
because of timeouts errors in case of TSAN builds.  The timeout errors
were induced by frequent RPC queue overflows.

The rest of issues behind the flakiness will be addressed separately.

This patch also introduces rpc_negotiation_timeout as a member
for ExternalMiniClusterOptions: that's to customize connection
negotiation timeout for the cluster's utility messenger.

Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
---
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
3 files changed, 82 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/11364/3
--
To view, visit http://gerrit.cloudera.org:8080/11364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
Gerrit-Change-Number: 11364
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] Improve runtime of slow java client security test.

2018-08-30 Thread Brian McDevitt (Code Review)
Brian McDevitt has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11365


Change subject: Improve runtime of slow java client security test.
..

Improve runtime of slow java client security test.

KUDU-2489 - Shortened the kerberos renew lifetime parameter and the
Thread.sleep parameters.  These changes cut the test runtime by ~50 seconds.

Change-Id: I19fa5185430a6c91fbe050dbc458b7b91e2d5bea
---
M java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java
1 file changed, 5 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/11365/1
--
To view, visit http://gerrit.cloudera.org:8080/11365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I19fa5185430a6c91fbe050dbc458b7b91e2d5bea
Gerrit-Change-Number: 11365
Gerrit-PatchSet: 1
Gerrit-Owner: Brian McDevitt 


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Will Berkeley (Code Review)
Hello Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11361

to look at the new patch set (#3).

Change subject: Add some additional info to ScanRequest traces
..

Add some additional info to ScanRequest traces

This patch makes two improvements to ScanRequest traces:
* It adds the tablet id to the trace in a couple of places. This is
  useful because otherwise it's hard to tell which tablet is actually
  being scanned.
* It adds a metric "rowset_iterators" for the number of rowsets the
  scanner will iterate over. This is useful because I've seen a number
  of scan problems where the underlying issue was a lot of uncompacted
  rowsets.

Here's how a ScanRequest trace on /rpcz looks now:

{
"method_name": "kudu.tserver.ScanRequestPB",
"samples": [
{
"header": {
"call_id": 3,
"remote_method": {
"service_name": "kudu.tserver.TabletServerService",
"method_name": "Scan"
},
"timeout_millis": 2
},
"trace": "0830 10:58:57.936549 (+ 0us) service_pool.cc:162] 
Inserting onto call queue\n0830 10:58:57.936585 (+36us) 
service_pool.cc:221] Handling call\n0830 10:58:57.936763 (+   178us) 
tablet_service.cc:1684] Created scanner f4c9f70b5da64a84b3059832fe087362 for 
tablet 1b1fee4fd7b244ec9ccadd55a812f855\n0830 10:58:57.936912 (+   149us) 
tablet_service.cc:1760] Creating iterator\n0830 10:58:57.936952 (+40us) 
tablet_service.cc:2092] Waiting safe time to advance\n0830 10:58:57.936976 (+   
 24us) tablet_service.cc:2100] Waiting for operations to commit\n0830 
10:58:57.936999 (+23us) tablet_service.cc:2114] All operations in snapshot 
committed. Waited for 35 microseconds\n0830 10:58:57.937022 (+23us) 
tablet_service.cc:1790] Iterator created\n0830 10:58:57.937371 (+   349us) 
tablet_service.cc:1804] Iterator init: OK\n0830 10:58:57.937395 (+24us) 
tablet_service.cc:1853] has_more: true\n0830 10:58:57.937403 (+ 8us) 
tablet_service.cc:1868] Continuing scan request\n0830 10:58:57.937438 (+
35us) tablet_service.cc:1916] Found scanner f4c9f70b5da64a84b3059832fe087362 
for tablet 1b1fee4fd7b244ec9ccadd55a812f855\n0830 10:58:57.943861 (+  6423us) 
inbound_call.cc:162] Queueing success response\n",
"duration_ms": 7,
"metrics": [
{
"key": "rowset_iterators",
"value": 1
},
{
"key": "threads_started",
"value": 1
},
{
"key": "thread_start_us",
"value": 72
},
{
"key": "compiler_manager_pool.run_cpu_time_us",
"value": 120300
},
{
"key": "compiler_manager_pool.run_wall_time_us",
"value": 144585
},
{
"key": "compiler_manager_pool.queue_time_us",
"value": 142
}
]
}
]
},

The new/improved trace messages are hidden in the block of text. They are:

10:58:57.936763 (+   178us) tablet_service.cc:1684] Created scanner 
f4c9f70b5da64a84b3059832fe087362 for tablet 1b1fee4fd7b244ec9ccadd55a812f855

and

10:58:57.937438 (+35us) tablet_service.cc:1916] Found scanner 
f4c9f70b5da64a84b3059832fe087362 for tablet 1b1fee4fd7b244ec9ccadd55a812f855

Note that both appear because our scanner code reuses the "continue
scanning" code path even for the first scan after the scanner is created.

Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
---
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/tablet_service.cc
2 files changed, 4 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/11361/3
--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11361 )

Change subject: Add some additional info to ScanRequest traces
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc@1790
PS2, Line 1790: TRACE_COUNTER_INCREMENT("rowset_iterators", ret.size());
> nit: could these be coalesced and put  down in the Tablet::Iterator right a
Done



--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 30 Aug 2018 21:37:36 +
Gerrit-HasComments: Yes


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Will Berkeley (Code Review)
Hello Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11361

to look at the new patch set (#4).

Change subject: Add some additional info to ScanRequest traces
..

Add some additional info to ScanRequest traces

This patch makes two improvements to ScanRequest traces:
* It adds the tablet id to the trace in a couple of places. This is
  useful because otherwise it's hard to tell which tablet is actually
  being scanned.
* It adds a metric "rowset_iterators" for the number of rowsets the
  scanner will iterate over. This is useful because I've seen a number
  of scan problems where the underlying issue was a lot of uncompacted
  rowsets.

Here's how a ScanRequest trace on /rpcz looks now:

{
"method_name": "kudu.tserver.ScanRequestPB",
"samples": [
{
"header": {
"call_id": 3,
"remote_method": {
"service_name": "kudu.tserver.TabletServerService",
"method_name": "Scan"
},
"timeout_millis": 2
},
"trace": "0830 10:58:57.936549 (+ 0us) service_pool.cc:162] 
Inserting onto call queue\n0830 10:58:57.936585 (+36us) 
service_pool.cc:221] Handling call\n0830 10:58:57.936763 (+   178us) 
tablet_service.cc:1684] Created scanner f4c9f70b5da64a84b3059832fe087362 for 
tablet 1b1fee4fd7b244ec9ccadd55a812f855\n0830 10:58:57.936912 (+   149us) 
tablet_service.cc:1760] Creating iterator\n0830 10:58:57.936952 (+40us) 
tablet_service.cc:2092] Waiting safe time to advance\n0830 10:58:57.936976 (+   
 24us) tablet_service.cc:2100] Waiting for operations to commit\n0830 
10:58:57.936999 (+23us) tablet_service.cc:2114] All operations in snapshot 
committed. Waited for 35 microseconds\n0830 10:58:57.937022 (+23us) 
tablet_service.cc:1790] Iterator created\n0830 10:58:57.937371 (+   349us) 
tablet_service.cc:1804] Iterator init: OK\n0830 10:58:57.937395 (+24us) 
tablet_service.cc:1853] has_more: true\n0830 10:58:57.937403 (+ 8us) 
tablet_service.cc:1868] Continuing scan request\n0830 10:58:57.937438 (+
35us) tablet_service.cc:1916] Found scanner f4c9f70b5da64a84b3059832fe087362 
for tablet 1b1fee4fd7b244ec9ccadd55a812f855\n0830 10:58:57.943861 (+  6423us) 
inbound_call.cc:162] Queueing success response\n",
"duration_ms": 7,
"metrics": [
{
"key": "rowset_iterators",
"value": 1
},
{
"key": "threads_started",
"value": 1
},
{
"key": "thread_start_us",
"value": 72
},
{
"key": "compiler_manager_pool.run_cpu_time_us",
"value": 120300
},
{
"key": "compiler_manager_pool.run_wall_time_us",
"value": 144585
},
{
"key": "compiler_manager_pool.queue_time_us",
"value": 142
}
]
}
]
},

The new/improved trace messages are hidden in the block of text. They are:

10:58:57.936763 (+   178us) tablet_service.cc:1684] Created scanner 
f4c9f70b5da64a84b3059832fe087362 for tablet 1b1fee4fd7b244ec9ccadd55a812f855

and

10:58:57.937438 (+35us) tablet_service.cc:1916] Found scanner 
f4c9f70b5da64a84b3059832fe087362 for tablet 1b1fee4fd7b244ec9ccadd55a812f855

Note that both appear because our scanner code reuses the "continue
scanning" code path even for the first scan after the scanner is created.

Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
---
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/tablet_service.cc
2 files changed, 3 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/11361/4
--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11361 )

Change subject: Add some additional info to ScanRequest traces
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc@1790
PS2, Line 1790: TRACE_COUNTER_INCREMENT("rowset_iterators", ret.size());
> nit: could these be coalesced and put  down in the Tablet::Iterator right a
This sounds like a good idea in the current implementation, but that would look 
a bit fragile to me if some new parts of the code start calling 
CaptureConsistentIterators.


http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc@1805
PS2, Line 1805: // Swap results into the parameters.
nit: please either move this one line down or remove the comment at all.


http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tserver/tablet_service.cc@1684
PS2, Line 1684: TRACE("Created scanner $0 for tablet $1", scanner->id(), 
scanner->tablet_id());
Wouldn't it be useful to trace registration of the new scanner instead?  In 
case of an error, this scanner would not be registered (i.e. unreg_scanner will 
do its job), so the fact that the scanner was created but not registered will 
not be easy to track in that case, no?



--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 30 Aug 2018 21:41:03 +
Gerrit-HasComments: Yes


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11361 )

Change subject: Add some additional info to ScanRequest traces
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc@1790
PS2, Line 1790: TRACE_COUNTER_INCREMENT("rowset_iterators", ret.size());
> This sounds like a good idea in the current implementation, but that would
It seems with PS4 this is no longer relevant.


http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tablet/tablet.cc@1805
PS2, Line 1805: // Swap results into the parameters.
> nit: please either move this one line down or remove the comment at all.
It seems with PS4 this is no longer relevant.



--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 30 Aug 2018 21:42:37 +
Gerrit-HasComments: Yes


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Will Berkeley (Code Review)
Will Berkeley has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/11361 )

Change subject: Add some additional info to ScanRequest traces
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11361 )

Change subject: Add some additional info to ScanRequest traces
..


Patch Set 4: Verified+1

Unrelated flake.


--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 30 Aug 2018 22:01:58 +
Gerrit-HasComments: No


[kudu-CR] bitmap: add equality method

2018-08-30 Thread Adar Dembo (Code Review)
Hello Mike Percy, Alexey Serbin, Kudu Jenkins, Grant Henke, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11266

to look at the new patch set (#6).

Change subject: bitmap: add equality method
..

bitmap: add equality method

This is useful for tests.

Change-Id: If53e41250435501e158bdbc076dc8e89ea153256
---
M src/kudu/util/bitmap-test.cc
M src/kudu/util/bitmap.h
2 files changed, 68 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/11266/6
--
To view, visit http://gerrit.cloudera.org:8080/11266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If53e41250435501e158bdbc076dc8e89ea153256
Gerrit-Change-Number: 11266
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] bitmap: add equality method

2018-08-30 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11266 )

Change subject: bitmap: add equality method
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11266/4/src/kudu/util/bitmap-test.cc
File src/kudu/util/bitmap-test.cc:

http://gerrit.cloudera.org:8080/#/c/11266/4/src/kudu/util/bitmap-test.cc@231
PS4, Line 231: TEST(TestBitMap, TestEquals) {
> Probably, that's going to appear in the next version (like PS6)?
Oops, sorry, I messed up my rebase for PS5. See PS6.



--
To view, visit http://gerrit.cloudera.org:8080/11266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If53e41250435501e158bdbc076dc8e89ea153256
Gerrit-Change-Number: 11266
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 30 Aug 2018 22:04:07 +
Gerrit-HasComments: Yes


[kudu-CR] Improve logging of maintenance ops

2018-08-30 Thread Will Berkeley (Code Review)
Will Berkeley has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11367


Change subject: Improve logging of maintenance ops
..

Improve logging of maintenance ops

MRS flushes and rowset compactions
=

MRS flush and rowset compaction logging now includes the number of new rowsets
flushed.

Before:

I0830 14:54:24.353442 2830984064 tablet.cc:1651] T test_tablet_id P 
78c16245dcd84048bf66debf1958c169: Flush successful on 100 rows (9272 bytes)

I0830 14:58:23.309068 2830984064 tablet.cc:1651] T test_tablet_id P 
15853c32ec1d4f70b82aef90da2108a6: Compaction successful on 30 rows (4957 bytes)

After:

I0830 14:54:24.353442 2830984064 tablet.cc:1651] T test_tablet_id P 
78c16245dcd84048bf66debf1958c169: Flush successful on 100 rows (1 rowsets, 9272 
bytes)

I0830 14:58:23.309068 2830984064 tablet.cc:1651] T test_tablet_id P 
15853c32ec1d4f70b82aef90da2108a6: Compaction successful on 30 rows (1 rowsets, 
4957 bytes)

Major delta compactions
===

Major delta compaction had logging that was a little too verbose- for
each delta store compacted, it printed out a separate log message with
mutation counts. Instead, this patch makes the old more detailed output
appear only when verbose logging is on and at INFO level substitutes a
total count of each mutation type and a count of delta files compacted,
as part of the "Finished" message.

Before:

I0830 15:09:31.731609 2830984064 delta_compaction.cc:296] Starting major delta 
compaction for columns val1[int32 NOT NULL] val3[int32 NOT NULL] val4[string 
NOT NULL]
I0830 15:09:31.731645 2830984064 delta_compaction.cc:300] Preparing to major 
compact delta file: 008509586267 (ts range=[101, 150], delete_count=[0], 
reinsert_count=[0], update_counts_by_col_id=[11:50,13:50,14:50])
I0830 15:09:31.731672 2830984064 delta_compaction.cc:300] Preparing to major 
compact delta file: 008509586268 (ts range=[151, 200], delete_count=[0], 
reinsert_count=[0], update_counts_by_col_id=[11:50,13:50,14:50])
I0830 15:09:31.735069 2830984064 delta_compaction.cc:306] Finished major delta 
compaction of columns val1[int32 NOT NULL] val3[int32 NOT NULL] val4[string NOT 
NULL]

After:

I0830 14:45:38.433037 2830984064 delta_compaction.cc:326] Starting major delta 
compaction for columns val1[int32 NOT NULL], val3[int32 NOT NULL], val4[string 
NOT NULL]
I0830 14:45:38.435420 2830984064 delta_compaction.cc:341] Finished major delta 
compaction of columns val1[int32 NOT NULL], val3[int32 NOT NULL], val4[string 
NOT NULL]. Compacted 1 delta files. Overall stats: delete_count=0, 
reinsert_count=0, update_count=150

Minor delta compactions and deltamemstore flushes already include good
information on how much was compacted and flushed, respectively. Their
logging is unchanged.

Change-Id: I43883001c5a1c72ff1ca0c1bc84d24a8533e3891
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_stats.cc
M src/kudu/tablet/delta_stats.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/tablet.cc
6 files changed, 64 insertions(+), 13 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11367/1
--
To view, visit http://gerrit.cloudera.org:8080/11367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I43883001c5a1c72ff1ca0c1bc84d24a8533e3891
Gerrit-Change-Number: 11367
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 


[kudu-CR] Improve logging of maintenance ops

2018-08-30 Thread Will Berkeley (Code Review)
Hello Mike Percy, Alexey Serbin, Andrew Wong, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11367

to look at the new patch set (#2).

Change subject: Improve logging of maintenance ops
..

Improve logging of maintenance ops

MRS flushes and rowset compactions
=

MRS flush and rowset compaction logging now includes the number of new rowsets
flushed.

Before:

I0830 14:54:24.353442 2830984064 tablet.cc:1651] T test_tablet_id P 
78c16245dcd84048bf66debf1958c169: Flush successful on 100 rows (9272 bytes)

I0830 14:58:23.309068 2830984064 tablet.cc:1651] T test_tablet_id P 
15853c32ec1d4f70b82aef90da2108a6: Compaction successful on 30 rows (4957 bytes)

After:

I0830 14:54:24.353442 2830984064 tablet.cc:1651] T test_tablet_id P 
78c16245dcd84048bf66debf1958c169: Flush successful on 100 rows (1 rowsets, 9272 
bytes)

I0830 14:58:23.309068 2830984064 tablet.cc:1651] T test_tablet_id P 
15853c32ec1d4f70b82aef90da2108a6: Compaction successful on 30 rows (1 rowsets, 
4957 bytes)

Major delta compactions
===

Major delta compaction had logging that was a little too verbose- for
each delta store compacted, it printed out a separate log message with
mutation counts. Instead, this patch makes the old more detailed output
appear only when verbose logging is on and at INFO level substitutes a
total count of each mutation type and a count of delta files compacted,
as part of the "Finished" message.

Before:

I0830 15:22:05.918965 2830984064 delta_compaction.cc:296] Starting major delta 
compaction for columns val1[int32 NOT NULL] val3[int32 NOT NULL] val4[string 
NOT NULL]
I0830 15:22:05.919018 2830984064 delta_compaction.cc:300] Preparing to major 
compact delta file: 0145461451762035 (ts range=[101, 150], delete_count=[0], 
reinsert_count=[0], update_counts_by_col_id=[11:50,13:50,14:50])
I0830 15:22:05.919056 2830984064 delta_compaction.cc:300] Preparing to major 
compact delta file: 0145461451762036 (ts range=[151, 200], delete_count=[0], 
reinsert_count=[0], update_counts_by_col_id=[11:50,13:50,14:50])
I0830 15:22:05.921931 2830984064 delta_compaction.cc:306] Finished major delta 
compaction of columns val1[int32 NOT NULL] val3[int32 NOT NULL] val4[string NOT 
NULL]

After:

I0830 15:28:02.737797 2830984064 delta_compaction.cc:326] Starting major delta 
compaction for columns val1[int32 NOT NULL], val3[int32 NOT NULL], val4[string 
NOT NULL]
I0830 15:28:02.740360 2830984064 delta_compaction.cc:341] Finished major delta 
compaction of columns val1[int32 NOT NULL], val3[int32 NOT NULL], val4[string 
NOT NULL]. Compacted 2 delta files. Overall stats: delete_count=0, 
reinsert_count=0, update_count=300

Minor delta compactions and deltamemstore flushes already include good
information on how much was compacted and flushed, respectively. Their
logging is unchanged.

Change-Id: I43883001c5a1c72ff1ca0c1bc84d24a8533e3891
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_stats.cc
M src/kudu/tablet/delta_stats.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/tablet.cc
6 files changed, 64 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11367/2
--
To view, visit http://gerrit.cloudera.org:8080/11367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I43883001c5a1c72ff1ca0c1bc84d24a8533e3891
Gerrit-Change-Number: 11367
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] [tests] make master-stress-test more stable

2018-08-30 Thread Alexey Serbin (Code Review)
Hello Will Berkeley, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11364

to look at the new patch set (#4).

Change subject: [tests] make master-stress-test more stable
..

[tests] make master-stress-test more stable

The master-stress-test has been flaky for some time.  After looking
at those failure closely, I found about five different issues.  This
patch addresses the most prominent one: failures of the test scenario
because of timeouts errors in case of TSAN builds: about 9 out of 10
failures were due to the issue fixed by this patch.  The timeout errors
were induced by frequent RPC queue overflows and the timing of master
restarts wrt specifics of KuduClient backoff scheme used for retries.

The rest of issues behind the flakiness will be addressed separately.

This patch also introduces rpc_negotiation_timeout as a member
for ExternalMiniClusterOptions: that's to customize connection
negotiation timeout for the cluster's utility messenger.

Some statistics:

before the fix:
  37 out of 256 failed in TSAN build, where almost all failures are
  due to the issues fixed by this patch:
http://dist-test.cloudera.org//job?job_id=aserbin.1535666928.86597

after the fix:
  2 out of 256 failed in TSAN build, where the failure was due to the
  issue [1] not addressed by this change list
  (it will be addressed separately):
http://dist-test.cloudera.org/job?job_id=aserbin.1535665784.64065

A couple of other issues due to which the test has failed:
[1] https://issues.apache.org/jira/browse/KUDU-2564
[2] https://issues.apache.org/jira/browse/HIVE-19874 (Dan's evaluation)

Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
---
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
3 files changed, 91 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/11364/4
--
To view, visit http://gerrit.cloudera.org:8080/11364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
Gerrit-Change-Number: 11364
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [tests] make master-stress-test more stable

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11364 )

Change subject: [tests] make master-stress-test more stable
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11364/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11364/2//COMMIT_MSG@9
PS2, Line 9: The master-stress-test has been flaky for some time.  After looking
   : at those failure closely, I found about five different issues.  
This
   : patch addresses the most prominent one: failures of the test 
scenario
   : because of timeouts errors in case of TSAN builds.  The timeout 
errors
   : were induced by frequent RPC queue overflows.
> Thanks for tackling this. Are you able to quantify how much this particular
Done


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc
File src/kudu/integration-tests/master-stress-test.cc:

http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@140
PS2, Line 140: // Make the Raft heartbeat interval shorter to allow for 
faster detection
 : // of failed/restarted leader masters.
 : 
opts.extra_master_flags.emplace_back("--raft_heartbeat_interval_ms=500");
> Isn't this the default value?
Woops.  Yes, 500 ms is the default value.  I was experimenting with the shorter 
interval here, but then got back to the default since there is no real benefit 
shortening this.  I'll remove this piece.


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@161
PS2, Line 161: 
opts.extra_tserver_flags.emplace_back("--raft_heartbeat_interval_ms=500");
> See above.
Yep, that's the default, but I thought it would be nice to list it here just to 
see the difference.  But if everybody remembers the default, then I'll just a 
comment about that and remove this line.


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@327
PS2, Line 327:   if (s.IsNetworkError()) {
 : continue;
 :   }
 :   if (tablet_ids.empty()) {
 : continue;
 :   }
> Should we backoff in both of these cases?
Yep, backing off would be nice.  Done.


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@415
PS2, Line 415: it's crucial to keep the
 :   // master up
> All masters? Or just the new one?
De facto that's about all masters.  Yes, the rest two can re-elect a new leader 
and be available for requests, but the next cycle the new leader might be 
shutdown.  Otherwise, things get hairy time to time and all retries fail in 
vain eventually.

I think I'll just increase the default run time then.


http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/integration-tests/master-stress-test.cc@492
PS2, Line 492:   FLAGS_timeout_ms = kDefaultAdminTimeout.ToMilliseconds();
> We don't use the CLI in this test, do we? If this is for the LeaderMasterPr
Yes, that's for the SyncRpc() calls on the LeaderMasterProxy.



--
To view, visit http://gerrit.cloudera.org:8080/11364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
Gerrit-Change-Number: 11364
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 30 Aug 2018 22:31:33 +
Gerrit-HasComments: Yes


[kudu-CR] bitmap: add equality method

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11266 )

Change subject: bitmap: add equality method
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If53e41250435501e158bdbc076dc8e89ea153256
Gerrit-Change-Number: 11266
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 31 Aug 2018 01:53:39 +
Gerrit-HasComments: No


[kudu-CR] Add some additional info to ScanRequest traces

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11361 )

Change subject: Add some additional info to ScanRequest traces
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/11361/2/src/kudu/tserver/tablet_service.cc@1684
PS2, Line 1684: TRACE("Created scanner $0 for tablet $1", scanner->id(), 
scanner->tablet_id());
> Wouldn't it be useful to trace registration of the new scanner instead?  In
What do you think regarding tracing of registration of a scanner instead of 
just creation?



--
To view, visit http://gerrit.cloudera.org:8080/11361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61792b6989c54a4e0578fe9255d769fe071e52f8
Gerrit-Change-Number: 11361
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Fri, 31 Aug 2018 01:56:23 +
Gerrit-HasComments: Yes


[kudu-CR] bitmap: add equality method

2018-08-30 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11266 )

Change subject: bitmap: add equality method
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If53e41250435501e158bdbc076dc8e89ea153256
Gerrit-Change-Number: 11266
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 31 Aug 2018 01:59:14 +
Gerrit-HasComments: No


[kudu-CR] [tests] make master-stress-test more stable

2018-08-30 Thread Alexey Serbin (Code Review)
Hello Will Berkeley, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11364

to look at the new patch set (#5).

Change subject: [tests] make master-stress-test more stable
..

[tests] make master-stress-test more stable

The master-stress-test has been flaky for some time.  After looking
at those failure closely, I found at least five different issues.

This patch addresses the most prominent one: failures of the test
scenario because of timeout errors in case of TSAN builds.  About 9 out
of 10 failures were due to the issue fixed by this patch.  The timeout
errors were triggered by RPC queue overflow and the timing of master
restarts wrt the retry/back-off pattern used by KuduClient and other
test utility code.

The rest of issues behind the flakiness will be addressed separately.

This patch also introduces rpc_negotiation_timeout as a member for
ExternalMiniClusterOptions: that's to customize connection negotiation
timeout for the cluster's utility messenger.

Some statistics about the flakiness:

before the fix:
  37 out of 256 failed in TSAN build, where almost all failures are
  due to the issues fixed by this patch:
http://dist-test.cloudera.org//job?job_id=aserbin.1535666928.86597

after the fix:
  2 out of 256 failed in TSAN build, where the failure was due to [2]
  (not addressed by this change list, it will be addressed separately):
http://dist-test.cloudera.org/job?job_id=aserbin.1535665784.64065

A few of other issues due to which the test is still a bit flaky:
  [1] https://issues.apache.org/jira/browse/KUDU-2561
  [2] https://issues.apache.org/jira/browse/KUDU-2564
  [3] https://issues.apache.org/jira/browse/HIVE-19874

By my understanding, Dan found [3] to be the reason behind one type
of HMS-related failures; and there two more to evaluate.

Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
---
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
3 files changed, 93 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/11364/5
--
To view, visit http://gerrit.cloudera.org:8080/11364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
Gerrit-Change-Number: 11364
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [tests] make master-stress-test more stable

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11364 )

Change subject: [tests] make master-stress-test more stable
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/mini-cluster/external_mini_cluster.h
File src/kudu/mini-cluster/external_mini_cluster.h:

http://gerrit.cloudera.org:8080/#/c/11364/2/src/kudu/mini-cluster/external_mini_cluster.h@159
PS2, Line 159:   // an incomplete connection negotiation will timeout.
> Can you note the default here?
Done



--
To view, visit http://gerrit.cloudera.org:8080/11364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b30d8afd4a24acdbd96481cadeaf8f6a9475adf
Gerrit-Change-Number: 11364
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Fri, 31 Aug 2018 02:11:26 +
Gerrit-HasComments: Yes


[kudu-CR] Improve runtime of slow java client security test.

2018-08-30 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11365 )

Change subject: Improve runtime of slow java client security test.
..


Patch Set 1:

(1 comment)

This change would be great to have as the test runs about 2x as long as any 
other test.

It looks like the change is fairly flaky with this change. It fails on 
dist-test about 40% of the time with the error below:

1) testRenewAndReacquireKeberosCredentials(org.apache.kudu.client.TestSecurity)
org.apache.kudu.client.NonRecoverableException: server requires authentication, 
but client Kerberos credentials (TGT) have expired. Authentication tokens were 
not used because no token is available
at 
org.apache.kudu.client.KuduException.transformException(KuduException.java:110)
at 
org.apache.kudu.client.KuduClient.joinAndHandleException(KuduClient.java:365)
at 
org.apache.kudu.client.KuduClient.listTabletServers(KuduClient.java:188)
at 
org.apache.kudu.client.TestSecurity.checkClientCanReconnect(TestSecurity.java:137)
at 
org.apache.kudu.client.TestSecurity.testRenewAndReacquireKeberosCredentials(TestSecurity.java:330)

http://gerrit.cloudera.org:8080/#/c/11365/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11365/1//COMMIT_MSG@7
PS1, Line 7: Improve runtime of slow java client security test.
Nit: Put "KUDU-2489:" at the beginning of the subject.



--
To view, visit http://gerrit.cloudera.org:8080/11365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19fa5185430a6c91fbe050dbc458b7b91e2d5bea
Gerrit-Change-Number: 11365
Gerrit-PatchSet: 1
Gerrit-Owner: Brian McDevitt 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Fri, 31 Aug 2018 03:24:18 +
Gerrit-HasComments: Yes


[kudu-CR] Improve runtime of slow java client security test.

2018-08-30 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11365 )

Change subject: Improve runtime of slow java client security test.
..


Patch Set 1:

> Patch Set 1:
>
> (1 comment)
>
> This change would be great to have as the test runs about 2x as long as any 
> other test.
> 
> It looks like the change is fairly flaky with this change. It fails on 
> dist-test about 40% of the time with the error below:

Yea, maybe moving this to be an integration test would be better rather than 
potentially introducing flakiness? Agreed that long running tests suck, but 
unfortunately it's not so easy to mock out the advancement of time across 
clients, servers, and krb5 libraries without some pretty substantial effort.


--
To view, visit http://gerrit.cloudera.org:8080/11365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19fa5185430a6c91fbe050dbc458b7b91e2d5bea
Gerrit-Change-Number: 11365
Gerrit-PatchSet: 1
Gerrit-Owner: Brian McDevitt 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 31 Aug 2018 03:36:37 +
Gerrit-HasComments: No


[kudu-CR] Improve logging of maintenance ops

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11367 )

Change subject: Improve logging of maintenance ops
..


Patch Set 2:

(13 comments)

looks good, just some nits

http://gerrit.cloudera.org:8080/#/c/11367/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11367/2//COMMIT_MSG@7
PS2, Line 7: Improve logging of maintenance ops
Could you update this to follow the pattern in most of other Kudu commmits:

[tablet] improve logging of maintenance ops

?


http://gerrit.cloudera.org:8080/#/c/11367/2//COMMIT_MSG@12
PS2, Line 12: MRS flush and rowset compaction logging now includes the number 
of new rowsets
nit: could you keep these lines no longer than 72 symbols?  Other lines with 
examples is OK to be as as, but here it would be nice to adhere to Kudu commit 
guidelines.


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_compaction.cc@107
PS2, Line 107: push_back
nit: maybe, use emplace_back(std::move(...)) instead


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_compaction.cc@109
PS2, Line 109:
nit: indent


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_compaction.cc@109
PS2, Line 109: push_back
ditto


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_compaction.cc@316
PS2, Line 316:
nit: add const specifier


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_compaction.cc@330
PS2, Line 330: const shared_ptr&
nit: would 'const auto&' work here as well?


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_compaction.cc@341
PS2, Line 341: << "Finished major delta compaction of columns "
 : << ColumnNamesToString() << ". Compacted " << 
included_stores_.size()
 : << " delta files. Overall stats: " << mds.ToString();
nit: consider using strings::Substitute() here


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_stats.h
File src/kudu/tablet/delta_stats.h:

http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_stats.h@89
PS2, Line 89: int64_t UpdateCount() const;
Maybe, follow the style of already existing update_count_for_col_id() method 
and call this

update_count()

or

update_count_total()

or

update_count_all_columns() ?


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_tracker.cc@408
PS2, Line 408:   LOG_WITH_PREFIX(INFO) << "Flushing compaction of " << 
compacted_blocks.size()
 : << " redo delta blocks { " << 
compacted_blocks
 : << " } into block " << new_block_id;
I think using strings::Substitute() would help to make it more readable.


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/diskrowset.h@220
PS2, Line 220: int64_t
size() returns size_t; is there any specific reason to convert it into signed 
integer?


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/tablet.cc@1487
PS2, Line 1487: gced_all_input
If this variable is not used in the code below, consider getting rid of it and 
just adding explanatory comment about drsw.rows_written_count() == 0 meaning.


http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/tablet.cc@1651
PS2, Line 1651: << op_name << " successful on " << drsw.rows_written_count()
  : << " rows " << "(" << 
drsw.drs_written_count() << " rowsets, "
  : << drsw.written_size() << " bytes)";
Maybe, use strings::Substitute() to build this message?  That way it will be 
easier to read in the code.



--
To view, visit http://gerrit.cloudera.org:8080/11367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I43883001c5a1c72ff1ca0c1bc84d24a8533e3891
Gerrit-Change-Number: 11367
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Fri, 31 Aug 2018 04:52:08 +
Gerrit-HasComments: Yes


[kudu-CR] Improve logging of maintenance ops

2018-08-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11367 )

Change subject: Improve logging of maintenance ops
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/11367/2/src/kudu/tablet/delta_compaction.cc@296
PS2, Line 296: // Throwaway struct to combine stats from multiple delta stores.
 : struct MergedDeltaStats {
Do you expect to use this more extensively? Why not just a `string 
DeltaStoreStatsToString(vector)`?



--
To view, visit http://gerrit.cloudera.org:8080/11367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I43883001c5a1c72ff1ca0c1bc84d24a8533e3891
Gerrit-Change-Number: 11367
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Fri, 31 Aug 2018 05:15:41 +
Gerrit-HasComments: Yes


[kudu-CR] [location awareness] Add 'location' column in tserver list

2018-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11313 )

Change subject: [location_awareness] Add 'location' column in tserver list
..


Patch Set 5:

(4 comments)

overall looks good, just a few nits

http://gerrit.cloudera.org:8080/#/c/11313/5/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/11313/5/src/kudu/tools/kudu-tool-test.cc@2038
PS5, Line 2038: Listi
Is this a typo?  'List' or 'Listing'?


http://gerrit.cloudera.org:8080/#/c/11313/5/src/kudu/tools/kudu-tool-test.cc@2043
PS5, Line 2043: "--location_mapping_cmd=" +
  : Substitute("$0 $1", kLocationCmdPath, location)
nit: why not just

Substitute("--location_mapping_cmd=$0 $1", kLocationCmdPath, location)

?


http://gerrit.cloudera.org:8080/#/c/11313/5/src/kudu/tools/kudu-tool-test.cc@2057
PS5, Line 2057: Listi
ditto: is this a typo?


http://gerrit.cloudera.org:8080/#/c/11313/5/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/11313/5/src/kudu/tools/tool_action_tserver.cc@144
PS5, Line 144: loc
nit: add std::move:

std::move(loc)



--
To view, visit http://gerrit.cloudera.org:8080/11313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6c9dc8bd08b8d907111fac850fc6fd2c2b96fb8
Gerrit-Change-Number: 11313
Gerrit-PatchSet: 5
Gerrit-Owner: Fengling Wang 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Greg Solovyev 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Fri, 31 Aug 2018 05:48:18 +
Gerrit-HasComments: Yes