[jira] [Created] (KUDU-3245) Provide Client API to set verbose logging filtered by vmodule
Hao Hao created KUDU-3245: - Summary: Provide Client API to set verbose logging filtered by vmodule Key: KUDU-3245 URL: https://issues.apache.org/jira/browse/KUDU-3245 Project: Kudu Issue Type: Improvement Components: client Reporter: Hao Hao Similar to [{{client::SetVerboseLogLevel}}|https://github.com/apache/kudu/blob/master/src/kudu/client/client.h#L164] API, it will be nice to add another API to allow enabling verbose logging filtered by module. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-3237) MaintenanceManagerTest.TestCompletedOpsHistory is flaky
Hao Hao created KUDU-3237: - Summary: MaintenanceManagerTest.TestCompletedOpsHistory is flaky Key: KUDU-3237 URL: https://issues.apache.org/jira/browse/KUDU-3237 Project: Kudu Issue Type: Test Reporter: Hao Hao Came across test failure in MaintenanceManagerTest.TestCompletedOpsHistory as the following: {noformat} I0125 19:55:10.782884 24454 maintenance_manager.cc:594] P 12345: op5 complete. Timing: real 0.000s user 0.000s sys 0.000s Metrics: {} /data/1/hao/Repositories/kudu/src/kudu/util/maintenance_manager-test.cc:525: Failure Expected: std::min(kHistorySize, i + 1) Which is: 6 To be equal to: status_pb.completed_operations_size() Which is: 5 I0125 19:55:10.783524 24420 test_util.cc:148] --- I0125 19:55:10.783561 24420 test_util.cc:149] Had fatal failures, leaving test files at /tmp/dist-test-task1ofSWE/test-tmp/maintenance_manager-test.0.MaintenanceManagerTest.TestCompletedOpsHistory.1611604508702756-24420 {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-3118) Validate --tserver_enforce_access_control is set when authorization is enabled in Master
Hao Hao created KUDU-3118: - Summary: Validate --tserver_enforce_access_control is set when authorization is enabled in Master Key: KUDU-3118 URL: https://issues.apache.org/jira/browse/KUDU-3118 Project: Kudu Issue Type: Task Reporter: Hao Hao As mentioned in the code review [https://gerrit.cloudera.org/c/15897/1/docs/security.adoc#476], it would be nice to add some validation (maybe in ksck or something) that this is set if fine-grained authorization is enabled on the master. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-3091) Support ownership privilege with Ranger
Hao Hao created KUDU-3091: - Summary: Support ownership privilege with Ranger Key: KUDU-3091 URL: https://issues.apache.org/jira/browse/KUDU-3091 Project: Kudu Issue Type: Task Reporter: Hao Hao Currently, ownership privilege in Ranger is not available as Kudu has no concept of owner, and does not store owner information internally. It would be nice to enable it once Kudu introduces owner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KUDU-3090) Add owner concept in Kudu
[ https://issues.apache.org/jira/browse/KUDU-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-3090: -- Description: As mentioned in the Ranger integration design doc, Ranger supports ownership privilege by creating a default policy that allows \{OWNER} of a resource to access it without creating additional policy manually. Unless Kudu actually has a full support for owner, ownership privilege is not possible with Ranger integration. (was: As mentioned in the Ranger integration design doc, Ranger supports ownership privilege by creating a default policy that allows {OWNER} of a resource to access it without creating additional policy manually. Unless Kudu actually has a full support for owner, ownership privilege is not possible with Ranger integration.) > Add owner concept in Kudu > - > > Key: KUDU-3090 > URL: https://issues.apache.org/jira/browse/KUDU-3090 > Project: Kudu > Issue Type: New Feature >Reporter: Hao Hao >Priority: Major > > As mentioned in the Ranger integration design doc, Ranger supports ownership > privilege by creating a default policy that allows \{OWNER} of a resource to > access it without creating additional policy manually. Unless Kudu actually > has a full support for owner, ownership privilege is not possible with Ranger > integration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-3090) Add owner concept in Kudu
Hao Hao created KUDU-3090: - Summary: Add owner concept in Kudu Key: KUDU-3090 URL: https://issues.apache.org/jira/browse/KUDU-3090 Project: Kudu Issue Type: New Feature Reporter: Hao Hao As mentioned in the Ranger integration design doc, Ranger supports ownership privilege by creating a default policy that allows {OWNER} of a resource to access it without creating additional policy manually. Unless Kudu actually has a full support for owner, ownership privilege is not possible with Ranger integration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KUDU-2971) Add a generic Java library wrapper
[ https://issues.apache.org/jira/browse/KUDU-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2971. --- Fix Version/s: 1.12.0 Resolution: Fixed > Add a generic Java library wrapper > -- > > Key: KUDU-2971 > URL: https://issues.apache.org/jira/browse/KUDU-2971 > Project: Kudu > Issue Type: Sub-task >Affects Versions: 1.11.0 >Reporter: Hao Hao >Assignee: Hao Hao >Priority: Major > Fix For: 1.12.0 > > > For Ranger integration, to call Java Ranger plugin from masters, we need a > create a wrapper (via Java subprocess). This should be generic to be used by > future integrations (e.g. Atlas) which need to call other Java libraries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KUDU-2972) Add Ranger authorization provider
[ https://issues.apache.org/jira/browse/KUDU-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2972. --- Fix Version/s: 1.12.0 Resolution: Fixed > Add Ranger authorization provider > - > > Key: KUDU-2972 > URL: https://issues.apache.org/jira/browse/KUDU-2972 > Project: Kudu > Issue Type: Sub-task >Affects Versions: 1.11.0 >Reporter: Hao Hao >Assignee: Attila Bukor >Priority: Major > Fix For: 1.12.0 > > > For Ranger integration, we need to create Ranger authorization provider to > retrieve authorization decisions from the wrapped Ranger plugin. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KUDU-3076) Add a Kudu cli for granting/revoking Ranger privileges
[ https://issues.apache.org/jira/browse/KUDU-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-3076: -- Description: Even though Ranger has a GUI for policies management (and can be accessed via REST API), it probably will be more user friendly to have a Kudu cli tool for granting and revoking privileges. (was: Even though Ranger has a UGI for policies management (and can be accessed via REST API), it probably will be more user friendly to have a Kudu cli tool for granting and revoking privileges.) > Add a Kudu cli for granting/revoking Ranger privileges > -- > > Key: KUDU-3076 > URL: https://issues.apache.org/jira/browse/KUDU-3076 > Project: Kudu > Issue Type: Task >Reporter: Hao Hao >Priority: Major > > Even though Ranger has a GUI for policies management (and can be accessed via > REST API), it probably will be more user friendly to have a Kudu cli tool for > granting and revoking privileges. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-3076) Add a Kudu cli for granting/revoking Ranger privileges
Hao Hao created KUDU-3076: - Summary: Add a Kudu cli for granting/revoking Ranger privileges Key: KUDU-3076 URL: https://issues.apache.org/jira/browse/KUDU-3076 Project: Kudu Issue Type: Task Reporter: Hao Hao Even though Ranger has a UGI for policies management (and can be accessed via REST API), it probably will be more user friendly to have a Kudu cli tool for granting and revoking privileges. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KUDU-2973) Support semi-database concept even without HMS integration
[ https://issues.apache.org/jira/browse/KUDU-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2973: -- Description: For Ranger integration, we need to continue to have semi-database support. And currently it is tied to the HMS integration, which needs to be separated out. This includes to extract database information from table name for retrieving corresponding Ranger policies. For example, "db.table" belongs to 'db'. And as Kudu table name is case sensitive and can have special character, database "table" will be considered as 'table' that belongs to 'default' database, and "db.table.abc" will be considered as 'table.abc' that belongs to 'db' database. (was: For Ranger integration, we need to continue to have semi-database support. And currently it is tied to the HMS integration, which needs to be separated out.) > Support semi-database concept even without HMS integration > -- > > Key: KUDU-2973 > URL: https://issues.apache.org/jira/browse/KUDU-2973 > Project: Kudu > Issue Type: Sub-task >Affects Versions: 1.11.0 >Reporter: Hao Hao >Assignee: Attila Bukor >Priority: Major > > For Ranger integration, we need to continue to have semi-database support. > And currently it is tied to the HMS integration, which needs to be separated > out. This includes to extract database information from table name for > retrieving corresponding Ranger policies. For example, "db.table" belongs to > 'db'. And as Kudu table name is case sensitive and can have special > character, database "table" will be considered as 'table' that belongs to > 'default' database, and "db.table.abc" will be considered as 'table.abc' that > belongs to 'db' database. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-3006) RebalanceIgnoredTserversTest.Basic is flaky
Hao Hao created KUDU-3006: - Summary: RebalanceIgnoredTserversTest.Basic is flaky Key: KUDU-3006 URL: https://issues.apache.org/jira/browse/KUDU-3006 Project: Kudu Issue Type: Bug Reporter: Hao Hao Attachments: rebalancer_tool-test.1.txt RebalanceIgnoredTserversTest.Basic of the rebalancer_tool-test sometimes fails with an error like below. I attached full test log. {noformat} /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/rebalancer_tool-test.cc:350: Failure Value of: out Expected: has substring "2dd9365c71c54e5d83294b31046c5478 | 0" Actual: "Per-server replica distribution summary for tservers_to_empty:\n Server UUID| Replica Count\n--+---\n 2dd9365c71c54e5d83294b31046c5478 | 1\n\nPer-server replica distribution summary:\n Statistic | Value\n---+--\n Minimum Replica Count | 0\n Maximum Replica Count | 1\n Average Replica Count | 0.50\n\nPer-table replica distribution summary:\n Replica Skew | Value\n--+--\n Minimum | 1\n Maximum | 1\n Average | 1.00\n\n\nrebalancing is complete: cluster is balanced (moved 0 replicas)\n" (of type std::string) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-3003) TestAsyncKuduSession.testTabletCacheInvalidatedDuringWrites is flaky
Hao Hao created KUDU-3003: - Summary: TestAsyncKuduSession.testTabletCacheInvalidatedDuringWrites is flaky Key: KUDU-3003 URL: https://issues.apache.org/jira/browse/KUDU-3003 Project: Kudu Issue Type: Bug Reporter: Hao Hao Attachments: test-output.txt testTabletCacheInvalidatedDuringWrites of the org.apache.kudu.client.TestAsyncKuduSession test sometimes fails with an error like below. I attached full test log. {noformat} There was 1 failure: 1) testTabletCacheInvalidatedDuringWrites(org.apache.kudu.client.TestAsyncKuduSession) org.apache.kudu.client.PleaseThrottleException: all buffers are currently flushing at org.apache.kudu.client.AsyncKuduSession.apply(AsyncKuduSession.java:579) at org.apache.kudu.client.TestAsyncKuduSession.testTabletCacheInvalidatedDuringWrites(TestAsyncKuduSession.java:371) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-2973) Support semi-database concept even without HMS integration
Hao Hao created KUDU-2973: - Summary: Support semi-database concept even without HMS integration Key: KUDU-2973 URL: https://issues.apache.org/jira/browse/KUDU-2973 Project: Kudu Issue Type: Sub-task Affects Versions: 1.11.0 Reporter: Hao Hao For Ranger integration, we need to continue to have semi-database support. And currently it is tied to the HMS integration, which needs to be separated out. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2970) Fine-grained authorization with Ranger
[ https://issues.apache.org/jira/browse/KUDU-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao reassigned KUDU-2970: - Assignee: Hao Hao > Fine-grained authorization with Ranger > --- > > Key: KUDU-2970 > URL: https://issues.apache.org/jira/browse/KUDU-2970 > Project: Kudu > Issue Type: New Feature > Components: security >Affects Versions: 1.11.0 >Reporter: Hao Hao >Assignee: Hao Hao >Priority: Major > > With the completion of Kudu’s integration with Apache Sentry, fine-grained > authorization capabilities have been added to Kudu. However, because Apache > Ranger has wider adoption and provides a more comprehensive security features > (such as attribute based access control, audit, etc) than Sentry, it is > important for Kudu to also integrate Ranger. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-2972) Add Ranger authorization provider
Hao Hao created KUDU-2972: - Summary: Add Ranger authorization provider Key: KUDU-2972 URL: https://issues.apache.org/jira/browse/KUDU-2972 Project: Kudu Issue Type: Sub-task Affects Versions: 1.11.0 Reporter: Hao Hao For Ranger integration, we need to create Ranger authorization provider to retrieve authorization decisions from the wrapped Ranger plugin. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KUDU-2970) Fine-grained authorization with Ranger
[ https://issues.apache.org/jira/browse/KUDU-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2970: -- Affects Version/s: (was: 1.10.0) 1.11.0 > Fine-grained authorization with Ranger > --- > > Key: KUDU-2970 > URL: https://issues.apache.org/jira/browse/KUDU-2970 > Project: Kudu > Issue Type: New Feature > Components: security >Affects Versions: 1.11.0 >Reporter: Hao Hao >Priority: Major > > With the completion of Kudu’s integration with Apache Sentry, fine-grained > authorization capabilities have been added to Kudu. However, because Apache > Ranger has wider adoption and provides a more comprehensive security features > (such as attribute based access control, audit, etc) than Sentry, it is > important for Kudu to also integrate Ranger. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2971) Add a generic Java library wrapper
[ https://issues.apache.org/jira/browse/KUDU-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao reassigned KUDU-2971: - Assignee: Hao Hao > Add a generic Java library wrapper > -- > > Key: KUDU-2971 > URL: https://issues.apache.org/jira/browse/KUDU-2971 > Project: Kudu > Issue Type: Sub-task >Affects Versions: 1.11.0 >Reporter: Hao Hao >Assignee: Hao Hao >Priority: Major > > For Ranger integration, to call Java Ranger plugin from masters, we need a > create a wrapper (via Java subprocess). This should be generic to be used by > future integrations (e.g. Atlas) which need to call other Java libraries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-2971) Add a generic Java library wrapper
Hao Hao created KUDU-2971: - Summary: Add a generic Java library wrapper Key: KUDU-2971 URL: https://issues.apache.org/jira/browse/KUDU-2971 Project: Kudu Issue Type: Sub-task Affects Versions: 1.11.0 Reporter: Hao Hao For Ranger integration, to call Java Ranger plugin from masters, we need a create a wrapper (via Java subprocess). This should be generic to be used by future integrations (e.g. Atlas) which need to call other Java libraries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KUDU-2970) Fine-grained authorization with Ranger
Hao Hao created KUDU-2970: - Summary: Fine-grained authorization with Ranger Key: KUDU-2970 URL: https://issues.apache.org/jira/browse/KUDU-2970 Project: Kudu Issue Type: New Feature Components: security Affects Versions: 1.10.0 Reporter: Hao Hao With the completion of Kudu’s integration with Apache Sentry, fine-grained authorization capabilities have been added to Kudu. However, because Apache Ranger has wider adoption and provides a more comprehensive security features (such as attribute based access control, audit, etc) than Sentry, it is important for Kudu to also integrate Ranger. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KUDU-2191) Hive Metastore Integration
[ https://issues.apache.org/jira/browse/KUDU-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2191. --- Fix Version/s: 1.10.0 Resolution: Fixed > Hive Metastore Integration > -- > > Key: KUDU-2191 > URL: https://issues.apache.org/jira/browse/KUDU-2191 > Project: Kudu > Issue Type: New Feature > Components: server >Affects Versions: 1.5.0 >Reporter: Dan Burkert >Assignee: Hao Hao >Priority: Major > Fix For: 1.10.0 > > > In order to facilitate discovery of Kudu tables, as well as a shared table > namespace, Kudu should register its tables in the Hive Metastore. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (KUDU-2916) Admin.TestDumpMemTrackers is flaky in tsan
Hao Hao created KUDU-2916: - Summary: Admin.TestDumpMemTrackers is flaky in tsan Key: KUDU-2916 URL: https://issues.apache.org/jira/browse/KUDU-2916 Project: Kudu Issue Type: Bug Affects Versions: 1.10.0 Reporter: Hao Hao I saw a tsan failure for AdminCliTest.TestDumpMemTrackers with the following log: {noformat} /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-admin-test.cc:2162: Failure Value of: s.ok() Actual: false Expected: true Runtime error: /tmp/dist-test-taskUWtx7r/build/tsan/bin/kudu: process exited with non-zero status 66 stdout: {"id":"root","limit":-1,"current_consumption":481,"peak_consumption":481,"child_trackers":[{"id":"server","parent_id":"root","limit":-1,"current_consumption":313,"peak_consumption":313,"child_trackers":[{"id":"result-tracker","parent_id":"server","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"log_block_manager","parent_id":"server","limit":-1,"current_consumption":48,"peak_consumption":48},{"id":"tablet-9276b163452b4b0399ff2cae579f7251","parent_id":"server","limit":-1,"current_consumption":265,"peak_consumption":265,"child_trackers":[{"id":"DeltaMemStores","parent_id":"tablet-9276b163452b4b0399ff2cae579f7251","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"MemRowSet-0","parent_id":"tablet-9276b163452b4b0399ff2cae579f7251","limit":-1,"current_consumption":265,"peak_consumption":265},{"id":"txn_tracker","parent_id":"tablet-9276b163452b4b0399ff2cae579f7251","limit":67108864,"current_consumption":0,"peak_consumption":0}]}]},{"id":"ttl-cache-sharded_fifo_cache","parent_id":"root","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"code_cache-sharded_lru_cache","parent_id":"root","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"block_cache-sharded_lru_cache","parent_id":"root","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"lbm-sharded_lru_cache","parent_id":"root","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"log_cache","parent_id":"root","limit":1073741824,"current_consumption":168,"peak_consumption":168,"child_trackers":[{"id":"log_cache:457a3168758d4f4f8f4c59e8dd179cd3:9276b163452b4b0399ff2cae579f7251","parent_id":"log_cache","limit":10485760,"current_consumption":168,"peak_consumption":168}]}]} stderr: W0803 14:03:00.206982 8443 flags.cc:404] Enabled unsafe flag: --never_fsync=true W0803 14:03:00.849385 8443 thread.cc:599] rpc reactor (reactor) Time spent creating pthread: real 0.590s user 0.230s sys 0.360s W0803 14:03:00.849658 8443 thread.cc:566] rpc reactor (reactor) Time spent starting thread: real 0.591suser 0.230s sys 0.360s == WARNING: ThreadSanitizer: destroy of a locked mutex (pid=8443) #0 pthread_rwlock_destroy /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1313 (kudu+0x4bbb24) #1 glog_internal_namespace_::Mutex::~Mutex() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/glog-0.3.5/src/base/mutex.h:249:30 (libglog.so.0+0x16488) #2 cxa_at_exit_wrapper(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:386 (kudu+0x48beb3) and: #0 operator new(unsigned long) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:57 (kudu+0x52ae83) #1 __allocate /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/new:228:10 (libc++.so.1+0xd63f3) #2 allocate /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/memory:1793 (libc++.so.1+0xd63f3) #3 allocate /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/memory:1547 (libc++.so.1+0xd63f3) #4 __init /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/string:1591 (libc++.so.1+0xd63f3) #5 std::__1::basic_string, std::__1::allocator >::basic_string(std::__1::basic_string, std::__1::allocator > const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/string:1653 (libc++.so.1+0xd63f3) #6 std::__1::pair, std::__1::allocator > const, std::__1::pair >::pair(std::__1::pair, std::__1::allocator > const, std::__1::pair > const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/utility:324:5 (libprotobuf.so.14+0x188711) #7 void std::__1::allocator, std::__1::allocator >, std::__1::pair >, void*> >::construct, std::__1::allocator > const, std::__1::pair >, std::__1::pair, std::__1::allocator > const, std::__1::pair > c
[jira] [Created] (KUDU-2883) HMS check/fix tool should accommodate tables without ID for dropping orphan hms tables.
Hao Hao created KUDU-2883: - Summary: HMS check/fix tool should accommodate tables without ID for dropping orphan hms tables. Key: KUDU-2883 URL: https://issues.apache.org/jira/browse/KUDU-2883 Project: Kudu Issue Type: Bug Affects Versions: 1.10.0 Reporter: Hao Hao In cases that table has no ID(created when HMS integration is disabled), HMS check/fix tool should accommodate such caes for dropping orphan hms tables (https://github.com/apache/kudu/blob/master/src/kudu/tools/tool_action_hms.cc#L616). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-2880) TestSecurity is flaky
[ https://issues.apache.org/jira/browse/KUDU-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2880: -- Attachment: test-output.txt > TestSecurity is flaky > - > > Key: KUDU-2880 > URL: https://issues.apache.org/jira/browse/KUDU-2880 > Project: Kudu > Issue Type: Test >Reporter: Hao Hao >Priority: Major > Attachments: test-output.txt > > > A recent run of TestSecurity failed with the following error: > {noformat} > There was 1 failure: > 1) > testExternallyProvidedSubjectRefreshedExternally(org.apache.kudu.client.TestSecurity) > org.apache.kudu.client.NonRecoverableException: cannot complete before > timeout: KuduRpc(method=ListTabletServers, tablet=null, attempt=26, > TimeoutTracker(timeout=3, elapsed=29608), Traces: [0ms] refreshing cache > from master, [46ms] Sub RPC ConnectToMaster: sending RPC to server > master-127.0.202.126:46581, [63ms] Sub RPC ConnectToMaster: sending RPC to > server master-127.0.202.124:43241, [69ms] Sub RPC ConnectToMaster: received > response from server master-127.0.202.126:46581: Network error: Failed to > connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection > refused: /127.0.202.126:46581, [70ms] Sub RPC ConnectToMaster: sending RPC to > server master-127.0.202.125:43873, [250ms] Sub RPC ConnectToMaster: received > response from server master-127.0.202.125:43873: Network error: [peer > master-127.0.202.125:43873(127.0.202.125:43873)] unexpected exception from > downstream on [id: 0x2fae7299, /127.0.0.1:57014 => /127.0.202.125:43873], > [282ms] Sub RPC ConnectToMaster: received response from server > master-127.0.202.124:43241: OK, [336ms] delaying RPC due to: Service > unavailable: Master config > (127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. > Exceptions received: org.apache.kudu.client.RecoverableException: Failed to > connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection > refused: /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: > [peer master-127.0.202.125:43873(127.0.202.125:43873)] unexpected exception > from downstream on [id: 0x2fae7299, /127.0.0.1:57014 => > /127.0.202.125:43873], [357ms] refreshing cache from master, [358ms] Sub RPC > ConnectToMaster: sending RPC to server master-127.0.202.126:46581, [358ms] > Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.124:43241, > [360ms] Sub RPC ConnectToMaster: received response from server > master-127.0.202.126:46581: Network error: java.net.ConnectException: > Connection refused: /127.0.202.126:46581, [360ms] Sub RPC ConnectToMaster: > sending RPC to server master-127.0.202.125:43873, [361ms] Sub RPC > ConnectToMaster: received response from server master-127.0.202.125:43873: > Network error: Failed to connect to peer > master-127.0.202.125:43873(127.0.202.125:43873): Connection refused: > /127.0.202.125:43873, [363ms] Sub RPC ConnectToMaster: received response from > server master-127.0.202.124:43241: OK, [364ms] delaying RPC due to: Service > unavailable: Master config > (127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. > Exceptions received: org.apache.kudu.client.RecoverableException: > java.net.ConnectException: Connection refused: > /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: Failed to > connect to peer master-127.0.202.125:43873(127.0.202.125:43873): Connection > refused: /127.0.202.125:43873, [376ms] refreshing cache from master, [377ms] > Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.126:46581, > [377ms] Sub RPC ConnectToMaster: sending RPC to server > master-127.0.202.124:43241, [378ms] Sub RPC ConnectToMaster: sending RPC to > server master-127.0.202.125:43873, [379ms] Sub RPC ConnectToMaster: received > response from server master-127.0.202.126:46581: Network error: Failed to > connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection > refused: /127.0.202.126:46581, [381ms] Sub RPC ConnectToMaster: received > response from server master-127.0.202.125:43873: Network error: > java.net.ConnectException: Connection refused: /127.0.202.125:43873, [382ms] > Sub RPC ConnectToMaster: received response from server > master-127.0.202.124:43241: OK, [383ms] delaying RPC due to: Service > unavailable: Master config > (127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. > Exceptions received: org.apache.kudu.client.RecoverableException: Failed to > connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection > refused: /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: > java.net.ConnectException: Connection refused: /127.0.202.125:43873, [397ms] > refreshing cache from master, [397ms] Sub RPC ConnectToMaster: sending RPC to > se
[jira] [Created] (KUDU-2880) TestSecurity is flaky
Hao Hao created KUDU-2880: - Summary: TestSecurity is flaky Key: KUDU-2880 URL: https://issues.apache.org/jira/browse/KUDU-2880 Project: Kudu Issue Type: Test Reporter: Hao Hao A recent run of TestSecurity failed with the following error: {noformat} There was 1 failure: 1) testExternallyProvidedSubjectRefreshedExternally(org.apache.kudu.client.TestSecurity) org.apache.kudu.client.NonRecoverableException: cannot complete before timeout: KuduRpc(method=ListTabletServers, tablet=null, attempt=26, TimeoutTracker(timeout=3, elapsed=29608), Traces: [0ms] refreshing cache from master, [46ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.126:46581, [63ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.124:43241, [69ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.126:46581: Network error: Failed to connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection refused: /127.0.202.126:46581, [70ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.125:43873, [250ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.125:43873: Network error: [peer master-127.0.202.125:43873(127.0.202.125:43873)] unexpected exception from downstream on [id: 0x2fae7299, /127.0.0.1:57014 => /127.0.202.125:43873], [282ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.124:43241: OK, [336ms] delaying RPC due to: Service unavailable: Master config (127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: Failed to connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection refused: /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: [peer master-127.0.202.125:43873(127.0.202.125:43873)] unexpected exception from downstream on [id: 0x2fae7299, /127.0.0.1:57014 => /127.0.202.125:43873], [357ms] refreshing cache from master, [358ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.126:46581, [358ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.124:43241, [360ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.126:46581: Network error: java.net.ConnectException: Connection refused: /127.0.202.126:46581, [360ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.125:43873, [361ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.125:43873: Network error: Failed to connect to peer master-127.0.202.125:43873(127.0.202.125:43873): Connection refused: /127.0.202.125:43873, [363ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.124:43241: OK, [364ms] delaying RPC due to: Service unavailable: Master config (127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: java.net.ConnectException: Connection refused: /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: Failed to connect to peer master-127.0.202.125:43873(127.0.202.125:43873): Connection refused: /127.0.202.125:43873, [376ms] refreshing cache from master, [377ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.126:46581, [377ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.124:43241, [378ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.125:43873, [379ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.126:46581: Network error: Failed to connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection refused: /127.0.202.126:46581, [381ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.125:43873: Network error: java.net.ConnectException: Connection refused: /127.0.202.125:43873, [382ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.124:43241: OK, [383ms] delaying RPC due to: Service unavailable: Master config (127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: Failed to connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection refused: /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: java.net.ConnectException: Connection refused: /127.0.202.125:43873, [397ms] refreshing cache from master, [397ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.126:46581, [398ms] Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.124:43241, [399ms] Sub RPC ConnectToMaster: received response from server master-127.0.202.126:46581: Network error: Failed to connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection refused: /127.0.202.126:46581, [402ms] Sub RPC ConnectToM
[jira] [Commented] (KUDU-1702) Document/Implement read-your-writes for Impala/Spark etc.
[ https://issues.apache.org/jira/browse/KUDU-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865103#comment-16865103 ] Hao Hao commented on KUDU-1702: --- I think adoption of READ_YOUR_WRITES mode for impala side is still not IMPALA-7184). For Spark side, the approach of using READ_AT_SNAPSHOT mode to achieve read-your-writes semantics is done as part of KUDU-1454. > Document/Implement read-your-writes for Impala/Spark etc. > - > > Key: KUDU-1702 > URL: https://issues.apache.org/jira/browse/KUDU-1702 > Project: Kudu > Issue Type: Sub-task > Components: client, tablet, tserver >Affects Versions: 1.1.0 >Reporter: David Alves >Assignee: David Alves >Priority: Major > > Engines like Impala/Spark use many independent client instances, so we should > provide a way to have read-your-writes across many independent client > instances, which translates to providing a way to get linearizable behavior. > At first this can be done using the APIs that are already available. For > instance if the objective is to be sure to have the results of a write in a a > following scan, the following steps can be taken: > - After a write the engine should collect the last observed timestamps from > kudu clients > - The engine's coordinator then takes the max of those timestamps, adds 1 and > uses that as a snapshot scan timestamp. > One important pre-requisite of the behavior above is that scans be done in > READ_AT_SNAPSHOT mode. Also the steps above currently don't actually > guarantee the expected behavior, but should as the currently anomalies are > taken care of (as part of KUDU-430). > In the immediate future we'll add APIs to the Kudu client so as to make the > inner workings of getting this behavior oblivious to the engine. The steps > will still be the same, i.e. timestamps or timestamp tokens will still be > passed around, but the kudu client will encapsulate the choice of the > timestamp for the scan. > Later we will add a way to obtain this behavior without timestamp > propagation, either by doing a write-side commit-wait, where clients wait out > the clock error after/during the last write thus making sure any future > operation will have a higher timestamp; or by making read-side commit wait, > where we provide an api on the kudu client for the engine to perform a > similar call before the scan call to obtain a scan timestamp. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KUDU-1498) Add support to Java client for read-your-writes consistency
[ https://issues.apache.org/jira/browse/KUDU-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-1498. --- Resolution: Duplicate Fix Version/s: n/a > Add support to Java client for read-your-writes consistency > --- > > Key: KUDU-1498 > URL: https://issues.apache.org/jira/browse/KUDU-1498 > Project: Kudu > Issue Type: Sub-task > Components: client >Reporter: Mike Percy >Priority: Major > Fix For: n/a > > > The Java client could use a mode called "read your writes" consistency where > we ensure that we read whatever the leader has committed at the time of the > request. > At the time of writing, the implementation requirements look like the > following: > * Always scan from the leader > * Specify that the leader must apply all operations from previous leaders > before processing the query > In the C++ client, this can be achieved by specifying both of the LEADER_ONLY > and READ_AT_SNAPSHOT options, while not specifying a timestamp to use for the > snapshot when starting the scan. > In the Java client API, we may want to simply expose a scan option called > "read your writes" or something similar. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KUDU-1499) Add friendly C++ API for read-your-writes consistency
[ https://issues.apache.org/jira/browse/KUDU-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-1499. --- Resolution: Duplicate Fix Version/s: n/a > Add friendly C++ API for read-your-writes consistency > - > > Key: KUDU-1499 > URL: https://issues.apache.org/jira/browse/KUDU-1499 > Project: Kudu > Issue Type: Sub-task > Components: client >Affects Versions: 0.9.0 >Reporter: Mike Percy >Priority: Major > Fix For: n/a > > > At the time of writing, in order to get read-your-writes consistency in the > C++ client, one must jump through hoops such as specifying LEADER_ONLY + > READ_AT_SNAPSHOT while *not* specifying a timestamp to use for the snapshot. > It would be more friendly to expose a simple API flag or option that enables > a "read your writes" consistency mode. > Another benefit to this approach is that we can change the implementation > later if we come up with a more clever or scalable way of implementing the > underlying consistency mode, such as something involving the use of > timestamps. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1498) Add support to Java client for read-your-writes consistency
[ https://issues.apache.org/jira/browse/KUDU-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865097#comment-16865097 ] Hao Hao commented on KUDU-1498: --- Yeah, this should be covered by KUDU-1704. > Add support to Java client for read-your-writes consistency > --- > > Key: KUDU-1498 > URL: https://issues.apache.org/jira/browse/KUDU-1498 > Project: Kudu > Issue Type: Sub-task > Components: client >Reporter: Mike Percy >Priority: Major > > The Java client could use a mode called "read your writes" consistency where > we ensure that we read whatever the leader has committed at the time of the > request. > At the time of writing, the implementation requirements look like the > following: > * Always scan from the leader > * Specify that the leader must apply all operations from previous leaders > before processing the query > In the C++ client, this can be achieved by specifying both of the LEADER_ONLY > and READ_AT_SNAPSHOT options, while not specifying a timestamp to use for the > snapshot when starting the scan. > In the Java client API, we may want to simply expose a scan option called > "read your writes" or something similar. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1499) Add friendly C++ API for read-your-writes consistency
[ https://issues.apache.org/jira/browse/KUDU-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865096#comment-16865096 ] Hao Hao commented on KUDU-1499: --- I think this is duplicate of KUDU-1704 which has been fixed. > Add friendly C++ API for read-your-writes consistency > - > > Key: KUDU-1499 > URL: https://issues.apache.org/jira/browse/KUDU-1499 > Project: Kudu > Issue Type: Sub-task > Components: client >Affects Versions: 0.9.0 >Reporter: Mike Percy >Priority: Major > > At the time of writing, in order to get read-your-writes consistency in the > C++ client, one must jump through hoops such as specifying LEADER_ONLY + > READ_AT_SNAPSHOT while *not* specifying a timestamp to use for the snapshot. > It would be more friendly to expose a simple API flag or option that enables > a "read your writes" consistency mode. > Another benefit to this approach is that we can change the implementation > later if we come up with a more clever or scalable way of implementing the > underlying consistency mode, such as something involving the use of > timestamps. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KUDU-2590) Master access control enforcement of CREATE/ALTER/DROP table operations
[ https://issues.apache.org/jira/browse/KUDU-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2590. --- Resolution: Fixed Fix Version/s: 1.10.0 > Master access control enforcement of CREATE/ALTER/DROP table operations > --- > > Key: KUDU-2590 > URL: https://issues.apache.org/jira/browse/KUDU-2590 > Project: Kudu > Issue Type: Sub-task > Components: master >Affects Versions: 1.7.1 >Reporter: Dan Burkert >Assignee: Hao Hao >Priority: Major > Fix For: 1.10.0 > > > As described in the 'Master RPC Authorization' section of the [design > doc.|https://docs.google.com/document/d/1SEBtgWwBFqij5CuCZwhOqDNSDVViC0WERq6RzsPCWjQ/edit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KUDU-2542) Fill-out AuthzToken definition
[ https://issues.apache.org/jira/browse/KUDU-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2542. --- Resolution: Fixed Fix Version/s: 1.10.0 > Fill-out AuthzToken definition > -- > > Key: KUDU-2542 > URL: https://issues.apache.org/jira/browse/KUDU-2542 > Project: Kudu > Issue Type: Sub-task > Components: security >Affects Versions: 1.8.0 >Reporter: Dan Burkert >Assignee: Andrew Wong >Priority: Major > Fix For: 1.10.0 > > > As part of the Sentry integration, it will be necessary to flesh out the > [AuthzTokenPB|https://github.com/apache/kudu/blob/master/src/kudu/security/token.proto#L28] > structure with relevant fields: > # The ID of the table which the token applies to > # The username which the attached privileges belong to > # The privileges > Sentry has it's own privilege format > [TSentryPrivilege|https://github.com/apache/sentry/blob/master/sentry-service/sentry-service-api/src/main/resources/sentry_policy_service.thrift#L47-L58], > but we'll probably want to convert this into our own internal Protobuf-based > format for the following reasons: > # The tokens will be used in the tablet servers to authorize client actions. > Currently tablet servers don't use or link to Thrift libraries. > # The Sentry privilege structure references columns by name, whereas we will > need to reference columns by ID in order to be robust to columns being > renamed. > # Having our own format will make it easier to drop in alternate > authorization providers in the future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2542) Fill-out AuthzToken definition
[ https://issues.apache.org/jira/browse/KUDU-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860201#comment-16860201 ] Hao Hao commented on KUDU-2542: --- This is done in a series of commits. > Fill-out AuthzToken definition > -- > > Key: KUDU-2542 > URL: https://issues.apache.org/jira/browse/KUDU-2542 > Project: Kudu > Issue Type: Sub-task > Components: security >Affects Versions: 1.8.0 >Reporter: Dan Burkert >Assignee: Andrew Wong >Priority: Major > > As part of the Sentry integration, it will be necessary to flesh out the > [AuthzTokenPB|https://github.com/apache/kudu/blob/master/src/kudu/security/token.proto#L28] > structure with relevant fields: > # The ID of the table which the token applies to > # The username which the attached privileges belong to > # The privileges > Sentry has it's own privilege format > [TSentryPrivilege|https://github.com/apache/sentry/blob/master/sentry-service/sentry-service-api/src/main/resources/sentry_policy_service.thrift#L47-L58], > but we'll probably want to convert this into our own internal Protobuf-based > format for the following reasons: > # The tokens will be used in the tablet servers to authorize client actions. > Currently tablet servers don't use or link to Thrift libraries. > # The Sentry privilege structure references columns by name, whereas we will > need to reference columns by ID in order to be robust to columns being > renamed. > # Having our own format will make it easier to drop in alternate > authorization providers in the future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2590) Master access control enforcement of CREATE/ALTER/DROP table operations
[ https://issues.apache.org/jira/browse/KUDU-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860200#comment-16860200 ] Hao Hao commented on KUDU-2590: --- This is done in a series of commits. > Master access control enforcement of CREATE/ALTER/DROP table operations > --- > > Key: KUDU-2590 > URL: https://issues.apache.org/jira/browse/KUDU-2590 > Project: Kudu > Issue Type: Sub-task > Components: master >Affects Versions: 1.7.1 >Reporter: Dan Burkert >Assignee: Hao Hao >Priority: Major > > As described in the 'Master RPC Authorization' section of the [design > doc.|https://docs.google.com/document/d/1SEBtgWwBFqij5CuCZwhOqDNSDVViC0WERq6RzsPCWjQ/edit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2541) server-side Sentry Client
[ https://issues.apache.org/jira/browse/KUDU-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857981#comment-16857981 ] Hao Hao commented on KUDU-2541: --- Committed in 14f3e6f60 and ecc4998cb > server-side Sentry Client > - > > Key: KUDU-2541 > URL: https://issues.apache.org/jira/browse/KUDU-2541 > Project: Kudu > Issue Type: Sub-task > Components: server >Affects Versions: 1.8.0 >Reporter: Dan Burkert >Assignee: Dan Burkert >Priority: Major > > As part of the Sentry integration, it will be necessary to have a Sentry > client which can be used by the Kudu master server. This will require > effectively re-implementing the existing Sentry client (plugin) in C++, or at > least the parts of it which we need to authorize operations in Kudu. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KUDU-2541) server-side Sentry Client
[ https://issues.apache.org/jira/browse/KUDU-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2541. --- Resolution: Fixed Fix Version/s: 1.10.0 > server-side Sentry Client > - > > Key: KUDU-2541 > URL: https://issues.apache.org/jira/browse/KUDU-2541 > Project: Kudu > Issue Type: Sub-task > Components: server >Affects Versions: 1.8.0 >Reporter: Dan Burkert >Assignee: Dan Burkert >Priority: Major > Fix For: 1.10.0 > > > As part of the Sentry integration, it will be necessary to have a Sentry > client which can be used by the Kudu master server. This will require > effectively re-implementing the existing Sentry client (plugin) in C++, or at > least the parts of it which we need to authorize operations in Kudu. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2557) Sometimes the rebalancer-related tests are runing for too long
[ https://issues.apache.org/jira/browse/KUDU-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16847733#comment-16847733 ] Hao Hao commented on KUDU-2557: --- Ah, ok, thanks for catching that! > Sometimes the rebalancer-related tests are runing for too long > -- > > Key: KUDU-2557 > URL: https://issues.apache.org/jira/browse/KUDU-2557 > Project: Kudu > Issue Type: Bug > Components: CLI, test >Affects Versions: 1.8.0 >Reporter: Alexey Serbin >Assignee: Alexey Serbin >Priority: Minor > Labels: CLI, flaky-test, rebalance, test > Fix For: 1.9.0 > > Attachments: kudu-admin-test.2.txt > > > The rebalancer-related tests in {{kudu-admin-test}} sometimes gets wild and > run for too long. That's observed in RELEASE builds at least: > {noformat} > ConcurrentRebalancersTest.TwoConcurrentRebalancers/1: test_main.cc:63] > Maximum unit test time exceeded (900 sec) > {noformat} > {noformat} > TserverGoesDownDuringRebalancingTest.TserverDown/1: test_main.cc:63] Maximum > unit test time exceeded (900 sec) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-2557) Sometimes the rebalancer-related tests are runing for too long
[ https://issues.apache.org/jira/browse/KUDU-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2557: -- Attachment: kudu-admin-test.2.txt > Sometimes the rebalancer-related tests are runing for too long > -- > > Key: KUDU-2557 > URL: https://issues.apache.org/jira/browse/KUDU-2557 > Project: Kudu > Issue Type: Bug > Components: CLI, test >Affects Versions: 1.8.0 >Reporter: Alexey Serbin >Assignee: Alexey Serbin >Priority: Minor > Labels: CLI, flaky-test, rebalance, test > Fix For: 1.9.0 > > Attachments: kudu-admin-test.2.txt > > > The rebalancer-related tests in {{kudu-admin-test}} sometimes gets wild and > run for too long. That's observed in RELEASE builds at least: > {noformat} > ConcurrentRebalancersTest.TwoConcurrentRebalancers/1: test_main.cc:63] > Maximum unit test time exceeded (900 sec) > {noformat} > {noformat} > TserverGoesDownDuringRebalancingTest.TserverDown/1: test_main.cc:63] Maximum > unit test time exceeded (900 sec) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2557) Sometimes the rebalancer-related tests are runing for too long
[ https://issues.apache.org/jira/browse/KUDU-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16847158#comment-16847158 ] Hao Hao commented on KUDU-2557: --- I think I saw another instance of this error with debug build. Attached the log > Sometimes the rebalancer-related tests are runing for too long > -- > > Key: KUDU-2557 > URL: https://issues.apache.org/jira/browse/KUDU-2557 > Project: Kudu > Issue Type: Bug > Components: CLI, test >Affects Versions: 1.8.0 >Reporter: Alexey Serbin >Assignee: Alexey Serbin >Priority: Minor > Labels: CLI, flaky-test, rebalance, test > Fix For: 1.9.0 > > Attachments: kudu-admin-test.2.txt > > > The rebalancer-related tests in {{kudu-admin-test}} sometimes gets wild and > run for too long. That's observed in RELEASE builds at least: > {noformat} > ConcurrentRebalancersTest.TwoConcurrentRebalancers/1: test_main.cc:63] > Maximum unit test time exceeded (900 sec) > {noformat} > {noformat} > TserverGoesDownDuringRebalancingTest.TserverDown/1: test_main.cc:63] Maximum > unit test time exceeded (900 sec) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-2779) MasterStressTest is flaky when HMS is enabled
[ https://issues.apache.org/jira/browse/KUDU-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2779: -- Description: Encountered failure in master-stress-test.cc when HMS integration is enabled: {noformat} 22:30:11.487 [HMS - ERROR - pool-8-thread-2] (HiveAlterHandler.java:341) Failed to alter table default.table_1529084adeeb48719dd0a1d18572b357 22:30:11.494 [HMS - ERROR - pool-8-thread-3] (HiveAlterHandler.java:341) Failed to alter table default.table_4657eb1f8bbe4b60b03db2cbf07803a3 22:30:11.506 [HMS - ERROR - pool-8-thread-2] (RetryingHMSHandler.java:200) MetaException(message:java.lang.IllegalStateException: Event not set up correctly) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6189) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4063) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4020) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy24.alter_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:11631) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:11615) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:103) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Event not set up correctly at org.apache.hadoop.hive.metastore.messaging.AlterTableMessage.checkValid(AlterTableMessage.java:49) at org.apache.hadoop.hive.metastore.messaging.json.JSONAlterTableMessage.(JSONAlterTableMessage.java:57) at org.apache.hadoop.hive.metastore.messaging.json.JSONMessageFactory.buildAlterTableMessage(JSONMessageFactory.java:115) at org.apache.hive.hcatalog.listener.DbNotificationListener.onAlterTable(DbNotificationListener.java:187) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$8.notify(MetaStoreListenerNotifier.java:107) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:175) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:205) at org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:317) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4049) ... 16 more Caused by: org.apache.thrift.protocol.TProtocolException: Unexpected character:{ at org.apache.thrift.protocol.TJSONProtocol.readJSONSyntaxChar(TJSONProtocol.java:337) at org.apache.thrift.protocol.TJSONProtocol$JSONPairContext.read(TJSONProtocol.java:246) at org.apache.thrift.protocol.TJSONProtocol.readJSONObjectStart(TJSONProtocol.java:793) at org.apache.thrift.protocol.TJSONProtocol.readStructBegin(TJSONProtocol.java:840) at org.apache.hadoop.hive.metastore.api.Table$TableStandardScheme.read(Table.java:1577) at org.apache.hadoop.hive.metastore.api.Table$TableStandardScheme.read(Table.java:1573) at org.apache.hadoop.hive.metastore.api.Table.read(Table.java:1407) at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:81) at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:67) at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:98) at org.apache.hadoop.hive.metastore.messaging.json.JSONMessageFactory.getTObj(JSONMessageFactory.java:270) at org.apache.hadoop.hive.metastore.messaging.json.JSONAlterTableMessage.getTableObjAfter(JSONAlterTableMessage.java:97) at org.apache.hadoop.hive.metastore.messaging.AlterTableMessage.checkValid(AlterTableMessage.java:41) .
[jira] [Resolved] (KUDU-2804) HmsSentryConfigurations/MasterStressTest is flaky
[ https://issues.apache.org/jira/browse/KUDU-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2804. --- Resolution: Duplicate Fix Version/s: n/a > HmsSentryConfigurations/MasterStressTest is flaky > - > > Key: KUDU-2804 > URL: https://issues.apache.org/jira/browse/KUDU-2804 > Project: Kudu > Issue Type: Bug > Components: hms, master, test >Affects Versions: 1.10.0 >Reporter: Alexey Serbin >Priority: Major > Fix For: n/a > > Attachments: master-stress-test.1.txt.xz > > > The {{HmsSentryConfigurations/MasterStressTest}} seems to be a bit flaky if > running via dist-test with {{--stress_cpu_threads=16}}. A snippet of the > {{master-stress-test}} binary's output (DEBUG build) is below. It seems the > configuration was > {{ HmsMode::ENABLE_METASTORE_INTEGRATION, SentryMode::DISABLED }}. Also, I'm > attaching a full log. > {noformat} > I0426 04:05:42.127689 497 rpcz_store.cc:269] Call > kudu.master.MasterService.AlterTable from 127.0.0.1:39526 (request call id > 112) took 2593ms. Request Metrics: > {"HiveMetastore.queue_time_us":1457223,"Hive > Metastore.run_cpu_time_us":631,"HiveMetastore.run_wall_time_us":98149} > F0426 04:05:42.132196 968 master-stress-test.cc:293] Check failed: _s.ok() > Bad status: Remote error: failed to alter Hive MetaStore table: TException - > service has thrown: MetaException > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2804) HmsSentryConfigurations/MasterStressTest is flaky
[ https://issues.apache.org/jira/browse/KUDU-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827242#comment-16827242 ] Hao Hao commented on KUDU-2804: --- [~aserbin] I think this is tracked in KUDU-2779, which is due to HIVE-19874. > HmsSentryConfigurations/MasterStressTest is flaky > - > > Key: KUDU-2804 > URL: https://issues.apache.org/jira/browse/KUDU-2804 > Project: Kudu > Issue Type: Bug > Components: hms, master, test >Affects Versions: 1.10.0 >Reporter: Alexey Serbin >Priority: Major > Attachments: master-stress-test.1.txt.xz > > > The {{HmsSentryConfigurations/MasterStressTest}} seems to be a bit flaky if > running via dist-test with {{--stress_cpu_threads=16}}. A snippet of the > {{master-stress-test}} binary's output (DEBUG build) is below. It seems the > configuration was > {{ HmsMode::ENABLE_METASTORE_INTEGRATION, SentryMode::DISABLED }}. Also, I'm > attaching a full log. > {noformat} > I0426 04:05:42.127689 497 rpcz_store.cc:269] Call > kudu.master.MasterService.AlterTable from 127.0.0.1:39526 (request call id > 112) took 2593ms. Request Metrics: > {"HiveMetastore.queue_time_us":1457223,"Hive > Metastore.run_cpu_time_us":631,"HiveMetastore.run_wall_time_us":98149} > F0426 04:05:42.132196 968 master-stress-test.cc:293] Check failed: _s.ok() > Bad status: Remote error: failed to alter Hive MetaStore table: TException - > service has thrown: MetaException > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KUDU-2784) MasterSentryTest.TestTableOwnership is flaky
[ https://issues.apache.org/jira/browse/KUDU-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao reassigned KUDU-2784: - Assignee: Hao Hao > MasterSentryTest.TestTableOwnership is flaky > > > Key: KUDU-2784 > URL: https://issues.apache.org/jira/browse/KUDU-2784 > Project: Kudu > Issue Type: Test >Reporter: Hao Hao >Assignee: Hao Hao >Priority: Major > Attachments: master_sentry-itest.2.txt > > > Encountered a failure in with the following error: > {noformat} > W0423 04:49:43.773183 1862 sentry_authz_provider.cc:269] Action on > table with authorizable scope is not permitted for > user > I0423 04:49:43.773447 1862 rpcz_store.cc:269] Call > kudu.master.MasterService.DeleteTable from 127.0.0.1:44822 (request call id > 6) took 2093ms. Request Metrics: > {"Sentry.queue_time_us":33,"Sentry.run_cpu_time_us":390,"Sentry.run_wall_time_us":18856} > /home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/master_sentry-itest.cc:446: > Failure > Failed > Bad status: Not authorized: unauthorized action > {noformat} > This could be owner privilege hasn't reflected yet for ? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2784) MasterSentryTest.TestTableOwnership is flaky
Hao Hao created KUDU-2784: - Summary: MasterSentryTest.TestTableOwnership is flaky Key: KUDU-2784 URL: https://issues.apache.org/jira/browse/KUDU-2784 Project: Kudu Issue Type: Test Reporter: Hao Hao Attachments: master_sentry-itest.2.txt Encountered a failure in with the following error: {noformat} W0423 04:49:43.773183 1862 sentry_authz_provider.cc:269] Action on table with authorizable scope is not permitted for user I0423 04:49:43.773447 1862 rpcz_store.cc:269] Call kudu.master.MasterService.DeleteTable from 127.0.0.1:44822 (request call id 6) took 2093ms. Request Metrics: {"Sentry.queue_time_us":33,"Sentry.run_cpu_time_us":390,"Sentry.run_wall_time_us":18856} /home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/master_sentry-itest.cc:446: Failure Failed Bad status: Not authorized: unauthorized action {noformat} This could be owner privilege hasn't reflected yet for ? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2779) MasterStressTest is flaky when HMS is enabled
Hao Hao created KUDU-2779: - Summary: MasterStressTest is flaky when HMS is enabled Key: KUDU-2779 URL: https://issues.apache.org/jira/browse/KUDU-2779 Project: Kudu Issue Type: Test Reporter: Hao Hao Encountered failure in master-stress-test.cc when HMS integration is enabled: {noformat} 22:30:11.487 [HMS - ERROR - pool-8-thread-2] (HiveAlterHandler.java:341) Failed to alter table default.table_1529084adeeb48719dd0a1d18572b357 22:30:11.494 [HMS - ERROR - pool-8-thread-3] (HiveAlterHandler.java:341) Failed to alter table default.table_4657eb1f8bbe4b60b03db2cbf07803a3 22:30:11.506 [HMS - ERROR - pool-8-thread-2] (RetryingHMSHandler.java:200) MetaException(message:java.lang.IllegalStateException: Event not set up correctly) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6189) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4063) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4020) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy24.alter_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:11631) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:11615) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:103) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Event not set up correctly at org.apache.hadoop.hive.metastore.messaging.AlterTableMessage.checkValid(AlterTableMessage.java:49) at org.apache.hadoop.hive.metastore.messaging.json.JSONAlterTableMessage.(JSONAlterTableMessage.java:57) at org.apache.hadoop.hive.metastore.messaging.json.JSONMessageFactory.buildAlterTableMessage(JSONMessageFactory.java:115) at org.apache.hive.hcatalog.listener.DbNotificationListener.onAlterTable(DbNotificationListener.java:187) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$8.notify(MetaStoreListenerNotifier.java:107) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:175) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:205) at org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:317) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4049) ... 16 more Caused by: org.apache.thrift.protocol.TProtocolException: Unexpected character:{ at org.apache.thrift.protocol.TJSONProtocol.readJSONSyntaxChar(TJSONProtocol.java:337) at org.apache.thrift.protocol.TJSONProtocol$JSONPairContext.read(TJSONProtocol.java:246) at org.apache.thrift.protocol.TJSONProtocol.readJSONObjectStart(TJSONProtocol.java:793) at org.apache.thrift.protocol.TJSONProtocol.readStructBegin(TJSONProtocol.java:840) at org.apache.hadoop.hive.metastore.api.Table$TableStandardScheme.read(Table.java:1577) at org.apache.hadoop.hive.metastore.api.Table$TableStandardScheme.read(Table.java:1573) at org.apache.hadoop.hive.metastore.api.Table.read(Table.java:1407) at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:81) at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:67) at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:98) at org.apache.hadoop.hive.metastore.messaging.json.JSONMessageFactory.getTObj(JSONMessageFactory.java:270) at org.apache.hadoop.hive.metastore.messaging.json.JSONAlterTableMessage.getTableObjAfter(JSONAlterTableMessage.java:97)
[jira] [Updated] (KUDU-2652) TsRecoveryITest.TestNoBlockIDReuseIfMissingBlocks potentially flaky
[ https://issues.apache.org/jira/browse/KUDU-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2652: -- Attachment: ts_recovery-itest.txt > TsRecoveryITest.TestNoBlockIDReuseIfMissingBlocks potentially flaky > --- > > Key: KUDU-2652 > URL: https://issues.apache.org/jira/browse/KUDU-2652 > Project: Kudu > Issue Type: Bug > Components: test >Reporter: Mike Percy >Assignee: Andrew Wong >Priority: Major > Attachments: ts_recovery-itest.txt, ts_recovery-itest.txt.gz > > > This test failed for me in a Gerrit pre-commit run with an unrelated change @ > [http://jenkins.kudu.apache.org/job/kudu-gerrit/15885] > The error was: > {code:java} > /home/jenkins-slave/workspace/kudu-master/3/src/kudu/integration-tests/ts_recovery-itest.cc:298: > Failure > Value of: !orphaned_block_ids.empty() > Actual: false > Expected: true > /home/jenkins-slave/workspace/kudu-master/3/src/kudu/util/test_util.cc:323: > Failure > Failed > Timed out waiting for assertion to pass. > {code} > I am attaching the error log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2652) TsRecoveryITest.TestNoBlockIDReuseIfMissingBlocks potentially flaky
[ https://issues.apache.org/jira/browse/KUDU-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813006#comment-16813006 ] Hao Hao commented on KUDU-2652: --- Is this fixed in commit 114792116? Somehow I am still seeing this error: {noformat} data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/integration-tests/ts_recovery-itest.cc:202: Failure Value of: !orphaned_block_ids.empty() Actual: false Expected: true /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/test_util.cc:308: Failure Failed Timed out waiting for assertion to pass. {noformat} Attached the full log. > TsRecoveryITest.TestNoBlockIDReuseIfMissingBlocks potentially flaky > --- > > Key: KUDU-2652 > URL: https://issues.apache.org/jira/browse/KUDU-2652 > Project: Kudu > Issue Type: Bug > Components: test >Reporter: Mike Percy >Assignee: Andrew Wong >Priority: Major > Attachments: ts_recovery-itest.txt.gz > > > This test failed for me in a Gerrit pre-commit run with an unrelated change @ > [http://jenkins.kudu.apache.org/job/kudu-gerrit/15885] > The error was: > {code:java} > /home/jenkins-slave/workspace/kudu-master/3/src/kudu/integration-tests/ts_recovery-itest.cc:298: > Failure > Value of: !orphaned_block_ids.empty() > Actual: false > Expected: true > /home/jenkins-slave/workspace/kudu-master/3/src/kudu/util/test_util.cc:323: > Failure > Failed > Timed out waiting for assertion to pass. > {code} > I am attaching the error log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2767) Java test TestAuthTokenReacquire is flaky
Hao Hao created KUDU-2767: - Summary: Java test TestAuthTokenReacquire is flaky Key: KUDU-2767 URL: https://issues.apache.org/jira/browse/KUDU-2767 Project: Kudu Issue Type: Bug Components: test Reporter: Hao Hao Attachments: test-output.txt I saw TestAuthTokenReacquire failed with the following error: {noformat} Time: 23.362 There was 1 failure: 1) testBasicMasterOperations(org.apache.kudu.client.TestAuthTokenReacquire) java.lang.AssertionError: test failed: unexpected errors at org.junit.Assert.fail(Assert.java:88) at org.apache.kudu.client.TestAuthTokenReacquire.testBasicMasterOperations(TestAuthTokenReacquire.java:153) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at org.apache.kudu.test.junit.RetryRule$RetryStatement.doOneAttempt(RetryRule.java:195) at org.apache.kudu.test.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:212) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at org.junit.runner.JUnitCore.runMain(JUnitCore.java:77) at org.junit.runner.JUnitCore.main(JUnitCore.java:36) FAILURES!!! Tests run: 2, Failures: 1 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2765) tsan failure in ToolTest.TestLoadgenAutoFlushBackgroundRandom
Hao Hao created KUDU-2765: - Summary: tsan failure in ToolTest.TestLoadgenAutoFlushBackgroundRandom Key: KUDU-2765 URL: https://issues.apache.org/jira/browse/KUDU-2765 Project: Kudu Issue Type: Test Reporter: Hao Hao Attachments: kudu-tool-test.0.txt ToolTest.TestLoadgenAutoFlushBackgroundRandom failed with the following error in tsan {noformat} == WARNING: ThreadSanitizer: destroy of a locked mutex (pid=1076) #0 pthread_rwlock_destroy /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1313 (kudu+0x4b4474) #1 glog_internal_namespace_::Mutex::~Mutex() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/glog-0.3.5/src/base/mutex.h:249:30 (libglog.so.0+0x16488) #2 cxa_at_exit_wrapper(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:386 (kudu+0x484803) and: #0 operator delete(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:119 (kudu+0x523cf1) #1 google::protobuf::FieldDescriptorProto::~FieldDescriptorProto() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.pb.cc:4916:47 (libprotobuf.so.14+0x19c3b1) #2 google::protobuf::internal::GenericTypeHandler::Delete(google::protobuf::FieldDescriptorProto*, google::protobuf::Arena*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:615:7 (libprotobuf.so.14+0x1973b1) #3 void google::protobuf::internal::RepeatedPtrFieldBase::Destroy::TypeHandler>() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:1429 (libprotobuf.so.14+0x1973b1) #4 google::protobuf::RepeatedPtrField::~RepeatedPtrField() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:1892 (libprotobuf.so.14+0x1973b1) #5 google::protobuf::DescriptorProto::~DescriptorProto() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.pb.cc:3528 (libprotobuf.so.14+0x1973b1) #6 google::protobuf::DescriptorProto::~DescriptorProto() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.pb.cc:3525:37 (libprotobuf.so.14+0x197519) #7 google::protobuf::internal::GenericTypeHandler::Delete(google::protobuf::DescriptorProto*, google::protobuf::Arena*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:615:7 (libprotobuf.so.14+0x18e8c1) #8 void google::protobuf::internal::RepeatedPtrFieldBase::Destroy::TypeHandler>() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:1429 (libprotobuf.so.14+0x18e8c1) #9 google::protobuf::RepeatedPtrField::~RepeatedPtrField() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:1892 (libprotobuf.so.14+0x18e8c1) #10 google::protobuf::FileDescriptorProto::~FileDescriptorProto() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.pb.cc:1426 (libprotobuf.so.14+0x18e8c1) #11 google::protobuf::EncodedDescriptorDatabase::Add(void const*, int) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor_database.cc:322:1 (libprotobuf.so.14+0x182dcd) #12 google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.cc:1315:3 (libprotobuf.so.14+0x13b705) #13 google::protobuf::protobuf_google_2fprotobuf_2ftype_2eproto::(anonymous namespace)::AddDescriptorsImpl() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/type.pb.cc:240:3 (libprotobuf.so.14+0x237c10) #14 google::protobuf::internal::FunctionClosure0::Run() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/stubs/callback.h:129:5 (libprotobuf.so.14+0xd330b) #15 google::protobuf::GoogleOnceInitImpl(long*, google::protobuf::Closure*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/stubs/once.cc:83:14 (libprotobuf.so.14+0xd5d6a) #16 google::protobuf::GoogleOnceInit(long*, void (*)()) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4
[jira] [Created] (KUDU-2764) Timeout in sentry_authz_provider-test
Hao Hao created KUDU-2764: - Summary: Timeout in sentry_authz_provider-test Key: KUDU-2764 URL: https://issues.apache.org/jira/browse/KUDU-2764 Project: Kudu Issue Type: Test Reporter: Hao Hao I encountered sentry_authz_provider-test timeout with the following error: {noformat} 130/379 Test #225: compaction_policy-test .. Passed 2.93 sec Start 226: composite-pushdown-test 131/379 Test #188: sentry_authz_provider-test ..***Timeout 930.23 sec Start 227: delta_compaction-test 132/379 Test #227: delta_compaction-test ... Passed 1.81 sec ... The following tests FAILED: 188 - sentry_authz_provider-test (Timeout) Errors while running CTest + TESTS_FAILED=1 {noformat} We should probably improve the run time of the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2718) master_failover-itest when HMS is enabled is flaky
[ https://issues.apache.org/jira/browse/KUDU-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810218#comment-16810218 ] Hao Hao commented on KUDU-2718: --- DropTable in HMS is a synchronous call, so I think it should be reflected immediately if we succeeded in dropping the table. It may be DropTable hasn't taken place before CreateTable being retried? But I don't know how DropTable in HMS can take up to ~2 mins. I also looped the [test 2000 times|http://dist-test.cloudera.org/job?job_id=hao.hao.1554350086.94333], but failed to reproduce the reported error here. Instead I encountered error as {noformat}/data/1/hao/kudu/src/kudu/integration-tests/master_failover-itest.cc:460: Failure Failed Bad status: Invalid argument: Error creating table default.table_0 on the master: not enough live tablet servers to create a table with the requested replication factor 3; 2 tablet servers are alive{noformat} which seems to be the issue described in KUDU-1358. Without the fix for KUDU-1358, to deflake we can retry upon such error. > master_failover-itest when HMS is enabled is flaky > -- > > Key: KUDU-2718 > URL: https://issues.apache.org/jira/browse/KUDU-2718 > Project: Kudu > Issue Type: Bug > Components: test >Affects Versions: 1.9.0 >Reporter: Adar Dembo >Assignee: Hao Hao >Priority: Major > Attachments: master_failover-itest.1.txt > > > This was a failure in > HmsConfigurations/MasterFailoverTest.TestDeleteTableSync/1, where GetParam() > = 2, but it's likely possible in every multi-master test with HMS integration > enabled. > It looks like there was a leader master election at the time that the client > tried to create the table being tested. The master managed to create the > table in HMS, but then there was a failure replicating in Raft because > another master was elected leader. So the client retried the request on a > different master, but the HMS piece of CreateTable failed because the HMS > already knew about the table. > Thing is, there's code to roll back the HMS table creation if this happens, > so I don't see why the retried CreateTable failed at the HMS with "table > already exists". Perhaps this is a case where even though we succeeded in > dropping the table from HMS, it doesn't reflect that immediately? > I'm attaching the full log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KUDU-2757) Retry OpenSSL downloads
[ https://issues.apache.org/jira/browse/KUDU-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2757. --- Resolution: Fixed Assignee: Hao Hao Fix Version/s: 1.10.0 Fixed in commit 984a3e1a1. > Retry OpenSSL downloads > --- > > Key: KUDU-2757 > URL: https://issues.apache.org/jira/browse/KUDU-2757 > Project: Kudu > Issue Type: Bug >Reporter: Hao Hao >Assignee: Hao Hao >Priority: Major > Fix For: 1.10.0 > > > KUDU-2528 added retry for downloading thirdparty, however, we do retry in > downloaing the OpenSSL RPMs. Here's an example: > {noformat} > Building on el6: installing OpenSSL from CentOS 6.4. > Fetching openssl-1.0.0-27.el6.x86_64.rpm from > http://d3dr9sfxru4sde.cloudfront.net/openssl-1.0.0-27.el6.x86_64.rpm > % Total% Received % Xferd Average Speed TimeTime Time > Current > Dload Upload Total SpentLeft Speed > 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 > 0 1392k0 00 0 0 0 --:--:-- 0:00:01 --:--:-- 0 > 0 1392k0 14480 0805 0 0:29:30 0:00:01 0:29:29 808 > 0 1392k0 14480 0517 0 0:45:57 0:00:02 0:45:55 518 > 0 1392k0 14480 0381 0 1:02:21 0:00:03 1:02:18 381 > 0 1392k0 14480 0301 0 1:18:56 0:00:04 1:18:52 301 > 0 1392k0 14480 0249 0 1:35:25 0:00:05 1:35:20 317 > 0 1392k0 14480 0212 0 1:52:04 0:00:06 1:51:58 0 > 0 1392k0 14480 0185 0 2:08:25 0:00:07 2:08:18 0 > 0 1392k0 17040 0199 0 1:59:23 0:00:08 1:59:1554 > 0 1392k0 22160 0232 0 1:42:24 0:00:09 1:42:15 161 > 0 1392k0 22160 0210 0 1:53:08 0:00:10 1:52:58 161 > 0 1392k0 22160 0191 0 2:04:23 0:00:11 2:04:12 161 > 0 1392k0 23800 0187 0 2:07:03 0:00:12 2:06:51 190 > 0 1392k0 35440 0252 0 1:34:17 0:00:14 1:34:03 334 > 0 1392k0 36160 0247 0 1:36:11 0:00:14 1:35:57 276 > 0 1392k0 40480 0260 0 1:31:23 0:00:15 1:31:08 366 > 0 1392k0 50560 0302 0 1:18:40 0:00:16 1:18:24 547 > 0 1392k0 61000 0344 0 1:09:04 0:00:17 1:08:47 743 > 0 1392k0 67840 0359 0 1:06:10 0:00:18 1:05:52 670 > 0 1392k0 70360 0356 0 1:06:44 0:00:19 1:06:25 665 > 0 1392k0 80080 0386 0 1:01:33 0:00:20 1:01:13 764 > 0 1392k0 85120 0387 0 1:01:23 0:00:21 1:01:02 657 > 0 1392k0 86560 0384 0 1:01:52 0:00:22 1:01:30 529 > 0 1392k0 111000 0469 0 0:50:39 0:00:23 0:50:16 900 > 0 1392k0 115680 0462 0 0:51:25 0:00:24 0:51:01 862 > 0 1392k0 122160 0478 0 0:49:42 0:00:25 0:49:17 875 > 0 1392k0 124680 0465 0 0:51:05 0:00:26 0:50:39 829 > 0 1392k0 127560 0462 0 0:51:25 0:00:27 0:50:58 811 > 0 1392k0 136200 0470 0 0:50:33 0:00:28 0:50:05 473 > 0 1392k0 138000 0465 0 0:51:05 0:00:29 0:50:36 479 > 1 1392k1 147360 0481 0 0:49:23 0:00:30 0:48:53 493 > 1 1392k1 165400 0522 0 0:45:30 0:00:31 0:44:59 832 > 1 1392k1 182760 0559 0 0:42:30 0:00:32 0:41:58 1084 > 1 1392k1 187080 0550 0 0:43:11 0:00:34 0:42:37 1009 > 1 1392k1 187440 0540 0 0:43:59 0:00:34 0:43:25 984 > 1 1392k1 196880 0543 0 0:43:45 0:00:36 0:43:09 887 > 1 1392k1 197240 0536 0 0:44:19 0:00:36 0:43:43 621 > 1 1392k1 200840 0532 0 0:44:39 0:00:37 0:44:02 360 > 1 1392k1 207040 0536 0 0:44:19 0:00:38 0:43:41 432 > 1 1392k1 216040 0545 0 0:43:35 0:00:39 0:42:56 580 > 1 1392k1 237800 0586 0 0:40:32 0:00:40 0:39:52 947 > 1 1392k1 251480 0602 0 0:39:28 0:00:41 0:38:47 1097 > 1 1392k1 258320 0607 0 0:39:08 0:00:42 0:38:26 1187 > 1 1392k1 263880 0606 0 0:39:12 0:00:43 0:38:29 1157 > 1 1392k1 268560 0602 0 0:39:28 0:00:44 0:38:44 1059 > 1 1392k1 270720 0594 0 0:39:59 0:00:45 0:39:14 654 > 1 1392k1 273240 0587 0 0:40:28 0:00:46 0:39:42 451 >
[jira] [Updated] (KUDU-2760) kudu-tool-test.TestCopyTable is flaky
[ https://issues.apache.org/jira/browse/KUDU-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2760: -- Attachment: kudu-tool-test.0.txt > kudu-tool-test.TestCopyTable is flaky > - > > Key: KUDU-2760 > URL: https://issues.apache.org/jira/browse/KUDU-2760 > Project: Kudu > Issue Type: Bug >Reporter: Hao Hao >Priority: Major > Attachments: kudu-tool-test.0.txt > > > Encountered a failure of TestCopyTable in kudu-tool-test.cc with the > following error: > {noformat} > I0403 04:31:25.63 7865 catalog_manager.cc:3977] T > 216c0526bd944c7da8c2a62cabe430ba P c80b8c54e90346559bbb413ebdb7d08f reported > cstate change: term changed from 0 to 1, leader changed from to > c80b8c54e90346559bbb413ebdb7d08f (127.3.136.65). New cstate: current_term: 1 > leader_uuid: "c80b8c54e90346559bbb413ebdb7d08f" committed_config { > opid_index: -1 OBSOLETE_local: true peers { permanent_uuid: > "c80b8c54e90346559bbb413ebdb7d08f" member_type: VOTER last_known_addr { host: > "127.3.136.65" port: 35451 } health_report { overall_health: HEALTHY } } } > /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:582: > Failure > Value of: dst_line > Expected: has no substring "key" > Actual: "(int32 key=151, int32 int_val=-325474784, string > string_val=\"7ca8cde3dcca640a\")" (of type std::string) > /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:3067: > Failure > Expected: RunCopyTableCheck(arg) doesn't generate new fatal failures in the > current thread. > Actual: it does. > {noformat} > Attached the full log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2760) kudu-tool-test.TestCopyTable is flaky
Hao Hao created KUDU-2760: - Summary: kudu-tool-test.TestCopyTable is flaky Key: KUDU-2760 URL: https://issues.apache.org/jira/browse/KUDU-2760 Project: Kudu Issue Type: Bug Reporter: Hao Hao Encountered a failure of TestCopyTable in kudu-tool-test.cc with the following error: {noformat} I0403 04:31:25.63 7865 catalog_manager.cc:3977] T 216c0526bd944c7da8c2a62cabe430ba P c80b8c54e90346559bbb413ebdb7d08f reported cstate change: term changed from 0 to 1, leader changed from to c80b8c54e90346559bbb413ebdb7d08f (127.3.136.65). New cstate: current_term: 1 leader_uuid: "c80b8c54e90346559bbb413ebdb7d08f" committed_config { opid_index: -1 OBSOLETE_local: true peers { permanent_uuid: "c80b8c54e90346559bbb413ebdb7d08f" member_type: VOTER last_known_addr { host: "127.3.136.65" port: 35451 } health_report { overall_health: HEALTHY } } } /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:582: Failure Value of: dst_line Expected: has no substring "key" Actual: "(int32 key=151, int32 int_val=-325474784, string string_val=\"7ca8cde3dcca640a\")" (of type std::string) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:3067: Failure Expected: RunCopyTableCheck(arg) doesn't generate new fatal failures in the current thread. Actual: it does. {noformat} Attached the full log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-2757) Retry OpenSSL downloads
[ https://issues.apache.org/jira/browse/KUDU-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2757: -- Code Review: https://gerrit.cloudera.org/#/c/12913/ (was: https://gerrit.cloudera.org/#/c/8313/) > Retry OpenSSL downloads > --- > > Key: KUDU-2757 > URL: https://issues.apache.org/jira/browse/KUDU-2757 > Project: Kudu > Issue Type: Bug >Reporter: Hao Hao >Priority: Major > > KUDU-2528 added retry for downloading thirdparty, however, we do retry in > downloaing the OpenSSL RPMs. Here's an example: > {noformat} > Building on el6: installing OpenSSL from CentOS 6.4. > Fetching openssl-1.0.0-27.el6.x86_64.rpm from > http://d3dr9sfxru4sde.cloudfront.net/openssl-1.0.0-27.el6.x86_64.rpm > % Total% Received % Xferd Average Speed TimeTime Time > Current > Dload Upload Total SpentLeft Speed > 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 > 0 1392k0 00 0 0 0 --:--:-- 0:00:01 --:--:-- 0 > 0 1392k0 14480 0805 0 0:29:30 0:00:01 0:29:29 808 > 0 1392k0 14480 0517 0 0:45:57 0:00:02 0:45:55 518 > 0 1392k0 14480 0381 0 1:02:21 0:00:03 1:02:18 381 > 0 1392k0 14480 0301 0 1:18:56 0:00:04 1:18:52 301 > 0 1392k0 14480 0249 0 1:35:25 0:00:05 1:35:20 317 > 0 1392k0 14480 0212 0 1:52:04 0:00:06 1:51:58 0 > 0 1392k0 14480 0185 0 2:08:25 0:00:07 2:08:18 0 > 0 1392k0 17040 0199 0 1:59:23 0:00:08 1:59:1554 > 0 1392k0 22160 0232 0 1:42:24 0:00:09 1:42:15 161 > 0 1392k0 22160 0210 0 1:53:08 0:00:10 1:52:58 161 > 0 1392k0 22160 0191 0 2:04:23 0:00:11 2:04:12 161 > 0 1392k0 23800 0187 0 2:07:03 0:00:12 2:06:51 190 > 0 1392k0 35440 0252 0 1:34:17 0:00:14 1:34:03 334 > 0 1392k0 36160 0247 0 1:36:11 0:00:14 1:35:57 276 > 0 1392k0 40480 0260 0 1:31:23 0:00:15 1:31:08 366 > 0 1392k0 50560 0302 0 1:18:40 0:00:16 1:18:24 547 > 0 1392k0 61000 0344 0 1:09:04 0:00:17 1:08:47 743 > 0 1392k0 67840 0359 0 1:06:10 0:00:18 1:05:52 670 > 0 1392k0 70360 0356 0 1:06:44 0:00:19 1:06:25 665 > 0 1392k0 80080 0386 0 1:01:33 0:00:20 1:01:13 764 > 0 1392k0 85120 0387 0 1:01:23 0:00:21 1:01:02 657 > 0 1392k0 86560 0384 0 1:01:52 0:00:22 1:01:30 529 > 0 1392k0 111000 0469 0 0:50:39 0:00:23 0:50:16 900 > 0 1392k0 115680 0462 0 0:51:25 0:00:24 0:51:01 862 > 0 1392k0 122160 0478 0 0:49:42 0:00:25 0:49:17 875 > 0 1392k0 124680 0465 0 0:51:05 0:00:26 0:50:39 829 > 0 1392k0 127560 0462 0 0:51:25 0:00:27 0:50:58 811 > 0 1392k0 136200 0470 0 0:50:33 0:00:28 0:50:05 473 > 0 1392k0 138000 0465 0 0:51:05 0:00:29 0:50:36 479 > 1 1392k1 147360 0481 0 0:49:23 0:00:30 0:48:53 493 > 1 1392k1 165400 0522 0 0:45:30 0:00:31 0:44:59 832 > 1 1392k1 182760 0559 0 0:42:30 0:00:32 0:41:58 1084 > 1 1392k1 187080 0550 0 0:43:11 0:00:34 0:42:37 1009 > 1 1392k1 187440 0540 0 0:43:59 0:00:34 0:43:25 984 > 1 1392k1 196880 0543 0 0:43:45 0:00:36 0:43:09 887 > 1 1392k1 197240 0536 0 0:44:19 0:00:36 0:43:43 621 > 1 1392k1 200840 0532 0 0:44:39 0:00:37 0:44:02 360 > 1 1392k1 207040 0536 0 0:44:19 0:00:38 0:43:41 432 > 1 1392k1 216040 0545 0 0:43:35 0:00:39 0:42:56 580 > 1 1392k1 237800 0586 0 0:40:32 0:00:40 0:39:52 947 > 1 1392k1 251480 0602 0 0:39:28 0:00:41 0:38:47 1097 > 1 1392k1 258320 0607 0 0:39:08 0:00:42 0:38:26 1187 > 1 1392k1 263880 0606 0 0:39:12 0:00:43 0:38:29 1157 > 1 1392k1 268560 0602 0 0:39:28 0:00:44 0:38:44 1059 > 1 1392k1 270720 0594 0 0:39:59 0:00:45 0:39:14 654 > 1 1392k1 273240 0587 0 0:40:28 0:00:46 0:39:42 451 > 1 1392k1 274680 0576 0 0:41:14 0:00:47 0:4
[jira] [Updated] (KUDU-2757) Retry OpenSSL downloads
[ https://issues.apache.org/jira/browse/KUDU-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2757: -- Code Review: https://gerrit.cloudera.org/#/c/8313/ > Retry OpenSSL downloads > --- > > Key: KUDU-2757 > URL: https://issues.apache.org/jira/browse/KUDU-2757 > Project: Kudu > Issue Type: Bug >Reporter: Hao Hao >Priority: Major > > KUDU-2528 added retry for downloading thirdparty, however, we do retry in > downloaing the OpenSSL RPMs. Here's an example: > {noformat} > Building on el6: installing OpenSSL from CentOS 6.4. > Fetching openssl-1.0.0-27.el6.x86_64.rpm from > http://d3dr9sfxru4sde.cloudfront.net/openssl-1.0.0-27.el6.x86_64.rpm > % Total% Received % Xferd Average Speed TimeTime Time > Current > Dload Upload Total SpentLeft Speed > 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 > 0 1392k0 00 0 0 0 --:--:-- 0:00:01 --:--:-- 0 > 0 1392k0 14480 0805 0 0:29:30 0:00:01 0:29:29 808 > 0 1392k0 14480 0517 0 0:45:57 0:00:02 0:45:55 518 > 0 1392k0 14480 0381 0 1:02:21 0:00:03 1:02:18 381 > 0 1392k0 14480 0301 0 1:18:56 0:00:04 1:18:52 301 > 0 1392k0 14480 0249 0 1:35:25 0:00:05 1:35:20 317 > 0 1392k0 14480 0212 0 1:52:04 0:00:06 1:51:58 0 > 0 1392k0 14480 0185 0 2:08:25 0:00:07 2:08:18 0 > 0 1392k0 17040 0199 0 1:59:23 0:00:08 1:59:1554 > 0 1392k0 22160 0232 0 1:42:24 0:00:09 1:42:15 161 > 0 1392k0 22160 0210 0 1:53:08 0:00:10 1:52:58 161 > 0 1392k0 22160 0191 0 2:04:23 0:00:11 2:04:12 161 > 0 1392k0 23800 0187 0 2:07:03 0:00:12 2:06:51 190 > 0 1392k0 35440 0252 0 1:34:17 0:00:14 1:34:03 334 > 0 1392k0 36160 0247 0 1:36:11 0:00:14 1:35:57 276 > 0 1392k0 40480 0260 0 1:31:23 0:00:15 1:31:08 366 > 0 1392k0 50560 0302 0 1:18:40 0:00:16 1:18:24 547 > 0 1392k0 61000 0344 0 1:09:04 0:00:17 1:08:47 743 > 0 1392k0 67840 0359 0 1:06:10 0:00:18 1:05:52 670 > 0 1392k0 70360 0356 0 1:06:44 0:00:19 1:06:25 665 > 0 1392k0 80080 0386 0 1:01:33 0:00:20 1:01:13 764 > 0 1392k0 85120 0387 0 1:01:23 0:00:21 1:01:02 657 > 0 1392k0 86560 0384 0 1:01:52 0:00:22 1:01:30 529 > 0 1392k0 111000 0469 0 0:50:39 0:00:23 0:50:16 900 > 0 1392k0 115680 0462 0 0:51:25 0:00:24 0:51:01 862 > 0 1392k0 122160 0478 0 0:49:42 0:00:25 0:49:17 875 > 0 1392k0 124680 0465 0 0:51:05 0:00:26 0:50:39 829 > 0 1392k0 127560 0462 0 0:51:25 0:00:27 0:50:58 811 > 0 1392k0 136200 0470 0 0:50:33 0:00:28 0:50:05 473 > 0 1392k0 138000 0465 0 0:51:05 0:00:29 0:50:36 479 > 1 1392k1 147360 0481 0 0:49:23 0:00:30 0:48:53 493 > 1 1392k1 165400 0522 0 0:45:30 0:00:31 0:44:59 832 > 1 1392k1 182760 0559 0 0:42:30 0:00:32 0:41:58 1084 > 1 1392k1 187080 0550 0 0:43:11 0:00:34 0:42:37 1009 > 1 1392k1 187440 0540 0 0:43:59 0:00:34 0:43:25 984 > 1 1392k1 196880 0543 0 0:43:45 0:00:36 0:43:09 887 > 1 1392k1 197240 0536 0 0:44:19 0:00:36 0:43:43 621 > 1 1392k1 200840 0532 0 0:44:39 0:00:37 0:44:02 360 > 1 1392k1 207040 0536 0 0:44:19 0:00:38 0:43:41 432 > 1 1392k1 216040 0545 0 0:43:35 0:00:39 0:42:56 580 > 1 1392k1 237800 0586 0 0:40:32 0:00:40 0:39:52 947 > 1 1392k1 251480 0602 0 0:39:28 0:00:41 0:38:47 1097 > 1 1392k1 258320 0607 0 0:39:08 0:00:42 0:38:26 1187 > 1 1392k1 263880 0606 0 0:39:12 0:00:43 0:38:29 1157 > 1 1392k1 268560 0602 0 0:39:28 0:00:44 0:38:44 1059 > 1 1392k1 270720 0594 0 0:39:59 0:00:45 0:39:14 654 > 1 1392k1 273240 0587 0 0:40:28 0:00:46 0:39:42 451 > 1 1392k1 274680 0576 0 0:41:14 0:00:47 0:40:27 319 > 1 1392k1 275400 0
[jira] [Created] (KUDU-2757) Retry OpenSSL downloads
Hao Hao created KUDU-2757: - Summary: Retry OpenSSL downloads Key: KUDU-2757 URL: https://issues.apache.org/jira/browse/KUDU-2757 Project: Kudu Issue Type: Bug Reporter: Hao Hao KUDU-2528 added retry for downloading thirdparty, however, we do retry in downloaing the OpenSSL RPMs. Here's an example: {noformat} Building on el6: installing OpenSSL from CentOS 6.4. Fetching openssl-1.0.0-27.el6.x86_64.rpm from http://d3dr9sfxru4sde.cloudfront.net/openssl-1.0.0-27.el6.x86_64.rpm % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 1392k0 00 0 0 0 --:--:-- 0:00:01 --:--:-- 0 0 1392k0 14480 0805 0 0:29:30 0:00:01 0:29:29 808 0 1392k0 14480 0517 0 0:45:57 0:00:02 0:45:55 518 0 1392k0 14480 0381 0 1:02:21 0:00:03 1:02:18 381 0 1392k0 14480 0301 0 1:18:56 0:00:04 1:18:52 301 0 1392k0 14480 0249 0 1:35:25 0:00:05 1:35:20 317 0 1392k0 14480 0212 0 1:52:04 0:00:06 1:51:58 0 0 1392k0 14480 0185 0 2:08:25 0:00:07 2:08:18 0 0 1392k0 17040 0199 0 1:59:23 0:00:08 1:59:1554 0 1392k0 22160 0232 0 1:42:24 0:00:09 1:42:15 161 0 1392k0 22160 0210 0 1:53:08 0:00:10 1:52:58 161 0 1392k0 22160 0191 0 2:04:23 0:00:11 2:04:12 161 0 1392k0 23800 0187 0 2:07:03 0:00:12 2:06:51 190 0 1392k0 35440 0252 0 1:34:17 0:00:14 1:34:03 334 0 1392k0 36160 0247 0 1:36:11 0:00:14 1:35:57 276 0 1392k0 40480 0260 0 1:31:23 0:00:15 1:31:08 366 0 1392k0 50560 0302 0 1:18:40 0:00:16 1:18:24 547 0 1392k0 61000 0344 0 1:09:04 0:00:17 1:08:47 743 0 1392k0 67840 0359 0 1:06:10 0:00:18 1:05:52 670 0 1392k0 70360 0356 0 1:06:44 0:00:19 1:06:25 665 0 1392k0 80080 0386 0 1:01:33 0:00:20 1:01:13 764 0 1392k0 85120 0387 0 1:01:23 0:00:21 1:01:02 657 0 1392k0 86560 0384 0 1:01:52 0:00:22 1:01:30 529 0 1392k0 111000 0469 0 0:50:39 0:00:23 0:50:16 900 0 1392k0 115680 0462 0 0:51:25 0:00:24 0:51:01 862 0 1392k0 122160 0478 0 0:49:42 0:00:25 0:49:17 875 0 1392k0 124680 0465 0 0:51:05 0:00:26 0:50:39 829 0 1392k0 127560 0462 0 0:51:25 0:00:27 0:50:58 811 0 1392k0 136200 0470 0 0:50:33 0:00:28 0:50:05 473 0 1392k0 138000 0465 0 0:51:05 0:00:29 0:50:36 479 1 1392k1 147360 0481 0 0:49:23 0:00:30 0:48:53 493 1 1392k1 165400 0522 0 0:45:30 0:00:31 0:44:59 832 1 1392k1 182760 0559 0 0:42:30 0:00:32 0:41:58 1084 1 1392k1 187080 0550 0 0:43:11 0:00:34 0:42:37 1009 1 1392k1 187440 0540 0 0:43:59 0:00:34 0:43:25 984 1 1392k1 196880 0543 0 0:43:45 0:00:36 0:43:09 887 1 1392k1 197240 0536 0 0:44:19 0:00:36 0:43:43 621 1 1392k1 200840 0532 0 0:44:39 0:00:37 0:44:02 360 1 1392k1 207040 0536 0 0:44:19 0:00:38 0:43:41 432 1 1392k1 216040 0545 0 0:43:35 0:00:39 0:42:56 580 1 1392k1 237800 0586 0 0:40:32 0:00:40 0:39:52 947 1 1392k1 251480 0602 0 0:39:28 0:00:41 0:38:47 1097 1 1392k1 258320 0607 0 0:39:08 0:00:42 0:38:26 1187 1 1392k1 263880 0606 0 0:39:12 0:00:43 0:38:29 1157 1 1392k1 268560 0602 0 0:39:28 0:00:44 0:38:44 1059 1 1392k1 270720 0594 0 0:39:59 0:00:45 0:39:14 654 1 1392k1 273240 0587 0 0:40:28 0:00:46 0:39:42 451 1 1392k1 274680 0576 0 0:41:14 0:00:47 0:40:27 319 1 1392k1 275400 0561 0 0:42:21 0:00:49 0:41:32 208 1 1392k1 276480 0551 0 0:43:07 0:00:50 0:42:17 142 1 1392k1 277560 0546 0 0:43:30 0:00:50 0:42:40 131 1 1392k1 277920 0533 0 0:44:34 0:00:52 0:43:4284 1 1392k1 277920 0523 0 0:45:25 0:0
[jira] [Created] (KUDU-2756) RemoteKsckTest.TestClusterWithLocation failed with master consensus conflicts
Hao Hao created KUDU-2756: - Summary: RemoteKsckTest.TestClusterWithLocation failed with master consensus conflicts Key: KUDU-2756 URL: https://issues.apache.org/jira/browse/KUDU-2756 Project: Kudu Issue Type: Test Reporter: Hao Hao Attachments: ksck_remote-test.txt RemoteKsckTest.TestClusterWithLocation is still flaky after fix from KUDU-2748 and failed with the following error. {noformat} I0401 16:42:06.135743 18496 sys_catalog.cc:340] T P 1afc84687f934a5a8055897bbf6c2a92 [sys.catalog]: This master's current role is: LEADER /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/ksck_remote-test.cc:542: Failure Failed Bad status: Corruption: there are master consensus conflicts /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/test_util.cc:326: Failure Failed Timed out waiting for assertion to pass. I0401 16:42:35.964449 12160 tablet_server.cc:165] TabletServer shutting down... {noformat} Attached the full log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2723) TsLocationAssignmentITest.Basic is flaky
Hao Hao created KUDU-2723: - Summary: TsLocationAssignmentITest.Basic is flaky Key: KUDU-2723 URL: https://issues.apache.org/jira/browse/KUDU-2723 Project: Kudu Issue Type: Test Reporter: Hao Hao Attachments: location_assignment-itest.txt I encountered a failure of location_assignment-itest with the following errors: {noformat} /data/1/hao/kudu/src/kudu/integration-tests/location_assignment-itest.cc:185: Failure Failed Bad status: Timed out: ListTabletServers RPC failed: ListTabletServers RPC to 127.0.22.190:34129 timed out after 15.000s (SENT) /data/1/hao/kudu/src/kudu/integration-tests/location_assignment-itest.cc:217: Failure Expected: StartCluster() doesn't generate new fatal failures in the current thread. Actual: it does. {noformat} Attached the full log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KUDU-2718) master_failover-itest when HMS is enabled is flaky
[ https://issues.apache.org/jira/browse/KUDU-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao reassigned KUDU-2718: - Assignee: Hao Hao > master_failover-itest when HMS is enabled is flaky > -- > > Key: KUDU-2718 > URL: https://issues.apache.org/jira/browse/KUDU-2718 > Project: Kudu > Issue Type: Bug > Components: test >Affects Versions: 1.9.0 >Reporter: Adar Dembo >Assignee: Hao Hao >Priority: Major > Attachments: master_failover-itest.1.txt > > > This was a failure in > HmsConfigurations/MasterFailoverTest.TestDeleteTableSync/1, where GetParam() > = 2, but it's likely possible in every multi-master test with HMS integration > enabled. > It looks like there was a leader master election at the time that the client > tried to create the table being tested. The master managed to create the > table in HMS, but then there was a failure replicating in Raft because > another master was elected leader. So the client retried the request on a > different master, but the HMS piece of CreateTable failed because the HMS > already knew about the table. > Thing is, there's code to roll back the HMS table creation if this happens, > so I don't see why the retried CreateTable failed at the HMS with "table > already exists". Perhaps this is a case where even though we succeeded in > dropping the table from HMS, it doesn't reflect that immediately? > I'm attaching the full log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2703) ITClientStress.testMultipleSessions timeout
Hao Hao created KUDU-2703: - Summary: ITClientStress.testMultipleSessions timeout Key: KUDU-2703 URL: https://issues.apache.org/jira/browse/KUDU-2703 Project: Kudu Issue Type: Test Reporter: Hao Hao Attachments: TEST-org.apache.kudu.client.ITClientStress.xml I recently encountered a timeout of ITClientStress.testMultipleSessions with the following errors: {noformat} 01:28:10.477 [INFO - cluster stderr printer] (MiniKuduCluster.java:543) I0219 01:28:10.477150 7258 mvcc.cc:203] Tried to move safe_time back from 6351128536717332480 to 6351128535804997632. Current Snapshot: MvccSnapshot[committed={T|T < 6351128536717332480}] 01:28:10.495 [INFO - cluster stderr printer] (MiniKuduCluster.java:543) I0219 01:28:10.495507 7259 mvcc.cc:203] Tried to move safe_time back from 6351128536717332480 to 6351128535804997632. Current Snapshot: MvccSnapshot[committed={T|T < 6351128536717332480}] 01:28:11.180 [INFO - cluster stderr printer] (MiniKuduCluster.java:543) I0219 01:28:11.180346 7257 mvcc.cc:203] Tried to move safe_time back from 6351128539811692544 to 6351128535804997632. Current Snapshot: MvccSnapshot[committed={T|T < 6351128539811692544 or (T in {6351128539811692544})}] 01:28:19.969 [DEBUG - New I/O worker #2152] (Connection.java:429) [peer master-127.6.95.253:51702(127.6.95.253:51702)] encountered a read timeout; closing the channel 01:28:19.969 [DEBUG - New I/O worker #2154] (Connection.java:429) [peer master-127.6.95.254:34354(127.6.95.254:34354)] encountered a read timeout; closing the channel 01:28:19.969 [DEBUG - New I/O worker #2154] (Connection.java:688) [peer master-127.6.95.254:34354(127.6.95.254:34354)] cleaning up while in state READY due to: [peer master-127.6.95.254:34354(127.6.95.254:34354)] encountered a read timeout; closing the channel 01:28:19.969 [DEBUG - New I/O worker #2152] (Connection.java:688) [peer master-127.6.95.253:51702(127.6.95.253:51702)] cleaning up while in state READY due to: [peer master-127.6.95.253:51702(127.6.95.253:51702)] encountered a read timeout; closing the channel 01:28:20.328 [DEBUG - New I/O worker #2153] (Connection.java:429) [peer master-127.6.95.252:47527(127.6.95.252:47527)] encountered a read timeout; closing the channel {noformat} Looking at the error, it may relate to safe time advancement. Attached the full log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky
[ https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754464#comment-16754464 ] Hao Hao commented on KUDU-2668: --- Fixed in commit c7c902a08. > TestKuduClient.readYourWrites tests are flaky > - > > Key: KUDU-2668 > URL: https://issues.apache.org/jira/browse/KUDU-2668 > Project: Kudu > Issue Type: Bug > Components: java, test >Affects Versions: 1.9.0 >Reporter: Adar Dembo >Assignee: Hao Hao >Priority: Critical > > I looped TestKuduClient 1000 times in dist-test while working on another > problem, and saw the following failures: > {noformat} > 1 testReadYourWritesBatchLeaderReplica > 14 testReadYourWritesSyncClosestReplica > 15 testReadYourWritesSyncLeaderReplica > {noformat} > In all cases, the stack trace of the failure was effectively this: > {noformat} > java.util.concurrent.ExecutionException: java.lang.AssertionError > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113) > ... > Caused by: java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055) > ... > {noformat} > The offending lines: > {code} > AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table) > .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES) > .replicaSelection(replicaSelection) > .build(); > KuduScanner syncScanner = new KuduScanner(scanner); > long preTs = asyncClient.getLastPropagatedTimestamp(); > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > asyncClient.getLastPropagatedTimestamp()); > long row_count = countRowsInScan(syncScanner); > long expected_count = 100L * (i + 1); > assertTrue(expected_count <= row_count); > // After the scan, verify that the chosen snapshot timestamp is > // returned from the server and it is larger than the previous > // propagated timestamp. > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > scanner.getSnapshotTimestamp()); > --> assertTrue(preTs < scanner.getSnapshotTimestamp()); > {code} > It's possible that this is just test flakiness, but I'm setting a higher > priority so we can understand whether that's the case, or whether there's > something wrong with read-your-writes scans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky
[ https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2668. --- Resolution: Fixed Fix Version/s: 1.9.0 > TestKuduClient.readYourWrites tests are flaky > - > > Key: KUDU-2668 > URL: https://issues.apache.org/jira/browse/KUDU-2668 > Project: Kudu > Issue Type: Bug > Components: java, test >Affects Versions: 1.9.0 >Reporter: Adar Dembo >Assignee: Hao Hao >Priority: Critical > Fix For: 1.9.0 > > > I looped TestKuduClient 1000 times in dist-test while working on another > problem, and saw the following failures: > {noformat} > 1 testReadYourWritesBatchLeaderReplica > 14 testReadYourWritesSyncClosestReplica > 15 testReadYourWritesSyncLeaderReplica > {noformat} > In all cases, the stack trace of the failure was effectively this: > {noformat} > java.util.concurrent.ExecutionException: java.lang.AssertionError > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113) > ... > Caused by: java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055) > ... > {noformat} > The offending lines: > {code} > AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table) > .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES) > .replicaSelection(replicaSelection) > .build(); > KuduScanner syncScanner = new KuduScanner(scanner); > long preTs = asyncClient.getLastPropagatedTimestamp(); > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > asyncClient.getLastPropagatedTimestamp()); > long row_count = countRowsInScan(syncScanner); > long expected_count = 100L * (i + 1); > assertTrue(expected_count <= row_count); > // After the scan, verify that the chosen snapshot timestamp is > // returned from the server and it is larger than the previous > // propagated timestamp. > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > scanner.getSnapshotTimestamp()); > --> assertTrue(preTs < scanner.getSnapshotTimestamp()); > {code} > It's possible that this is just test flakiness, but I'm setting a higher > priority so we can understand whether that's the case, or whether there's > something wrong with read-your-writes scans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky
[ https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752886#comment-16752886 ] Hao Hao commented on KUDU-2668: --- Fixed in commit 847ceb84d. > TestKuduClient.readYourWrites tests are flaky > - > > Key: KUDU-2668 > URL: https://issues.apache.org/jira/browse/KUDU-2668 > Project: Kudu > Issue Type: Bug > Components: java, test >Affects Versions: 1.9.0 >Reporter: Adar Dembo >Assignee: Hao Hao >Priority: Critical > > I looped TestKuduClient 1000 times in dist-test while working on another > problem, and saw the following failures: > {noformat} > 1 testReadYourWritesBatchLeaderReplica > 14 testReadYourWritesSyncClosestReplica > 15 testReadYourWritesSyncLeaderReplica > {noformat} > In all cases, the stack trace of the failure was effectively this: > {noformat} > java.util.concurrent.ExecutionException: java.lang.AssertionError > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113) > ... > Caused by: java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055) > ... > {noformat} > The offending lines: > {code} > AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table) > .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES) > .replicaSelection(replicaSelection) > .build(); > KuduScanner syncScanner = new KuduScanner(scanner); > long preTs = asyncClient.getLastPropagatedTimestamp(); > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > asyncClient.getLastPropagatedTimestamp()); > long row_count = countRowsInScan(syncScanner); > long expected_count = 100L * (i + 1); > assertTrue(expected_count <= row_count); > // After the scan, verify that the chosen snapshot timestamp is > // returned from the server and it is larger than the previous > // propagated timestamp. > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > scanner.getSnapshotTimestamp()); > --> assertTrue(preTs < scanner.getSnapshotTimestamp()); > {code} > It's possible that this is just test flakiness, but I'm setting a higher > priority so we can understand whether that's the case, or whether there's > something wrong with read-your-writes scans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky
[ https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2668: -- Comment: was deleted (was: Fixed in commit 847ceb84d.) > TestKuduClient.readYourWrites tests are flaky > - > > Key: KUDU-2668 > URL: https://issues.apache.org/jira/browse/KUDU-2668 > Project: Kudu > Issue Type: Bug > Components: java, test >Affects Versions: 1.9.0 >Reporter: Adar Dembo >Assignee: Hao Hao >Priority: Critical > > I looped TestKuduClient 1000 times in dist-test while working on another > problem, and saw the following failures: > {noformat} > 1 testReadYourWritesBatchLeaderReplica > 14 testReadYourWritesSyncClosestReplica > 15 testReadYourWritesSyncLeaderReplica > {noformat} > In all cases, the stack trace of the failure was effectively this: > {noformat} > java.util.concurrent.ExecutionException: java.lang.AssertionError > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113) > ... > Caused by: java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055) > ... > {noformat} > The offending lines: > {code} > AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table) > .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES) > .replicaSelection(replicaSelection) > .build(); > KuduScanner syncScanner = new KuduScanner(scanner); > long preTs = asyncClient.getLastPropagatedTimestamp(); > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > asyncClient.getLastPropagatedTimestamp()); > long row_count = countRowsInScan(syncScanner); > long expected_count = 100L * (i + 1); > assertTrue(expected_count <= row_count); > // After the scan, verify that the chosen snapshot timestamp is > // returned from the server and it is larger than the previous > // propagated timestamp. > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > scanner.getSnapshotTimestamp()); > --> assertTrue(preTs < scanner.getSnapshotTimestamp()); > {code} > It's possible that this is just test flakiness, but I'm setting a higher > priority so we can understand whether that's the case, or whether there's > something wrong with read-your-writes scans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky
[ https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao reassigned KUDU-2668: - Assignee: Hao Hao > TestKuduClient.readYourWrites tests are flaky > - > > Key: KUDU-2668 > URL: https://issues.apache.org/jira/browse/KUDU-2668 > Project: Kudu > Issue Type: Bug > Components: java, test >Affects Versions: 1.9.0 >Reporter: Adar Dembo >Assignee: Hao Hao >Priority: Critical > > I looped TestKuduClient 1000 times in dist-test while working on another > problem, and saw the following failures: > {noformat} > 1 testReadYourWritesBatchLeaderReplica > 14 testReadYourWritesSyncClosestReplica > 15 testReadYourWritesSyncLeaderReplica > {noformat} > In all cases, the stack trace of the failure was effectively this: > {noformat} > java.util.concurrent.ExecutionException: java.lang.AssertionError > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113) > ... > Caused by: java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098) > at > org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055) > ... > {noformat} > The offending lines: > {code} > AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table) > .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES) > .replicaSelection(replicaSelection) > .build(); > KuduScanner syncScanner = new KuduScanner(scanner); > long preTs = asyncClient.getLastPropagatedTimestamp(); > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > asyncClient.getLastPropagatedTimestamp()); > long row_count = countRowsInScan(syncScanner); > long expected_count = 100L * (i + 1); > assertTrue(expected_count <= row_count); > // After the scan, verify that the chosen snapshot timestamp is > // returned from the server and it is larger than the previous > // propagated timestamp. > assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, > scanner.getSnapshotTimestamp()); > --> assertTrue(preTs < scanner.getSnapshotTimestamp()); > {code} > It's possible that this is just test flakiness, but I'm setting a higher > priority so we can understand whether that's the case, or whether there's > something wrong with read-your-writes scans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2667) MultiThreadedTabletTest/DeleteAndReinsert is flaky in ASAN
Hao Hao created KUDU-2667: - Summary: MultiThreadedTabletTest/DeleteAndReinsert is flaky in ASAN Key: KUDU-2667 URL: https://issues.apache.org/jira/browse/KUDU-2667 Project: Kudu Issue Type: Test Reporter: Hao Hao Attachments: mt-tablet-test.3.txt I recently came across a failure in MultiThreadedTabletTest/DeleteAndReinsert of ASAN. The error message is: {noformat} Error Message mt-tablet-test.cc:378] Check failed: _s.ok() Bad status: Already present: int32 key=2, int32 key_idx=2, int32 val=NULL: key already present Stacktrace mt-tablet-test.cc:378] Check failed: _s.ok() Bad status: Already present: int32 key=2, int32 key_idx=2, int32 val=NULL: key already present @ 0x7f66b32a5c37 gsignal at ??:0 @ 0x7f66b32a9028 abort at ??:0 @ 0x62c995 kudu::tablet::MultiThreadedTabletTest<>::DeleteAndReinsertCycleThread() at /home/jenkins-slave/workspace/kudu-master/0/src/kudu/tablet/mt-tablet-test.cc:378 @ 0x617e63 boost::_bi::bind_t<>::operator()() at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/uninstrumented/include/boost/bind/bind.hpp:1223 @ 0x7f66b92d8dac boost::function0<>::operator()() at ??:0 @ 0x7f66b7792afb kudu::Thread::SuperviseThread() at ??:0 @ 0x7f66bec0e184 start_thread at ??:0 @ 0x7f66b336cffd clone at ??:0 {noformat} Attached the full log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KUDU-2620) Flaky TestMiniSentryLifecycle
[ https://issues.apache.org/jira/browse/KUDU-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao resolved KUDU-2620. --- Resolution: Fixed > Flaky TestMiniSentryLifecycle > - > > Key: KUDU-2620 > URL: https://issues.apache.org/jira/browse/KUDU-2620 > Project: Kudu > Issue Type: Bug >Reporter: Hao Hao >Assignee: Hao Hao >Priority: Major > Fix For: 1.9.0 > > > I saw TestMiniSentryLifecycle failed with the following error, > {noformat} > /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64: > Failure > Failed > Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero > status 1 > Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are using > GNU date *** > PC: @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr() > SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; stack > trace: *** > @ 0x3d0ca0f710 (unknown) at ??:0 > @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr() at ??:0 > @ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0 > @ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at ??:0 > @ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0 > @ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at > /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70 > 2018-11-05 23:07:10 > Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode): > "DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 > waiting on condition [0x] > java.lang.Thread.State: RUNNABLE > "BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 > nid=0x4c00 waiting on condition [0x7f057e06d000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0xfd5b4478> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) > at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 > tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0xfd5b3c40> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 > tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142) > locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158) > at com.google.common.base.internal.Finalizer.run(Finalizer.java:127) > "BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 > nid=0x4bfd waiting on condition [0x7f05882ba000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0xfceca418> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) > at com.jolbox.bonecp.PoolWat
[jira] [Updated] (KUDU-2620) Flaky TestMiniSentryLifecycle
[ https://issues.apache.org/jira/browse/KUDU-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2620: -- Code Review: https://gerrit.cloudera.org/#/c/11898/ Fix Version/s: 1.9.0 > Flaky TestMiniSentryLifecycle > - > > Key: KUDU-2620 > URL: https://issues.apache.org/jira/browse/KUDU-2620 > Project: Kudu > Issue Type: Bug >Reporter: Hao Hao >Assignee: Hao Hao >Priority: Major > Fix For: 1.9.0 > > > I saw TestMiniSentryLifecycle failed with the following error, > {noformat} > /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64: > Failure > Failed > Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero > status 1 > Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are using > GNU date *** > PC: @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr() > SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; stack > trace: *** > @ 0x3d0ca0f710 (unknown) at ??:0 > @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr() at ??:0 > @ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0 > @ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at ??:0 > @ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0 > @ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at > /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70 > 2018-11-05 23:07:10 > Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode): > "DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 > waiting on condition [0x] > java.lang.Thread.State: RUNNABLE > "BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 > nid=0x4c00 waiting on condition [0x7f057e06d000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0xfd5b4478> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) > at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 > tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0xfd5b3c40> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 > tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142) > locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158) > at com.google.common.base.internal.Finalizer.run(Finalizer.java:127) > "BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 > nid=0x4bfd waiting on condition [0x7f05882ba000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0xfceca418> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.ArrayBlockingQueue.take(
[jira] [Assigned] (KUDU-2620) Flaky TestMiniSentryLifecycle
[ https://issues.apache.org/jira/browse/KUDU-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao reassigned KUDU-2620: - Assignee: Hao Hao > Flaky TestMiniSentryLifecycle > - > > Key: KUDU-2620 > URL: https://issues.apache.org/jira/browse/KUDU-2620 > Project: Kudu > Issue Type: Bug >Reporter: Hao Hao >Assignee: Hao Hao >Priority: Major > > I saw TestMiniSentryLifecycle failed with the following error, > {noformat} > /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64: > Failure > Failed > Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero > status 1 > * > ** > *** Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are > using GNU date *** > PC: @ 0x7f8288d7e7ec std::__shared_ptr<>::__shared_ptr() > *** SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; > stack trace: *** > @ 0x3d0ca0f710 (unknown) at ??:0 > @ 0x7f8288d7e7ec std::__shared_ptr<>::__shared_ptr() at ??:0 > @ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0 > @ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at > ??:0 > @ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0 > @ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at > /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70 > 2018-11-05 23:07:10 > Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode): > "DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 > waiting on condition [0x] > java.lang.Thread.State: RUNNABLE > "BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 > nid=0x4c00 waiting on condition [0x7f057e06d000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xfd5b4478> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) > at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 > tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xfd5b3c40> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 > tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142) > - locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158) > at com.google.common.base.internal.Finalizer.run(Finalizer.java:127) > "BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 > nid=0x4bfd waiting on condition [0x7f05882ba000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xfceca418> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.ArrayBlockingQueue.tak
[jira] [Updated] (KUDU-2620) Flaky TestMiniSentryLifecycle
[ https://issues.apache.org/jira/browse/KUDU-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2620: -- Description: I saw TestMiniSentryLifecycle failed with the following error, {noformat} /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64: Failure Failed Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero status 1 Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are using GNU date *** PC: @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr() SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; stack trace: *** @ 0x3d0ca0f710 (unknown) at ??:0 @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr() at ??:0 @ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0 @ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at ??:0 @ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0 @ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70 2018-11-05 23:07:10 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode): "DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 waiting on condition [0x] java.lang.Thread.State: RUNNABLE "BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 nid=0x4c00 waiting on condition [0x7f057e06d000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) parking to wait for <0xfd5b4478> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) parking to wait for <0xfd5b3c40> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142) locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158) at com.google.common.base.internal.Finalizer.run(Finalizer.java:127) "BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 nid=0x4bfd waiting on condition [0x7f05882ba000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) parking to wait for <0xfceca418> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "BoneCP-keep-alive-scheduler" #31 daemon prio=5 os_prio=0 tid=0x7f058cf0d800 nid=0x4bfc waiting on condition [0x7f0588f0a000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) parking to wait for <0xfcec9be0> (
[jira] [Created] (KUDU-2620) Flaky TestMiniSentryLifecycle
Hao Hao created KUDU-2620: - Summary: Flaky TestMiniSentryLifecycle Key: KUDU-2620 URL: https://issues.apache.org/jira/browse/KUDU-2620 Project: Kudu Issue Type: Bug Reporter: Hao Hao I saw TestMiniSentryLifecycle failed with the following error, {noformat} /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64: Failure Failed Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero status 1 * ** *** Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are using GNU date *** PC: @ 0x7f8288d7e7ec std::__shared_ptr<>::__shared_ptr() *** SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; stack trace: *** @ 0x3d0ca0f710 (unknown) at ??:0 @ 0x7f8288d7e7ec std::__shared_ptr<>::__shared_ptr() at ??:0 @ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0 @ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at ??:0 @ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0 @ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70 2018-11-05 23:07:10 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode): "DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 waiting on condition [0x] java.lang.Thread.State: RUNNABLE "BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 nid=0x4c00 waiting on condition [0x7f057e06d000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xfd5b4478> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xfd5b3c40> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142) - locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158) at com.google.common.base.internal.Finalizer.run(Finalizer.java:127) "BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 nid=0x4bfd waiting on condition [0x7f05882ba000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xfceca418> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "BoneCP-keep-alive-scheduler" #31 daemon prio=5 os_prio=0 tid=0x7f058cf0d800
[jira] [Created] (KUDU-2610) TestSimultaneousLeaderTransferAndAbruptStepdown is Flaky
Hao Hao created KUDU-2610: - Summary: TestSimultaneousLeaderTransferAndAbruptStepdown is Flaky Key: KUDU-2610 URL: https://issues.apache.org/jira/browse/KUDU-2610 Project: Kudu Issue Type: Bug Affects Versions: 1.8.0 Reporter: Hao Hao Attachments: kudu-admin-test.5.txt AdminCliTest.TestSimultaneousLeaderTransferAndAbruptStepdown is flaky sometime in ASAN build with the following error: {noformat} b01d528fd3c74eb5b42b8d4888591ed2 (127.18.62.194:38185) has failed: Timed out: Write RPC to 127.18.62.194:38185 timed out after 60.000s (SENT) W1017 23:33:47.772014 20038 batcher.cc:348] Timed out: Failed to write batch of 1 ops to tablet 9b4b2dea960941bcb38197b51c55baf4 after 1 attempt(s): Failed to write to server: b01d528fd3c74eb5b42b8d4888591ed2 (127.18.62.194:38185): Write RPC to 127.18.62.194:38185 timed out after 60.000s (SENT) F1017 23:33:47.772820 20042 test_workload.cc:202] Timed out: Failed to write batch of 1 ops to tablet 9b4b2dea960941bcb38197b51c55baf4 after 1 attempt(s): Failed to write to server: b01d528fd3c74eb5b42b8d4888591ed2 (127.18.62.194:38185): Write RPC to 127.18.62.194:38185 timed out after 60.000s (SENT) *** Check failure stack trace: *** *** Aborted at 1539844427 (unix time) try "date -d @1539844427" if you are using GNU date *** PC: @ 0x3c74632625 __GI_raise *** SIGABRT (@0x45248fb) received by PID 18683 (TID 0x7f13ebe5b700) from PID 18683; stack trace: *** @ 0x3c74a0f710 (unknown) at ??:0 @ 0x3c74632625 __GI_raise at ??:0 @ 0x3c74633e05 __GI_abort at ??:0 @ 0x7f13fd43da29 (unknown) at ??:0 @ 0x7f13fd43f31d (unknown) at ??:0 @ 0x7f13fd4411dd (unknown) at ??:0 @ 0x7f13fd43ee59 (unknown) at ??:0 @ 0x7f13fd441c7f (unknown) at ??:0 @ 0x7f1412f7ba6e (unknown) at ??:0 @ 0x3c796b6470 (unknown) at ??:0 @ 0x3c74a079d1 start_thread at ??:0 @ 0x3c746e88fd clone at ??:0 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2608) Memory leak in RaftConsensusStressITest
Hao Hao created KUDU-2608: - Summary: Memory leak in RaftConsensusStressITest Key: KUDU-2608 URL: https://issues.apache.org/jira/browse/KUDU-2608 Project: Kudu Issue Type: Bug Reporter: Hao Hao Attachments: raft_consensus_stress-itest.txt RaftConsensusStressITest.RemoveReplaceInCycle sometimes returns complaining about memory leaks. Attached the log. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2607) master_cert_authority-itest detected memory leaks
Hao Hao created KUDU-2607: - Summary: master_cert_authority-itest detected memory leaks Key: KUDU-2607 URL: https://issues.apache.org/jira/browse/KUDU-2607 Project: Kudu Issue Type: Improvement Reporter: Hao Hao Attachments: master_cert_authority-itest.txt I saw MultiMasterConnectToClusterTest.ConnectToCluster complaining about memory leaks in a recent job. Posted the log below. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-2491) kudu-admin-test times out
[ https://issues.apache.org/jira/browse/KUDU-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2491: -- Attachment: jenkins_output (1).txt > kudu-admin-test times out > - > > Key: KUDU-2491 > URL: https://issues.apache.org/jira/browse/KUDU-2491 > Project: Kudu > Issue Type: Test >Reporter: Alexey Serbin >Assignee: Alexey Serbin >Priority: Major > Fix For: 1.8.0 > > Attachments: jenkins_output (1).txt, kudu-admin-test.txt, > kudu-admin-test.xml > > > In one of automated runs, the kudu-admin-test reportedly timed out while > running the {{RebalanceParamTest.Rebalance/3}} scenario at revision > {{1ae050e4d57bc333de28bcbc62e072e8bafd04b3}}. > The logs are attached. One small note: the ctest's output claims the test > was running for more than 15 minutes, while in the log there is information > on the test running just for over 5 minutes. > The test was run via {{ctest}}: > {noformat} > ctest -j4 -R > '^kudu\-tool\-test|^linked_list\-test|^master\-stress\-test|^raft_consensus\-itest|^mt\-rpc\-test|^alter_table\-randomized\-test|^delete_tablet\-itest|^minidump_generation\-itest|^kudu\-ts\-cli\-test|^security\-itest|^client\-test|^kudu\-admin\-test|^master_failover\-itest' > Start 12: client-test.0 > Start 13: client-test.1 > 1/32 Test #12: client-test.0 Passed 17.14 sec > Start 14: client-test.2 > 2/32 Test #13: client-test.1 Passed 39.78 sec > Start 15: client-test.3 > 3/32 Test #14: client-test.2 Passed 27.46 sec > Start 16: client-test.4 > 4/32 Test #15: client-test.3 Passed 21.74 sec > Start 17: client-test.5 > 5/32 Test #16: client-test.4 Passed 43.09 sec > Start 18: client-test.6 > 6/32 Test #18: client-test.6 Passed 13.93 sec > Start 19: client-test.7 > 7/32 Test #19: client-test.7 Passed 15.40 sec > Start 100: delete_tablet-itest > 8/32 Test #17: client-test.5 Passed 58.53 sec > Start 142: security-itest > Start 165: minidump_generation-itest > 9/32 Test #100: delete_tablet-itest .. Passed4.08 sec > Start 246: kudu-ts-cli-test > 10/32 Test #165: minidump_generation-itest Passed5.88 sec > 11/32 Test #246: kudu-ts-cli-test . Passed8.45 sec > Start 118: master_failover-itest.0 > 12/32 Test #142: security-itest ... Passed 38.22 sec > 13/32 Test #118: master_failover-itest.0 .. Passed 112.92 sec > Start 79: alter_table-randomized-test.0 > 14/32 Test #79: alter_table-randomized-test.0 Passed 51.54 sec > Start 80: alter_table-randomized-test.1 > 15/32 Test #80: alter_table-randomized-test.1 Passed 66.13 sec > Start 115: linked_list-test > 16/32 Test #115: linked_list-test . Passed 135.89 sec > Start 119: master_failover-itest.1 > 17/32 Test #119: master_failover-itest.1 .. Passed 155.30 sec > Start 120: master_failover-itest.2 > 18/32 Test #120: master_failover-itest.2 .. Passed 53.94 sec > Start 121: master_failover-itest.3 > 19/32 Test #121: master_failover-itest.3 .. Passed 88.64 sec > Start 125: master-stress-test > 20/32 Test #125: master-stress-test ... Passed 105.25 sec > Start 133: raft_consensus-itest.0 > 21/32 Test #133: raft_consensus-itest.0 ... Passed 31.82 sec > Start 134: raft_consensus-itest.1 > 22/32 Test #134: raft_consensus-itest.1 ... Passed 134.83 sec > Start 135: raft_consensus-itest.2 > 23/32 Test #135: raft_consensus-itest.2 ... Passed 149.73 sec > Start 136: raft_consensus-itest.3 > 24/32 Test #136: raft_consensus-itest.3 ... Passed 122.22 sec > Start 137: raft_consensus-itest.4 > 25/32 Test #137: raft_consensus-itest.4 ... Passed 143.62 sec > Start 138: raft_consensus-itest.5 > 26/32 Test #138: raft_consensus-itest.5 ... Passed 52.04 sec > Start 174: mt-rpc-test > 27/32 Test #174: mt-rpc-test .. Passed1.69 sec > Start 241: kudu-admin-test > 28/32 Test #241: kudu-admin-test ..***Timeout 930.12 sec > Start 242: kudu-tool-test.0 > 29/32 Test #242: kudu-tool-test.0 . Passed 47.92 sec > Start 243: kudu-tool-test.1 > 30/32 Test #243: kudu-tool-test.1 . Passed 39.39 sec > Start 244: kudu-tool-test.2 > 31/32 Test #244: kudu-tool-test.2 . Passed 90.17 sec > Start 245: kudu-tool-test.3 > 32/32 Test #245: kudu-tool-test.3 ..
[jira] [Commented] (KUDU-2491) kudu-admin-test times out
[ https://issues.apache.org/jira/browse/KUDU-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16649636#comment-16649636 ] Hao Hao commented on KUDU-2491: --- I saw rebalancer_tool-test test time out again. Attached the log. > kudu-admin-test times out > - > > Key: KUDU-2491 > URL: https://issues.apache.org/jira/browse/KUDU-2491 > Project: Kudu > Issue Type: Test >Reporter: Alexey Serbin >Assignee: Alexey Serbin >Priority: Major > Fix For: 1.8.0 > > Attachments: kudu-admin-test.txt, kudu-admin-test.xml > > > In one of automated runs, the kudu-admin-test reportedly timed out while > running the {{RebalanceParamTest.Rebalance/3}} scenario at revision > {{1ae050e4d57bc333de28bcbc62e072e8bafd04b3}}. > The logs are attached. One small note: the ctest's output claims the test > was running for more than 15 minutes, while in the log there is information > on the test running just for over 5 minutes. > The test was run via {{ctest}}: > {noformat} > ctest -j4 -R > '^kudu\-tool\-test|^linked_list\-test|^master\-stress\-test|^raft_consensus\-itest|^mt\-rpc\-test|^alter_table\-randomized\-test|^delete_tablet\-itest|^minidump_generation\-itest|^kudu\-ts\-cli\-test|^security\-itest|^client\-test|^kudu\-admin\-test|^master_failover\-itest' > Start 12: client-test.0 > Start 13: client-test.1 > 1/32 Test #12: client-test.0 Passed 17.14 sec > Start 14: client-test.2 > 2/32 Test #13: client-test.1 Passed 39.78 sec > Start 15: client-test.3 > 3/32 Test #14: client-test.2 Passed 27.46 sec > Start 16: client-test.4 > 4/32 Test #15: client-test.3 Passed 21.74 sec > Start 17: client-test.5 > 5/32 Test #16: client-test.4 Passed 43.09 sec > Start 18: client-test.6 > 6/32 Test #18: client-test.6 Passed 13.93 sec > Start 19: client-test.7 > 7/32 Test #19: client-test.7 Passed 15.40 sec > Start 100: delete_tablet-itest > 8/32 Test #17: client-test.5 Passed 58.53 sec > Start 142: security-itest > Start 165: minidump_generation-itest > 9/32 Test #100: delete_tablet-itest .. Passed4.08 sec > Start 246: kudu-ts-cli-test > 10/32 Test #165: minidump_generation-itest Passed5.88 sec > 11/32 Test #246: kudu-ts-cli-test . Passed8.45 sec > Start 118: master_failover-itest.0 > 12/32 Test #142: security-itest ... Passed 38.22 sec > 13/32 Test #118: master_failover-itest.0 .. Passed 112.92 sec > Start 79: alter_table-randomized-test.0 > 14/32 Test #79: alter_table-randomized-test.0 Passed 51.54 sec > Start 80: alter_table-randomized-test.1 > 15/32 Test #80: alter_table-randomized-test.1 Passed 66.13 sec > Start 115: linked_list-test > 16/32 Test #115: linked_list-test . Passed 135.89 sec > Start 119: master_failover-itest.1 > 17/32 Test #119: master_failover-itest.1 .. Passed 155.30 sec > Start 120: master_failover-itest.2 > 18/32 Test #120: master_failover-itest.2 .. Passed 53.94 sec > Start 121: master_failover-itest.3 > 19/32 Test #121: master_failover-itest.3 .. Passed 88.64 sec > Start 125: master-stress-test > 20/32 Test #125: master-stress-test ... Passed 105.25 sec > Start 133: raft_consensus-itest.0 > 21/32 Test #133: raft_consensus-itest.0 ... Passed 31.82 sec > Start 134: raft_consensus-itest.1 > 22/32 Test #134: raft_consensus-itest.1 ... Passed 134.83 sec > Start 135: raft_consensus-itest.2 > 23/32 Test #135: raft_consensus-itest.2 ... Passed 149.73 sec > Start 136: raft_consensus-itest.3 > 24/32 Test #136: raft_consensus-itest.3 ... Passed 122.22 sec > Start 137: raft_consensus-itest.4 > 25/32 Test #137: raft_consensus-itest.4 ... Passed 143.62 sec > Start 138: raft_consensus-itest.5 > 26/32 Test #138: raft_consensus-itest.5 ... Passed 52.04 sec > Start 174: mt-rpc-test > 27/32 Test #174: mt-rpc-test .. Passed1.69 sec > Start 241: kudu-admin-test > 28/32 Test #241: kudu-admin-test ..***Timeout 930.12 sec > Start 242: kudu-tool-test.0 > 29/32 Test #242: kudu-tool-test.0 . Passed 47.92 sec > Start 243: kudu-tool-test.1 > 30/32 Test #243: kudu-tool-test.1 . Passed 39.39 sec > Start 244: kudu-tool-test.2 > 31/32 Test #244: kudu-tool-test.2 . Passed 90.17 sec > Start 245: k
[jira] [Updated] (KUDU-2602) testRandomBackupAndRestore is flaky
[ https://issues.apache.org/jira/browse/KUDU-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2602: -- Description: Saw the following failure with testRandomBackupAndRestore: {noformat} java.lang.AssertionError: expected:<21> but was:<20> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:631) at org.apache.kudu.backup.TestKuduBackup.testRandomBackupAndRestore(TestKuduBackup.scala:99) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155) at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137) at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404) at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63) at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.gradle.internal.concurrent.ThreadFactoryImpl$M
[jira] [Updated] (KUDU-2602) testRandomBackupAndRestore is flaky
[ https://issues.apache.org/jira/browse/KUDU-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2602: -- Issue Type: Bug (was: Improvement) > testRandomBackupAndRestore is flaky > --- > > Key: KUDU-2602 > URL: https://issues.apache.org/jira/browse/KUDU-2602 > Project: Kudu > Issue Type: Bug >Reporter: Hao Hao >Priority: Major > Attachments: TEST-org.apache.kudu.backup.TestKuduBackup.xml > > > Saw the following failure with testRandomBackupAndRestore: > {noformat} > java.lang.AssertionError: > expected:<21> but was:<20> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at org.junit.Assert.assertEquals(Assert.java:631) > at > org.apache.kudu.backup.TestKuduBackup.testRandomBackupAndRestore(TestKuduBackup.scala:99) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > at > org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) > at > org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) > at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) > at > org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > at > org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155) > at > org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137) >
[jira] [Commented] (KUDU-2599) Timeout in DefaultSourceTest.testSocketReadTimeoutPropagation
[ https://issues.apache.org/jira/browse/KUDU-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643941#comment-16643941 ] Hao Hao commented on KUDU-2599: --- This also happens to DefaultSourceTest.testTableScanWithProjectionAndPredicateDecimal128. > Timeout in DefaultSourceTest.testSocketReadTimeoutPropagation > - > > Key: KUDU-2599 > URL: https://issues.apache.org/jira/browse/KUDU-2599 > Project: Kudu > Issue Type: Bug > Components: spark >Affects Versions: 1.8.0 >Reporter: Will Berkeley >Priority: Major > Attachments: TEST-org.apache.kudu.spark.kudu.DefaultSourceTest.xml > > > Log attached -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KUDU-2584) Flaky testSimpleBackupAndRestore
[ https://issues.apache.org/jira/browse/KUDU-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643937#comment-16643937 ] Hao Hao edited comment on KUDU-2584 at 10/9/18 6:43 PM: I saw testSimpleBackupAndRestore failed with another error messages: {noformat} 04:26:59.844 [ERROR - Test worker] (RetryRule.java:76) testSimpleBackupAndRestore(org.apache.kudu.backup.TestKuduBackup): failed run 1 java.security.PrivilegedActionException: org.apache.kudu.client.NoLeaderFoundException: Master config (127.17.164.190:54477,127.17.164.189:42043,127.17.164.188:38685) has no leader. at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:122) at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:65) at org.apache.kudu.spark.kudu.KuduTestSuite$class.setUpBase(KuduTestSuite.scala:131) at org.apache.kudu.backup.TestKuduBackup.setUpBase(TestKuduBackup.scala:47) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155) at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137) at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404) at org.gradle.inter
[jira] [Updated] (KUDU-2584) Flaky testSimpleBackupAndRestore
[ https://issues.apache.org/jira/browse/KUDU-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2584: -- Attachment: TEST-org.apache.kudu.backup.TestKuduBackup.xml > Flaky testSimpleBackupAndRestore > > > Key: KUDU-2584 > URL: https://issues.apache.org/jira/browse/KUDU-2584 > Project: Kudu > Issue Type: Bug > Components: backup >Reporter: Mike Percy >Assignee: Grant Henke >Priority: Major > Attachments: TEST-org.apache.kudu.backup.TestKuduBackup.xml > > > testSimpleBackupAndRestore is flaky and tends to fail with the following > error: > {code:java} > 04:48:06.604 [ERROR - Test worker] (RetryRule.java:72) > testRandomBackupAndRestore(org.apache.kudu.backup.TestKuduBackup): failed run > 1 > java.lang.AssertionError: expected:<111> but was:<110> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at org.junit.Assert.assertEquals(Assert.java:631) > at > org.apache.kudu.backup.TestKuduBackup.testRandomBackupAndRestore(TestKuduBackup.scala:99) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:483) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:68) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106) > > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66) > > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:483) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) > > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > > at > org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) > > at > org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) > > at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) > at > org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:483) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) > > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > > at
[jira] [Commented] (KUDU-2584) Flaky testSimpleBackupAndRestore
[ https://issues.apache.org/jira/browse/KUDU-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643937#comment-16643937 ] Hao Hao commented on KUDU-2584: --- I saw testSimpleBackupAndRestore failed with another error messages: {noformat} 04:26:59.844 [ERROR - Test worker] (RetryRule.java:76) testSimpleBackupAndRestore(org.apache.kudu.backup.TestKuduBackup): failed run 1 java.security.PrivilegedActionException: org.apache.kudu.client.NoLeaderFoundException: Master config (127.17.164.190:54477,127.17.164.189:42043,127.17.164.188:38685) has no leader. at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:122) at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:65) at org.apache.kudu.spark.kudu.KuduTestSuite$class.setUpBase(KuduTestSuite.scala:131) at org.apache.kudu.backup.TestKuduBackup.setUpBase(TestKuduBackup.scala:47) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155) at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137) at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404) at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFai
[jira] [Created] (KUDU-2602) testRandomBackupAndRestore is flaky
Hao Hao created KUDU-2602: - Summary: testRandomBackupAndRestore is flaky Key: KUDU-2602 URL: https://issues.apache.org/jira/browse/KUDU-2602 Project: Kudu Issue Type: Improvement Reporter: Hao Hao Attachments: TEST-org.apache.kudu.backup.TestKuduBackup.xml Saw the following failure with testRandomBackupAndRestore: {noformat} java.lang.AssertionError: expected:<21> but was:<20> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:631) at org.apache.kudu.backup.TestKuduBackup.testRandomBackupAndRestore(TestKuduBackup.scala:99) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155) at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137) at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404) at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63) at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46) at java.util.concurrent.ThreadPoolExecutor
[jira] [Created] (KUDU-2488) tsan failure in security-itest
Hao Hao created KUDU-2488: - Summary: tsan failure in security-itest Key: KUDU-2488 URL: https://issues.apache.org/jira/browse/KUDU-2488 Project: Kudu Issue Type: Test Reporter: Hao Hao Attachments: security-itest.txt Recent run of master, I encountered a tsan failure of security-itest. Attached the log. {noformat} == WARNING: ThreadSanitizer: data race (pid=12812) Write of size 8 at 0x7b080528 by main thread: #0 operator delete(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:119 (security-itest+0x4eb7a1) #1 std::__1::__libcpp_deallocate(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/new:236:3 (libkrpc.so+0x8d8ba) #2 std::__1::allocator >::deallocate(std::__1::__tree_node*, unsigned long) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/memory:1796 (libkrpc.so+0x8d8ba) #3 std::__1::allocator_traits > >::deallocate(std::__1::allocator >&, std::__1::__tree_node*, unsigned long) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/memory:1555 (libkrpc.so+0x8d8ba) #4 std::__1::__tree, std::__1::allocator >::destroy(std::__1::__tree_node*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1834 (libkrpc.so+0x8d8ba) #5 std::__1::__tree, std::__1::allocator >::~__tree() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1821:3 (libkrpc.so+0x8d856) #6 std::__1::set, std::__1::allocator >::~set() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/set:400:28 (libkrpc.so+0x8bc79) #7 cxa_at_exit_wrapper(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:386 (security-itest+0x44c2b3)\{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2480) tsan failure of master-stress-test
Hao Hao created KUDU-2480: - Summary: tsan failure of master-stress-test Key: KUDU-2480 URL: https://issues.apache.org/jira/browse/KUDU-2480 Project: Kudu Issue Type: Test Reporter: Hao Hao Attachments: master-stress-test.txt master-stress-test recently has been very flaky(~24%). One of the failure log {noformat}WARNING: ThreadSanitizer: data race (pid=26513) Read of size 8 at 0x7ffb5e5b88b8 by thread T65: #0 kudu::Status::operator=(kudu::Status const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/status.h:469:7 (libmaster.so+0x10bd00) #1 kudu::Synchronizer::StatusCB(kudu::Status const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/async_util.h:44:8 (libmaster.so+0x10bc40) #2 kudu::internal::RunnableAdapter::Run(kudu::Synchronizer*, kudu::Status const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/bind_internal.h:192:12 (libmaster.so+0x10c708) #3 kudu::internal::InvokeHelper, void ()(kudu::Synchronizer*, kudu::Status const&)>::MakeItSo(kudu::internal::RunnableAdapter, kudu::Synchronizer*, kudu::Status const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/bind_internal.h:889:14 (libmaster.so+0x10c5e8) #4 kudu::internal::Invoker<1, kudu::internal::BindState, void ()(kudu::Synchronizer*, kudu::Status const&), void ()(kudu::internal::UnretainedWrapper)>, void ()(kudu::Synchronizer*, kudu::Status const&)>::Run(kudu::internal::BindStateBase*, kudu::Status const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/bind_internal.h:1118:12 (libmaster.so+0x10c51a) #5 kudu::Callback::Run(kudu::Status const&) const /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/callback.h:436:12 (libmaster.so+0x10b831) #6 kudu::master::HmsNotificationLogListenerTask::RunLoop() /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/master/hms_notification_log_listener.cc:136:10 (libmaster.so+0x108e0a) #7 boost::_mfi::mf0::operator()(kudu::master::HmsNotificationLogListenerTask*) const /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/bind/mem_fn_template.hpp:49:29 (libmaster.so+0x110ea9) #8 void boost::_bi::list1 >::operator(), boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0&, boost::_bi::list0&, int) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/bind/bind.hpp:259:9 (libmaster.so+0x110dfa) #9 boost::_bi::bind_t, boost::_bi::list1 > >::operator()() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/bind/bind.hpp:1222:16 (libmaster.so+0x110d83) #10 boost::detail::function::void_function_obj_invoker0, boost::_bi::list1 > >, void>::invoke(boost::detail::function::function_buffer&) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/function/function_template.hpp:159:11 (libmaster.so+0x110b79) #11 boost::function0::operator()() const /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/function/function_template.hpp:770:14 (libkrpc.so+0xb64b1) #12 kudu::Thread::SuperviseThread(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/thread.cc:603:3 (libkudu_util.so+0x1bd8b4) Previous write of size 8 at 0x7ffb5e5b88b8 by thread T24 (mutexes: read M1468): #0 boost::intrusive::circular_list_algorithms >::init(boost::intrusive::list_node* const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/intrusive/circular_list_algorithms.hpp:72:22 (libkrpc.so+0x99c92) #1 boost::intrusive::generic_hook >, boost::intrusive::dft_tag, (boost::intrusive::link_mode_type)1, (boost::intrusive::base_hook_type)1>::generic_hook() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/intrusive/detail/generic_hook.hpp:174:10 (libkrpc.so+0xc4669) #2 boost::intrusive::list_base_hook::list_base_hook() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/intrusive/list_hook.hpp:83:7 (libkrpc.so+0xc2049) #3 kudu::rpc::ReactorTask::ReactorTask() /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/reactor.cc:262:14 (libkrpc.so+0xbd5fb) #4 kudu::rpc::QueueTransferTask::QueueTransferTask(gscoped_ptr >, kudu::rpc::Connection*) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/connection.cc:432:3 (libkrpc.so+0x98ea4) #5 kudu::rpc::Connection::QueueResponseForCall(gscoped_ptr >) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/connection.cc:474:33 (libkrpc.so+0x94d79) #6 kudu::rpc::InboundCall::Respond(google::protobuf::MessageLite const&, bool) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/inbound_call.cc:165:10 (libkrpc.so+0xa11b9) #7 kudu::rpc::InboundCall::R
[jira] [Commented] (KUDU-2473) READ_YOUR_WRITES error on snapshot timestamp
[ https://issues.apache.org/jira/browse/KUDU-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510470#comment-16510470 ] Hao Hao commented on KUDU-2473: --- I think this is the same issue as KUDU-2415. Some relevant discussion on slack https://getkudu.slack.com/archives/C0CPXJ3CH/p152461454321. > READ_YOUR_WRITES error on snapshot timestamp > > > Key: KUDU-2473 > URL: https://issues.apache.org/jira/browse/KUDU-2473 > Project: Kudu > Issue Type: Bug > Components: impala >Reporter: Thomas Tauber-Marshall >Priority: Critical > > I'm trying to implement support for READ_YOUR_WRITES for Impala, but I'm > finding that if SetLatestObservedTimestamp() isn't called (eg. because we > haven't interacted with Kudu yet in the current session and don't have a > timestamp to set), attempting to scan tables always fails with an error of > the form: > org.apache.kudu.client.NonRecoverableException: Snapshot timestamp is earlier > than the ancient history mark. Consider increasing the value of the > configuration parameter --tablet_history_max_age_sec. Snapshot timestamp: P: > 0 usec, L: 1 Ancient History Mark: P: 1528845756599966 usec, L: 0 Physical > time difference: -1528845756.600s > Minimal repro: > {noformat} > KuduClientBuilder b = new KuduClient.KuduClientBuilder("localhost"); > KuduClient client = b.build(); > KuduTable table = client.openTable("read_mode_test"); > KuduScannerBuilder scannerBuilder = client.newScannerBuilder(table); > scannerBuilder.readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES); > KuduScanner scanner = scannerBuilder.build(); > while (scanner.hasMoreRows()) { > scanner.nextRows(); > } > {noformat} > I'm using Kudu at git hash a954418 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2430) Flaky test security-itest
Hao Hao created KUDU-2430: - Summary: Flaky test security-itest Key: KUDU-2430 URL: https://issues.apache.org/jira/browse/KUDU-2430 Project: Kudu Issue Type: Test Components: security Affects Versions: 1.7.0 Reporter: Hao Hao Attachments: security-itest.txt While running on master branch, security-itest failed with 'WARNING: ThreadSanitizer: data race'. Attached the full log. {noformat} WARNING: ThreadSanitizer: data race (pid=785) Write of size 8 at 0x7b080d68 by main thread: #0 operator delete(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:119 (security-itest+0x4eb7a1) #1 std::__1::__libcpp_deallocate(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/new:236:3 (libkrpc.so+0x8d8ba) #2 std::__1::allocator >::deallocate(std::__1::__tree_node*, unsigned long) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/memory:1796 (libkrpc.so+0x8d8ba) #3 std::__1::allocator_traits > >::deallocate(std::__1::allocator >&, std::__1::__tree_node*, unsigned long) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/memory:1555 (libkrpc.so+0x8d8ba) #4 std::__1::__tree, std::__1::allocator >::destroy(std::__1::__tree_node*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1834 (libkrpc.so+0x8d8ba) #5 std::__1::__tree, std::__1::allocator >::~__tree() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1821:3 (libkrpc.so+0x8d856) #6 std::__1::set, std::__1::allocator >::~set() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/set:400:28 (libkrpc.so+0x8bc79) #7 cxa_at_exit_wrapper(void*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:386 (security-itest+0x44c2b3) Previous read of size 8 at 0x7b080d68 by thread T5: #0 std::__1::__tree_end_node*>* std::__1::__tree_next_iter*>*, std::__1::__tree_node_base*>(std::__1::__tree_node_base*) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:185:14 (libkrpc.so+0x90356) #1 std::__1::__tree_const_iterator*, long>::operator++() /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:921 (libkrpc.so+0x90356) #2 void std::__1::__tree, std::__1::allocator >::__assign_multi*, long> >(std::__1::__tree_const_iterator*, long>, std::__1::__tree_const_iterator*, long>) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1667 (libkrpc.so+0x90356) #3 std::__1::__tree, std::__1::allocator >::operator=(std::__1::__tree, std::__1::allocator > const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1575:9 (libkrpc.so+0x901a4) #4 std::__1::set, std::__1::allocator >::operator=(std::__1::set, std::__1::allocator > const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/set:485:21 (libkrpc.so+0x870ba) #5 kudu::rpc::ClientNegotiation::SendNegotiate() /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/client_negotiation.cc:306 (libkrpc.so+0x870ba) #6 kudu::rpc::ClientNegotiation::Negotiate(std::__1::unique_ptr >*) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/client_negotiation.cc:171:5 (libkrpc.so+0x8693a) #7 kudu::rpc::DoClientNegotiation(kudu::rpc::Connection*, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime, std::__1::unique_ptr >*) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/negotiation.cc:217:3 (libkrpc.so+0xb02b4) #8 kudu::rpc::Negotiation::RunNegotiation(scoped_refptr const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/negotiation.cc:294:9 (libkrpc.so+0xaf42b) #9 kudu::internal::RunnableAdapter const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>::Run(scoped_refptr const&, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/bind_internal.h:356:12 (libkrpc.so+0xc853d) #10 kudu::internal::InvokeHelper const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, void ()(kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoTime const&)>::MakeItSo(kudu::internal::RunnableAdapter const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime)>, kudu::rpc::Connection*, kudu::TriStateFlag const&, kudu::TriStateFlag const&, kudu::MonoT
[jira] [Created] (KUDU-2425) Flaky test rpc_server-test
Hao Hao created KUDU-2425: - Summary: Flaky test rpc_server-test Key: KUDU-2425 URL: https://issues.apache.org/jira/browse/KUDU-2425 Project: Kudu Issue Type: Test Affects Versions: 1.7.0 Reporter: Hao Hao Attachments: rpc_server-test.txt While running on master branch, rpc_server-test failed with 'Check failed: 0 == rv (0 vs. 16)'. Attached the full log. {noformat} F0427 23:04:30.496054 497 mutex.cc:77] Check failed: 0 == rv (0 vs. 16) . Device or resource busy *** Check failure stack trace: *** *** Aborted at 1524895470 (unix time) try "date -d @1524895470" if you are using GNU date *** PC: @ 0x397f632625 (unknown) *** SIGABRT (@0x45201f1) received by PID 497 (TID 0x7fa80ffef980) from PID 497; stack trace: *** @ 0x397fa0f710 (unknown) at ??:0 @ 0x397f632625 (unknown) at ??:0 @ 0x397f633e05 (unknown) at ??:0 @ 0x7fa8104c4a29 google::logging_fail() at ??:0 @ 0x7fa8104c631d google::LogMessage::Fail() at ??:0 @ 0x7fa8104c81dd google::LogMessage::SendToLog() at ??:0 @ 0x7fa8104c5e59 google::LogMessage::Flush() at ??:0 F0427 23:04:30.496255 518 mutex.cc:83] Check failed: rv == 0 || rv == 16 . Invalid argument. Owner tid: 0; Self tid: 7; To collect the owner stack trace, enable the flag --debug_mutex_collect_stacktraceF0427 23:04:30.496286 509 mutex.cc:83] Check failed: rv == 0 || rv == 16 . Invalid argument. Owner tid: 0; Self tid: 8; To collect the owner stack trace, enable the flag --debug_mutex_collect_stacktrace *** Check failure stack trace: *** {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KUDU-2352) Add an API to allow bounded staleness scan
[ https://issues.apache.org/jira/browse/KUDU-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao reassigned KUDU-2352: - Assignee: Hao Hao > Add an API to allow bounded staleness scan > -- > > Key: KUDU-2352 > URL: https://issues.apache.org/jira/browse/KUDU-2352 > Project: Kudu > Issue Type: Improvement > Components: api >Affects Versions: 1.7.0 >Reporter: Hao Hao >Assignee: Hao Hao >Priority: Major > > It would be nice to have an API that allows clients specify a timestamp so > that in either READ_AT_SNAPSHOT or READ_YOUR_WRITES mode the chosen snapshot > timestamp is guaranteed to be higher than the given bound. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KUDU-2415) READ_YOUR_WRITES scan with no prior operation fails
[ https://issues.apache.org/jira/browse/KUDU-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao reassigned KUDU-2415: - Assignee: Hao Hao > READ_YOUR_WRITES scan with no prior operation fails > --- > > Key: KUDU-2415 > URL: https://issues.apache.org/jira/browse/KUDU-2415 > Project: Kudu > Issue Type: Bug > Components: client, tserver >Affects Versions: 1.7.0 >Reporter: Todd Lipcon >Assignee: Hao Hao >Priority: Major > > If I create a new Java client, and then perform a scan in READ_YOUR_WRITES > mode without having done any prior operations from that client, it sends an > RPC with read_mode=READ_YOUR_WRITES but without any propagated or snapshot > timestamp field set. The server seems to interpret this as a value '0' and > then fails with the error: > Snapshot timestamp is earlier than the ancient history mark. Consider > increasing the value of the configuration parameter > --tablet_history_max_age_sec. Snapshot timestamp: P: 0 usec, L: 1 Ancient > History Mark: P: 1524607330402072 usec, L: 0 Physical time difference: > -1524607330.402s -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2415) READ_YOUR_WRITES scan with no prior operation fails
[ https://issues.apache.org/jira/browse/KUDU-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451414#comment-16451414 ] Hao Hao commented on KUDU-2415: --- I think one way to mitigate this is KUDU-2352. > READ_YOUR_WRITES scan with no prior operation fails > --- > > Key: KUDU-2415 > URL: https://issues.apache.org/jira/browse/KUDU-2415 > Project: Kudu > Issue Type: Bug > Components: client, tserver >Affects Versions: 1.7.0 >Reporter: Todd Lipcon >Priority: Major > > If I create a new Java client, and then perform a scan in READ_YOUR_WRITES > mode without having done any prior operations from that client, it sends an > RPC with read_mode=READ_YOUR_WRITES but without any propagated or snapshot > timestamp field set. The server seems to interpret this as a value '0' and > then fails with the error: > Snapshot timestamp is earlier than the ancient history mark. Consider > increasing the value of the configuration parameter > --tablet_history_max_age_sec. Snapshot timestamp: P: 0 usec, L: 1 Ancient > History Mark: P: 1524607330402072 usec, L: 0 Physical time difference: > -1524607330.402s -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2390) ITClient fails with "Row count unexpectedly decreased"
[ https://issues.apache.org/jira/browse/KUDU-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418161#comment-16418161 ] Hao Hao commented on KUDU-2390: --- [~tlipcon] Right, it looks like one scanner was expired and retried. I will loop the test to try to reproduce and debug it. > ITClient fails with "Row count unexpectedly decreased" > -- > > Key: KUDU-2390 > URL: https://issues.apache.org/jira/browse/KUDU-2390 > Project: Kudu > Issue Type: Bug > Components: java, test >Affects Versions: 1.7.0 >Reporter: Todd Lipcon >Priority: Critical > Attachments: Stdout.txt.gz > > > On master, hit the following failure of ITClient: > {code} > 20:05:05.407 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) > AsyncKuduScanner$Response(scannerId = "6ddf5d0da48241aea4b9eb51645716cc", > data = RowResultIterator for 27600 rows, more = true, responseScanTimestamp = > 6234957022375723008) for scanner > 20:05:05.407 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:447) Scanner > "6ddf5d0da48241aea4b9eb51645716cc" opened on > d78cb5506f6e4e17bd54fdaf1819a8a2@[729d64003e7740cabb650f8f6aea4af6(127.1.76.194:60468),7a2e5f9b2be9497fadc30b81a6a50b24(127.1.76.19 > 20:05:05.409 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) > AsyncKuduScanner$Response(scannerId = "", data = RowResultIterator for 7314 > rows, more = false) for scanner > KuduScanner(table=org.apache.kudu.client.ITClient-1522206255318, tablet=d78c > 20:05:05.409 [INFO - Thread-4] (ITClient.java:397) New row count 90114 > 20:05:05.414 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) > AsyncKuduScanner$Response(scannerId = "c230614ad13e40478254b785995d1d7c", > data = RowResultIterator for 27600 rows, more = true, responseScanTimestamp = > 6234957022413987840) for scanner > 20:05:05.414 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:447) Scanner > "c230614ad13e40478254b785995d1d7c" opened on > d78cb5506f6e4e17bd54fdaf1819a8a2@[729d64003e7740cabb650f8f6aea4af6(127.1.76.194:60468),7a2e5f9b2be9497fadc30b81a6a50b24(127.1.76.19 > 20:05:05.419 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) > AsyncKuduScanner$Response(scannerId = "", data = RowResultIterator for 27600 > rows, more = true) for scanner > KuduScanner(table=org.apache.kudu.client.ITClient-1522206255318, tablet=d78c > 20:05:05.420 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) > AsyncKuduScanner$Response(scannerId = "", data = RowResultIterator for 7342 > rows, more = false) for scanner > KuduScanner(table=org.apache.kudu.client.ITClient-1522206255318, tablet=d78c > 20:05:05.421 [ERROR - Thread-4] (ITClient.java:134) Row count unexpectedly > decreased from 90114to 62542 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-2363) Investigate if we should use ServiceCredentialProvider for Spark integration
[ https://issues.apache.org/jira/browse/KUDU-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-2363: -- Description: Spark 2 provides a \{{ServiceCredentialProvider}} implementation for integration with other service on a secure cluster. Here is a related [documentation|https://spark.apache.org/docs/2.1.0/running-on-yarn.html#running-in-a-secure-cluster] although lacking in detail. We should probably investigate if we want to use it to avoid asking users to provide the keytab, since it might not be a good practice. was: Spark 2 provides a \{ServiceCredentialProvider} implementation for integration with other service on a secure cluster. Here is a related [documentation|https://spark.apache.org/docs/2.1.0/running-on-yarn.html#running-in-a-secure-cluster] although lacking in detail. We should probably investigate if we want to use it to avoid asking users to provide the keytab, since it might not be a good practice. > Investigate if we should use ServiceCredentialProvider for Spark integration > > > Key: KUDU-2363 > URL: https://issues.apache.org/jira/browse/KUDU-2363 > Project: Kudu > Issue Type: Improvement >Reporter: Hao Hao >Priority: Major > > Spark 2 provides a \{{ServiceCredentialProvider}} implementation for > integration with other service on a secure cluster. Here is a related > [documentation|https://spark.apache.org/docs/2.1.0/running-on-yarn.html#running-in-a-secure-cluster] > although lacking in detail. > We should probably investigate if we want to use it to avoid asking users to > provide the keytab, since it might not be a good practice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2363) Investigate if we should use ServiceCredentialProvider for Spark integration
Hao Hao created KUDU-2363: - Summary: Investigate if we should use ServiceCredentialProvider for Spark integration Key: KUDU-2363 URL: https://issues.apache.org/jira/browse/KUDU-2363 Project: Kudu Issue Type: Improvement Reporter: Hao Hao Spark 2 provides a \{ServiceCredentialProvider} implementation for integration with other service on a secure cluster. Here is a related [documentation|https://spark.apache.org/docs/2.1.0/running-on-yarn.html#running-in-a-secure-cluster] although lacking in detail. We should probably investigate if we want to use it to avoid asking users to provide the keytab, since it might not be a good practice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-1704) Add a new read mode to perform bounded staleness snapshot reads
[ https://issues.apache.org/jira/browse/KUDU-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-1704: -- Description: It would be useful to be able to perform snapshot reads at a timestamp that is higher than client's previous writes (achieve read-your-writes), thus improving recency, but lower that the server's oldest inflight transaction, thus minimizing the scan's chance to block. Such a mode would not guarantee linearizability, but would still allow for client-local read-your-writes, which seems to be one of the properties users care about the most. The detail design of the mode is available in the linked design doc. This should likely be the new default read mode for scanners. was: It would be useful to be able to perform snapshot reads at a timestamp that is higher than a client previous writes (achieve read-your-writes), thus improving recency, but lower that the server's oldest inflight transaction, thus minimizing the scan's chance to block. Such a mode would not guarantee linearizability, but would still allow for client-local read-your-writes, which seems to be one of the properties users care about the most. The detail design of the mode is available in the linked design doc. This should likely be the new default read mode for scanners. > Add a new read mode to perform bounded staleness snapshot reads > --- > > Key: KUDU-1704 > URL: https://issues.apache.org/jira/browse/KUDU-1704 > Project: Kudu > Issue Type: Sub-task > Components: client >Affects Versions: 1.1.0 >Reporter: David Alves >Assignee: Hao Hao >Priority: Major > Labels: consistency > Fix For: 1.7.0 > > > It would be useful to be able to perform snapshot reads at a timestamp that > is higher than client's previous writes (achieve read-your-writes), thus > improving recency, but lower that the server's oldest inflight transaction, > thus minimizing the scan's chance to block. > Such a mode would not guarantee linearizability, but would still allow for > client-local read-your-writes, which seems to be one of the properties users > care about the most. > The detail design of the mode is available in the linked design doc. > This should likely be the new default read mode for scanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2352) Add an API to allow bounded staleness scan
Hao Hao created KUDU-2352: - Summary: Add an API to allow bounded staleness scan Key: KUDU-2352 URL: https://issues.apache.org/jira/browse/KUDU-2352 Project: Kudu Issue Type: Improvement Components: api Affects Versions: 1.7.0 Reporter: Hao Hao It would be nice to have an API that allows clients specify a timestamp so that in either READ_AT_SNAPSHOT or READ_YOUR_WRITES mode the chosen snapshot timestamp is guaranteed to be higher than the given bound. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-1704) Add a new read mode to perform bounded staleness snapshot reads
[ https://issues.apache.org/jira/browse/KUDU-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated KUDU-1704: -- Description: It would be useful to be able to perform snapshot reads at a timestamp that is higher than a client previous writes (achieve read-your-writes), thus improving recency, but lower that the server's oldest inflight transaction, thus minimizing the scan's chance to block. Such a mode would not guarantee linearizability, but would still allow for client-local read-your-writes, which seems to be one of the properties users care about the most. The detail design of the mode is available in the linked design doc. This should likely be the new default read mode for scanners. was: It would be useful to be able to perform snapshot reads at a timestamp that is higher than a client provided timestamp, thus improving recency, but lower that the server's oldest inflight transaction, thus minimizing the scan's chance to block. Such a mode would not guarantee linearizability, but would still allow for client-local read-your-writes, which seems to be one of the properties users care about the most. This should likely be the new default read mode for scanners. > Add a new read mode to perform bounded staleness snapshot reads > --- > > Key: KUDU-1704 > URL: https://issues.apache.org/jira/browse/KUDU-1704 > Project: Kudu > Issue Type: Sub-task > Components: client >Affects Versions: 1.1.0 >Reporter: David Alves >Assignee: Hao Hao >Priority: Major > Labels: consistency > Fix For: 1.7.0 > > > It would be useful to be able to perform snapshot reads at a timestamp that > is higher than a client previous writes (achieve read-your-writes), thus > improving recency, but lower that the server's oldest inflight transaction, > thus minimizing the scan's chance to block. > Such a mode would not guarantee linearizability, but would still allow for > client-local read-your-writes, which seems to be one of the properties users > care about the most. > The detail design of the mode is available in the linked design doc. > This should likely be the new default read mode for scanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KUDU-1704) Add a new read mode to perform bounded staleness snapshot reads
[ https://issues.apache.org/jira/browse/KUDU-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399598#comment-16399598 ] Hao Hao edited comment on KUDU-1704 at 3/15/18 3:33 AM: Fixed in commits 723ced836, 5047f091d, 0c05e8375, 9d233f457. was (Author: hahao): Fixed in commits 723ced836, 5047f091d, 0c05e8375, 48cdaaa17. > Add a new read mode to perform bounded staleness snapshot reads > --- > > Key: KUDU-1704 > URL: https://issues.apache.org/jira/browse/KUDU-1704 > Project: Kudu > Issue Type: Sub-task > Components: client >Affects Versions: 1.1.0 >Reporter: David Alves >Assignee: Hao Hao >Priority: Major > Labels: consistency > Fix For: 1.7.0 > > > It would be useful to be able to perform snapshot reads at a timestamp that > is higher than a client provided timestamp, thus improving recency, but lower > that the server's oldest inflight transaction, thus minimizing the scan's > chance to block. > Such a mode would not guarantee linearizability, but would still allow for > client-local read-your-writes, which seems to be one of the properties users > care about the most. > This should likely be the new default read mode for scanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1704) Add a new read mode to perform bounded staleness snapshot reads
[ https://issues.apache.org/jira/browse/KUDU-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399865#comment-16399865 ] Hao Hao commented on KUDU-1704: --- Hmm, I guess I used my local commit hash. Just verified it should be 9d233f457 instead. I think the description in the jira does not specifically mentioned it is snapshot consistency across all tablets. Though it did mention it is bounded staleness. Currently, there is no API to specify the staleness bound yet. So to make it more clear, I linked the design doc and will update the description and file a follow up Jira to add a API to specify the staleness bound. > Add a new read mode to perform bounded staleness snapshot reads > --- > > Key: KUDU-1704 > URL: https://issues.apache.org/jira/browse/KUDU-1704 > Project: Kudu > Issue Type: Sub-task > Components: client >Affects Versions: 1.1.0 >Reporter: David Alves >Assignee: Hao Hao >Priority: Major > Labels: consistency > Fix For: 1.7.0 > > > It would be useful to be able to perform snapshot reads at a timestamp that > is higher than a client provided timestamp, thus improving recency, but lower > that the server's oldest inflight transaction, thus minimizing the scan's > chance to block. > Such a mode would not guarantee linearizability, but would still allow for > client-local read-your-writes, which seems to be one of the properties users > care about the most. > This should likely be the new default read mode for scanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)