from:"Hao Hao \(Jira\)"

[jira] [Created] (KUDU-3245) Provide Client API to set verbose logging filtered by vmodule

2021-02-16 Thread Hao Hao (Jira)

Hao Hao created KUDU-3245:
-

 Summary: Provide Client API to set verbose logging filtered by 
vmodule 
 Key: KUDU-3245
 URL: https://issues.apache.org/jira/browse/KUDU-3245
 Project: Kudu
  Issue Type: Improvement
  Components: client
Reporter: Hao Hao


Similar to 
[{{client::SetVerboseLogLevel}}|https://github.com/apache/kudu/blob/master/src/kudu/client/client.h#L164]
 API, it will be nice to add another API to allow enabling verbose logging 
filtered by module.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-3237) MaintenanceManagerTest.TestCompletedOpsHistory is flaky

2021-01-25 Thread Hao Hao (Jira)

Hao Hao created KUDU-3237:
-

 Summary: MaintenanceManagerTest.TestCompletedOpsHistory is flaky
 Key: KUDU-3237
 URL: https://issues.apache.org/jira/browse/KUDU-3237
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao


Came across test failure in MaintenanceManagerTest.TestCompletedOpsHistory as 
the following:
{noformat}
I0125 19:55:10.782884 24454 maintenance_manager.cc:594] P 12345: op5 complete. 
Timing: real 0.000s  user 0.000s sys 0.000s Metrics: {}
/data/1/hao/Repositories/kudu/src/kudu/util/maintenance_manager-test.cc:525: 
Failure
  Expected: std::min(kHistorySize, i + 1)
  Which is: 6
To be equal to: status_pb.completed_operations_size()
  Which is: 5
I0125 19:55:10.783524 24420 test_util.cc:148] 
---
I0125 19:55:10.783561 24420 test_util.cc:149] Had fatal failures, leaving test 
files at 
/tmp/dist-test-task1ofSWE/test-tmp/maintenance_manager-test.0.MaintenanceManagerTest.TestCompletedOpsHistory.1611604508702756-24420
{noformat}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-3118) Validate --tserver_enforce_access_control is set when authorization is enabled in Master

2020-05-11 Thread Hao Hao (Jira)

Hao Hao created KUDU-3118:
-

 Summary: Validate --tserver_enforce_access_control is set when 
authorization is enabled in Master 
 Key: KUDU-3118
 URL: https://issues.apache.org/jira/browse/KUDU-3118
 Project: Kudu
  Issue Type: Task
Reporter: Hao Hao


As mentioned in the code review 
[https://gerrit.cloudera.org/c/15897/1/docs/security.adoc#476], it would be 
nice to add some validation (maybe in ksck or something) that this is set if 
fine-grained authorization is enabled on the master.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-3091) Support ownership privilege with Ranger

2020-03-23 Thread Hao Hao (Jira)

Hao Hao created KUDU-3091:
-

 Summary: Support ownership privilege with Ranger
 Key: KUDU-3091
 URL: https://issues.apache.org/jira/browse/KUDU-3091
 Project: Kudu
  Issue Type: Task
Reporter: Hao Hao


Currently, ownership privilege in Ranger is not available as Kudu has no 
concept of owner, and does not store owner information internally. It would be 
nice to enable it once Kudu introduces owner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KUDU-3090) Add owner concept in Kudu

2020-03-23 Thread Hao Hao (Jira)



 [ 
https://issues.apache.org/jira/browse/KUDU-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-3090:
--
Description: As mentioned in the Ranger integration design doc, Ranger 
supports ownership privilege by creating a default policy that allows \{OWNER} 
of a resource to access it without creating additional policy manually. Unless 
Kudu actually has a full support for owner, ownership privilege is not possible 
with Ranger integration.  (was: As mentioned in the Ranger integration design 
doc, Ranger supports ownership privilege by creating a default policy that 
allows {OWNER} of a resource to access it without creating additional policy 
manually. Unless Kudu actually has a full support for owner, ownership 
privilege is not possible with Ranger integration.)

> Add owner concept in Kudu
> -
>
> Key: KUDU-3090
> URL: https://issues.apache.org/jira/browse/KUDU-3090
> Project: Kudu
>  Issue Type: New Feature
>Reporter: Hao Hao
>Priority: Major
>
> As mentioned in the Ranger integration design doc, Ranger supports ownership 
> privilege by creating a default policy that allows \{OWNER} of a resource to 
> access it without creating additional policy manually. Unless Kudu actually 
> has a full support for owner, ownership privilege is not possible with Ranger 
> integration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-3090) Add owner concept in Kudu

2020-03-23 Thread Hao Hao (Jira)

Hao Hao created KUDU-3090:
-

 Summary: Add owner concept in Kudu
 Key: KUDU-3090
 URL: https://issues.apache.org/jira/browse/KUDU-3090
 Project: Kudu
  Issue Type: New Feature
Reporter: Hao Hao


As mentioned in the Ranger integration design doc, Ranger supports ownership 
privilege by creating a default policy that allows {OWNER} of a resource to 
access it without creating additional policy manually. Unless Kudu actually has 
a full support for owner, ownership privilege is not possible with Ranger 
integration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KUDU-2971) Add a generic Java library wrapper

2020-03-17 Thread Hao Hao (Jira)



 [ 
https://issues.apache.org/jira/browse/KUDU-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2971.
---
Fix Version/s: 1.12.0
   Resolution: Fixed

> Add a generic Java library wrapper
> --
>
> Key: KUDU-2971
> URL: https://issues.apache.org/jira/browse/KUDU-2971
> Project: Kudu
>  Issue Type: Sub-task
>Affects Versions: 1.11.0
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
> Fix For: 1.12.0
>
>
> For Ranger integration, to call Java Ranger plugin from masters, we need a 
> create a wrapper (via Java subprocess). This should be generic to be used by 
> future integrations (e.g. Atlas) which need to call other Java libraries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KUDU-2972) Add Ranger authorization provider

2020-03-17 Thread Hao Hao (Jira)



 [ 
https://issues.apache.org/jira/browse/KUDU-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2972.
---
Fix Version/s: 1.12.0
   Resolution: Fixed

> Add Ranger authorization provider
> -
>
> Key: KUDU-2972
> URL: https://issues.apache.org/jira/browse/KUDU-2972
> Project: Kudu
>  Issue Type: Sub-task
>Affects Versions: 1.11.0
>Reporter: Hao Hao
>Assignee: Attila Bukor
>Priority: Major
> Fix For: 1.12.0
>
>
> For Ranger integration, we need to create Ranger authorization provider to 
> retrieve authorization decisions from the wrapped Ranger plugin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KUDU-3076) Add a Kudu cli for granting/revoking Ranger privileges

2020-03-14 Thread Hao Hao (Jira)



 [ 
https://issues.apache.org/jira/browse/KUDU-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-3076:
--
Description: Even though Ranger has a GUI for policies management (and can 
be accessed via REST API), it probably will be more user friendly to have a 
Kudu cli tool for granting and revoking privileges.  (was: Even though Ranger 
has a UGI for policies management (and can be accessed via REST API), it 
probably will be more user friendly to have a Kudu cli tool for granting and 
revoking privileges.)

> Add a Kudu cli for granting/revoking Ranger privileges
> --
>
> Key: KUDU-3076
> URL: https://issues.apache.org/jira/browse/KUDU-3076
> Project: Kudu
>  Issue Type: Task
>Reporter: Hao Hao
>Priority: Major
>
> Even though Ranger has a GUI for policies management (and can be accessed via 
> REST API), it probably will be more user friendly to have a Kudu cli tool for 
> granting and revoking privileges.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-3076) Add a Kudu cli for granting/revoking Ranger privileges

2020-03-14 Thread Hao Hao (Jira)

Hao Hao created KUDU-3076:
-

 Summary: Add a Kudu cli for granting/revoking Ranger privileges
 Key: KUDU-3076
 URL: https://issues.apache.org/jira/browse/KUDU-3076
 Project: Kudu
  Issue Type: Task
Reporter: Hao Hao


Even though Ranger has a UGI for policies management (and can be accessed via 
REST API), it probably will be more user friendly to have a Kudu cli tool for 
granting and revoking privileges.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KUDU-2973) Support semi-database concept even without HMS integration

2020-01-13 Thread Hao Hao (Jira)



 [ 
https://issues.apache.org/jira/browse/KUDU-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2973:
--
Description: For Ranger integration, we need to continue to have 
semi-database support. And currently it is tied to the HMS integration, which 
needs to be separated out. This includes to extract database information from 
table name for retrieving corresponding Ranger policies. For example, 
"db.table" belongs to 'db'. And as Kudu table name is case sensitive and can 
have special character,  database "table" will be considered as 'table' that 
belongs to 'default' database, and "db.table.abc" will be considered as 
'table.abc' that belongs to 'db' database.   (was: For Ranger integration, we 
need to continue to have semi-database support. And currently it is tied to the 
HMS integration, which needs to be separated out.)

> Support semi-database concept even without HMS integration
> --
>
> Key: KUDU-2973
> URL: https://issues.apache.org/jira/browse/KUDU-2973
> Project: Kudu
>  Issue Type: Sub-task
>Affects Versions: 1.11.0
>Reporter: Hao Hao
>Assignee: Attila Bukor
>Priority: Major
>
> For Ranger integration, we need to continue to have semi-database support. 
> And currently it is tied to the HMS integration, which needs to be separated 
> out. This includes to extract database information from table name for 
> retrieving corresponding Ranger policies. For example, "db.table" belongs to 
> 'db'. And as Kudu table name is case sensitive and can have special 
> character,  database "table" will be considered as 'table' that belongs to 
> 'default' database, and "db.table.abc" will be considered as 'table.abc' that 
> belongs to 'db' database. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-3006) RebalanceIgnoredTserversTest.Basic is flaky

2019-11-21 Thread Hao Hao (Jira)

Hao Hao created KUDU-3006:
-

 Summary: RebalanceIgnoredTserversTest.Basic is flaky
 Key: KUDU-3006
 URL: https://issues.apache.org/jira/browse/KUDU-3006
 Project: Kudu
  Issue Type: Bug
Reporter: Hao Hao
 Attachments: rebalancer_tool-test.1.txt

RebalanceIgnoredTserversTest.Basic of the rebalancer_tool-test sometimes fails 
with an error like below. I attached full test log.

{noformat}
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/rebalancer_tool-test.cc:350:
 Failure
Value of: out
Expected: has substring "2dd9365c71c54e5d83294b31046c5478 | 0"
  Actual: "Per-server replica distribution summary for tservers_to_empty:\n 
  Server UUID| Replica 
Count\n--+---\n 
2dd9365c71c54e5d83294b31046c5478 | 1\n\nPer-server replica distribution 
summary:\n   Statistic   |  Value\n---+--\n 
Minimum Replica Count | 0\n Maximum Replica Count | 1\n Average Replica Count | 
0.50\n\nPer-table replica distribution summary:\n Replica Skew |  
Value\n--+--\n Minimum  | 1\n Maximum  | 1\n 
Average  | 1.00\n\n\nrebalancing is complete: cluster is balanced 
(moved 0 replicas)\n" (of type std::string)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-3003) TestAsyncKuduSession.testTabletCacheInvalidatedDuringWrites is flaky

2019-11-19 Thread Hao Hao (Jira)

Hao Hao created KUDU-3003:
-

 Summary: 
TestAsyncKuduSession.testTabletCacheInvalidatedDuringWrites is flaky
 Key: KUDU-3003
 URL: https://issues.apache.org/jira/browse/KUDU-3003
 Project: Kudu
  Issue Type: Bug
Reporter: Hao Hao
 Attachments: test-output.txt

testTabletCacheInvalidatedDuringWrites of the 
org.apache.kudu.client.TestAsyncKuduSession test sometimes fails with an error 
like below. I attached full test log.
{noformat}
There was 1 failure:
1) 
testTabletCacheInvalidatedDuringWrites(org.apache.kudu.client.TestAsyncKuduSession)
org.apache.kudu.client.PleaseThrottleException: all buffers are currently 
flushing
at 
org.apache.kudu.client.AsyncKuduSession.apply(AsyncKuduSession.java:579)
at 
org.apache.kudu.client.TestAsyncKuduSession.testTabletCacheInvalidatedDuringWrites(TestAsyncKuduSession.java:371)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-2973) Support semi-database concept even without HMS integration

2019-10-13 Thread Hao Hao (Jira)

Hao Hao created KUDU-2973:
-

 Summary: Support semi-database concept even without HMS integration
 Key: KUDU-2973
 URL: https://issues.apache.org/jira/browse/KUDU-2973
 Project: Kudu
  Issue Type: Sub-task
Affects Versions: 1.11.0
Reporter: Hao Hao


For Ranger integration, we need to continue to have semi-database support. And 
currently it is tied to the HMS integration, which needs to be separated out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KUDU-2970) Fine-grained authorization with Ranger

2019-10-13 Thread Hao Hao (Jira)



 [ 
https://issues.apache.org/jira/browse/KUDU-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao reassigned KUDU-2970:
-

Assignee: Hao Hao

>  Fine-grained authorization with Ranger
> ---
>
> Key: KUDU-2970
> URL: https://issues.apache.org/jira/browse/KUDU-2970
> Project: Kudu
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 1.11.0
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
>
> With the completion of Kudu’s integration with Apache Sentry, fine-grained 
> authorization capabilities have been added to Kudu. However, because Apache 
> Ranger has wider adoption and provides a more comprehensive security features 
> (such as attribute based access control, audit, etc) than Sentry, it is 
> important for Kudu to also integrate Ranger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-2972) Add Ranger authorization provider

2019-10-13 Thread Hao Hao (Jira)

Hao Hao created KUDU-2972:
-

 Summary: Add Ranger authorization provider
 Key: KUDU-2972
 URL: https://issues.apache.org/jira/browse/KUDU-2972
 Project: Kudu
  Issue Type: Sub-task
Affects Versions: 1.11.0
Reporter: Hao Hao


For Ranger integration, we need to create Ranger authorization provider to 
retrieve authorization decisions from the wrapped Ranger plugin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KUDU-2970) Fine-grained authorization with Ranger

2019-10-13 Thread Hao Hao (Jira)



 [ 
https://issues.apache.org/jira/browse/KUDU-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2970:
--
Affects Version/s: (was: 1.10.0)
   1.11.0

>  Fine-grained authorization with Ranger
> ---
>
> Key: KUDU-2970
> URL: https://issues.apache.org/jira/browse/KUDU-2970
> Project: Kudu
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 1.11.0
>Reporter: Hao Hao
>Priority: Major
>
> With the completion of Kudu’s integration with Apache Sentry, fine-grained 
> authorization capabilities have been added to Kudu. However, because Apache 
> Ranger has wider adoption and provides a more comprehensive security features 
> (such as attribute based access control, audit, etc) than Sentry, it is 
> important for Kudu to also integrate Ranger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KUDU-2971) Add a generic Java library wrapper

2019-10-13 Thread Hao Hao (Jira)



 [ 
https://issues.apache.org/jira/browse/KUDU-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao reassigned KUDU-2971:
-

Assignee: Hao Hao

> Add a generic Java library wrapper
> --
>
> Key: KUDU-2971
> URL: https://issues.apache.org/jira/browse/KUDU-2971
> Project: Kudu
>  Issue Type: Sub-task
>Affects Versions: 1.11.0
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
>
> For Ranger integration, to call Java Ranger plugin from masters, we need a 
> create a wrapper (via Java subprocess). This should be generic to be used by 
> future integrations (e.g. Atlas) which need to call other Java libraries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-2971) Add a generic Java library wrapper

2019-10-13 Thread Hao Hao (Jira)

Hao Hao created KUDU-2971:
-

 Summary: Add a generic Java library wrapper
 Key: KUDU-2971
 URL: https://issues.apache.org/jira/browse/KUDU-2971
 Project: Kudu
  Issue Type: Sub-task
Affects Versions: 1.11.0
Reporter: Hao Hao


For Ranger integration, to call Java Ranger plugin from masters, we need a 
create a wrapper (via Java subprocess). This should be generic to be used by 
future integrations (e.g. Atlas) which need to call other Java libraries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KUDU-2970) Fine-grained authorization with Ranger

2019-10-13 Thread Hao Hao (Jira)

Hao Hao created KUDU-2970:
-

 Summary:  Fine-grained authorization with Ranger
 Key: KUDU-2970
 URL: https://issues.apache.org/jira/browse/KUDU-2970
 Project: Kudu
  Issue Type: New Feature
  Components: security
Affects Versions: 1.10.0
Reporter: Hao Hao


With the completion of Kudu’s integration with Apache Sentry, fine-grained 
authorization capabilities have been added to Kudu. However, because Apache 
Ranger has wider adoption and provides a more comprehensive security features 
(such as attribute based access control, audit, etc) than Sentry, it is 
important for Kudu to also integrate Ranger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KUDU-2191) Hive Metastore Integration

2019-08-30 Thread Hao Hao (Jira)



 [ 
https://issues.apache.org/jira/browse/KUDU-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2191.
---
Fix Version/s: 1.10.0
   Resolution: Fixed

> Hive Metastore Integration
> --
>
> Key: KUDU-2191
> URL: https://issues.apache.org/jira/browse/KUDU-2191
> Project: Kudu
>  Issue Type: New Feature
>  Components: server
>Affects Versions: 1.5.0
>Reporter: Dan Burkert
>Assignee: Hao Hao
>Priority: Major
> Fix For: 1.10.0
>
>
> In order to facilitate discovery of Kudu tables, as well as a shared table 
> namespace, Kudu should register its tables in the Hive Metastore.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (KUDU-2916) Admin.TestDumpMemTrackers is flaky in tsan

2019-08-05 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2916:
-

 Summary: Admin.TestDumpMemTrackers is flaky in tsan
 Key: KUDU-2916
 URL: https://issues.apache.org/jira/browse/KUDU-2916
 Project: Kudu
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Hao Hao


I saw a tsan failure for AdminCliTest.TestDumpMemTrackers with the following 
log:
{noformat}
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-admin-test.cc:2162:
 Failure
Value of: s.ok()
  Actual: false
Expected: true
Runtime error: /tmp/dist-test-taskUWtx7r/build/tsan/bin/kudu: process exited 
with non-zero status 66
stdout: 
{"id":"root","limit":-1,"current_consumption":481,"peak_consumption":481,"child_trackers":[{"id":"server","parent_id":"root","limit":-1,"current_consumption":313,"peak_consumption":313,"child_trackers":[{"id":"result-tracker","parent_id":"server","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"log_block_manager","parent_id":"server","limit":-1,"current_consumption":48,"peak_consumption":48},{"id":"tablet-9276b163452b4b0399ff2cae579f7251","parent_id":"server","limit":-1,"current_consumption":265,"peak_consumption":265,"child_trackers":[{"id":"DeltaMemStores","parent_id":"tablet-9276b163452b4b0399ff2cae579f7251","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"MemRowSet-0","parent_id":"tablet-9276b163452b4b0399ff2cae579f7251","limit":-1,"current_consumption":265,"peak_consumption":265},{"id":"txn_tracker","parent_id":"tablet-9276b163452b4b0399ff2cae579f7251","limit":67108864,"current_consumption":0,"peak_consumption":0}]}]},{"id":"ttl-cache-sharded_fifo_cache","parent_id":"root","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"code_cache-sharded_lru_cache","parent_id":"root","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"block_cache-sharded_lru_cache","parent_id":"root","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"lbm-sharded_lru_cache","parent_id":"root","limit":-1,"current_consumption":0,"peak_consumption":0},{"id":"log_cache","parent_id":"root","limit":1073741824,"current_consumption":168,"peak_consumption":168,"child_trackers":[{"id":"log_cache:457a3168758d4f4f8f4c59e8dd179cd3:9276b163452b4b0399ff2cae579f7251","parent_id":"log_cache","limit":10485760,"current_consumption":168,"peak_consumption":168}]}]}

stderr: W0803 14:03:00.206982  8443 flags.cc:404] Enabled unsafe flag: 
--never_fsync=true
W0803 14:03:00.849385  8443 thread.cc:599] rpc reactor (reactor) Time spent 
creating pthread: real 0.590s   user 0.230s sys 0.360s
W0803 14:03:00.849658  8443 thread.cc:566] rpc reactor (reactor) Time spent 
starting thread: real 0.591suser 0.230s sys 0.360s
==
WARNING: ThreadSanitizer: destroy of a locked mutex (pid=8443)
#0 pthread_rwlock_destroy 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1313
 (kudu+0x4bbb24)
#1 glog_internal_namespace_::Mutex::~Mutex() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/glog-0.3.5/src/base/mutex.h:249:30
 (libglog.so.0+0x16488)
#2 cxa_at_exit_wrapper(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:386
 (kudu+0x48beb3)

  and:
#0 operator new(unsigned long) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:57
 (kudu+0x52ae83)
#1 __allocate 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/new:228:10
 (libc++.so.1+0xd63f3)
#2 allocate 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/memory:1793
 (libc++.so.1+0xd63f3)
#3 allocate 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/memory:1547
 (libc++.so.1+0xd63f3)
#4 __init 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/string:1591
 (libc++.so.1+0xd63f3)
#5 std::__1::basic_string, 
std::__1::allocator >::basic_string(std::__1::basic_string, std::__1::allocator > const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/libcxx/include/string:1653
 (libc++.so.1+0xd63f3)
#6 std::__1::pair, 
std::__1::allocator > const, std::__1::pair 
>::pair(std::__1::pair, std::__1::allocator > const, 
std::__1::pair > const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/utility:324:5
 (libprotobuf.so.14+0x188711)
#7 void 
std::__1::allocator, std::__1::allocator >, std::__1::pair >, void*> >::construct, std::__1::allocator > const, 
std::__1::pair >, std::__1::pair, std::__1::allocator > const, 
std::__1::pair >

[jira] [Created] (KUDU-2883) HMS check/fix tool should accommodate tables without ID for dropping orphan hms tables.

2019-07-01 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2883:
-

 Summary: HMS check/fix tool should accommodate tables without ID 
for dropping orphan hms tables.
 Key: KUDU-2883
 URL: https://issues.apache.org/jira/browse/KUDU-2883
 Project: Kudu
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Hao Hao


In cases that table has no ID(created when HMS integration is disabled), HMS 
check/fix tool should accommodate such caes for dropping orphan hms tables 
(https://github.com/apache/kudu/blob/master/src/kudu/tools/tool_action_hms.cc#L616).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KUDU-2880) TestSecurity is flaky

2019-06-27 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2880:
--
Attachment: test-output.txt

> TestSecurity is flaky
> -
>
> Key: KUDU-2880
> URL: https://issues.apache.org/jira/browse/KUDU-2880
> Project: Kudu
>  Issue Type: Test
>Reporter: Hao Hao
>Priority: Major
> Attachments: test-output.txt
>
>
> A recent run of TestSecurity failed with the following error:
> {noformat}
> There was 1 failure:
> 1) 
> testExternallyProvidedSubjectRefreshedExternally(org.apache.kudu.client.TestSecurity)
> org.apache.kudu.client.NonRecoverableException: cannot complete before 
> timeout: KuduRpc(method=ListTabletServers, tablet=null, attempt=26, 
> TimeoutTracker(timeout=3, elapsed=29608), Traces: [0ms] refreshing cache 
> from master, [46ms] Sub RPC ConnectToMaster: sending RPC to server 
> master-127.0.202.126:46581, [63ms] Sub RPC ConnectToMaster: sending RPC to 
> server master-127.0.202.124:43241, [69ms] Sub RPC ConnectToMaster: received 
> response from server master-127.0.202.126:46581: Network error: Failed to 
> connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection 
> refused: /127.0.202.126:46581, [70ms] Sub RPC ConnectToMaster: sending RPC to 
> server master-127.0.202.125:43873, [250ms] Sub RPC ConnectToMaster: received 
> response from server master-127.0.202.125:43873: Network error: [peer 
> master-127.0.202.125:43873(127.0.202.125:43873)] unexpected exception from 
> downstream on [id: 0x2fae7299, /127.0.0.1:57014 => /127.0.202.125:43873], 
> [282ms] Sub RPC ConnectToMaster: received response from server 
> master-127.0.202.124:43241: OK, [336ms] delaying RPC due to: Service 
> unavailable: Master config 
> (127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. 
> Exceptions received: org.apache.kudu.client.RecoverableException: Failed to 
> connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection 
> refused: /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: 
> [peer master-127.0.202.125:43873(127.0.202.125:43873)] unexpected exception 
> from downstream on [id: 0x2fae7299, /127.0.0.1:57014 => 
> /127.0.202.125:43873], [357ms] refreshing cache from master, [358ms] Sub RPC 
> ConnectToMaster: sending RPC to server master-127.0.202.126:46581, [358ms] 
> Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.124:43241, 
> [360ms] Sub RPC ConnectToMaster: received response from server 
> master-127.0.202.126:46581: Network error: java.net.ConnectException: 
> Connection refused: /127.0.202.126:46581, [360ms] Sub RPC ConnectToMaster: 
> sending RPC to server master-127.0.202.125:43873, [361ms] Sub RPC 
> ConnectToMaster: received response from server master-127.0.202.125:43873: 
> Network error: Failed to connect to peer 
> master-127.0.202.125:43873(127.0.202.125:43873): Connection refused: 
> /127.0.202.125:43873, [363ms] Sub RPC ConnectToMaster: received response from 
> server master-127.0.202.124:43241: OK, [364ms] delaying RPC due to: Service 
> unavailable: Master config 
> (127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. 
> Exceptions received: org.apache.kudu.client.RecoverableException: 
> java.net.ConnectException: Connection refused: 
> /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: Failed to 
> connect to peer master-127.0.202.125:43873(127.0.202.125:43873): Connection 
> refused: /127.0.202.125:43873, [376ms] refreshing cache from master, [377ms] 
> Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.126:46581, 
> [377ms] Sub RPC ConnectToMaster: sending RPC to server 
> master-127.0.202.124:43241, [378ms] Sub RPC ConnectToMaster: sending RPC to 
> server master-127.0.202.125:43873, [379ms] Sub RPC ConnectToMaster: received 
> response from server master-127.0.202.126:46581: Network error: Failed to 
> connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection 
> refused: /127.0.202.126:46581, [381ms] Sub RPC ConnectToMaster: received 
> response from server master-127.0.202.125:43873: Network error: 
> java.net.ConnectException: Connection refused: /127.0.202.125:43873, [382ms] 
> Sub RPC ConnectToMaster: received response from server 
> master-127.0.202.124:43241: OK, [383ms] delaying RPC due to: Service 
> unavailable: Master config 
> (127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. 
> Exceptions received: org.apache.kudu.client.RecoverableException: Failed to 
> connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection 
> refused: /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: 
> java.net.ConnectException: Connection refused: /127.0.202.125:43873, [397ms] 
> refreshing cache from master, [397ms] Sub RPC ConnectToMaster: sending RPC to 
>

[jira] [Created] (KUDU-2880) TestSecurity is flaky

2019-06-27 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2880:
-

 Summary: TestSecurity is flaky
 Key: KUDU-2880
 URL: https://issues.apache.org/jira/browse/KUDU-2880
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao


A recent run of TestSecurity failed with the following error:
{noformat}
There was 1 failure:
1) 
testExternallyProvidedSubjectRefreshedExternally(org.apache.kudu.client.TestSecurity)
org.apache.kudu.client.NonRecoverableException: cannot complete before timeout: 
KuduRpc(method=ListTabletServers, tablet=null, attempt=26, 
TimeoutTracker(timeout=3, elapsed=29608), Traces: [0ms] refreshing cache 
from master, [46ms] Sub RPC ConnectToMaster: sending RPC to server 
master-127.0.202.126:46581, [63ms] Sub RPC ConnectToMaster: sending RPC to 
server master-127.0.202.124:43241, [69ms] Sub RPC ConnectToMaster: received 
response from server master-127.0.202.126:46581: Network error: Failed to 
connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection 
refused: /127.0.202.126:46581, [70ms] Sub RPC ConnectToMaster: sending RPC to 
server master-127.0.202.125:43873, [250ms] Sub RPC ConnectToMaster: received 
response from server master-127.0.202.125:43873: Network error: [peer 
master-127.0.202.125:43873(127.0.202.125:43873)] unexpected exception from 
downstream on [id: 0x2fae7299, /127.0.0.1:57014 => /127.0.202.125:43873], 
[282ms] Sub RPC ConnectToMaster: received response from server 
master-127.0.202.124:43241: OK, [336ms] delaying RPC due to: Service 
unavailable: Master config 
(127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: Failed to 
connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection 
refused: /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: 
[peer master-127.0.202.125:43873(127.0.202.125:43873)] unexpected exception 
from downstream on [id: 0x2fae7299, /127.0.0.1:57014 => /127.0.202.125:43873], 
[357ms] refreshing cache from master, [358ms] Sub RPC ConnectToMaster: sending 
RPC to server master-127.0.202.126:46581, [358ms] Sub RPC ConnectToMaster: 
sending RPC to server master-127.0.202.124:43241, [360ms] Sub RPC 
ConnectToMaster: received response from server master-127.0.202.126:46581: 
Network error: java.net.ConnectException: Connection refused: 
/127.0.202.126:46581, [360ms] Sub RPC ConnectToMaster: sending RPC to server 
master-127.0.202.125:43873, [361ms] Sub RPC ConnectToMaster: received response 
from server master-127.0.202.125:43873: Network error: Failed to connect to 
peer master-127.0.202.125:43873(127.0.202.125:43873): Connection refused: 
/127.0.202.125:43873, [363ms] Sub RPC ConnectToMaster: received response from 
server master-127.0.202.124:43241: OK, [364ms] delaying RPC due to: Service 
unavailable: Master config 
(127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: 
java.net.ConnectException: Connection refused: 
/127.0.202.126:46581,org.apache.kudu.client.RecoverableException: Failed to 
connect to peer master-127.0.202.125:43873(127.0.202.125:43873): Connection 
refused: /127.0.202.125:43873, [376ms] refreshing cache from master, [377ms] 
Sub RPC ConnectToMaster: sending RPC to server master-127.0.202.126:46581, 
[377ms] Sub RPC ConnectToMaster: sending RPC to server 
master-127.0.202.124:43241, [378ms] Sub RPC ConnectToMaster: sending RPC to 
server master-127.0.202.125:43873, [379ms] Sub RPC ConnectToMaster: received 
response from server master-127.0.202.126:46581: Network error: Failed to 
connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection 
refused: /127.0.202.126:46581, [381ms] Sub RPC ConnectToMaster: received 
response from server master-127.0.202.125:43873: Network error: 
java.net.ConnectException: Connection refused: /127.0.202.125:43873, [382ms] 
Sub RPC ConnectToMaster: received response from server 
master-127.0.202.124:43241: OK, [383ms] delaying RPC due to: Service 
unavailable: Master config 
(127.0.202.126:46581,127.0.202.124:43241,127.0.202.125:43873) has no leader. 
Exceptions received: org.apache.kudu.client.RecoverableException: Failed to 
connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection 
refused: /127.0.202.126:46581,org.apache.kudu.client.RecoverableException: 
java.net.ConnectException: Connection refused: /127.0.202.125:43873, [397ms] 
refreshing cache from master, [397ms] Sub RPC ConnectToMaster: sending RPC to 
server master-127.0.202.126:46581, [398ms] Sub RPC ConnectToMaster: sending RPC 
to server master-127.0.202.124:43241, [399ms] Sub RPC ConnectToMaster: received 
response from server master-127.0.202.126:46581: Network error: Failed to 
connect to peer master-127.0.202.126:46581(127.0.202.126:46581): Connection 
refused: /127.0.202.126:46581, [402ms] Sub RPC

[jira] [Commented] (KUDU-1702) Document/Implement read-your-writes for Impala/Spark etc.

2019-06-16 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865103#comment-16865103
 ] 

Hao Hao commented on KUDU-1702:
---

I think adoption of READ_YOUR_WRITES mode for impala side is still not 
IMPALA-7184). For Spark side,   the approach of using READ_AT_SNAPSHOT mode to 
achieve read-your-writes semantics is done as part of KUDU-1454.

> Document/Implement read-your-writes for Impala/Spark etc.
> -
>
> Key: KUDU-1702
> URL: https://issues.apache.org/jira/browse/KUDU-1702
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client, tablet, tserver
>Affects Versions: 1.1.0
>Reporter: David Alves
>Assignee: David Alves
>Priority: Major
>
> Engines like Impala/Spark use many independent client instances, so we should 
> provide a way to have read-your-writes across many independent client 
> instances, which translates to providing a way to get linearizable behavior. 
> At first this can be done using the APIs that are already available. For 
> instance if the objective is to be sure to have the results of a write in a a 
> following scan, the following steps can be taken:
> - After a write the engine should collect the last observed timestamps from 
> kudu clients
> - The engine's coordinator then takes the max of those timestamps, adds 1 and 
> uses that as a snapshot scan timestamp.
> One important pre-requisite of the behavior above is that scans be done in 
> READ_AT_SNAPSHOT mode. Also the steps above currently don't actually 
> guarantee the expected behavior, but should as the currently anomalies are 
> taken care of (as part of KUDU-430).
> In the immediate future we'll add APIs to the Kudu client so as to make the 
> inner workings of getting this behavior oblivious to the engine. The steps 
> will still be the same, i.e. timestamps or timestamp tokens will still be 
> passed around, but the kudu client will encapsulate the choice of the 
> timestamp for the scan.
> Later we will add a way to obtain this behavior without timestamp 
> propagation, either by doing a write-side commit-wait, where clients wait out 
> the clock error after/during the last write thus making sure any future 
> operation will have a higher timestamp; or by making read-side commit wait, 
> where we provide an api on the kudu client for the engine to perform a 
> similar call before the scan call to obtain a scan timestamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-1498) Add support to Java client for read-your-writes consistency

2019-06-16 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-1498.
---
   Resolution: Duplicate
Fix Version/s: n/a

> Add support to Java client for read-your-writes consistency
> ---
>
> Key: KUDU-1498
> URL: https://issues.apache.org/jira/browse/KUDU-1498
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client
>Reporter: Mike Percy
>Priority: Major
> Fix For: n/a
>
>
> The Java client could use a mode called "read your writes" consistency where 
> we ensure that we read whatever the leader has committed at the time of the 
> request.
> At the time of writing, the implementation requirements look like the 
> following:
> * Always scan from the leader
> * Specify that the leader must apply all operations from previous leaders 
> before processing the query
> In the C++ client, this can be achieved by specifying both of the LEADER_ONLY 
> and READ_AT_SNAPSHOT options, while not specifying a timestamp to use for the 
> snapshot when starting the scan.
> In the Java client API, we may want to simply expose a scan option called 
> "read your writes" or something similar.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-1499) Add friendly C++ API for read-your-writes consistency

2019-06-16 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-1499.
---
   Resolution: Duplicate
Fix Version/s: n/a

> Add friendly C++ API for read-your-writes consistency
> -
>
> Key: KUDU-1499
> URL: https://issues.apache.org/jira/browse/KUDU-1499
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.9.0
>Reporter: Mike Percy
>Priority: Major
> Fix For: n/a
>
>
> At the time of writing, in order to get read-your-writes consistency in the 
> C++ client, one must jump through hoops such as specifying LEADER_ONLY + 
> READ_AT_SNAPSHOT while *not* specifying a timestamp to use for the snapshot.
> It would be more friendly to expose a simple API flag or option that enables 
> a "read your writes" consistency mode.
> Another benefit to this approach is that we can change the implementation 
> later if we come up with a more clever or scalable way of implementing the 
> underlying consistency mode, such as something involving the use of 
> timestamps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-1498) Add support to Java client for read-your-writes consistency

2019-06-16 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865097#comment-16865097
 ] 

Hao Hao commented on KUDU-1498:
---

Yeah, this should be covered by KUDU-1704.

> Add support to Java client for read-your-writes consistency
> ---
>
> Key: KUDU-1498
> URL: https://issues.apache.org/jira/browse/KUDU-1498
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client
>Reporter: Mike Percy
>Priority: Major
>
> The Java client could use a mode called "read your writes" consistency where 
> we ensure that we read whatever the leader has committed at the time of the 
> request.
> At the time of writing, the implementation requirements look like the 
> following:
> * Always scan from the leader
> * Specify that the leader must apply all operations from previous leaders 
> before processing the query
> In the C++ client, this can be achieved by specifying both of the LEADER_ONLY 
> and READ_AT_SNAPSHOT options, while not specifying a timestamp to use for the 
> snapshot when starting the scan.
> In the Java client API, we may want to simply expose a scan option called 
> "read your writes" or something similar.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-1499) Add friendly C++ API for read-your-writes consistency

2019-06-16 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865096#comment-16865096
 ] 

Hao Hao commented on KUDU-1499:
---

I think this is duplicate of KUDU-1704 which has been fixed.

> Add friendly C++ API for read-your-writes consistency
> -
>
> Key: KUDU-1499
> URL: https://issues.apache.org/jira/browse/KUDU-1499
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.9.0
>Reporter: Mike Percy
>Priority: Major
>
> At the time of writing, in order to get read-your-writes consistency in the 
> C++ client, one must jump through hoops such as specifying LEADER_ONLY + 
> READ_AT_SNAPSHOT while *not* specifying a timestamp to use for the snapshot.
> It would be more friendly to expose a simple API flag or option that enables 
> a "read your writes" consistency mode.
> Another benefit to this approach is that we can change the implementation 
> later if we come up with a more clever or scalable way of implementing the 
> underlying consistency mode, such as something involving the use of 
> timestamps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-2590) Master access control enforcement of CREATE/ALTER/DROP table operations

2019-06-10 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2590.
---
   Resolution: Fixed
Fix Version/s: 1.10.0

> Master access control enforcement of CREATE/ALTER/DROP table operations
> ---
>
> Key: KUDU-2590
> URL: https://issues.apache.org/jira/browse/KUDU-2590
> Project: Kudu
>  Issue Type: Sub-task
>  Components: master
>Affects Versions: 1.7.1
>Reporter: Dan Burkert
>Assignee: Hao Hao
>Priority: Major
> Fix For: 1.10.0
>
>
> As described in the 'Master RPC Authorization' section of the [design 
> doc.|https://docs.google.com/document/d/1SEBtgWwBFqij5CuCZwhOqDNSDVViC0WERq6RzsPCWjQ/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-2542) Fill-out AuthzToken definition

2019-06-10 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2542.
---
   Resolution: Fixed
Fix Version/s: 1.10.0

> Fill-out AuthzToken definition
> --
>
> Key: KUDU-2542
> URL: https://issues.apache.org/jira/browse/KUDU-2542
> Project: Kudu
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 1.8.0
>Reporter: Dan Burkert
>Assignee: Andrew Wong
>Priority: Major
> Fix For: 1.10.0
>
>
> As part of the Sentry integration, it will be necessary to flesh out the  
> [AuthzTokenPB|https://github.com/apache/kudu/blob/master/src/kudu/security/token.proto#L28]
>  structure with relevant fields:
>  # The ID of the table which the token applies to
>  # The username which the attached privileges belong to
>  # The privileges
> Sentry has it's own privilege format 
> [TSentryPrivilege|https://github.com/apache/sentry/blob/master/sentry-service/sentry-service-api/src/main/resources/sentry_policy_service.thrift#L47-L58],
>  but we'll probably want to convert this into our own internal Protobuf-based 
> format for the following reasons:
>  # The tokens will be used in the tablet servers to authorize client actions. 
> Currently tablet servers don't use or link to Thrift libraries.
>  # The Sentry privilege structure references columns by name, whereas we will 
> need to reference columns by ID in order to be robust to columns being 
> renamed.
>  # Having our own format will make it easier to drop in alternate 
> authorization providers in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2542) Fill-out AuthzToken definition

2019-06-10 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860201#comment-16860201
 ] 

Hao Hao commented on KUDU-2542:
---

This is done in a series of commits.

> Fill-out AuthzToken definition
> --
>
> Key: KUDU-2542
> URL: https://issues.apache.org/jira/browse/KUDU-2542
> Project: Kudu
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 1.8.0
>Reporter: Dan Burkert
>Assignee: Andrew Wong
>Priority: Major
>
> As part of the Sentry integration, it will be necessary to flesh out the  
> [AuthzTokenPB|https://github.com/apache/kudu/blob/master/src/kudu/security/token.proto#L28]
>  structure with relevant fields:
>  # The ID of the table which the token applies to
>  # The username which the attached privileges belong to
>  # The privileges
> Sentry has it's own privilege format 
> [TSentryPrivilege|https://github.com/apache/sentry/blob/master/sentry-service/sentry-service-api/src/main/resources/sentry_policy_service.thrift#L47-L58],
>  but we'll probably want to convert this into our own internal Protobuf-based 
> format for the following reasons:
>  # The tokens will be used in the tablet servers to authorize client actions. 
> Currently tablet servers don't use or link to Thrift libraries.
>  # The Sentry privilege structure references columns by name, whereas we will 
> need to reference columns by ID in order to be robust to columns being 
> renamed.
>  # Having our own format will make it easier to drop in alternate 
> authorization providers in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2590) Master access control enforcement of CREATE/ALTER/DROP table operations

2019-06-10 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860200#comment-16860200
 ] 

Hao Hao commented on KUDU-2590:
---

This is done in a series of commits.

> Master access control enforcement of CREATE/ALTER/DROP table operations
> ---
>
> Key: KUDU-2590
> URL: https://issues.apache.org/jira/browse/KUDU-2590
> Project: Kudu
>  Issue Type: Sub-task
>  Components: master
>Affects Versions: 1.7.1
>Reporter: Dan Burkert
>Assignee: Hao Hao
>Priority: Major
>
> As described in the 'Master RPC Authorization' section of the [design 
> doc.|https://docs.google.com/document/d/1SEBtgWwBFqij5CuCZwhOqDNSDVViC0WERq6RzsPCWjQ/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2541) server-side Sentry Client

2019-06-06 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857981#comment-16857981
 ] 

Hao Hao commented on KUDU-2541:
---

Committed in 14f3e6f60 and ecc4998cb

> server-side Sentry Client
> -
>
> Key: KUDU-2541
> URL: https://issues.apache.org/jira/browse/KUDU-2541
> Project: Kudu
>  Issue Type: Sub-task
>  Components: server
>Affects Versions: 1.8.0
>Reporter: Dan Burkert
>Assignee: Dan Burkert
>Priority: Major
>
> As part of the Sentry integration, it will be necessary to have a Sentry 
> client which can be used by the Kudu master server.  This will require 
> effectively re-implementing the existing Sentry client (plugin) in C++, or at 
> least the parts of it which we need to authorize operations in Kudu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-2541) server-side Sentry Client

2019-06-06 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2541.
---
   Resolution: Fixed
Fix Version/s: 1.10.0

> server-side Sentry Client
> -
>
> Key: KUDU-2541
> URL: https://issues.apache.org/jira/browse/KUDU-2541
> Project: Kudu
>  Issue Type: Sub-task
>  Components: server
>Affects Versions: 1.8.0
>Reporter: Dan Burkert
>Assignee: Dan Burkert
>Priority: Major
> Fix For: 1.10.0
>
>
> As part of the Sentry integration, it will be necessary to have a Sentry 
> client which can be used by the Kudu master server.  This will require 
> effectively re-implementing the existing Sentry client (plugin) in C++, or at 
> least the parts of it which we need to authorize operations in Kudu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2557) Sometimes the rebalancer-related tests are runing for too long

2019-05-24 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847733#comment-16847733
 ] 

Hao Hao commented on KUDU-2557:
---

Ah, ok, thanks for catching that!

> Sometimes the rebalancer-related tests are runing for too long
> --
>
> Key: KUDU-2557
> URL: https://issues.apache.org/jira/browse/KUDU-2557
> Project: Kudu
>  Issue Type: Bug
>  Components: CLI, test
>Affects Versions: 1.8.0
>Reporter: Alexey Serbin
>Assignee: Alexey Serbin
>Priority: Minor
>  Labels: CLI, flaky-test, rebalance, test
> Fix For: 1.9.0
>
> Attachments: kudu-admin-test.2.txt
>
>
> The rebalancer-related tests in {{kudu-admin-test}} sometimes gets wild and 
> run for too long.  That's observed in RELEASE builds at least:
> {noformat}
> ConcurrentRebalancersTest.TwoConcurrentRebalancers/1: test_main.cc:63] 
> Maximum unit test time exceeded (900 sec)
> {noformat}
> {noformat}
> TserverGoesDownDuringRebalancingTest.TserverDown/1: test_main.cc:63] Maximum 
> unit test time exceeded (900 sec)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KUDU-2557) Sometimes the rebalancer-related tests are runing for too long

2019-05-23 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2557:
--
Attachment: kudu-admin-test.2.txt

> Sometimes the rebalancer-related tests are runing for too long
> --
>
> Key: KUDU-2557
> URL: https://issues.apache.org/jira/browse/KUDU-2557
> Project: Kudu
>  Issue Type: Bug
>  Components: CLI, test
>Affects Versions: 1.8.0
>Reporter: Alexey Serbin
>Assignee: Alexey Serbin
>Priority: Minor
>  Labels: CLI, flaky-test, rebalance, test
> Fix For: 1.9.0
>
> Attachments: kudu-admin-test.2.txt
>
>
> The rebalancer-related tests in {{kudu-admin-test}} sometimes gets wild and 
> run for too long.  That's observed in RELEASE builds at least:
> {noformat}
> ConcurrentRebalancersTest.TwoConcurrentRebalancers/1: test_main.cc:63] 
> Maximum unit test time exceeded (900 sec)
> {noformat}
> {noformat}
> TserverGoesDownDuringRebalancingTest.TserverDown/1: test_main.cc:63] Maximum 
> unit test time exceeded (900 sec)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2557) Sometimes the rebalancer-related tests are runing for too long

2019-05-23 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847158#comment-16847158
 ] 

Hao Hao commented on KUDU-2557:
---

I think I saw another instance of this error with debug build. Attached the log

> Sometimes the rebalancer-related tests are runing for too long
> --
>
> Key: KUDU-2557
> URL: https://issues.apache.org/jira/browse/KUDU-2557
> Project: Kudu
>  Issue Type: Bug
>  Components: CLI, test
>Affects Versions: 1.8.0
>Reporter: Alexey Serbin
>Assignee: Alexey Serbin
>Priority: Minor
>  Labels: CLI, flaky-test, rebalance, test
> Fix For: 1.9.0
>
> Attachments: kudu-admin-test.2.txt
>
>
> The rebalancer-related tests in {{kudu-admin-test}} sometimes gets wild and 
> run for too long.  That's observed in RELEASE builds at least:
> {noformat}
> ConcurrentRebalancersTest.TwoConcurrentRebalancers/1: test_main.cc:63] 
> Maximum unit test time exceeded (900 sec)
> {noformat}
> {noformat}
> TserverGoesDownDuringRebalancingTest.TserverDown/1: test_main.cc:63] Maximum 
> unit test time exceeded (900 sec)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KUDU-2779) MasterStressTest is flaky when HMS is enabled

2019-04-26 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2779:
--
Description: 
Encountered failure in master-stress-test.cc when HMS integration is enabled: 
{noformat}
22:30:11.487 [HMS - ERROR - pool-8-thread-2] (HiveAlterHandler.java:341) Failed 
to alter table default.table_1529084adeeb48719dd0a1d18572b357
22:30:11.494 [HMS - ERROR - pool-8-thread-3] (HiveAlterHandler.java:341) Failed 
to alter table default.table_4657eb1f8bbe4b60b03db2cbf07803a3
22:30:11.506 [HMS - ERROR - pool-8-thread-2] (RetryingHMSHandler.java:200) 
MetaException(message:java.lang.IllegalStateException: Event not set up 
correctly)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6189)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4063)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4020)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at com.sun.proxy.$Proxy24.alter_table_with_environment_context(Unknown 
Source)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:11631)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:11615)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:103)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Event not set up correctly
at 
org.apache.hadoop.hive.metastore.messaging.AlterTableMessage.checkValid(AlterTableMessage.java:49)
at 
org.apache.hadoop.hive.metastore.messaging.json.JSONAlterTableMessage.(JSONAlterTableMessage.java:57)
at 
org.apache.hadoop.hive.metastore.messaging.json.JSONMessageFactory.buildAlterTableMessage(JSONMessageFactory.java:115)
at 
org.apache.hive.hcatalog.listener.DbNotificationListener.onAlterTable(DbNotificationListener.java:187)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$8.notify(MetaStoreListenerNotifier.java:107)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:175)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:205)
at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:317)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4049)
... 16 more
Caused by: org.apache.thrift.protocol.TProtocolException: Unexpected character:{
at 
org.apache.thrift.protocol.TJSONProtocol.readJSONSyntaxChar(TJSONProtocol.java:337)
at 
org.apache.thrift.protocol.TJSONProtocol$JSONPairContext.read(TJSONProtocol.java:246)
at 
org.apache.thrift.protocol.TJSONProtocol.readJSONObjectStart(TJSONProtocol.java:793)
at 
org.apache.thrift.protocol.TJSONProtocol.readStructBegin(TJSONProtocol.java:840)
at 
org.apache.hadoop.hive.metastore.api.Table$TableStandardScheme.read(Table.java:1577)
at 
org.apache.hadoop.hive.metastore.api.Table$TableStandardScheme.read(Table.java:1573)
at org.apache.hadoop.hive.metastore.api.Table.read(Table.java:1407)
at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:81)
at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:67)
at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:98)
at 
org.apache.hadoop.hive.metastore.messaging.json.JSONMessageFactory.getTObj(JSONMessageFactory.java:270)
at 
org.apache.hadoop.hive.metastore.messaging.json.JSONAlterTableMessage.getTableObjAfter(JSONAlterTableMessage.java:97)
at 
org.apache.hadoop.hive.metastore.messaging.AlterTableMessage.checkValid(AlterTableMessage.java:41)

[jira] [Resolved] (KUDU-2804) HmsSentryConfigurations/MasterStressTest is flaky

2019-04-26 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2804.
---
   Resolution: Duplicate
Fix Version/s: n/a

> HmsSentryConfigurations/MasterStressTest is flaky
> -
>
> Key: KUDU-2804
> URL: https://issues.apache.org/jira/browse/KUDU-2804
> Project: Kudu
>  Issue Type: Bug
>  Components: hms, master, test
>Affects Versions: 1.10.0
>Reporter: Alexey Serbin
>Priority: Major
> Fix For: n/a
>
> Attachments: master-stress-test.1.txt.xz
>
>
> The {{HmsSentryConfigurations/MasterStressTest}} seems to be a bit flaky if 
> running via dist-test with {{--stress_cpu_threads=16}}.  A snippet of the 
> {{master-stress-test}} binary's output (DEBUG build) is below.  It seems the 
> configuration was 
> {{ HmsMode::ENABLE_METASTORE_INTEGRATION, SentryMode::DISABLED }}.  Also, I'm 
> attaching a full log.
> {noformat}
> I0426 04:05:42.127689   497 rpcz_store.cc:269] Call 
> kudu.master.MasterService.AlterTable from 127.0.0.1:39526 (request call id 
> 112) took 2593ms. Request Metrics: 
> {"HiveMetastore.queue_time_us":1457223,"Hive 
> Metastore.run_cpu_time_us":631,"HiveMetastore.run_wall_time_us":98149}
> F0426 04:05:42.132196   968 master-stress-test.cc:293] Check failed: _s.ok() 
> Bad status: Remote error: failed to alter Hive MetaStore table: TException - 
> service has thrown: MetaException
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2804) HmsSentryConfigurations/MasterStressTest is flaky

2019-04-26 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827242#comment-16827242
 ] 

Hao Hao commented on KUDU-2804:
---

[~aserbin] I think this is tracked in KUDU-2779, which is due to HIVE-19874.

> HmsSentryConfigurations/MasterStressTest is flaky
> -
>
> Key: KUDU-2804
> URL: https://issues.apache.org/jira/browse/KUDU-2804
> Project: Kudu
>  Issue Type: Bug
>  Components: hms, master, test
>Affects Versions: 1.10.0
>Reporter: Alexey Serbin
>Priority: Major
> Attachments: master-stress-test.1.txt.xz
>
>
> The {{HmsSentryConfigurations/MasterStressTest}} seems to be a bit flaky if 
> running via dist-test with {{--stress_cpu_threads=16}}.  A snippet of the 
> {{master-stress-test}} binary's output (DEBUG build) is below.  It seems the 
> configuration was 
> {{ HmsMode::ENABLE_METASTORE_INTEGRATION, SentryMode::DISABLED }}.  Also, I'm 
> attaching a full log.
> {noformat}
> I0426 04:05:42.127689   497 rpcz_store.cc:269] Call 
> kudu.master.MasterService.AlterTable from 127.0.0.1:39526 (request call id 
> 112) took 2593ms. Request Metrics: 
> {"HiveMetastore.queue_time_us":1457223,"Hive 
> Metastore.run_cpu_time_us":631,"HiveMetastore.run_wall_time_us":98149}
> F0426 04:05:42.132196   968 master-stress-test.cc:293] Check failed: _s.ok() 
> Bad status: Remote error: failed to alter Hive MetaStore table: TException - 
> service has thrown: MetaException
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KUDU-2784) MasterSentryTest.TestTableOwnership is flaky

2019-04-22 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao reassigned KUDU-2784:
-

Assignee: Hao Hao

> MasterSentryTest.TestTableOwnership is flaky
> 
>
> Key: KUDU-2784
> URL: https://issues.apache.org/jira/browse/KUDU-2784
> Project: Kudu
>  Issue Type: Test
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
> Attachments: master_sentry-itest.2.txt
>
>
> Encountered a failure in with the following error:
> {noformat}
> W0423 04:49:43.773183  1862 sentry_authz_provider.cc:269] Action  on 
> table  with authorizable scope  is not permitted for 
> user 
> I0423 04:49:43.773447  1862 rpcz_store.cc:269] Call 
> kudu.master.MasterService.DeleteTable from 127.0.0.1:44822 (request call id 
> 6) took 2093ms. Request Metrics: 
> {"Sentry.queue_time_us":33,"Sentry.run_cpu_time_us":390,"Sentry.run_wall_time_us":18856}
> /home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/master_sentry-itest.cc:446:
>  Failure
> Failed
> Bad status: Not authorized: unauthorized action
> {noformat}
> This could be owner privilege hasn't reflected yet for ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2784) MasterSentryTest.TestTableOwnership is flaky

2019-04-22 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2784:
-

 Summary: MasterSentryTest.TestTableOwnership is flaky
 Key: KUDU-2784
 URL: https://issues.apache.org/jira/browse/KUDU-2784
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao
 Attachments: master_sentry-itest.2.txt

Encountered a failure in with the following error:
{noformat}
W0423 04:49:43.773183  1862 sentry_authz_provider.cc:269] Action  on 
table  with authorizable scope  is not permitted for user 

I0423 04:49:43.773447  1862 rpcz_store.cc:269] Call 
kudu.master.MasterService.DeleteTable from 127.0.0.1:44822 (request call id 6) 
took 2093ms. Request Metrics: 
{"Sentry.queue_time_us":33,"Sentry.run_cpu_time_us":390,"Sentry.run_wall_time_us":18856}
/home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/master_sentry-itest.cc:446:
 Failure
Failed
Bad status: Not authorized: unauthorized action
{noformat}

This could be owner privilege hasn't reflected yet for ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2779) MasterStressTest is flaky when HMS is enabled

2019-04-18 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2779:
-

 Summary: MasterStressTest is flaky when HMS is enabled
 Key: KUDU-2779
 URL: https://issues.apache.org/jira/browse/KUDU-2779
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao


Encountered failure in master-stress-test.cc when HMS integration is enabled: 
{noformat}
22:30:11.487 [HMS - ERROR - pool-8-thread-2] (HiveAlterHandler.java:341) Failed 
to alter table default.table_1529084adeeb48719dd0a1d18572b357
22:30:11.494 [HMS - ERROR - pool-8-thread-3] (HiveAlterHandler.java:341) Failed 
to alter table default.table_4657eb1f8bbe4b60b03db2cbf07803a3
22:30:11.506 [HMS - ERROR - pool-8-thread-2] (RetryingHMSHandler.java:200) 
MetaException(message:java.lang.IllegalStateException: Event not set up 
correctly)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6189)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4063)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4020)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at com.sun.proxy.$Proxy24.alter_table_with_environment_context(Unknown 
Source)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:11631)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:11615)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:103)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Event not set up correctly
at 
org.apache.hadoop.hive.metastore.messaging.AlterTableMessage.checkValid(AlterTableMessage.java:49)
at 
org.apache.hadoop.hive.metastore.messaging.json.JSONAlterTableMessage.(JSONAlterTableMessage.java:57)
at 
org.apache.hadoop.hive.metastore.messaging.json.JSONMessageFactory.buildAlterTableMessage(JSONMessageFactory.java:115)
at 
org.apache.hive.hcatalog.listener.DbNotificationListener.onAlterTable(DbNotificationListener.java:187)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$8.notify(MetaStoreListenerNotifier.java:107)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:175)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:205)
at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:317)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4049)
... 16 more
Caused by: org.apache.thrift.protocol.TProtocolException: Unexpected character:{
at 
org.apache.thrift.protocol.TJSONProtocol.readJSONSyntaxChar(TJSONProtocol.java:337)
at 
org.apache.thrift.protocol.TJSONProtocol$JSONPairContext.read(TJSONProtocol.java:246)
at 
org.apache.thrift.protocol.TJSONProtocol.readJSONObjectStart(TJSONProtocol.java:793)
at 
org.apache.thrift.protocol.TJSONProtocol.readStructBegin(TJSONProtocol.java:840)
at 
org.apache.hadoop.hive.metastore.api.Table$TableStandardScheme.read(Table.java:1577)
at 
org.apache.hadoop.hive.metastore.api.Table$TableStandardScheme.read(Table.java:1573)
at org.apache.hadoop.hive.metastore.api.Table.read(Table.java:1407)
at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:81)
at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:67)
at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:98)
at 
org.apache.hadoop.hive.metastore.messaging.json.JSONMessageFactory.getTObj(JSONMessageFactory.java:270)
at 
org.apache.hadoop.hive.metastore.messaging.json.JSONAlterTableMessage.getTableObjAfter(JSONAlterTableMessage.java:97)

[jira] [Updated] (KUDU-2652) TsRecoveryITest.TestNoBlockIDReuseIfMissingBlocks potentially flaky

2019-04-08 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2652:
--
Attachment: ts_recovery-itest.txt

> TsRecoveryITest.TestNoBlockIDReuseIfMissingBlocks potentially flaky
> ---
>
> Key: KUDU-2652
> URL: https://issues.apache.org/jira/browse/KUDU-2652
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Reporter: Mike Percy
>Assignee: Andrew Wong
>Priority: Major
> Attachments: ts_recovery-itest.txt, ts_recovery-itest.txt.gz
>
>
> This test failed for me in a Gerrit pre-commit run with an unrelated change @ 
> [http://jenkins.kudu.apache.org/job/kudu-gerrit/15885]
> The error was:
> {code:java}
> /home/jenkins-slave/workspace/kudu-master/3/src/kudu/integration-tests/ts_recovery-itest.cc:298:
>  Failure
> Value of: !orphaned_block_ids.empty()
>  Actual: false
> Expected: true
> /home/jenkins-slave/workspace/kudu-master/3/src/kudu/util/test_util.cc:323: 
> Failure
> Failed
> Timed out waiting for assertion to pass.
> {code}
> I am attaching the error log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2652) TsRecoveryITest.TestNoBlockIDReuseIfMissingBlocks potentially flaky

2019-04-08 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813006#comment-16813006
 ] 

Hao Hao commented on KUDU-2652:
---

Is this fixed in commit 114792116?

Somehow I am still seeing this error:
{noformat}
data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/integration-tests/ts_recovery-itest.cc:202:
 Failure
Value of: !orphaned_block_ids.empty()
  Actual: false
Expected: true
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/test_util.cc:308:
 Failure
Failed
Timed out waiting for assertion to pass.
{noformat}
Attached the full log.

> TsRecoveryITest.TestNoBlockIDReuseIfMissingBlocks potentially flaky
> ---
>
> Key: KUDU-2652
> URL: https://issues.apache.org/jira/browse/KUDU-2652
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Reporter: Mike Percy
>Assignee: Andrew Wong
>Priority: Major
> Attachments: ts_recovery-itest.txt.gz
>
>
> This test failed for me in a Gerrit pre-commit run with an unrelated change @ 
> [http://jenkins.kudu.apache.org/job/kudu-gerrit/15885]
> The error was:
> {code:java}
> /home/jenkins-slave/workspace/kudu-master/3/src/kudu/integration-tests/ts_recovery-itest.cc:298:
>  Failure
> Value of: !orphaned_block_ids.empty()
>  Actual: false
> Expected: true
> /home/jenkins-slave/workspace/kudu-master/3/src/kudu/util/test_util.cc:323: 
> Failure
> Failed
> Timed out waiting for assertion to pass.
> {code}
> I am attaching the error log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2767) Java test TestAuthTokenReacquire is flaky

2019-04-08 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2767:
-

 Summary: Java test TestAuthTokenReacquire is flaky
 Key: KUDU-2767
 URL: https://issues.apache.org/jira/browse/KUDU-2767
 Project: Kudu
  Issue Type: Bug
  Components: test
Reporter: Hao Hao
 Attachments: test-output.txt

I saw TestAuthTokenReacquire failed with the following error:
{noformat}
Time: 23.362
There was 1 failure:
1) testBasicMasterOperations(org.apache.kudu.client.TestAuthTokenReacquire)
java.lang.AssertionError: test failed: unexpected errors
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.kudu.client.TestAuthTokenReacquire.testBasicMasterOperations(TestAuthTokenReacquire.java:153)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.kudu.test.junit.RetryRule$RetryStatement.doOneAttempt(RetryRule.java:195)
at 
org.apache.kudu.test.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:212)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at org.junit.runner.JUnitCore.runMain(JUnitCore.java:77)
at org.junit.runner.JUnitCore.main(JUnitCore.java:36)

FAILURES!!!
Tests run: 2,  Failures: 1
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2765) tsan failure in ToolTest.TestLoadgenAutoFlushBackgroundRandom

2019-04-05 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2765:
-

 Summary: tsan failure in 
ToolTest.TestLoadgenAutoFlushBackgroundRandom
 Key: KUDU-2765
 URL: https://issues.apache.org/jira/browse/KUDU-2765
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao
 Attachments: kudu-tool-test.0.txt

ToolTest.TestLoadgenAutoFlushBackgroundRandom failed with the following error 
in tsan
{noformat}
==
WARNING: ThreadSanitizer: destroy of a locked mutex (pid=1076)
#0 pthread_rwlock_destroy 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1313
 (kudu+0x4b4474)
#1 glog_internal_namespace_::Mutex::~Mutex() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/glog-0.3.5/src/base/mutex.h:249:30
 (libglog.so.0+0x16488)
#2 cxa_at_exit_wrapper(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:386
 (kudu+0x484803)

  and:
#0 operator delete(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:119
 (kudu+0x523cf1)
#1 google::protobuf::FieldDescriptorProto::~FieldDescriptorProto() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.pb.cc:4916:47
 (libprotobuf.so.14+0x19c3b1)
#2 
google::protobuf::internal::GenericTypeHandler::Delete(google::protobuf::FieldDescriptorProto*,
 google::protobuf::Arena*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:615:7
 (libprotobuf.so.14+0x1973b1)
#3 void 
google::protobuf::internal::RepeatedPtrFieldBase::Destroy::TypeHandler>()
 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:1429
 (libprotobuf.so.14+0x1973b1)
#4 
google::protobuf::RepeatedPtrField::~RepeatedPtrField()
 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:1892
 (libprotobuf.so.14+0x1973b1)
#5 google::protobuf::DescriptorProto::~DescriptorProto() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.pb.cc:3528
 (libprotobuf.so.14+0x1973b1)
#6 google::protobuf::DescriptorProto::~DescriptorProto() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.pb.cc:3525:37
 (libprotobuf.so.14+0x197519)
#7 
google::protobuf::internal::GenericTypeHandler::Delete(google::protobuf::DescriptorProto*,
 google::protobuf::Arena*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:615:7
 (libprotobuf.so.14+0x18e8c1)
#8 void 
google::protobuf::internal::RepeatedPtrFieldBase::Destroy::TypeHandler>()
 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:1429
 (libprotobuf.so.14+0x18e8c1)
#9 
google::protobuf::RepeatedPtrField::~RepeatedPtrField()
 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/repeated_field.h:1892
 (libprotobuf.so.14+0x18e8c1)
#10 google::protobuf::FileDescriptorProto::~FileDescriptorProto() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.pb.cc:1426
 (libprotobuf.so.14+0x18e8c1)
#11 google::protobuf::EncodedDescriptorDatabase::Add(void const*, int) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor_database.cc:322:1
 (libprotobuf.so.14+0x182dcd)
#12 google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, 
int) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/descriptor.cc:1315:3
 (libprotobuf.so.14+0x13b705)
#13 google::protobuf::protobuf_google_2fprotobuf_2ftype_2eproto::(anonymous 
namespace)::AddDescriptorsImpl() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/type.pb.cc:240:3
 (libprotobuf.so.14+0x237c10)
#14 google::protobuf::internal::FunctionClosure0::Run() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/stubs/callback.h:129:5
 (libprotobuf.so.14+0xd330b)
#15 google::protobuf::GoogleOnceInitImpl(long*, google::protobuf::Closure*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/protobuf-3.4.1/src/google/protobuf/stubs/once.cc:83:14
 (libprotobuf.so.14+0xd5d6a)
#16 google::protobuf::GoogleOnceInit(long*, void (*)())

[jira] [Created] (KUDU-2764) Timeout in sentry_authz_provider-test

2019-04-05 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2764:
-

 Summary: Timeout in sentry_authz_provider-test
 Key: KUDU-2764
 URL: https://issues.apache.org/jira/browse/KUDU-2764
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao


I encountered sentry_authz_provider-test timeout with the following error:
{noformat}
130/379 Test #225: compaction_policy-test ..   Passed
2.93 sec
Start 226: composite-pushdown-test
131/379 Test #188: sentry_authz_provider-test ..***Timeout 
930.23 sec
Start 227: delta_compaction-test
132/379 Test #227: delta_compaction-test ...   Passed
1.81 sec
...
The following tests FAILED:
188 - sentry_authz_provider-test (Timeout)
Errors while running CTest
+ TESTS_FAILED=1
{noformat}

We should probably improve the run time of the test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2718) master_failover-itest when HMS is enabled is flaky

2019-04-04 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810218#comment-16810218
 ] 

Hao Hao commented on KUDU-2718:
---

DropTable in HMS is a synchronous call, so I think it should be reflected 
immediately if we succeeded in dropping the table. It may be DropTable hasn't 
taken place before CreateTable being retried? But I don't know how DropTable in 
HMS can take up to ~2 mins.

I also looped the [test 2000 
times|http://dist-test.cloudera.org/job?job_id=hao.hao.1554350086.94333], but 
failed to reproduce the reported error here. Instead I encountered error as 
{noformat}/data/1/hao/kudu/src/kudu/integration-tests/master_failover-itest.cc:460:
 Failure
Failed
Bad status: Invalid argument: Error creating table default.table_0 on the 
master: not enough live tablet servers to create a table with the requested 
replication factor 3; 2 tablet servers are alive{noformat}
which seems to be the issue described in KUDU-1358. Without the fix for 
KUDU-1358, to deflake we can retry upon such error.

> master_failover-itest when HMS is enabled is flaky
> --
>
> Key: KUDU-2718
> URL: https://issues.apache.org/jira/browse/KUDU-2718
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.9.0
>Reporter: Adar Dembo
>Assignee: Hao Hao
>Priority: Major
> Attachments: master_failover-itest.1.txt
>
>
> This was a failure in 
> HmsConfigurations/MasterFailoverTest.TestDeleteTableSync/1, where GetParam() 
> = 2, but it's likely possible in every multi-master test with HMS integration 
> enabled.
> It looks like there was a leader master election at the time that the client 
> tried to create the table being tested. The master managed to create the 
> table in HMS, but then there was a failure replicating in Raft because 
> another master was elected leader. So the client retried the request on a 
> different master, but the HMS piece of CreateTable failed because the HMS 
> already knew about the table.
> Thing is, there's code to roll back the HMS table creation if this happens, 
> so I don't see why the retried CreateTable failed at the HMS with "table 
> already exists". Perhaps this is a case where even though we succeeded in 
> dropping the table from HMS, it doesn't reflect that immediately?
> I'm attaching the full log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-2757) Retry OpenSSL downloads

2019-04-03 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2757.
---
   Resolution: Fixed
 Assignee: Hao Hao
Fix Version/s: 1.10.0

Fixed in commit 984a3e1a1.

> Retry OpenSSL downloads
> ---
>
> Key: KUDU-2757
> URL: https://issues.apache.org/jira/browse/KUDU-2757
> Project: Kudu
>  Issue Type: Bug
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
> Fix For: 1.10.0
>
>
> KUDU-2528 added retry for downloading thirdparty, however, we do retry in 
> downloaing the OpenSSL RPMs. Here's an example:
> {noformat}
> Building on el6: installing OpenSSL from CentOS 6.4.
> Fetching openssl-1.0.0-27.el6.x86_64.rpm from 
> http://d3dr9sfxru4sde.cloudfront.net/openssl-1.0.0-27.el6.x86_64.rpm
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
>   0 1392k0 00 0  0  0 --:--:--  0:00:01 --:--:-- 0
>   0 1392k0  14480 0805  0  0:29:30  0:00:01  0:29:29   808
>   0 1392k0  14480 0517  0  0:45:57  0:00:02  0:45:55   518
>   0 1392k0  14480 0381  0  1:02:21  0:00:03  1:02:18   381
>   0 1392k0  14480 0301  0  1:18:56  0:00:04  1:18:52   301
>   0 1392k0  14480 0249  0  1:35:25  0:00:05  1:35:20   317
>   0 1392k0  14480 0212  0  1:52:04  0:00:06  1:51:58 0
>   0 1392k0  14480 0185  0  2:08:25  0:00:07  2:08:18 0
>   0 1392k0  17040 0199  0  1:59:23  0:00:08  1:59:1554
>   0 1392k0  22160 0232  0  1:42:24  0:00:09  1:42:15   161
>   0 1392k0  22160 0210  0  1:53:08  0:00:10  1:52:58   161
>   0 1392k0  22160 0191  0  2:04:23  0:00:11  2:04:12   161
>   0 1392k0  23800 0187  0  2:07:03  0:00:12  2:06:51   190
>   0 1392k0  35440 0252  0  1:34:17  0:00:14  1:34:03   334
>   0 1392k0  36160 0247  0  1:36:11  0:00:14  1:35:57   276
>   0 1392k0  40480 0260  0  1:31:23  0:00:15  1:31:08   366
>   0 1392k0  50560 0302  0  1:18:40  0:00:16  1:18:24   547
>   0 1392k0  61000 0344  0  1:09:04  0:00:17  1:08:47   743
>   0 1392k0  67840 0359  0  1:06:10  0:00:18  1:05:52   670
>   0 1392k0  70360 0356  0  1:06:44  0:00:19  1:06:25   665
>   0 1392k0  80080 0386  0  1:01:33  0:00:20  1:01:13   764
>   0 1392k0  85120 0387  0  1:01:23  0:00:21  1:01:02   657
>   0 1392k0  86560 0384  0  1:01:52  0:00:22  1:01:30   529
>   0 1392k0 111000 0469  0  0:50:39  0:00:23  0:50:16   900
>   0 1392k0 115680 0462  0  0:51:25  0:00:24  0:51:01   862
>   0 1392k0 122160 0478  0  0:49:42  0:00:25  0:49:17   875
>   0 1392k0 124680 0465  0  0:51:05  0:00:26  0:50:39   829
>   0 1392k0 127560 0462  0  0:51:25  0:00:27  0:50:58   811
>   0 1392k0 136200 0470  0  0:50:33  0:00:28  0:50:05   473
>   0 1392k0 138000 0465  0  0:51:05  0:00:29  0:50:36   479
>   1 1392k1 147360 0481  0  0:49:23  0:00:30  0:48:53   493
>   1 1392k1 165400 0522  0  0:45:30  0:00:31  0:44:59   832
>   1 1392k1 182760 0559  0  0:42:30  0:00:32  0:41:58  1084
>   1 1392k1 187080 0550  0  0:43:11  0:00:34  0:42:37  1009
>   1 1392k1 187440 0540  0  0:43:59  0:00:34  0:43:25   984
>   1 1392k1 196880 0543  0  0:43:45  0:00:36  0:43:09   887
>   1 1392k1 197240 0536  0  0:44:19  0:00:36  0:43:43   621
>   1 1392k1 200840 0532  0  0:44:39  0:00:37  0:44:02   360
>   1 1392k1 207040 0536  0  0:44:19  0:00:38  0:43:41   432
>   1 1392k1 216040 0545  0  0:43:35  0:00:39  0:42:56   580
>   1 1392k1 237800 0586  0  0:40:32  0:00:40  0:39:52   947
>   1 1392k1 251480 0602  0  0:39:28  0:00:41  0:38:47  1097
>   1 1392k1 258320 0607  0  0:39:08  0:00:42  0:38:26  1187
>   1 1392k1 263880 0606  0  0:39:12  0:00:43  0:38:29  1157
>   1 1392k1 268560 0602  0  0:39:28  0:00:44  0:38:44  1059
>   1 1392k1 270720 0594  0  0:39:59  0:00:45  0:39:14   654
>   1 1392k1 273240 0587  0  0:40:28  0:00:46  0:39:42   451
>

[jira] [Updated] (KUDU-2760) kudu-tool-test.TestCopyTable is flaky

2019-04-03 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2760:
--
Attachment: kudu-tool-test.0.txt

> kudu-tool-test.TestCopyTable is flaky
> -
>
> Key: KUDU-2760
> URL: https://issues.apache.org/jira/browse/KUDU-2760
> Project: Kudu
>  Issue Type: Bug
>Reporter: Hao Hao
>Priority: Major
> Attachments: kudu-tool-test.0.txt
>
>
> Encountered a failure of TestCopyTable in kudu-tool-test.cc with the 
> following error:
> {noformat}
> I0403 04:31:25.63  7865 catalog_manager.cc:3977] T 
> 216c0526bd944c7da8c2a62cabe430ba P c80b8c54e90346559bbb413ebdb7d08f reported 
> cstate change: term changed from 0 to 1, leader changed from  to 
> c80b8c54e90346559bbb413ebdb7d08f (127.3.136.65). New cstate: current_term: 1 
> leader_uuid: "c80b8c54e90346559bbb413ebdb7d08f" committed_config { 
> opid_index: -1 OBSOLETE_local: true peers { permanent_uuid: 
> "c80b8c54e90346559bbb413ebdb7d08f" member_type: VOTER last_known_addr { host: 
> "127.3.136.65" port: 35451 } health_report { overall_health: HEALTHY } } }
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:582:
>  Failure
> Value of: dst_line
> Expected: has no substring "key"
>   Actual: "(int32 key=151, int32 int_val=-325474784, string 
> string_val=\"7ca8cde3dcca640a\")" (of type std::string)
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:3067:
>  Failure
> Expected: RunCopyTableCheck(arg) doesn't generate new fatal failures in the 
> current thread.
>   Actual: it does.
> {noformat}
> Attached the full log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KUDU-2757) Retry OpenSSL downloads

2019-04-02 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2757:
--
Code Review: https://gerrit.cloudera.org/#/c/12913/  (was: 
https://gerrit.cloudera.org/#/c/8313/)

> Retry OpenSSL downloads
> ---
>
> Key: KUDU-2757
> URL: https://issues.apache.org/jira/browse/KUDU-2757
> Project: Kudu
>  Issue Type: Bug
>Reporter: Hao Hao
>Priority: Major
>
> KUDU-2528 added retry for downloading thirdparty, however, we do retry in 
> downloaing the OpenSSL RPMs. Here's an example:
> {noformat}
> Building on el6: installing OpenSSL from CentOS 6.4.
> Fetching openssl-1.0.0-27.el6.x86_64.rpm from 
> http://d3dr9sfxru4sde.cloudfront.net/openssl-1.0.0-27.el6.x86_64.rpm
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
>   0 1392k0 00 0  0  0 --:--:--  0:00:01 --:--:-- 0
>   0 1392k0  14480 0805  0  0:29:30  0:00:01  0:29:29   808
>   0 1392k0  14480 0517  0  0:45:57  0:00:02  0:45:55   518
>   0 1392k0  14480 0381  0  1:02:21  0:00:03  1:02:18   381
>   0 1392k0  14480 0301  0  1:18:56  0:00:04  1:18:52   301
>   0 1392k0  14480 0249  0  1:35:25  0:00:05  1:35:20   317
>   0 1392k0  14480 0212  0  1:52:04  0:00:06  1:51:58 0
>   0 1392k0  14480 0185  0  2:08:25  0:00:07  2:08:18 0
>   0 1392k0  17040 0199  0  1:59:23  0:00:08  1:59:1554
>   0 1392k0  22160 0232  0  1:42:24  0:00:09  1:42:15   161
>   0 1392k0  22160 0210  0  1:53:08  0:00:10  1:52:58   161
>   0 1392k0  22160 0191  0  2:04:23  0:00:11  2:04:12   161
>   0 1392k0  23800 0187  0  2:07:03  0:00:12  2:06:51   190
>   0 1392k0  35440 0252  0  1:34:17  0:00:14  1:34:03   334
>   0 1392k0  36160 0247  0  1:36:11  0:00:14  1:35:57   276
>   0 1392k0  40480 0260  0  1:31:23  0:00:15  1:31:08   366
>   0 1392k0  50560 0302  0  1:18:40  0:00:16  1:18:24   547
>   0 1392k0  61000 0344  0  1:09:04  0:00:17  1:08:47   743
>   0 1392k0  67840 0359  0  1:06:10  0:00:18  1:05:52   670
>   0 1392k0  70360 0356  0  1:06:44  0:00:19  1:06:25   665
>   0 1392k0  80080 0386  0  1:01:33  0:00:20  1:01:13   764
>   0 1392k0  85120 0387  0  1:01:23  0:00:21  1:01:02   657
>   0 1392k0  86560 0384  0  1:01:52  0:00:22  1:01:30   529
>   0 1392k0 111000 0469  0  0:50:39  0:00:23  0:50:16   900
>   0 1392k0 115680 0462  0  0:51:25  0:00:24  0:51:01   862
>   0 1392k0 122160 0478  0  0:49:42  0:00:25  0:49:17   875
>   0 1392k0 124680 0465  0  0:51:05  0:00:26  0:50:39   829
>   0 1392k0 127560 0462  0  0:51:25  0:00:27  0:50:58   811
>   0 1392k0 136200 0470  0  0:50:33  0:00:28  0:50:05   473
>   0 1392k0 138000 0465  0  0:51:05  0:00:29  0:50:36   479
>   1 1392k1 147360 0481  0  0:49:23  0:00:30  0:48:53   493
>   1 1392k1 165400 0522  0  0:45:30  0:00:31  0:44:59   832
>   1 1392k1 182760 0559  0  0:42:30  0:00:32  0:41:58  1084
>   1 1392k1 187080 0550  0  0:43:11  0:00:34  0:42:37  1009
>   1 1392k1 187440 0540  0  0:43:59  0:00:34  0:43:25   984
>   1 1392k1 196880 0543  0  0:43:45  0:00:36  0:43:09   887
>   1 1392k1 197240 0536  0  0:44:19  0:00:36  0:43:43   621
>   1 1392k1 200840 0532  0  0:44:39  0:00:37  0:44:02   360
>   1 1392k1 207040 0536  0  0:44:19  0:00:38  0:43:41   432
>   1 1392k1 216040 0545  0  0:43:35  0:00:39  0:42:56   580
>   1 1392k1 237800 0586  0  0:40:32  0:00:40  0:39:52   947
>   1 1392k1 251480 0602  0  0:39:28  0:00:41  0:38:47  1097
>   1 1392k1 258320 0607  0  0:39:08  0:00:42  0:38:26  1187
>   1 1392k1 263880 0606  0  0:39:12  0:00:43  0:38:29  1157
>   1 1392k1 268560 0602  0  0:39:28  0:00:44  0:38:44  1059
>   1 1392k1 270720 0594  0  0:39:59  0:00:45  0:39:14   654
>   1 1392k1 273240 0587  0  0:40:28  0:00:46  0:39:42   451
>   1 1392k1 274680 0576  0  0:41:14  0:00:47

[jira] [Updated] (KUDU-2757) Retry OpenSSL downloads

2019-04-02 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2757:
--
Code Review: https://gerrit.cloudera.org/#/c/8313/

> Retry OpenSSL downloads
> ---
>
> Key: KUDU-2757
> URL: https://issues.apache.org/jira/browse/KUDU-2757
> Project: Kudu
>  Issue Type: Bug
>Reporter: Hao Hao
>Priority: Major
>
> KUDU-2528 added retry for downloading thirdparty, however, we do retry in 
> downloaing the OpenSSL RPMs. Here's an example:
> {noformat}
> Building on el6: installing OpenSSL from CentOS 6.4.
> Fetching openssl-1.0.0-27.el6.x86_64.rpm from 
> http://d3dr9sfxru4sde.cloudfront.net/openssl-1.0.0-27.el6.x86_64.rpm
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
>   0 1392k0 00 0  0  0 --:--:--  0:00:01 --:--:-- 0
>   0 1392k0  14480 0805  0  0:29:30  0:00:01  0:29:29   808
>   0 1392k0  14480 0517  0  0:45:57  0:00:02  0:45:55   518
>   0 1392k0  14480 0381  0  1:02:21  0:00:03  1:02:18   381
>   0 1392k0  14480 0301  0  1:18:56  0:00:04  1:18:52   301
>   0 1392k0  14480 0249  0  1:35:25  0:00:05  1:35:20   317
>   0 1392k0  14480 0212  0  1:52:04  0:00:06  1:51:58 0
>   0 1392k0  14480 0185  0  2:08:25  0:00:07  2:08:18 0
>   0 1392k0  17040 0199  0  1:59:23  0:00:08  1:59:1554
>   0 1392k0  22160 0232  0  1:42:24  0:00:09  1:42:15   161
>   0 1392k0  22160 0210  0  1:53:08  0:00:10  1:52:58   161
>   0 1392k0  22160 0191  0  2:04:23  0:00:11  2:04:12   161
>   0 1392k0  23800 0187  0  2:07:03  0:00:12  2:06:51   190
>   0 1392k0  35440 0252  0  1:34:17  0:00:14  1:34:03   334
>   0 1392k0  36160 0247  0  1:36:11  0:00:14  1:35:57   276
>   0 1392k0  40480 0260  0  1:31:23  0:00:15  1:31:08   366
>   0 1392k0  50560 0302  0  1:18:40  0:00:16  1:18:24   547
>   0 1392k0  61000 0344  0  1:09:04  0:00:17  1:08:47   743
>   0 1392k0  67840 0359  0  1:06:10  0:00:18  1:05:52   670
>   0 1392k0  70360 0356  0  1:06:44  0:00:19  1:06:25   665
>   0 1392k0  80080 0386  0  1:01:33  0:00:20  1:01:13   764
>   0 1392k0  85120 0387  0  1:01:23  0:00:21  1:01:02   657
>   0 1392k0  86560 0384  0  1:01:52  0:00:22  1:01:30   529
>   0 1392k0 111000 0469  0  0:50:39  0:00:23  0:50:16   900
>   0 1392k0 115680 0462  0  0:51:25  0:00:24  0:51:01   862
>   0 1392k0 122160 0478  0  0:49:42  0:00:25  0:49:17   875
>   0 1392k0 124680 0465  0  0:51:05  0:00:26  0:50:39   829
>   0 1392k0 127560 0462  0  0:51:25  0:00:27  0:50:58   811
>   0 1392k0 136200 0470  0  0:50:33  0:00:28  0:50:05   473
>   0 1392k0 138000 0465  0  0:51:05  0:00:29  0:50:36   479
>   1 1392k1 147360 0481  0  0:49:23  0:00:30  0:48:53   493
>   1 1392k1 165400 0522  0  0:45:30  0:00:31  0:44:59   832
>   1 1392k1 182760 0559  0  0:42:30  0:00:32  0:41:58  1084
>   1 1392k1 187080 0550  0  0:43:11  0:00:34  0:42:37  1009
>   1 1392k1 187440 0540  0  0:43:59  0:00:34  0:43:25   984
>   1 1392k1 196880 0543  0  0:43:45  0:00:36  0:43:09   887
>   1 1392k1 197240 0536  0  0:44:19  0:00:36  0:43:43   621
>   1 1392k1 200840 0532  0  0:44:39  0:00:37  0:44:02   360
>   1 1392k1 207040 0536  0  0:44:19  0:00:38  0:43:41   432
>   1 1392k1 216040 0545  0  0:43:35  0:00:39  0:42:56   580
>   1 1392k1 237800 0586  0  0:40:32  0:00:40  0:39:52   947
>   1 1392k1 251480 0602  0  0:39:28  0:00:41  0:38:47  1097
>   1 1392k1 258320 0607  0  0:39:08  0:00:42  0:38:26  1187
>   1 1392k1 263880 0606  0  0:39:12  0:00:43  0:38:29  1157
>   1 1392k1 268560 0602  0  0:39:28  0:00:44  0:38:44  1059
>   1 1392k1 270720 0594  0  0:39:59  0:00:45  0:39:14   654
>   1 1392k1 273240 0587  0  0:40:28  0:00:46  0:39:42   451
>   1 1392k1 274680 0576  0  0:41:14  0:00:47  0:40:27   319
>   1 1392k1 275400 0

[jira] [Created] (KUDU-2757) Retry OpenSSL downloads

2019-04-02 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2757:
-

 Summary: Retry OpenSSL downloads
 Key: KUDU-2757
 URL: https://issues.apache.org/jira/browse/KUDU-2757
 Project: Kudu
  Issue Type: Bug
Reporter: Hao Hao


KUDU-2528 added retry for downloading thirdparty, however, we do retry in 
downloaing the OpenSSL RPMs. Here's an example:
{noformat}
Building on el6: installing OpenSSL from CentOS 6.4.
Fetching openssl-1.0.0-27.el6.x86_64.rpm from 
http://d3dr9sfxru4sde.cloudfront.net/openssl-1.0.0-27.el6.x86_64.rpm
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed

  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
  0 1392k0 00 0  0  0 --:--:--  0:00:01 --:--:-- 0
  0 1392k0  14480 0805  0  0:29:30  0:00:01  0:29:29   808
  0 1392k0  14480 0517  0  0:45:57  0:00:02  0:45:55   518
  0 1392k0  14480 0381  0  1:02:21  0:00:03  1:02:18   381
  0 1392k0  14480 0301  0  1:18:56  0:00:04  1:18:52   301
  0 1392k0  14480 0249  0  1:35:25  0:00:05  1:35:20   317
  0 1392k0  14480 0212  0  1:52:04  0:00:06  1:51:58 0
  0 1392k0  14480 0185  0  2:08:25  0:00:07  2:08:18 0
  0 1392k0  17040 0199  0  1:59:23  0:00:08  1:59:1554
  0 1392k0  22160 0232  0  1:42:24  0:00:09  1:42:15   161
  0 1392k0  22160 0210  0  1:53:08  0:00:10  1:52:58   161
  0 1392k0  22160 0191  0  2:04:23  0:00:11  2:04:12   161
  0 1392k0  23800 0187  0  2:07:03  0:00:12  2:06:51   190
  0 1392k0  35440 0252  0  1:34:17  0:00:14  1:34:03   334
  0 1392k0  36160 0247  0  1:36:11  0:00:14  1:35:57   276
  0 1392k0  40480 0260  0  1:31:23  0:00:15  1:31:08   366
  0 1392k0  50560 0302  0  1:18:40  0:00:16  1:18:24   547
  0 1392k0  61000 0344  0  1:09:04  0:00:17  1:08:47   743
  0 1392k0  67840 0359  0  1:06:10  0:00:18  1:05:52   670
  0 1392k0  70360 0356  0  1:06:44  0:00:19  1:06:25   665
  0 1392k0  80080 0386  0  1:01:33  0:00:20  1:01:13   764
  0 1392k0  85120 0387  0  1:01:23  0:00:21  1:01:02   657
  0 1392k0  86560 0384  0  1:01:52  0:00:22  1:01:30   529
  0 1392k0 111000 0469  0  0:50:39  0:00:23  0:50:16   900
  0 1392k0 115680 0462  0  0:51:25  0:00:24  0:51:01   862
  0 1392k0 122160 0478  0  0:49:42  0:00:25  0:49:17   875
  0 1392k0 124680 0465  0  0:51:05  0:00:26  0:50:39   829
  0 1392k0 127560 0462  0  0:51:25  0:00:27  0:50:58   811
  0 1392k0 136200 0470  0  0:50:33  0:00:28  0:50:05   473
  0 1392k0 138000 0465  0  0:51:05  0:00:29  0:50:36   479
  1 1392k1 147360 0481  0  0:49:23  0:00:30  0:48:53   493
  1 1392k1 165400 0522  0  0:45:30  0:00:31  0:44:59   832
  1 1392k1 182760 0559  0  0:42:30  0:00:32  0:41:58  1084
  1 1392k1 187080 0550  0  0:43:11  0:00:34  0:42:37  1009
  1 1392k1 187440 0540  0  0:43:59  0:00:34  0:43:25   984
  1 1392k1 196880 0543  0  0:43:45  0:00:36  0:43:09   887
  1 1392k1 197240 0536  0  0:44:19  0:00:36  0:43:43   621
  1 1392k1 200840 0532  0  0:44:39  0:00:37  0:44:02   360
  1 1392k1 207040 0536  0  0:44:19  0:00:38  0:43:41   432
  1 1392k1 216040 0545  0  0:43:35  0:00:39  0:42:56   580
  1 1392k1 237800 0586  0  0:40:32  0:00:40  0:39:52   947
  1 1392k1 251480 0602  0  0:39:28  0:00:41  0:38:47  1097
  1 1392k1 258320 0607  0  0:39:08  0:00:42  0:38:26  1187
  1 1392k1 263880 0606  0  0:39:12  0:00:43  0:38:29  1157
  1 1392k1 268560 0602  0  0:39:28  0:00:44  0:38:44  1059
  1 1392k1 270720 0594  0  0:39:59  0:00:45  0:39:14   654
  1 1392k1 273240 0587  0  0:40:28  0:00:46  0:39:42   451
  1 1392k1 274680 0576  0  0:41:14  0:00:47  0:40:27   319
  1 1392k1 275400 0561  0  0:42:21  0:00:49  0:41:32   208
  1 1392k1 276480 0551  0  0:43:07  0:00:50  0:42:17   142
  1 1392k1 277560 0546  0  0:43:30  0:00:50  0:42:40   131
  1 1392k1 277920 0533  0  0:44:34  0:00:52  0:43:4284
  1 1392k1 277920 0523  0  0:45:25

[jira] [Created] (KUDU-2756) RemoteKsckTest.TestClusterWithLocation failed with master consensus conflicts

2019-04-02 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2756:
-

 Summary: RemoteKsckTest.TestClusterWithLocation failed with master 
consensus conflicts
 Key: KUDU-2756
 URL: https://issues.apache.org/jira/browse/KUDU-2756
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao
 Attachments: ksck_remote-test.txt

RemoteKsckTest.TestClusterWithLocation is still flaky after fix from KUDU-2748 
and failed with the following error.

{noformat}
I0401 16:42:06.135743 18496 sys_catalog.cc:340] T 
 P 1afc84687f934a5a8055897bbf6c2a92 
[sys.catalog]: This master's current role is: LEADER
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/ksck_remote-test.cc:542:
 Failure
Failed
Bad status: Corruption: there are master consensus conflicts
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/test_util.cc:326:
 Failure
Failed
Timed out waiting for assertion to pass.
I0401 16:42:35.964449 12160 tablet_server.cc:165] TabletServer shutting down...
{noformat}

Attached the full log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2723) TsLocationAssignmentITest.Basic is flaky

2019-03-01 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2723:
-

 Summary: TsLocationAssignmentITest.Basic is flaky
 Key: KUDU-2723
 URL: https://issues.apache.org/jira/browse/KUDU-2723
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao
 Attachments: location_assignment-itest.txt

I encountered a failure of location_assignment-itest with the following errors:

{noformat}
/data/1/hao/kudu/src/kudu/integration-tests/location_assignment-itest.cc:185: 
Failure
Failed
Bad status: Timed out: ListTabletServers RPC failed: ListTabletServers RPC to 
127.0.22.190:34129 timed out after 15.000s (SENT)
/data/1/hao/kudu/src/kudu/integration-tests/location_assignment-itest.cc:217: 
Failure
Expected: StartCluster() doesn't generate new fatal failures in the current 
thread.
  Actual: it does.
{noformat}

Attached the full log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KUDU-2718) master_failover-itest when HMS is enabled is flaky

2019-02-27 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao reassigned KUDU-2718:
-

Assignee: Hao Hao

> master_failover-itest when HMS is enabled is flaky
> --
>
> Key: KUDU-2718
> URL: https://issues.apache.org/jira/browse/KUDU-2718
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.9.0
>Reporter: Adar Dembo
>Assignee: Hao Hao
>Priority: Major
> Attachments: master_failover-itest.1.txt
>
>
> This was a failure in 
> HmsConfigurations/MasterFailoverTest.TestDeleteTableSync/1, where GetParam() 
> = 2, but it's likely possible in every multi-master test with HMS integration 
> enabled.
> It looks like there was a leader master election at the time that the client 
> tried to create the table being tested. The master managed to create the 
> table in HMS, but then there was a failure replicating in Raft because 
> another master was elected leader. So the client retried the request on a 
> different master, but the HMS piece of CreateTable failed because the HMS 
> already knew about the table.
> Thing is, there's code to roll back the HMS table creation if this happens, 
> so I don't see why the retried CreateTable failed at the HMS with "table 
> already exists". Perhaps this is a case where even though we succeeded in 
> dropping the table from HMS, it doesn't reflect that immediately?
> I'm attaching the full log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2703) ITClientStress.testMultipleSessions timeout

2019-02-19 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2703:
-

 Summary: ITClientStress.testMultipleSessions timeout
 Key: KUDU-2703
 URL: https://issues.apache.org/jira/browse/KUDU-2703
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao
 Attachments: TEST-org.apache.kudu.client.ITClientStress.xml

I recently encountered a timeout of ITClientStress.testMultipleSessions with 
the following errors:
{noformat}
01:28:10.477 [INFO - cluster stderr printer] (MiniKuduCluster.java:543) I0219 
01:28:10.477150  7258 mvcc.cc:203] Tried to move safe_time back from 
6351128536717332480 to 6351128535804997632. Current Snapshot: 
MvccSnapshot[committed={T|T < 6351128536717332480}]
01:28:10.495 [INFO - cluster stderr printer] (MiniKuduCluster.java:543) I0219 
01:28:10.495507  7259 mvcc.cc:203] Tried to move safe_time back from 
6351128536717332480 to 6351128535804997632. Current Snapshot: 
MvccSnapshot[committed={T|T < 6351128536717332480}]
01:28:11.180 [INFO - cluster stderr printer] (MiniKuduCluster.java:543) I0219 
01:28:11.180346  7257 mvcc.cc:203] Tried to move safe_time back from 
6351128539811692544 to 6351128535804997632. Current Snapshot: 
MvccSnapshot[committed={T|T < 6351128539811692544 or (T in 
{6351128539811692544})}]
01:28:19.969 [DEBUG - New I/O worker #2152] (Connection.java:429) [peer 
master-127.6.95.253:51702(127.6.95.253:51702)] encountered a read timeout; 
closing the channel
01:28:19.969 [DEBUG - New I/O worker #2154] (Connection.java:429) [peer 
master-127.6.95.254:34354(127.6.95.254:34354)] encountered a read timeout; 
closing the channel
01:28:19.969 [DEBUG - New I/O worker #2154] (Connection.java:688) [peer 
master-127.6.95.254:34354(127.6.95.254:34354)] cleaning up while in state READY 
due to: [peer master-127.6.95.254:34354(127.6.95.254:34354)] encountered a read 
timeout; closing the channel
01:28:19.969 [DEBUG - New I/O worker #2152] (Connection.java:688) [peer 
master-127.6.95.253:51702(127.6.95.253:51702)] cleaning up while in state READY 
due to: [peer master-127.6.95.253:51702(127.6.95.253:51702)] encountered a read 
timeout; closing the channel
01:28:20.328 [DEBUG - New I/O worker #2153] (Connection.java:429) [peer 
master-127.6.95.252:47527(127.6.95.252:47527)] encountered a read timeout; 
closing the channel
{noformat}

Looking at the error, it may relate to safe time advancement. Attached the full 
log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky

2019-01-28 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754464#comment-16754464
 ] 

Hao Hao commented on KUDU-2668:
---

Fixed in commit c7c902a08.

> TestKuduClient.readYourWrites tests are flaky
> -
>
> Key: KUDU-2668
> URL: https://issues.apache.org/jira/browse/KUDU-2668
> Project: Kudu
>  Issue Type: Bug
>  Components: java, test
>Affects Versions: 1.9.0
>Reporter: Adar Dembo
>Assignee: Hao Hao
>Priority: Critical
>
> I looped TestKuduClient 1000 times in dist-test while working on another 
> problem, and saw the following failures:
> {noformat}
> 1 testReadYourWritesBatchLeaderReplica
> 14 testReadYourWritesSyncClosestReplica
> 15 testReadYourWritesSyncLeaderReplica
> {noformat}
> In all cases, the stack trace of the failure was effectively this:
> {noformat}
> java.util.concurrent.ExecutionException: java.lang.AssertionError
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113)
> ...
> Caused by: java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055)
> ...
> {noformat}
> The offending lines:
> {code}
>   AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table)
>   .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES)
>   .replicaSelection(replicaSelection)
>   .build();
>   KuduScanner syncScanner = new KuduScanner(scanner);
>   long preTs = asyncClient.getLastPropagatedTimestamp();
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP,
>   asyncClient.getLastPropagatedTimestamp());
>   long row_count = countRowsInScan(syncScanner);
>   long expected_count = 100L * (i + 1);
>   assertTrue(expected_count <= row_count);
>   // After the scan, verify that the chosen snapshot timestamp is
>   // returned from the server and it is larger than the previous
>   // propagated timestamp.
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, 
> scanner.getSnapshotTimestamp());
> -->   assertTrue(preTs < scanner.getSnapshotTimestamp());
> {code}
> It's possible that this is just test flakiness, but I'm setting a higher 
> priority so we can understand whether that's the case, or whether there's 
> something wrong with read-your-writes scans.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky

2019-01-28 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2668.
---
   Resolution: Fixed
Fix Version/s: 1.9.0

> TestKuduClient.readYourWrites tests are flaky
> -
>
> Key: KUDU-2668
> URL: https://issues.apache.org/jira/browse/KUDU-2668
> Project: Kudu
>  Issue Type: Bug
>  Components: java, test
>Affects Versions: 1.9.0
>Reporter: Adar Dembo
>Assignee: Hao Hao
>Priority: Critical
> Fix For: 1.9.0
>
>
> I looped TestKuduClient 1000 times in dist-test while working on another 
> problem, and saw the following failures:
> {noformat}
> 1 testReadYourWritesBatchLeaderReplica
> 14 testReadYourWritesSyncClosestReplica
> 15 testReadYourWritesSyncLeaderReplica
> {noformat}
> In all cases, the stack trace of the failure was effectively this:
> {noformat}
> java.util.concurrent.ExecutionException: java.lang.AssertionError
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113)
> ...
> Caused by: java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055)
> ...
> {noformat}
> The offending lines:
> {code}
>   AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table)
>   .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES)
>   .replicaSelection(replicaSelection)
>   .build();
>   KuduScanner syncScanner = new KuduScanner(scanner);
>   long preTs = asyncClient.getLastPropagatedTimestamp();
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP,
>   asyncClient.getLastPropagatedTimestamp());
>   long row_count = countRowsInScan(syncScanner);
>   long expected_count = 100L * (i + 1);
>   assertTrue(expected_count <= row_count);
>   // After the scan, verify that the chosen snapshot timestamp is
>   // returned from the server and it is larger than the previous
>   // propagated timestamp.
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, 
> scanner.getSnapshotTimestamp());
> -->   assertTrue(preTs < scanner.getSnapshotTimestamp());
> {code}
> It's possible that this is just test flakiness, but I'm setting a higher 
> priority so we can understand whether that's the case, or whether there's 
> something wrong with read-your-writes scans.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky

2019-01-25 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752886#comment-16752886
 ] 

Hao Hao commented on KUDU-2668:
---

Fixed in commit 847ceb84d.

> TestKuduClient.readYourWrites tests are flaky
> -
>
> Key: KUDU-2668
> URL: https://issues.apache.org/jira/browse/KUDU-2668
> Project: Kudu
>  Issue Type: Bug
>  Components: java, test
>Affects Versions: 1.9.0
>Reporter: Adar Dembo
>Assignee: Hao Hao
>Priority: Critical
>
> I looped TestKuduClient 1000 times in dist-test while working on another 
> problem, and saw the following failures:
> {noformat}
> 1 testReadYourWritesBatchLeaderReplica
> 14 testReadYourWritesSyncClosestReplica
> 15 testReadYourWritesSyncLeaderReplica
> {noformat}
> In all cases, the stack trace of the failure was effectively this:
> {noformat}
> java.util.concurrent.ExecutionException: java.lang.AssertionError
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113)
> ...
> Caused by: java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055)
> ...
> {noformat}
> The offending lines:
> {code}
>   AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table)
>   .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES)
>   .replicaSelection(replicaSelection)
>   .build();
>   KuduScanner syncScanner = new KuduScanner(scanner);
>   long preTs = asyncClient.getLastPropagatedTimestamp();
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP,
>   asyncClient.getLastPropagatedTimestamp());
>   long row_count = countRowsInScan(syncScanner);
>   long expected_count = 100L * (i + 1);
>   assertTrue(expected_count <= row_count);
>   // After the scan, verify that the chosen snapshot timestamp is
>   // returned from the server and it is larger than the previous
>   // propagated timestamp.
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, 
> scanner.getSnapshotTimestamp());
> -->   assertTrue(preTs < scanner.getSnapshotTimestamp());
> {code}
> It's possible that this is just test flakiness, but I'm setting a higher 
> priority so we can understand whether that's the case, or whether there's 
> something wrong with read-your-writes scans.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Issue Comment Deleted] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky

2019-01-25 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2668:
--
Comment: was deleted

(was: Fixed in commit 847ceb84d.)

> TestKuduClient.readYourWrites tests are flaky
> -
>
> Key: KUDU-2668
> URL: https://issues.apache.org/jira/browse/KUDU-2668
> Project: Kudu
>  Issue Type: Bug
>  Components: java, test
>Affects Versions: 1.9.0
>Reporter: Adar Dembo
>Assignee: Hao Hao
>Priority: Critical
>
> I looped TestKuduClient 1000 times in dist-test while working on another 
> problem, and saw the following failures:
> {noformat}
> 1 testReadYourWritesBatchLeaderReplica
> 14 testReadYourWritesSyncClosestReplica
> 15 testReadYourWritesSyncLeaderReplica
> {noformat}
> In all cases, the stack trace of the failure was effectively this:
> {noformat}
> java.util.concurrent.ExecutionException: java.lang.AssertionError
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113)
> ...
> Caused by: java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055)
> ...
> {noformat}
> The offending lines:
> {code}
>   AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table)
>   .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES)
>   .replicaSelection(replicaSelection)
>   .build();
>   KuduScanner syncScanner = new KuduScanner(scanner);
>   long preTs = asyncClient.getLastPropagatedTimestamp();
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP,
>   asyncClient.getLastPropagatedTimestamp());
>   long row_count = countRowsInScan(syncScanner);
>   long expected_count = 100L * (i + 1);
>   assertTrue(expected_count <= row_count);
>   // After the scan, verify that the chosen snapshot timestamp is
>   // returned from the server and it is larger than the previous
>   // propagated timestamp.
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, 
> scanner.getSnapshotTimestamp());
> -->   assertTrue(preTs < scanner.getSnapshotTimestamp());
> {code}
> It's possible that this is just test flakiness, but I'm setting a higher 
> priority so we can understand whether that's the case, or whether there's 
> something wrong with read-your-writes scans.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KUDU-2668) TestKuduClient.readYourWrites tests are flaky

2019-01-23 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao reassigned KUDU-2668:
-

Assignee: Hao Hao

> TestKuduClient.readYourWrites tests are flaky
> -
>
> Key: KUDU-2668
> URL: https://issues.apache.org/jira/browse/KUDU-2668
> Project: Kudu
>  Issue Type: Bug
>  Components: java, test
>Affects Versions: 1.9.0
>Reporter: Adar Dembo
>Assignee: Hao Hao
>Priority: Critical
>
> I looped TestKuduClient 1000 times in dist-test while working on another 
> problem, and saw the following failures:
> {noformat}
> 1 testReadYourWritesBatchLeaderReplica
> 14 testReadYourWritesSyncClosestReplica
> 15 testReadYourWritesSyncLeaderReplica
> {noformat}
> In all cases, the stack trace of the failure was effectively this:
> {noformat}
> java.util.concurrent.ExecutionException: java.lang.AssertionError
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113)
> ...
> Caused by: java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098)
> at 
> org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055)
> ...
> {noformat}
> The offending lines:
> {code}
>   AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table)
>   .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES)
>   .replicaSelection(replicaSelection)
>   .build();
>   KuduScanner syncScanner = new KuduScanner(scanner);
>   long preTs = asyncClient.getLastPropagatedTimestamp();
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP,
>   asyncClient.getLastPropagatedTimestamp());
>   long row_count = countRowsInScan(syncScanner);
>   long expected_count = 100L * (i + 1);
>   assertTrue(expected_count <= row_count);
>   // After the scan, verify that the chosen snapshot timestamp is
>   // returned from the server and it is larger than the previous
>   // propagated timestamp.
>   assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, 
> scanner.getSnapshotTimestamp());
> -->   assertTrue(preTs < scanner.getSnapshotTimestamp());
> {code}
> It's possible that this is just test flakiness, but I'm setting a higher 
> priority so we can understand whether that's the case, or whether there's 
> something wrong with read-your-writes scans.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2667) MultiThreadedTabletTest/DeleteAndReinsert is flaky in ASAN

2019-01-23 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2667:
-

 Summary:  MultiThreadedTabletTest/DeleteAndReinsert  is flaky in 
ASAN
 Key: KUDU-2667
 URL: https://issues.apache.org/jira/browse/KUDU-2667
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao
 Attachments: mt-tablet-test.3.txt

I recently came across a failure in MultiThreadedTabletTest/DeleteAndReinsert 
of ASAN. The error message is:

{noformat}
Error Message
mt-tablet-test.cc:378] Check failed: _s.ok() Bad status: Already present: int32 
key=2, int32 key_idx=2, int32 val=NULL: key already present

Stacktrace
mt-tablet-test.cc:378] Check failed: _s.ok() Bad status: Already present: int32 
key=2, int32 key_idx=2, int32 val=NULL: key already present
@ 0x7f66b32a5c37 gsignal at ??:0
@ 0x7f66b32a9028 abort at ??:0
@   0x62c995 
kudu::tablet::MultiThreadedTabletTest<>::DeleteAndReinsertCycleThread() at 
/home/jenkins-slave/workspace/kudu-master/0/src/kudu/tablet/mt-tablet-test.cc:378
@   0x617e63 boost::_bi::bind_t<>::operator()() at 
/home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/uninstrumented/include/boost/bind/bind.hpp:1223
@ 0x7f66b92d8dac boost::function0<>::operator()() at ??:0
@ 0x7f66b7792afb kudu::Thread::SuperviseThread() at ??:0
@ 0x7f66bec0e184 start_thread at ??:0
@ 0x7f66b336cffd clone at ??:0
{noformat}

Attached the full log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-2620) Flaky TestMiniSentryLifecycle

2018-11-11 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-2620.
---
Resolution: Fixed

> Flaky TestMiniSentryLifecycle
> -
>
> Key: KUDU-2620
> URL: https://issues.apache.org/jira/browse/KUDU-2620
> Project: Kudu
>  Issue Type: Bug
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
> Fix For: 1.9.0
>
>
>  I saw TestMiniSentryLifecycle failed with the following error,
> {noformat}
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64:
>  Failure
> Failed
> Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero 
> status 1
> Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are using 
> GNU date ***
> PC: @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr()
> SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; stack 
> trace: ***
> @ 0x3d0ca0f710 (unknown) at ??:0
> @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr() at ??:0
> @ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0
> @ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at ??:0
> @ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0
> @ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at 
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70
> 2018-11-05 23:07:10
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode):
> "DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 
> waiting on condition [0x]
> java.lang.Thread.State: RUNNABLE
> "BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 
> nid=0x4c00 waiting on condition [0x7f057e06d000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> parking to wait for <0xfd5b4478> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
> at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> "BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 
> tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> parking to wait for <0xfd5b3c40> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> "com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 
> tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
> locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
> at com.google.common.base.internal.Finalizer.run(Finalizer.java:127)
> "BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 
> nid=0x4bfd waiting on condition [0x7f05882ba000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> parking to wait for <0xfceca418> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
> at

[jira] [Updated] (KUDU-2620) Flaky TestMiniSentryLifecycle

2018-11-11 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2620:
--
  Code Review: https://gerrit.cloudera.org/#/c/11898/
Fix Version/s: 1.9.0

> Flaky TestMiniSentryLifecycle
> -
>
> Key: KUDU-2620
> URL: https://issues.apache.org/jira/browse/KUDU-2620
> Project: Kudu
>  Issue Type: Bug
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
> Fix For: 1.9.0
>
>
>  I saw TestMiniSentryLifecycle failed with the following error,
> {noformat}
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64:
>  Failure
> Failed
> Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero 
> status 1
> Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are using 
> GNU date ***
> PC: @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr()
> SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; stack 
> trace: ***
> @ 0x3d0ca0f710 (unknown) at ??:0
> @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr() at ??:0
> @ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0
> @ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at ??:0
> @ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0
> @ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at 
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70
> 2018-11-05 23:07:10
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode):
> "DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 
> waiting on condition [0x]
> java.lang.Thread.State: RUNNABLE
> "BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 
> nid=0x4c00 waiting on condition [0x7f057e06d000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> parking to wait for <0xfd5b4478> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
> at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> "BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 
> tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> parking to wait for <0xfd5b3c40> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> "com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 
> tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
> locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
> at com.google.common.base.internal.Finalizer.run(Finalizer.java:127)
> "BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 
> nid=0x4bfd waiting on condition [0x7f05882ba000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> parking to wait for <0xfceca418> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at

[jira] [Assigned] (KUDU-2620) Flaky TestMiniSentryLifecycle

2018-11-06 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao reassigned KUDU-2620:
-

Assignee: Hao Hao

> Flaky TestMiniSentryLifecycle
> -
>
> Key: KUDU-2620
> URL: https://issues.apache.org/jira/browse/KUDU-2620
> Project: Kudu
>  Issue Type: Bug
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
>
>  I saw TestMiniSentryLifecycle failed with the following error,
> {noformat} 
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64:
>  Failure
>  Failed
>  Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero 
> status 1
>  * 
>  ** 
>  *** Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are 
> using GNU date ***
>  PC: @ 0x7f8288d7e7ec std::__shared_ptr<>::__shared_ptr()
>  *** SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; 
> stack trace: ***
>  @ 0x3d0ca0f710 (unknown) at ??:0
>  @ 0x7f8288d7e7ec std::__shared_ptr<>::__shared_ptr() at ??:0
>  @ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0
>  @ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at 
> ??:0
>  @ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0
>  @ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at 
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70
>  2018-11-05 23:07:10
>  Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode):
> "DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 
> waiting on condition [0x]
>  java.lang.Thread.State: RUNNABLE
> "BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 
> nid=0x4c00 waiting on condition [0x7f057e06d000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0xfd5b4478> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
>  at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> "BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 
> tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000]
>  java.lang.Thread.State: TIMED_WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0xfd5b3c40> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
>  at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> "com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 
> tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000]
>  java.lang.Thread.State: WAITING (on object monitor)
>  at java.lang.Object.wait(Native Method)
>  - waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
>  at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
>  - locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
>  at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
>  at com.google.common.base.internal.Finalizer.run(Finalizer.java:127)
> "BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 
> nid=0x4bfd waiting on condition [0x7f05882ba000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0xfceca418> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at

[jira] [Updated] (KUDU-2620) Flaky TestMiniSentryLifecycle

2018-11-06 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2620:
--
Description: 
 I saw TestMiniSentryLifecycle failed with the following error,
{noformat}
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64:
 Failure
Failed
Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero status 1

Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are using 
GNU date ***
PC: @ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr()
SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; stack 
trace: ***
@ 0x3d0ca0f710 (unknown) at ??:0
@ 0x7f8288d7e7ec std::_shared_ptr<>::_shared_ptr() at ??:0
@ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0
@ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at ??:0
@ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0
@ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70
2018-11-05 23:07:10
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode):
"DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 waiting 
on condition [0x]
java.lang.Thread.State: RUNNABLE

"BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 
nid=0x4c00 waiting on condition [0x7f057e06d000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)

parking to wait for <0xfd5b4478> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 
tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)

parking to wait for <0xfd5b3c40> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 
tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)

waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
at com.google.common.base.internal.Finalizer.run(Finalizer.java:127)
"BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 
nid=0x4bfd waiting on condition [0x7f05882ba000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)

parking to wait for <0xfceca418> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"BoneCP-keep-alive-scheduler" #31 daemon prio=5 os_prio=0 
tid=0x7f058cf0d800 nid=0x4bfc waiting on condition [0x7f0588f0a000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)

parking to wait for <0xfcec9be0>

[jira] [Created] (KUDU-2620) Flaky TestMiniSentryLifecycle

2018-11-06 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2620:
-

 Summary: Flaky TestMiniSentryLifecycle
 Key: KUDU-2620
 URL: https://issues.apache.org/jira/browse/KUDU-2620
 Project: Kudu
  Issue Type: Bug
Reporter: Hao Hao


 I saw TestMiniSentryLifecycle failed with the following error,

{noformat} 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:64:
 Failure
 Failed
 Bad status: Runtime error: /usr/sbin/lsof: process exited with non-zero status 
1
 * 
 ** 
 *** Aborted at 1541488030 (unix time) try "date -d @1541488030" if you are 
using GNU date ***
 PC: @ 0x7f8288d7e7ec std::__shared_ptr<>::__shared_ptr()
 *** SIGSEGV (@0x8) received by PID 19125 (TID 0x7f8282d87980) from PID 8; 
stack trace: ***
 @ 0x3d0ca0f710 (unknown) at ??:0
 @ 0x7f8288d7e7ec std::__shared_ptr<>::__shared_ptr() at ??:0
 @ 0x7f8288d7e837 std::shared_ptr<>::shared_ptr() at ??:0
 @ 0x7f8288d7edb5 sentry::SentryPolicyServiceClient::getInputProtocol() at ??:0
 @ 0x7f8288d7ba08 kudu::sentry::SentryClient::Stop() at ??:0
 @ 0x4414c9 kudu::sentry::SentryTestBase::TearDown() at 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/sentry/sentry-test-base.h:70
 2018-11-05 23:07:10
 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode):

"DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x7f05a1864800 nid=0x4af6 waiting 
on condition [0x]
 java.lang.Thread.State: RUNNABLE

"BoneCP-pool-watch-thread" #35 daemon prio=5 os_prio=0 tid=0x7f058c431800 
nid=0x4c00 waiting on condition [0x7f057e06d000]
 java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for <0xfd5b4478> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
 at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
 at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)

"BoneCP-keep-alive-scheduler" #34 daemon prio=5 os_prio=0 
tid=0x7f058cf04000 nid=0x4bff waiting on condition [0x7f057e16e000]
 java.lang.Thread.State: TIMED_WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for <0xfd5b3c40> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
 at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)

"com.google.common.base.internal.Finalizer" #33 daemon prio=5 os_prio=0 
tid=0x7f058cf03000 nid=0x4bfe in Object.wait() [0x7f05881b9000]
 java.lang.Thread.State: WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 - waiting on <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
 at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
 - locked <0xfd5b37d0> (a java.lang.ref.ReferenceQueue$Lock)
 at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
 at com.google.common.base.internal.Finalizer.run(Finalizer.java:127)

"BoneCP-pool-watch-thread" #32 daemon prio=5 os_prio=0 tid=0x7f058cf0e800 
nid=0x4bfd waiting on condition [0x7f05882ba000]
 java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for <0xfceca418> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
 at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
 at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)

"BoneCP-keep-alive-scheduler" #31 daemon prio=5 os_prio=0

[jira] [Created] (KUDU-2610) TestSimultaneousLeaderTransferAndAbruptStepdown is Flaky

2018-10-18 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2610:
-

 Summary: TestSimultaneousLeaderTransferAndAbruptStepdown is Flaky
 Key: KUDU-2610
 URL: https://issues.apache.org/jira/browse/KUDU-2610
 Project: Kudu
  Issue Type: Bug
Affects Versions: 1.8.0
Reporter: Hao Hao
 Attachments: kudu-admin-test.5.txt

AdminCliTest.TestSimultaneousLeaderTransferAndAbruptStepdown is flaky sometime 
in ASAN build with the following error:

{noformat}

b01d528fd3c74eb5b42b8d4888591ed2 (127.18.62.194:38185) has failed: Timed out: 
Write RPC to 127.18.62.194:38185 timed out after 60.000s (SENT)
W1017 23:33:47.772014 20038 batcher.cc:348] Timed out: Failed to write batch of 
1 ops to tablet 9b4b2dea960941bcb38197b51c55baf4 after 1 attempt(s): Failed to 
write to server: b01d528fd3c74eb5b42b8d4888591ed2 (127.18.62.194:38185): Write 
RPC to 127.18.62.194:38185 timed out after 60.000s (SENT)
F1017 23:33:47.772820 20042 test_workload.cc:202] Timed out: Failed to write 
batch of 1 ops to tablet 9b4b2dea960941bcb38197b51c55baf4 after 1 attempt(s): 
Failed to write to server: b01d528fd3c74eb5b42b8d4888591ed2 
(127.18.62.194:38185): Write RPC to 127.18.62.194:38185 timed out after 60.000s 
(SENT)
*** Check failure stack trace: ***
*** Aborted at 1539844427 (unix time) try "date -d @1539844427" if you are 
using GNU date ***
PC: @ 0x3c74632625 __GI_raise
*** SIGABRT (@0x45248fb) received by PID 18683 (TID 0x7f13ebe5b700) from 
PID 18683; stack trace: ***
 @ 0x3c74a0f710 (unknown) at ??:0
 @ 0x3c74632625 __GI_raise at ??:0
 @ 0x3c74633e05 __GI_abort at ??:0
 @ 0x7f13fd43da29 (unknown) at ??:0
 @ 0x7f13fd43f31d (unknown) at ??:0
 @ 0x7f13fd4411dd (unknown) at ??:0
 @ 0x7f13fd43ee59 (unknown) at ??:0
 @ 0x7f13fd441c7f (unknown) at ??:0
 @ 0x7f1412f7ba6e (unknown) at ??:0
 @ 0x3c796b6470 (unknown) at ??:0
 @ 0x3c74a079d1 start_thread at ??:0
 @ 0x3c746e88fd clone at ??:0

{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2608) Memory leak in RaftConsensusStressITest

2018-10-17 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2608:
-

 Summary: Memory leak in RaftConsensusStressITest
 Key: KUDU-2608
 URL: https://issues.apache.org/jira/browse/KUDU-2608
 Project: Kudu
  Issue Type: Bug
Reporter: Hao Hao
 Attachments: raft_consensus_stress-itest.txt

RaftConsensusStressITest.RemoveReplaceInCycle sometimes returns complaining 
about memory leaks. Attached the log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2607) master_cert_authority-itest detected memory leaks

2018-10-15 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2607:
-

 Summary: master_cert_authority-itest detected memory leaks
 Key: KUDU-2607
 URL: https://issues.apache.org/jira/browse/KUDU-2607
 Project: Kudu
  Issue Type: Improvement
Reporter: Hao Hao
 Attachments: master_cert_authority-itest.txt

I saw MultiMasterConnectToClusterTest.ConnectToCluster complaining about memory 
leaks in a recent job. Posted the log below.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KUDU-2491) kudu-admin-test times out

2018-10-14 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2491:
--
Attachment: jenkins_output (1).txt

> kudu-admin-test times out
> -
>
> Key: KUDU-2491
> URL: https://issues.apache.org/jira/browse/KUDU-2491
> Project: Kudu
>  Issue Type: Test
>Reporter: Alexey Serbin
>Assignee: Alexey Serbin
>Priority: Major
> Fix For: 1.8.0
>
> Attachments: jenkins_output (1).txt, kudu-admin-test.txt, 
> kudu-admin-test.xml
>
>
> In one of automated runs, the kudu-admin-test reportedly timed out while 
> running the {{RebalanceParamTest.Rebalance/3}} scenario at revision 
> {{1ae050e4d57bc333de28bcbc62e072e8bafd04b3}}.
> The logs are attached.  One small note: the ctest's output claims the test 
> was running for more than 15 minutes, while in the log there is information 
> on the test running just for over 5 minutes.
> The test was run via {{ctest}}:
> {noformat}
> ctest -j4 -R 
> '^kudu\-tool\-test|^linked_list\-test|^master\-stress\-test|^raft_consensus\-itest|^mt\-rpc\-test|^alter_table\-randomized\-test|^delete_tablet\-itest|^minidump_generation\-itest|^kudu\-ts\-cli\-test|^security\-itest|^client\-test|^kudu\-admin\-test|^master_failover\-itest'
>   Start  12: client-test.0
>   Start  13: client-test.1
>  1/32 Test  #12: client-test.0    Passed   17.14 sec
>   Start  14: client-test.2
>  2/32 Test  #13: client-test.1    Passed   39.78 sec
>   Start  15: client-test.3
>  3/32 Test  #14: client-test.2    Passed   27.46 sec
>   Start  16: client-test.4
>  4/32 Test  #15: client-test.3    Passed   21.74 sec
>   Start  17: client-test.5
>  5/32 Test  #16: client-test.4    Passed   43.09 sec
>   Start  18: client-test.6
>  6/32 Test  #18: client-test.6    Passed   13.93 sec
>   Start  19: client-test.7
>  7/32 Test  #19: client-test.7    Passed   15.40 sec
>   Start 100: delete_tablet-itest
>  8/32 Test  #17: client-test.5    Passed   58.53 sec
>   Start 142: security-itest
>   Start 165: minidump_generation-itest
>  9/32 Test #100: delete_tablet-itest ..   Passed4.08 sec
>   Start 246: kudu-ts-cli-test
> 10/32 Test #165: minidump_generation-itest    Passed5.88 sec
> 11/32 Test #246: kudu-ts-cli-test .   Passed8.45 sec
>   Start 118: master_failover-itest.0
> 12/32 Test #142: security-itest ...   Passed   38.22 sec
> 13/32 Test #118: master_failover-itest.0 ..   Passed  112.92 sec
>   Start  79: alter_table-randomized-test.0
> 14/32 Test  #79: alter_table-randomized-test.0    Passed   51.54 sec
>   Start  80: alter_table-randomized-test.1
> 15/32 Test  #80: alter_table-randomized-test.1    Passed   66.13 sec
>   Start 115: linked_list-test
> 16/32 Test #115: linked_list-test .   Passed  135.89 sec
>   Start 119: master_failover-itest.1
> 17/32 Test #119: master_failover-itest.1 ..   Passed  155.30 sec
>   Start 120: master_failover-itest.2
> 18/32 Test #120: master_failover-itest.2 ..   Passed   53.94 sec
>   Start 121: master_failover-itest.3
> 19/32 Test #121: master_failover-itest.3 ..   Passed   88.64 sec
>   Start 125: master-stress-test
> 20/32 Test #125: master-stress-test ...   Passed  105.25 sec
>   Start 133: raft_consensus-itest.0
> 21/32 Test #133: raft_consensus-itest.0 ...   Passed   31.82 sec
>   Start 134: raft_consensus-itest.1
> 22/32 Test #134: raft_consensus-itest.1 ...   Passed  134.83 sec
>   Start 135: raft_consensus-itest.2
> 23/32 Test #135: raft_consensus-itest.2 ...   Passed  149.73 sec
>   Start 136: raft_consensus-itest.3
> 24/32 Test #136: raft_consensus-itest.3 ...   Passed  122.22 sec
>   Start 137: raft_consensus-itest.4
> 25/32 Test #137: raft_consensus-itest.4 ...   Passed  143.62 sec
>   Start 138: raft_consensus-itest.5
> 26/32 Test #138: raft_consensus-itest.5 ...   Passed   52.04 sec
>   Start 174: mt-rpc-test
> 27/32 Test #174: mt-rpc-test ..   Passed1.69 sec
>   Start 241: kudu-admin-test
> 28/32 Test #241: kudu-admin-test ..***Timeout 930.12 sec
>   Start 242: kudu-tool-test.0
> 29/32 Test #242: kudu-tool-test.0 .   Passed   47.92 sec
>   Start 243: kudu-tool-test.1
> 30/32 Test #243: kudu-tool-test.1 .   Passed   39.39 sec
>   Start 244: kudu-tool-test.2
> 31/32 Test #244: kudu-tool-test.2 .   Passed   90.17 sec
>   Start 245: kudu-tool-test.3
> 32/32 Test #245: kudu-tool-test.3

[jira] [Commented] (KUDU-2491) kudu-admin-test times out

2018-10-14 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649636#comment-16649636
 ] 

Hao Hao commented on KUDU-2491:
---

I saw rebalancer_tool-test test time out again. Attached the log.

> kudu-admin-test times out
> -
>
> Key: KUDU-2491
> URL: https://issues.apache.org/jira/browse/KUDU-2491
> Project: Kudu
>  Issue Type: Test
>Reporter: Alexey Serbin
>Assignee: Alexey Serbin
>Priority: Major
> Fix For: 1.8.0
>
> Attachments: kudu-admin-test.txt, kudu-admin-test.xml
>
>
> In one of automated runs, the kudu-admin-test reportedly timed out while 
> running the {{RebalanceParamTest.Rebalance/3}} scenario at revision 
> {{1ae050e4d57bc333de28bcbc62e072e8bafd04b3}}.
> The logs are attached.  One small note: the ctest's output claims the test 
> was running for more than 15 minutes, while in the log there is information 
> on the test running just for over 5 minutes.
> The test was run via {{ctest}}:
> {noformat}
> ctest -j4 -R 
> '^kudu\-tool\-test|^linked_list\-test|^master\-stress\-test|^raft_consensus\-itest|^mt\-rpc\-test|^alter_table\-randomized\-test|^delete_tablet\-itest|^minidump_generation\-itest|^kudu\-ts\-cli\-test|^security\-itest|^client\-test|^kudu\-admin\-test|^master_failover\-itest'
>   Start  12: client-test.0
>   Start  13: client-test.1
>  1/32 Test  #12: client-test.0    Passed   17.14 sec
>   Start  14: client-test.2
>  2/32 Test  #13: client-test.1    Passed   39.78 sec
>   Start  15: client-test.3
>  3/32 Test  #14: client-test.2    Passed   27.46 sec
>   Start  16: client-test.4
>  4/32 Test  #15: client-test.3    Passed   21.74 sec
>   Start  17: client-test.5
>  5/32 Test  #16: client-test.4    Passed   43.09 sec
>   Start  18: client-test.6
>  6/32 Test  #18: client-test.6    Passed   13.93 sec
>   Start  19: client-test.7
>  7/32 Test  #19: client-test.7    Passed   15.40 sec
>   Start 100: delete_tablet-itest
>  8/32 Test  #17: client-test.5    Passed   58.53 sec
>   Start 142: security-itest
>   Start 165: minidump_generation-itest
>  9/32 Test #100: delete_tablet-itest ..   Passed4.08 sec
>   Start 246: kudu-ts-cli-test
> 10/32 Test #165: minidump_generation-itest    Passed5.88 sec
> 11/32 Test #246: kudu-ts-cli-test .   Passed8.45 sec
>   Start 118: master_failover-itest.0
> 12/32 Test #142: security-itest ...   Passed   38.22 sec
> 13/32 Test #118: master_failover-itest.0 ..   Passed  112.92 sec
>   Start  79: alter_table-randomized-test.0
> 14/32 Test  #79: alter_table-randomized-test.0    Passed   51.54 sec
>   Start  80: alter_table-randomized-test.1
> 15/32 Test  #80: alter_table-randomized-test.1    Passed   66.13 sec
>   Start 115: linked_list-test
> 16/32 Test #115: linked_list-test .   Passed  135.89 sec
>   Start 119: master_failover-itest.1
> 17/32 Test #119: master_failover-itest.1 ..   Passed  155.30 sec
>   Start 120: master_failover-itest.2
> 18/32 Test #120: master_failover-itest.2 ..   Passed   53.94 sec
>   Start 121: master_failover-itest.3
> 19/32 Test #121: master_failover-itest.3 ..   Passed   88.64 sec
>   Start 125: master-stress-test
> 20/32 Test #125: master-stress-test ...   Passed  105.25 sec
>   Start 133: raft_consensus-itest.0
> 21/32 Test #133: raft_consensus-itest.0 ...   Passed   31.82 sec
>   Start 134: raft_consensus-itest.1
> 22/32 Test #134: raft_consensus-itest.1 ...   Passed  134.83 sec
>   Start 135: raft_consensus-itest.2
> 23/32 Test #135: raft_consensus-itest.2 ...   Passed  149.73 sec
>   Start 136: raft_consensus-itest.3
> 24/32 Test #136: raft_consensus-itest.3 ...   Passed  122.22 sec
>   Start 137: raft_consensus-itest.4
> 25/32 Test #137: raft_consensus-itest.4 ...   Passed  143.62 sec
>   Start 138: raft_consensus-itest.5
> 26/32 Test #138: raft_consensus-itest.5 ...   Passed   52.04 sec
>   Start 174: mt-rpc-test
> 27/32 Test #174: mt-rpc-test ..   Passed1.69 sec
>   Start 241: kudu-admin-test
> 28/32 Test #241: kudu-admin-test ..***Timeout 930.12 sec
>   Start 242: kudu-tool-test.0
> 29/32 Test #242: kudu-tool-test.0 .   Passed   47.92 sec
>   Start 243: kudu-tool-test.1
> 30/32 Test #243: kudu-tool-test.1 .   Passed   39.39 sec
>   Start 244: kudu-tool-test.2
> 31/32 Test #244: kudu-tool-test.2 .   Passed   90.17 sec
>   Start 245: kudu-tool-test.3

[jira] [Updated] (KUDU-2602) testRandomBackupAndRestore is flaky

2018-10-11 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2602:
--
Description: 
Saw the following failure with testRandomBackupAndRestore:
{noformat}
java.lang.AssertionError: 
expected:21 but was:20

at org.junit.Assert.fail(Assert.java:88)

at org.junit.Assert.failNotEquals(Assert.java:834)

at org.junit.Assert.assertEquals(Assert.java:645)

at org.junit.Assert.assertEquals(Assert.java:631)

at 
org.apache.kudu.backup.TestKuduBackup.testRandomBackupAndRestore(TestKuduBackup.scala:99)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)

at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)

at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)

at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)

at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)

at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)

at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72)

at org.junit.rules.RunRules.evaluate(RunRules.java:20)

at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)

at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)

at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)

at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)

at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)

at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)

at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)

at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)

at org.junit.runners.ParentRunner.run(ParentRunner.java:363)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)

at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)

at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)

at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)

at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)

at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)

at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)

at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)

at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)

at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)

at 
org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)

at 
org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at

[jira] [Updated] (KUDU-2602) testRandomBackupAndRestore is flaky

2018-10-11 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2602:
--
Issue Type: Bug  (was: Improvement)

> testRandomBackupAndRestore is flaky
> ---
>
> Key: KUDU-2602
> URL: https://issues.apache.org/jira/browse/KUDU-2602
> Project: Kudu
>  Issue Type: Bug
>Reporter: Hao Hao
>Priority: Major
> Attachments: TEST-org.apache.kudu.backup.TestKuduBackup.xml
>
>
> Saw the following failure with testRandomBackupAndRestore:
> {noformat}
> java.lang.AssertionError: 
> expected:21 but was:20
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:834)
> at org.junit.Assert.assertEquals(Assert.java:645)
> at org.junit.Assert.assertEquals(Assert.java:631)
> at 
> org.apache.kudu.backup.TestKuduBackup.testRandomBackupAndRestore(TestKuduBackup.scala:99)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
> at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
> at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
> at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
> at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
> at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
> at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
> at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
> at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
> at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
> at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
> at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)
> at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)
> at

[jira] [Commented] (KUDU-2599) Timeout in DefaultSourceTest.testSocketReadTimeoutPropagation

2018-10-09 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643941#comment-16643941
 ] 

Hao Hao commented on KUDU-2599:
---

This also happens to 
DefaultSourceTest.testTableScanWithProjectionAndPredicateDecimal128.

> Timeout in DefaultSourceTest.testSocketReadTimeoutPropagation
> -
>
> Key: KUDU-2599
> URL: https://issues.apache.org/jira/browse/KUDU-2599
> Project: Kudu
>  Issue Type: Bug
>  Components: spark
>Affects Versions: 1.8.0
>Reporter: Will Berkeley
>Priority: Major
> Attachments: TEST-org.apache.kudu.spark.kudu.DefaultSourceTest.xml
>
>
> Log attached



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (KUDU-2584) Flaky testSimpleBackupAndRestore

2018-10-09 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643937#comment-16643937
 ] 

Hao Hao edited comment on KUDU-2584 at 10/9/18 6:43 PM:


I saw testSimpleBackupAndRestore failed with another error messages:
{noformat}
04:26:59.844 [ERROR - Test worker] (RetryRule.java:76) 
testSimpleBackupAndRestore(org.apache.kudu.backup.TestKuduBackup): failed run 1

java.security.PrivilegedActionException: 
org.apache.kudu.client.NoLeaderFoundException: Master config 
(127.17.164.190:54477,127.17.164.189:42043,127.17.164.188:38685) has no leader.

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:360)

at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:122)

at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:65)

at 
org.apache.kudu.spark.kudu.KuduTestSuite$class.setUpBase(KuduTestSuite.scala:131)

at org.apache.kudu.backup.TestKuduBackup.setUpBase(TestKuduBackup.scala:47)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)

at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)

at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)

at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)

at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)

at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72)

at org.junit.rules.RunRules.evaluate(RunRules.java:20)

at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)

at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)

at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)

at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)

at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)

at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)

at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)

at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)

at org.junit.runners.ParentRunner.run(ParentRunner.java:363)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)

at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)

at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)

at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)

at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)

at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)

at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)

at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)

at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)

at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)

at

[jira] [Updated] (KUDU-2584) Flaky testSimpleBackupAndRestore

2018-10-09 Thread Hao Hao (JIRA)



 [ 
https://issues.apache.org/jira/browse/KUDU-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2584:
--
Attachment: TEST-org.apache.kudu.backup.TestKuduBackup.xml

> Flaky testSimpleBackupAndRestore
> 
>
> Key: KUDU-2584
> URL: https://issues.apache.org/jira/browse/KUDU-2584
> Project: Kudu
>  Issue Type: Bug
>  Components: backup
>Reporter: Mike Percy
>Assignee: Grant Henke
>Priority: Major
> Attachments: TEST-org.apache.kudu.backup.TestKuduBackup.xml
>
>
> testSimpleBackupAndRestore is flaky and tends to fail with the following 
> error:
> {code:java}
> 04:48:06.604 [ERROR - Test worker] (RetryRule.java:72) 
> testRandomBackupAndRestore(org.apache.kudu.backup.TestKuduBackup): failed run 
> 1 
> java.lang.AssertionError: expected:<111> but was:<110> 
> at org.junit.Assert.fail(Assert.java:88) 
> at org.junit.Assert.failNotEquals(Assert.java:834) 
> at org.junit.Assert.assertEquals(Assert.java:645) 
> at org.junit.Assert.assertEquals(Assert.java:631) 
> at 
> org.apache.kudu.backup.TestKuduBackup.testRandomBackupAndRestore(TestKuduBackup.scala:99)
>  
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:483) 
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:68) 
> at org.junit.rules.RunRules.evaluate(RunRules.java:20) 
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) 
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) 
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) 
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) 
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) 
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) 
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363) 
> at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>  
> at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>  
> at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>  
> at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>  
> at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>  
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:483) 
> at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>  
> at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>  
> at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>  
> at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>  
> at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) 
> at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
>  
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:483) 
> at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>  
> at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>  
>

[jira] [Commented] (KUDU-2584) Flaky testSimpleBackupAndRestore

2018-10-09 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643937#comment-16643937
 ] 

Hao Hao commented on KUDU-2584:
---

I saw testSimpleBackupAndRestore failed with another error messages:

{noformat}

04:26:59.844 [ERROR - Test worker] (RetryRule.java:76) 
testSimpleBackupAndRestore(org.apache.kudu.backup.TestKuduBackup): failed run 1

java.security.PrivilegedActionException: 
org.apache.kudu.client.NoLeaderFoundException: Master config 
(127.17.164.190:54477,127.17.164.189:42043,127.17.164.188:38685) has no leader.

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:360)

at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:122)

at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:65)

at 
org.apache.kudu.spark.kudu.KuduTestSuite$class.setUpBase(KuduTestSuite.scala:131)

at org.apache.kudu.backup.TestKuduBackup.setUpBase(TestKuduBackup.scala:47)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)

at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)

at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)

at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)

at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)

at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72)

at org.junit.rules.RunRules.evaluate(RunRules.java:20)

at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)

at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)

at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)

at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)

at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)

at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)

at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)

at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)

at org.junit.runners.ParentRunner.run(ParentRunner.java:363)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)

at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)

at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)

at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)

at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)

at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)

at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)

at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)

at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)

at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)

at

[jira] [Created] (KUDU-2602) testRandomBackupAndRestore is flaky

2018-10-09 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2602:
-

 Summary: testRandomBackupAndRestore is flaky
 Key: KUDU-2602
 URL: https://issues.apache.org/jira/browse/KUDU-2602
 Project: Kudu
  Issue Type: Improvement
Reporter: Hao Hao
 Attachments: TEST-org.apache.kudu.backup.TestKuduBackup.xml

Saw the following failure with testRandomBackupAndRestore:

{noformat}

java.lang.AssertionError: 
expected:21 but was:20

at org.junit.Assert.fail(Assert.java:88)

at org.junit.Assert.failNotEquals(Assert.java:834)

at org.junit.Assert.assertEquals(Assert.java:645)

at org.junit.Assert.assertEquals(Assert.java:631)

at 
org.apache.kudu.backup.TestKuduBackup.testRandomBackupAndRestore(TestKuduBackup.scala:99)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)

at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)

at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)

at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)

at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)

at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)

at org.apache.kudu.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72)

at org.junit.rules.RunRules.evaluate(RunRules.java:20)

at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)

at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)

at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)

at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)

at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)

at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)

at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)

at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)

at org.junit.runners.ParentRunner.run(ParentRunner.java:363)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)

at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)

at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)

at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)

at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)

at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)

at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)

at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)

at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)

at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)

at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)

at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)

at 
org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)

at 
org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)

at

[jira] [Created] (KUDU-2488) tsan failure in security-itest

2018-06-28 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2488:
-

 Summary: tsan failure in security-itest
 Key: KUDU-2488
 URL: https://issues.apache.org/jira/browse/KUDU-2488
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao
 Attachments: security-itest.txt

Recent run of master, I encountered a tsan failure of security-itest. Attached 
the log.

{noformat}

==
WARNING: ThreadSanitizer: data race (pid=12812)
 Write of size 8 at 0x7b080528 by main thread:
 #0 operator delete(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:119
 (security-itest+0x4eb7a1)
 #1 std::__1::__libcpp_deallocate(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/new:236:3
 (libkrpc.so+0x8d8ba)
 #2 std::__1::allocator 
>::deallocate(std::__1::__tree_node*, 
unsigned long) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/memory:1796
 (libkrpc.so+0x8d8ba)
 #3 
std::__1::allocator_traits > 
>::deallocate(std::__1::allocator >&, std::__1::__tree_node*, unsigned 
long) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/memory:1555
 (libkrpc.so+0x8d8ba)
 #4 std::__1::__tree, 
std::__1::allocator 
>::destroy(std::__1::__tree_node*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1834
 (libkrpc.so+0x8d8ba)
 #5 std::__1::__tree, 
std::__1::allocator >::~__tree() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1821:3
 (libkrpc.so+0x8d856)
 #6 std::__1::set, 
std::__1::allocator >::~set() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/set:400:28
 (libkrpc.so+0x8bc79)
 #7 cxa_at_exit_wrapper(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:386
 (security-itest+0x44c2b3)\{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2480) tsan failure of master-stress-test

2018-06-19 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2480:
-

 Summary: tsan failure of master-stress-test
 Key: KUDU-2480
 URL: https://issues.apache.org/jira/browse/KUDU-2480
 Project: Kudu
  Issue Type: Test
Reporter: Hao Hao
 Attachments: master-stress-test.txt

master-stress-test recently has been very flaky(~24%).  One of the failure log

{noformat}WARNING: ThreadSanitizer: data race (pid=26513)
 Read of size 8 at 0x7ffb5e5b88b8 by thread T65:
 #0 kudu::Status::operator=(kudu::Status const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/status.h:469:7 
(libmaster.so+0x10bd00)
 #1 kudu::Synchronizer::StatusCB(kudu::Status const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/async_util.h:44:8
 (libmaster.so+0x10bc40)
 #2 kudu::internal::RunnableAdapter::Run(kudu::Synchronizer*, kudu::Status const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/bind_internal.h:192:12
 (libmaster.so+0x10c708)
 #3 kudu::internal::InvokeHelper, void ()(kudu::Synchronizer*, kudu::Status 
const&)>::MakeItSo(kudu::internal::RunnableAdapter, kudu::Synchronizer*, 
kudu::Status const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/bind_internal.h:889:14
 (libmaster.so+0x10c5e8)
 #4 kudu::internal::Invoker<1, 
kudu::internal::BindState, void ()(kudu::Synchronizer*, 
kudu::Status const&), void 
()(kudu::internal::UnretainedWrapper)>, void 
()(kudu::Synchronizer*, kudu::Status 
const&)>::Run(kudu::internal::BindStateBase*, kudu::Status const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/bind_internal.h:1118:12
 (libmaster.so+0x10c51a)
 #5 kudu::Callback::Run(kudu::Status const&) 
const 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/callback.h:436:12
 (libmaster.so+0x10b831)
 #6 kudu::master::HmsNotificationLogListenerTask::RunLoop() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/master/hms_notification_log_listener.cc:136:10
 (libmaster.so+0x108e0a)
 #7 boost::_mfi::mf0::operator()(kudu::master::HmsNotificationLogListenerTask*)
 const 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/bind/mem_fn_template.hpp:49:29
 (libmaster.so+0x110ea9)
 #8 void 
boost::_bi::list1
 >::operator(), 
boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0&, boost::_bi::list0&, int) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/bind/bind.hpp:259:9
 (libmaster.so+0x110dfa)
 #9 boost::_bi::bind_t, 
boost::_bi::list1
 > >::operator()() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/bind/bind.hpp:1222:16
 (libmaster.so+0x110d83)
 #10 
boost::detail::function::void_function_obj_invoker0, 
boost::_bi::list1
 > >, void>::invoke(boost::detail::function::function_buffer&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/function/function_template.hpp:159:11
 (libmaster.so+0x110b79)
 #11 boost::function0::operator()() const 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/function/function_template.hpp:770:14
 (libkrpc.so+0xb64b1)
 #12 kudu::Thread::SuperviseThread(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/thread.cc:603:3
 (libkudu_util.so+0x1bd8b4)

Previous write of size 8 at 0x7ffb5e5b88b8 by thread T24 (mutexes: read M1468):
 #0 
boost::intrusive::circular_list_algorithms
 >::init(boost::intrusive::list_node* const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/intrusive/circular_list_algorithms.hpp:72:22
 (libkrpc.so+0x99c92)
 #1 
boost::intrusive::generic_hook
 >, boost::intrusive::dft_tag, (boost::intrusive::link_mode_type)1, 
(boost::intrusive::base_hook_type)1>::generic_hook() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/intrusive/detail/generic_hook.hpp:174:10
 (libkrpc.so+0xc4669)
 #2 boost::intrusive::list_base_hook::list_base_hook() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/boost/intrusive/list_hook.hpp:83:7
 (libkrpc.so+0xc2049)
 #3 kudu::rpc::ReactorTask::ReactorTask() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/reactor.cc:262:14
 (libkrpc.so+0xbd5fb)
 #4 
kudu::rpc::QueueTransferTask::QueueTransferTask(gscoped_ptr >, kudu::rpc::Connection*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/connection.cc:432:3
 (libkrpc.so+0x98ea4)
 #5 
kudu::rpc::Connection::QueueResponseForCall(gscoped_ptr >) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/connection.cc:474:33
 (libkrpc.so+0x94d79)
 #6 kudu::rpc::InboundCall::Respond(google::protobuf::MessageLite const&, bool) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/inbound_call.cc:165:10
 (libkrpc.so+0xa11b9)
 #7

[jira] [Commented] (KUDU-2473) READ_YOUR_WRITES error on snapshot timestamp

2018-06-12 Thread Hao Hao (JIRA)



[ 
https://issues.apache.org/jira/browse/KUDU-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510470#comment-16510470
 ] 

Hao Hao commented on KUDU-2473:
---

I think this is the same issue as KUDU-2415. Some relevant discussion on slack 
https://getkudu.slack.com/archives/C0CPXJ3CH/p152461454321.

> READ_YOUR_WRITES error on snapshot timestamp
> 
>
> Key: KUDU-2473
> URL: https://issues.apache.org/jira/browse/KUDU-2473
> Project: Kudu
>  Issue Type: Bug
>  Components: impala
>Reporter: Thomas Tauber-Marshall
>Priority: Critical
>
> I'm trying to implement support for READ_YOUR_WRITES for Impala, but I'm 
> finding that if SetLatestObservedTimestamp() isn't called (eg. because we 
> haven't interacted with Kudu yet in the current session and don't have a 
> timestamp to set), attempting to scan tables always fails with an error of 
> the form:
> org.apache.kudu.client.NonRecoverableException: Snapshot timestamp is earlier 
> than the ancient history mark. Consider increasing the value of the 
> configuration parameter --tablet_history_max_age_sec. Snapshot timestamp: P: 
> 0 usec, L: 1 Ancient History Mark: P: 1528845756599966 usec, L: 0 Physical 
> time difference: -1528845756.600s
> Minimal repro:
> {noformat}
> KuduClientBuilder b = new KuduClient.KuduClientBuilder("localhost");
> KuduClient client = b.build();
> KuduTable table = client.openTable("read_mode_test");
> KuduScannerBuilder scannerBuilder = client.newScannerBuilder(table);
> scannerBuilder.readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES);
> KuduScanner scanner = scannerBuilder.build();
> while (scanner.hasMoreRows()) {
>   scanner.nextRows();
> }
> {noformat}
> I'm using Kudu at git hash a954418



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2430) Flaky test security-itest

2018-05-07 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2430:
-

 Summary: Flaky test security-itest
 Key: KUDU-2430
 URL: https://issues.apache.org/jira/browse/KUDU-2430
 Project: Kudu
  Issue Type: Test
  Components: security
Affects Versions: 1.7.0
Reporter: Hao Hao
 Attachments: security-itest.txt

While running on master branch, security-itest failed with 'WARNING: 
ThreadSanitizer: data race'. Attached the full log.

{noformat}

WARNING: ThreadSanitizer: data race (pid=785)
 Write of size 8 at 0x7b080d68 by main thread:
 #0 operator delete(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:119
 (security-itest+0x4eb7a1)
 #1 std::__1::__libcpp_deallocate(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/new:236:3
 (libkrpc.so+0x8d8ba)
 #2 std::__1::allocator 
>::deallocate(std::__1::__tree_node*, 
unsigned long) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/memory:1796
 (libkrpc.so+0x8d8ba)
 #3 
std::__1::allocator_traits > 
>::deallocate(std::__1::allocator >&, std::__1::__tree_node*, unsigned 
long) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/memory:1555
 (libkrpc.so+0x8d8ba)
 #4 std::__1::__tree::destroy(std::__1::__tree_node*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1834
 (libkrpc.so+0x8d8ba)
 #5 std::__1::__tree::~__tree() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1821:3
 (libkrpc.so+0x8d856)
 #6 std::__1::set::~set() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/set:400:28
 (libkrpc.so+0x8bc79)
 #7 cxa_at_exit_wrapper(void*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:386
 (security-itest+0x44c2b3)

Previous read of size 8 at 0x7b080d68 by thread T5:
 #0 std::__1::__tree_end_node*>* 
std::__1::__tree_next_iter*>*,
 std::__1::__tree_node_base*>(std::__1::__tree_node_base*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:185:14
 (libkrpc.so+0x90356)
 #1 std::__1::__tree_const_iterator*, long>::operator++() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:921
 (libkrpc.so+0x90356)
 #2 void std::__1::__tree::__assign_multi*, long> 
>(std::__1::__tree_const_iterator*, long>, 
std::__1::__tree_const_iterator*, long>) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1667
 (libkrpc.so+0x90356)
 #3 std::__1::__tree::operator=(std::__1::__tree const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/__tree:1575:9
 (libkrpc.so+0x901a4)
 #4 std::__1::set::operator=(std::__1::set const&) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/set:485:21
 (libkrpc.so+0x870ba)
 #5 kudu::rpc::ClientNegotiation::SendNegotiate() 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/client_negotiation.cc:306
 (libkrpc.so+0x870ba)
 #6 
kudu::rpc::ClientNegotiation::Negotiate(std::__1::unique_ptr*) 
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/rpc/client_negotiation.cc:171:5
 (libkrpc.so+0x8693a)
 #7

[jira] [Created] (KUDU-2425) Flaky test rpc_server-test

2018-05-01 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2425:
-

 Summary: Flaky test rpc_server-test
 Key: KUDU-2425
 URL: https://issues.apache.org/jira/browse/KUDU-2425
 Project: Kudu
  Issue Type: Test
Affects Versions: 1.7.0
Reporter: Hao Hao
 Attachments: rpc_server-test.txt

While running on master branch, rpc_server-test failed with 'Check failed: 0 == 
rv (0 vs. 16)'. Attached the full log.

{noformat}

F0427 23:04:30.496054 497 mutex.cc:77] Check failed: 0 == rv (0 vs. 16) . 
Device or resource busy
*** Check failure stack trace: ***
*** Aborted at 1524895470 (unix time) try "date -d @1524895470" if you are 
using GNU date ***
PC: @ 0x397f632625 (unknown)
*** SIGABRT (@0x45201f1) received by PID 497 (TID 0x7fa80ffef980) from PID 
497; stack trace: ***
 @ 0x397fa0f710 (unknown) at ??:0
 @ 0x397f632625 (unknown) at ??:0
 @ 0x397f633e05 (unknown) at ??:0
 @ 0x7fa8104c4a29 google::logging_fail() at ??:0
 @ 0x7fa8104c631d google::LogMessage::Fail() at ??:0
 @ 0x7fa8104c81dd google::LogMessage::SendToLog() at ??:0
 @ 0x7fa8104c5e59 google::LogMessage::Flush() at ??:0
F0427 23:04:30.496255 518 mutex.cc:83] Check failed: rv == 0 || rv == 16 . 
Invalid argument. Owner tid: 0; Self tid: 7; To collect the owner stack trace, 
enable the flag --debug_mutex_collect_stacktraceF0427 23:04:30.496286 509 
mutex.cc:83] Check failed: rv == 0 || rv == 16 . Invalid argument. Owner tid: 
0; Self tid: 8; To collect the owner stack trace, enable the flag 
--debug_mutex_collect_stacktrace
*** Check failure stack trace: ***

{noformat} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KUDU-2352) Add an API to allow bounded staleness scan

2018-04-24 Thread Hao Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao reassigned KUDU-2352:
-

Assignee: Hao Hao

> Add an API to allow bounded staleness scan
> --
>
> Key: KUDU-2352
> URL: https://issues.apache.org/jira/browse/KUDU-2352
> Project: Kudu
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 1.7.0
>Reporter: Hao Hao
>Assignee: Hao Hao
>Priority: Major
>
> It would be nice to have an API that allows clients specify a timestamp so 
> that in either READ_AT_SNAPSHOT or READ_YOUR_WRITES mode the chosen snapshot 
> timestamp is guaranteed to be higher than the given bound.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KUDU-2415) READ_YOUR_WRITES scan with no prior operation fails

2018-04-24 Thread Hao Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao reassigned KUDU-2415:
-

Assignee: Hao Hao

> READ_YOUR_WRITES scan with no prior operation fails
> ---
>
> Key: KUDU-2415
> URL: https://issues.apache.org/jira/browse/KUDU-2415
> Project: Kudu
>  Issue Type: Bug
>  Components: client, tserver
>Affects Versions: 1.7.0
>Reporter: Todd Lipcon
>Assignee: Hao Hao
>Priority: Major
>
> If I create a new Java client, and then perform a scan in READ_YOUR_WRITES 
> mode without having done any prior operations from that client, it sends an 
> RPC with read_mode=READ_YOUR_WRITES but without any propagated or snapshot 
> timestamp field set. The server seems to interpret this as a value '0' and 
> then fails with the error:
> Snapshot timestamp is earlier than the ancient history mark. Consider 
> increasing the value of the configuration parameter 
> --tablet_history_max_age_sec. Snapshot timestamp: P: 0 usec, L: 1 Ancient 
> History Mark: P: 1524607330402072 usec, L: 0 Physical time difference: 
> -1524607330.402s



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2415) READ_YOUR_WRITES scan with no prior operation fails

2018-04-24 Thread Hao Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451414#comment-16451414
 ] 

Hao Hao commented on KUDU-2415:
---

I think one way to mitigate this is KUDU-2352.

> READ_YOUR_WRITES scan with no prior operation fails
> ---
>
> Key: KUDU-2415
> URL: https://issues.apache.org/jira/browse/KUDU-2415
> Project: Kudu
>  Issue Type: Bug
>  Components: client, tserver
>Affects Versions: 1.7.0
>Reporter: Todd Lipcon
>Priority: Major
>
> If I create a new Java client, and then perform a scan in READ_YOUR_WRITES 
> mode without having done any prior operations from that client, it sends an 
> RPC with read_mode=READ_YOUR_WRITES but without any propagated or snapshot 
> timestamp field set. The server seems to interpret this as a value '0' and 
> then fails with the error:
> Snapshot timestamp is earlier than the ancient history mark. Consider 
> increasing the value of the configuration parameter 
> --tablet_history_max_age_sec. Snapshot timestamp: P: 0 usec, L: 1 Ancient 
> History Mark: P: 1524607330402072 usec, L: 0 Physical time difference: 
> -1524607330.402s



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-2390) ITClient fails with "Row count unexpectedly decreased"

2018-03-28 Thread Hao Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418161#comment-16418161
 ] 

Hao Hao commented on KUDU-2390:
---

[~tlipcon] Right, it looks like one scanner was expired and retried. I will 
loop the test to try to reproduce and debug it.

> ITClient fails with "Row count unexpectedly decreased"
> --
>
> Key: KUDU-2390
> URL: https://issues.apache.org/jira/browse/KUDU-2390
> Project: Kudu
>  Issue Type: Bug
>  Components: java, test
>Affects Versions: 1.7.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: Stdout.txt.gz
>
>
> On master, hit the following failure of ITClient:
> {code}
> 20:05:05.407 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) 
> AsyncKuduScanner$Response(scannerId = "6ddf5d0da48241aea4b9eb51645716cc", 
> data = RowResultIterator for 27600 rows, more = true, responseScanTimestamp = 
> 6234957022375723008) for scanner
> 20:05:05.407 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:447) Scanner 
> "6ddf5d0da48241aea4b9eb51645716cc" opened on 
> d78cb5506f6e4e17bd54fdaf1819a8a2@[729d64003e7740cabb650f8f6aea4af6(127.1.76.194:60468),7a2e5f9b2be9497fadc30b81a6a50b24(127.1.76.19
> 20:05:05.409 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) 
> AsyncKuduScanner$Response(scannerId = "", data = RowResultIterator for 7314 
> rows, more = false) for scanner 
> KuduScanner(table=org.apache.kudu.client.ITClient-1522206255318, tablet=d78c
> 20:05:05.409 [INFO - Thread-4] (ITClient.java:397) New row count 90114
> 20:05:05.414 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) 
> AsyncKuduScanner$Response(scannerId = "c230614ad13e40478254b785995d1d7c", 
> data = RowResultIterator for 27600 rows, more = true, responseScanTimestamp = 
> 6234957022413987840) for scanner
> 20:05:05.414 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:447) Scanner 
> "c230614ad13e40478254b785995d1d7c" opened on 
> d78cb5506f6e4e17bd54fdaf1819a8a2@[729d64003e7740cabb650f8f6aea4af6(127.1.76.194:60468),7a2e5f9b2be9497fadc30b81a6a50b24(127.1.76.19
> 20:05:05.419 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) 
> AsyncKuduScanner$Response(scannerId = "", data = RowResultIterator for 27600 
> rows, more = true) for scanner 
> KuduScanner(table=org.apache.kudu.client.ITClient-1522206255318, tablet=d78c
> 20:05:05.420 [DEBUG - New I/O worker #17] (AsyncKuduScanner.java:934) 
> AsyncKuduScanner$Response(scannerId = "", data = RowResultIterator for 7342 
> rows, more = false) for scanner 
> KuduScanner(table=org.apache.kudu.client.ITClient-1522206255318, tablet=d78c
> 20:05:05.421 [ERROR - Thread-4] (ITClient.java:134) Row count unexpectedly 
> decreased from 90114to 62542
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KUDU-2363) Investigate if we should use ServiceCredentialProvider for Spark integration

2018-03-21 Thread Hao Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-2363:
--
Description: 
Spark 2 provides a \{{ServiceCredentialProvider}} implementation for 
integration with other service on a secure cluster. Here is a related 
[documentation|https://spark.apache.org/docs/2.1.0/running-on-yarn.html#running-in-a-secure-cluster]
 although lacking in detail.

We should probably investigate if we want to use it to avoid asking users to 
provide the keytab, since it might not be a good practice.

  was:
Spark 2 provides a \{ServiceCredentialProvider} implementation for integration 
with other service on a secure cluster. Here is a related 
[documentation|https://spark.apache.org/docs/2.1.0/running-on-yarn.html#running-in-a-secure-cluster]
 although lacking in detail.

We should probably investigate if we want to use it to avoid asking users to 
provide the keytab, since it might not be a good practice.


> Investigate if we should use ServiceCredentialProvider for Spark integration
> 
>
> Key: KUDU-2363
> URL: https://issues.apache.org/jira/browse/KUDU-2363
> Project: Kudu
>  Issue Type: Improvement
>Reporter: Hao Hao
>Priority: Major
>
> Spark 2 provides a \{{ServiceCredentialProvider}} implementation for 
> integration with other service on a secure cluster. Here is a related 
> [documentation|https://spark.apache.org/docs/2.1.0/running-on-yarn.html#running-in-a-secure-cluster]
>  although lacking in detail.
> We should probably investigate if we want to use it to avoid asking users to 
> provide the keytab, since it might not be a good practice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2363) Investigate if we should use ServiceCredentialProvider for Spark integration

2018-03-20 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2363:
-

 Summary: Investigate if we should use ServiceCredentialProvider 
for Spark integration
 Key: KUDU-2363
 URL: https://issues.apache.org/jira/browse/KUDU-2363
 Project: Kudu
  Issue Type: Improvement
Reporter: Hao Hao


Spark 2 provides a \{ServiceCredentialProvider} implementation for integration 
with other service on a secure cluster. Here is a related 
[documentation|https://spark.apache.org/docs/2.1.0/running-on-yarn.html#running-in-a-secure-cluster]
 although lacking in detail.

We should probably investigate if we want to use it to avoid asking users to 
provide the keytab, since it might not be a good practice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KUDU-1704) Add a new read mode to perform bounded staleness snapshot reads

2018-03-14 Thread Hao Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-1704:
--
Description: 
It would be useful to be able to perform snapshot reads at a timestamp that is 
higher than client's previous writes (achieve read-your-writes), thus improving 
recency, but lower that the server's oldest inflight transaction, thus 
minimizing the scan's chance to block.

Such a mode would not guarantee linearizability, but would still allow for 
client-local read-your-writes, which seems to be one of the properties users 
care about the most.

The detail design of the mode is available in the linked design doc.

This should likely be the new default read mode for scanners.

  was:
It would be useful to be able to perform snapshot reads at a timestamp that is 
higher than a client previous writes (achieve read-your-writes), thus improving 
recency, but lower that the server's oldest inflight transaction, thus 
minimizing the scan's chance to block.

Such a mode would not guarantee linearizability, but would still allow for 
client-local read-your-writes, which seems to be one of the properties users 
care about the most.

The detail design of the mode is available in the linked design doc.

This should likely be the new default read mode for scanners.


> Add a new read mode to perform bounded staleness snapshot reads
> ---
>
> Key: KUDU-1704
> URL: https://issues.apache.org/jira/browse/KUDU-1704
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 1.1.0
>Reporter: David Alves
>Assignee: Hao Hao
>Priority: Major
>  Labels: consistency
> Fix For: 1.7.0
>
>
> It would be useful to be able to perform snapshot reads at a timestamp that 
> is higher than client's previous writes (achieve read-your-writes), thus 
> improving recency, but lower that the server's oldest inflight transaction, 
> thus minimizing the scan's chance to block.
> Such a mode would not guarantee linearizability, but would still allow for 
> client-local read-your-writes, which seems to be one of the properties users 
> care about the most.
> The detail design of the mode is available in the linked design doc.
> This should likely be the new default read mode for scanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KUDU-2352) Add an API to allow bounded staleness scan

2018-03-14 Thread Hao Hao (JIRA)

Hao Hao created KUDU-2352:
-

 Summary: Add an API to allow bounded staleness scan
 Key: KUDU-2352
 URL: https://issues.apache.org/jira/browse/KUDU-2352
 Project: Kudu
  Issue Type: Improvement
  Components: api
Affects Versions: 1.7.0
Reporter: Hao Hao


It would be nice to have an API that allows clients specify a timestamp so that 
in either READ_AT_SNAPSHOT or READ_YOUR_WRITES mode the chosen snapshot 
timestamp is guaranteed to be higher than the given bound.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KUDU-1704) Add a new read mode to perform bounded staleness snapshot reads

2018-03-14 Thread Hao Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao updated KUDU-1704:
--
Description: 
It would be useful to be able to perform snapshot reads at a timestamp that is 
higher than a client previous writes (achieve read-your-writes), thus improving 
recency, but lower that the server's oldest inflight transaction, thus 
minimizing the scan's chance to block.

Such a mode would not guarantee linearizability, but would still allow for 
client-local read-your-writes, which seems to be one of the properties users 
care about the most.

The detail design of the mode is available in the linked design doc.

This should likely be the new default read mode for scanners.

  was:
It would be useful to be able to perform snapshot reads at a timestamp that is 
higher than a client provided timestamp, thus improving recency, but lower that 
the server's oldest inflight transaction, thus minimizing the scan's chance to 
block.

Such a mode would not guarantee linearizability, but would still allow for 
client-local read-your-writes, which seems to be one of the properties users 
care about the most.

This should likely be the new default read mode for scanners.



> Add a new read mode to perform bounded staleness snapshot reads
> ---
>
> Key: KUDU-1704
> URL: https://issues.apache.org/jira/browse/KUDU-1704
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 1.1.0
>Reporter: David Alves
>Assignee: Hao Hao
>Priority: Major
>  Labels: consistency
> Fix For: 1.7.0
>
>
> It would be useful to be able to perform snapshot reads at a timestamp that 
> is higher than a client previous writes (achieve read-your-writes), thus 
> improving recency, but lower that the server's oldest inflight transaction, 
> thus minimizing the scan's chance to block.
> Such a mode would not guarantee linearizability, but would still allow for 
> client-local read-your-writes, which seems to be one of the properties users 
> care about the most.
> The detail design of the mode is available in the linked design doc.
> This should likely be the new default read mode for scanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (KUDU-1704) Add a new read mode to perform bounded staleness snapshot reads

2018-03-14 Thread Hao Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399598#comment-16399598
 ] 

Hao Hao edited comment on KUDU-1704 at 3/15/18 3:33 AM:


Fixed in commits 723ced836, 5047f091d, 0c05e8375, 9d233f457.


was (Author: hahao):
Fixed in commits 723ced836, 5047f091d, 0c05e8375, 48cdaaa17.

> Add a new read mode to perform bounded staleness snapshot reads
> ---
>
> Key: KUDU-1704
> URL: https://issues.apache.org/jira/browse/KUDU-1704
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 1.1.0
>Reporter: David Alves
>Assignee: Hao Hao
>Priority: Major
>  Labels: consistency
> Fix For: 1.7.0
>
>
> It would be useful to be able to perform snapshot reads at a timestamp that 
> is higher than a client provided timestamp, thus improving recency, but lower 
> that the server's oldest inflight transaction, thus minimizing the scan's 
> chance to block.
> Such a mode would not guarantee linearizability, but would still allow for 
> client-local read-your-writes, which seems to be one of the properties users 
> care about the most.
> This should likely be the new default read mode for scanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-1704) Add a new read mode to perform bounded staleness snapshot reads

2018-03-14 Thread Hao Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399865#comment-16399865
 ] 

Hao Hao commented on KUDU-1704:
---

Hmm, I guess I used my local commit hash. Just verified it should be 9d233f457 
instead.

I think the description in the jira does not specifically mentioned it is 
snapshot consistency across all tablets. Though it did mention it is bounded 
staleness. Currently, there is no API to specify the staleness bound yet. 

So to make it more clear, I linked the design doc and will update the 
description and file a follow up Jira to add a API to specify the staleness 
bound.

> Add a new read mode to perform bounded staleness snapshot reads
> ---
>
> Key: KUDU-1704
> URL: https://issues.apache.org/jira/browse/KUDU-1704
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 1.1.0
>Reporter: David Alves
>Assignee: Hao Hao
>Priority: Major
>  Labels: consistency
> Fix For: 1.7.0
>
>
> It would be useful to be able to perform snapshot reads at a timestamp that 
> is higher than a client provided timestamp, thus improving recency, but lower 
> that the server's oldest inflight transaction, thus minimizing the scan's 
> chance to block.
> Such a mode would not guarantee linearizability, but would still allow for 
> client-local read-your-writes, which seems to be one of the properties users 
> care about the most.
> This should likely be the new default read mode for scanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-1704) Add a new read mode to perform bounded staleness snapshot reads

2018-03-14 Thread Hao Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Hao resolved KUDU-1704.
---
   Resolution: Fixed
Fix Version/s: 1.7.0

> Add a new read mode to perform bounded staleness snapshot reads
> ---
>
> Key: KUDU-1704
> URL: https://issues.apache.org/jira/browse/KUDU-1704
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 1.1.0
>Reporter: David Alves
>Assignee: Hao Hao
>Priority: Major
>  Labels: consistency
> Fix For: 1.7.0
>
>
> It would be useful to be able to perform snapshot reads at a timestamp that 
> is higher than a client provided timestamp, thus improving recency, but lower 
> that the server's oldest inflight transaction, thus minimizing the scan's 
> chance to block.
> Such a mode would not guarantee linearizability, but would still allow for 
> client-local read-your-writes, which seems to be one of the properties users 
> care about the most.
> This should likely be the new default read mode for scanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 171 matches

Mail list logo