[jira] [Created] (KUDU-1912) Tablet startup fails on transaction memory consumption

2017-03-08 Thread Jean-Daniel Cryans (JIRA)
Jean-Daniel Cryans created KUDU-1912:


 Summary: Tablet startup fails on transaction memory consumption
 Key: KUDU-1912
 URL: https://issues.apache.org/jira/browse/KUDU-1912
 Project: Kudu
  Issue Type: Bug
  Components: tablet
Affects Versions: 1.2.0
Reporter: Jean-Daniel Cryans


As reported by a user on Slack:

{code}
W0307 20:03:46.820791 25594 transaction_tracker.cc:108] Transaction failed, 
tablet 7bb5e24d7521458d91ad06736a9f7685 transaction memory consumption 
(66447925) has exceeded its limit (67108864) or the limit of an ancestral 
tracker
E0307 20:03:46.820821 25594 ts_tablet_manager.cc:776] T 
7bb5e24d7521458d91ad06736a9f7685 P d4a26cb0d6994266a68dc76d983e454a: Tablet 
failed to start: Service unavailable: Transaction failed, tablet 
7bb5e24d7521458d91ad06736a9f7685 transaction memory consumption (66447925) has 
exceeded its limit (67108864) or the limit of an ancestral tracker
{code}

Then since it's a failed state, the replica doesn't get kicked out of the 
configuration and so the tablet stays under-replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KUDU-1887) Allow RPC handlers to discard inbound transfer

2017-03-08 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved KUDU-1887.
---
   Resolution: Fixed
Fix Version/s: 1.4.0

> Allow RPC handlers to discard inbound transfer
> --
>
> Key: KUDU-1887
> URL: https://issues.apache.org/jira/browse/KUDU-1887
> Project: Kudu
>  Issue Type: Improvement
>  Components: rpc
>Affects Versions: 1.2.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Minor
> Fix For: 1.4.0
>
>
> This is a general feature request for using KRPC in Impala, not something 
> that affects Kudu itself right not AFAIK.
> A common pattern with communication patterns where a lot of flows fan-in to a 
> single server is for the server to delay returning a response to a client for 
> a while, in order to implement some kind of flow control when the server is 
> at capacity. 
> If a client sends a lot of data (perhaps by sidecar - KUDU-1866), there's 
> currently no way AFAICT to retain the {{RpcContext}} needed to delay sending 
> the response, but to drop the associated transfer buffer (that, presumably, 
> is putting the server over its capacity). 
> So we could have {{RpcContext::DiscardTransfer()}} which drops the 
> {{InboundCall}}'s {{InboundTransfer}}. Since this likely to be called after 
> handling the request, the request protobuf should still be independently 
> attached to the {{RpcContext}}. After {{DiscardTransfer}}, it's an error to 
> look at any inbound sidecars.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KUDU-1890) Allow renaming of primary key column

2017-03-08 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15901836#comment-15901836
 ] 

Todd Lipcon commented on KUDU-1890:
---

Should we resolve this as a duplicate of KUDU-1626? Or close that one as dup? 
seems they are the same

> Allow renaming of primary key column
> 
>
> Key: KUDU-1890
> URL: https://issues.apache.org/jira/browse/KUDU-1890
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Ram Mettu
>Assignee: Ram Mettu
>Priority: Minor
>
> Current version provides functionality to rename any non-primary key columns 
> of the table, request is to remove the restriction to rename primary key 
> column. 
> The workaround is very time consuming, create a new table and recopy the data 
> from old table into new table. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KUDU-1890) Allow renaming of primary key column

2017-03-08 Thread Ram Mettu (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15901962#comment-15901962
 ] 

Ram Mettu commented on KUDU-1890:
-

no problem, we can close this one.

> Allow renaming of primary key column
> 
>
> Key: KUDU-1890
> URL: https://issues.apache.org/jira/browse/KUDU-1890
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Ram Mettu
>Assignee: Ram Mettu
>Priority: Minor
>
> Current version provides functionality to rename any non-primary key columns 
> of the table, request is to remove the restriction to rename primary key 
> column. 
> The workaround is very time consuming, create a new table and recopy the data 
> from old table into new table. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (KUDU-1890) Allow renaming of primary key column

2017-03-08 Thread Ram Mettu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ram Mettu closed KUDU-1890.
---
Resolution: Duplicate

Duplicate of KUDU-1626

> Allow renaming of primary key column
> 
>
> Key: KUDU-1890
> URL: https://issues.apache.org/jira/browse/KUDU-1890
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Ram Mettu
>Assignee: Ram Mettu
>Priority: Minor
>
> Current version provides functionality to rename any non-primary key columns 
> of the table, request is to remove the restriction to rename primary key 
> column. 
> The workaround is very time consuming, create a new table and recopy the data 
> from old table into new table. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KUDU-1626) Allow renaming primary key columns

2017-03-08 Thread Ram Mettu (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15901968#comment-15901968
 ] 

Ram Mettu commented on KUDU-1626:
-

https://gerrit.cloudera.org/#/c/6078/

> Allow renaming primary key columns
> --
>
> Key: KUDU-1626
> URL: https://issues.apache.org/jira/browse/KUDU-1626
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.0.0
>Reporter: Dan Burkert
>Assignee: Ram Mettu
>
> Kudu unnecessarily restricts primary key columns from being renamed. This is 
> of particular importance since column renaming is the only workaround for 
> Impala and Spark not being able to use columns with upper case and non-ascii 
> characters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KUDU-1913) Tablet server should handle failure gracefully instead of crashing

2017-03-08 Thread Juan Yu (JIRA)
Juan Yu created KUDU-1913:
-

 Summary: Tablet server should handle failure gracefully instead of 
crashing
 Key: KUDU-1913
 URL: https://issues.apache.org/jira/browse/KUDU-1913
 Project: Kudu
  Issue Type: Bug
Reporter: Juan Yu


When adding lots of range partitions, all tablet server crashed with the 
following error:

F0308 14:51:04.109369 12952 raft_consensus.cc:1985] Check failed: _s.ok() Bad 
status: Runtime error: Could not create thread: Resource temporarily 
unavailable (error 11)

Tablet server should handle error/failure more gracefully instead of crashing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KUDU-1914) Add positive test cases for Web UI .htpasswd support

2017-03-08 Thread Hao Hao (JIRA)
Hao Hao created KUDU-1914:
-

 Summary: Add positive test cases for Web UI .htpasswd support
 Key: KUDU-1914
 URL: https://issues.apache.org/jira/browse/KUDU-1914
 Project: Kudu
  Issue Type: Test
  Components: security
Affects Versions: 1.3.0
Reporter: Hao Hao
Priority: Minor


We have negative test for web UI basic HTTP authentication. It would be nice to 
add a positive test to ensure when HTTP authentication is enabled, given 
correct user, password can connect to Web server successfully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KUDU-1913) Tablet server should handle failure gracefully instead of crashing

2017-03-08 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902278#comment-15902278
 ] 

Todd Lipcon commented on KUDU-1913:
---

I think it's running out of threads on a tablet server. We have some plans to 
reduce the number of threads created - right now we create at least two per 
partition on a tablet server, so if you try to put many thousands of partitions 
on one server, you'll hit this. The known limitations document does recommend 
keeping to hundreds of tablets per server max in current versions.

> Tablet server should handle failure gracefully instead of crashing
> --
>
> Key: KUDU-1913
> URL: https://issues.apache.org/jira/browse/KUDU-1913
> Project: Kudu
>  Issue Type: Bug
>Reporter: Juan Yu
>
> When adding lots of range partitions, all tablet server crashed with the 
> following error:
> F0308 14:51:04.109369 12952 raft_consensus.cc:1985] Check failed: _s.ok() Bad 
> status: Runtime error: Could not create thread: Resource temporarily 
> unavailable (error 11)
> Tablet server should handle error/failure more gracefully instead of crashing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KUDU-1913) Tablet server runs out of threads when creating lots of tablets

2017-03-08 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-1913:
--
Target Version/s: 1.4.0
  Labels: data-scalability  (was: )
 Component/s: log
  consensus
 Summary: Tablet server runs out of threads when creating lots of 
tablets  (was: Tablet server should handle failure gracefully instead of 
crashing)

> Tablet server runs out of threads when creating lots of tablets
> ---
>
> Key: KUDU-1913
> URL: https://issues.apache.org/jira/browse/KUDU-1913
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, log
>Reporter: Juan Yu
>  Labels: data-scalability
>
> When adding lots of range partitions, all tablet server crashed with the 
> following error:
> F0308 14:51:04.109369 12952 raft_consensus.cc:1985] Check failed: _s.ok() Bad 
> status: Runtime error: Could not create thread: Resource temporarily 
> unavailable (error 11)
> Tablet server should handle error/failure more gracefully instead of crashing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KUDU-1913) Tablet server runs out of threads when creating lots of tablets

2017-03-08 Thread Juan Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902470#comment-15902470
 ] 

Juan Yu commented on KUDU-1913:
---

I know there is recommendation for hundreds of tablets limit. and when create 
table, there is also 60 bucket limit check to avoid creating too many 
partitions.
But there is no warning when add range partition. so it's very easy to hit the 
limit and it will cause many servers to crash at the same time, not just a 
single one.
could an upper limit (total tablet per server) check be added to avoid this?

> Tablet server runs out of threads when creating lots of tablets
> ---
>
> Key: KUDU-1913
> URL: https://issues.apache.org/jira/browse/KUDU-1913
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, log
>Reporter: Juan Yu
>  Labels: data-scalability
>
> When adding lots of range partitions, all tablet server crashed with the 
> following error:
> F0308 14:51:04.109369 12952 raft_consensus.cc:1985] Check failed: _s.ok() Bad 
> status: Runtime error: Could not create thread: Resource temporarily 
> unavailable (error 11)
> Tablet server should handle error/failure more gracefully instead of crashing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KUDU-1554) Tombstoned replicas remain on TS even after table is deleted

2017-03-08 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-1554:
--
Affects Version/s: (was: 0.10.0)
   1.2.0
 Target Version/s: 1.4.0

The above-mentioned "still references orphaned blocks" was spotted again in the 
wild on 1.2.

> Tombstoned replicas remain on TS even after table is deleted
> 
>
> Key: KUDU-1554
> URL: https://issues.apache.org/jira/browse/KUDU-1554
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.2.0
>Reporter: Todd Lipcon
>Priority: Minor
>
> If a replica is deleted on a live table, a tombstone replica is left with 
> TABLET_DATA_TOMBSTONED state. If the table is then deleted, those tombstones 
> aren't cleaned up, and will remain on the tserver until the next time the 
> tserver restarts.
> Not a big deal, but it may be confusing to users to see these tombstones 
> sticking around.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KUDU-1038) Deleting a tablet should also delete its log recovery directory, if any

2017-03-08 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-1038:
--
Component/s: tablet

> Deleting a tablet should also delete its log recovery directory, if any
> ---
>
> Key: KUDU-1038
> URL: https://issues.apache.org/jira/browse/KUDU-1038
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, tablet
>Affects Versions: Feature Complete
>Reporter: Mike Percy
>Assignee: Mike Percy
>Priority: Minor
>
> Deleting a tablet should also delete its log recovery directory, if any.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KUDU-693) UpdateConsensus and RequestConsensusVote RPC callbacks block reactor

2017-03-08 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved KUDU-693.
--
   Resolution: Duplicate
Fix Version/s: n/a

I haven't seen this be a problem since making glog async

> UpdateConsensus and RequestConsensusVote RPC callbacks block reactor
> 
>
> Key: KUDU-693
> URL: https://issues.apache.org/jira/browse/KUDU-693
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: n/a
>
>
> After adding logging when RPC callbacks block the reactor for too long, I see 
> a fair number of logs on the YCSB tablet servers indicating that these two 
> RPC calls have blocked the reactor for anywhere between 100ms and almost a 
> second. This implies they could probably cause deadlocks as well, and 
> definitely impact latency of all other RPCs on the server.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)