[jira] [Commented] (ACCUMULO-3565) Encourage use of voting on JIRA
[ https://issues.apache.org/jira/browse/ACCUMULO-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305475#comment-14305475 ] Josh Elser commented on ACCUMULO-3565: -- Accumulo Interest: https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12317510 Accumulo: https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12316812 Encourage use of voting on JIRA --- Key: ACCUMULO-3565 URL: https://issues.apache.org/jira/browse/ACCUMULO-3565 Project: Accumulo Issue Type: Sub-task Reporter: Josh Elser We have lots of issues, many of which tend to fall by the wayside. If we encourage committers/contributors to use the vote feature, we could create a dashboard that highlights the most highly-voted on issues. This will help show where interest lies with Accumulo and help devs or new users find meaningful issues to work on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3565) Encourage use of voting on JIRA
[ https://issues.apache.org/jira/browse/ACCUMULO-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305472#comment-14305472 ] Mike Drob commented on ACCUMULO-3565: - Where can I find these dashboards? Encourage use of voting on JIRA --- Key: ACCUMULO-3565 URL: https://issues.apache.org/jira/browse/ACCUMULO-3565 Project: Accumulo Issue Type: Sub-task Reporter: Josh Elser We have lots of issues, many of which tend to fall by the wayside. If we encourage committers/contributors to use the vote feature, we could create a dashboard that highlights the most highly-voted on issues. This will help show where interest lies with Accumulo and help devs or new users find meaningful issues to work on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3563) Create regular Accumulo blog posts
[ https://issues.apache.org/jira/browse/ACCUMULO-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305400#comment-14305400 ] Keith Turner commented on ACCUMULO-3563: I could possibly write a post about ACCUMULO-3439 for Feb. Create regular Accumulo blog posts -- Key: ACCUMULO-3563 URL: https://issues.apache.org/jira/browse/ACCUMULO-3563 Project: Accumulo Issue Type: Sub-task Reporter: Josh Elser It'd be great if we could get a regular schedule in place that each $timeframe, someone signs up to create a new blog post for http://blogs.apache.org/accumulo. We don't need to have extremely long, detailed posts -- any regular posting would be a great addition to users of all experience. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3562) Create a powered-by page on the website
[ https://issues.apache.org/jira/browse/ACCUMULO-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305433#comment-14305433 ] Keith Turner commented on ACCUMULO-3562: There was a mailing list discussion about a logo. Below are two links to that discussion. * [markmail mailing list discussion|http://markmail.org/message/j7pndgcfvu3nxd7p] * [apache mailing list discussion|http://mail-archives.apache.org/mod_mbox/accumulo-dev/201410.mbox/%3ccagutchrai4vcuuipxtuqsg0oxenlbx4cc-m8mtkzr5vxpsh...@mail.gmail.com%3E] Create a powered-by page on the website --- Key: ACCUMULO-3562 URL: https://issues.apache.org/jira/browse/ACCUMULO-3562 Project: Accumulo Issue Type: Sub-task Reporter: Josh Elser We could provide a way for companies and projects to advertise that they use Accumulo. It would be a good way to show that Accumulo is in successful use by projects and companies, and possibly help people who want to work with Accumulo as a part of their $dayjob. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3542) Native tarball doesn't have a NOTICE
[ https://issues.apache.org/jira/browse/ACCUMULO-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305707#comment-14305707 ] Christopher Tubbs commented on ACCUMULO-3542: - What would this NOTICE contain, and can you provide a link to the justification to include it? Also, we package this tarball inside our source tarball, which does have a NOTICE. This is an analogous artifact to our jars, which also don't have NOTICEs. Should those jars also have NOTICE files? Native tarball doesn't have a NOTICE Key: ACCUMULO-3542 URL: https://issues.apache.org/jira/browse/ACCUMULO-3542 Project: Accumulo Issue Type: Bug Affects Versions: 1.6.1 Reporter: Billie Rinaldi Fix For: 1.6.3 Do we build any other tarballs we should check? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
[ https://issues.apache.org/jira/browse/ACCUMULO-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305797#comment-14305797 ] Josh Elser edited comment on ACCUMULO-3513 at 2/4/15 8:00 PM: -- I haven't really read up about DIGEST-MD5. I'll have to look into that and see if there's anything better we can use with SASL. bq. The individual MapReduce nodes do not have Kerberos principals at all? How do they authenticate to the job controller? YARN processes have kerberos principals and credentials, but the tasks they spawn do not. Delegation tokens are the solution for those tasks. The user talks to the resource manager with their krb credentials, and obtains delegation tokens for yarn (same happens with the NN and hdfs). These get stored inside the UserGroupInformation object for yourself and get passed along through the RM, NM, app master and other containers, either over the protocol or via filesystem-permission secured files on disk (the latter happening when the NM drops privileges to run as you instead of itself). bq. you have to talk to the TServer which issued it This would require us have clients hold onto N delegation tokens though. That'd make the client implementation much more difficult than a singular delegation token that any node in the instance can verify. bq. If you use a single shared key, you really don't need leader election (because they all have the secret and perform the same function) You need the coordination to roll new secret keys. Using the same secret key for months (assuming average uptime of a cluster) is just asking for attacks. bq. I'm very curious precisely how you are generating these delegation tokens, though. I could be on a completely separate page regarding that and your suggestion for leader elections. Code will speak better than I can: https://github.com/joshelser/accumulo/tree/delegation-tokens/server/base/src/main/java/org/apache/accumulo/server/security/delegation. I just finished this up, I think. Each Master and Tserver has a SecretManager implementation. The Master (or more generally, whoever is creating the secret keys), also runs the KeyManager which generates a new secret key every $timelength. That process also uses the KeyDistributor to add secret keys to ZK (for all of the followers). The followers (tservers) use the KeyWatcher to see changes made by the KeyDistributor and update their SecretManager. In general, the SecretManager is a local cache off of ZooKeeper which can generate/verify the passwords in delegation tokens. No mechanisms yet exist to ensure that all followers/tservers have seen a new secret key. was (Author: elserj): I haven't really read up about DIGEST-MD5. I'll have to look into that and see if there's anything better we can use with SASL. bq The individual MapReduce nodes do not have Kerberos principals at all? How do they authenticate to the job controller? Delegation tokens. bq. you have to talk to the TServer which issued it This would require us have clients hold onto N delegation tokens though. That'd make the client implementation much more difficult than a singular delegation token that any node in the instance can verify. bq. If you use a single shared key, you really don't need leader election (because they all have the secret and perform the same function) You need the coordination to roll new secret keys. Using the same secret key for months (assuming average uptime of a cluster) is just asking for attacks. bq. I'm very curious precisely how you are generating these delegation tokens, though. I could be on a completely separate page regarding that and your suggestion for leader elections. Code will speak better than I can: https://github.com/joshelser/accumulo/tree/delegation-tokens/server/base/src/main/java/org/apache/accumulo/server/security/delegation. I just finished this up, I think. Each Master and Tserver has a SecretManager implementation. The Master (or more generally, whoever is creating the secret keys), also runs the KeyManager which generates a new secret key every $timelength. That process also uses the KeyDistributor to add secret keys to ZK (for all of the followers). The followers (tservers) use the KeyWatcher to see changes made by the KeyDistributor and update their SecretManager. In general, the SecretManager is a local cache off of ZooKeeper which can generate/verify the passwords in delegation tokens. No mechanisms yet exist to ensure that all followers/tservers have seen a new secret key. Ensure MapReduce functionality with Kerberos enabled Key: ACCUMULO-3513 URL: https://issues.apache.org/jira/browse/ACCUMULO-3513 Project: Accumulo Issue Type: Bug Components: client Reporter: Josh Elser
[jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
[ https://issues.apache.org/jira/browse/ACCUMULO-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305779#comment-14305779 ] Christopher Tubbs commented on ACCUMULO-3513: - NOTE: DIGEST-MD5 is ill-advised, due to problems: http://tools.ietf.org/html/rfc6331 That's not to say that it couldn't be useful, if deployed properly. I'm just reluctant to rely on deprecated security modes, because it could give a false sense of confidence in the security being implemented. {quote}MapReduce does not have access to Kerberos tokens. This is a non-starter.{quote} The individual MapReduce nodes do not have Kerberos principals at all? How do they authenticate to the job controller? {quote}... We can easily add leader election...{quote} My point was that we don't need to do leader election. Rather, each TServer is just as good as any other to authenticate users, so rather than elect a single leader, you can simply allow any of them to issue tokens (concurrently). The only restriction is that to validate that token, you have to talk to the TServer which issued it... but that's better than always talking to a single leader or the master. {quote}... This authentication model relies on the same secrets being shared across all nodes in the cluster. ...{quote} If you use a single shared key, you *really* don't need leader election (because they all have the secret and perform the same function). However, I was actually thinking that each TServer would have a temporary key with which to generate delegation tokens. So long as that TServer hadn't crashed, it could validate any delegation tokens created from it. I'm very curious precisely how you are generating these delegation tokens, though. I could be on a completely separate page regarding that and your suggestion for leader elections. Ensure MapReduce functionality with Kerberos enabled Key: ACCUMULO-3513 URL: https://issues.apache.org/jira/browse/ACCUMULO-3513 Project: Accumulo Issue Type: Bug Components: client Reporter: Josh Elser Assignee: Josh Elser Priority: Blocker Fix For: 1.7.0 Attachments: ACCUMULO-3513-design.pdf I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help get a picture about what extra might be needed to make this work. Generally, in Hadoop and HBase, the client must have valid credentials to submit a job, then the notion of delegation tokens is used by for further communication since the servers do not have access to the client's sensitive information. A centralized service manages creation of a delegation token which is a record which contains certain information (such as the submitting user name) necessary to securely identify the holder of the delegation token. The general idea is that we would need to build support into the master to manage delegation tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code which implements this general idea, but we will need to apply them Accumulo and verify that it is M/R jobs still work on a kerberized environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
[ https://issues.apache.org/jira/browse/ACCUMULO-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305797#comment-14305797 ] Josh Elser commented on ACCUMULO-3513: -- I haven't really read up about DIGEST-MD5. I'll have to look into that and see if there's anything better we can use with SASL. bq The individual MapReduce nodes do not have Kerberos principals at all? How do they authenticate to the job controller? Delegation tokens. bq. you have to talk to the TServer which issued it This would require us have clients hold onto N delegation tokens though. That'd make the client implementation much more difficult than a singular delegation token that any node in the instance can verify. bq. If you use a single shared key, you really don't need leader election (because they all have the secret and perform the same function) You need the coordination to roll new secret keys. Using the same secret key for months (assuming average uptime of a cluster) is just asking for attacks. bq. I'm very curious precisely how you are generating these delegation tokens, though. I could be on a completely separate page regarding that and your suggestion for leader elections. Code will speak better than I can: https://github.com/joshelser/accumulo/tree/delegation-tokens/server/base/src/main/java/org/apache/accumulo/server/security/delegation. I just finished this up, I think. Each Master and Tserver has a SecretManager implementation. The Master (or more generally, whoever is creating the secret keys), also runs the KeyManager which generates a new secret key every $timelength. That process also uses the KeyDistributor to add secret keys to ZK (for all of the followers). The followers (tservers) use the KeyWatcher to see changes made by the KeyDistributor and update their SecretManager. In general, the SecretManager is a local cache off of ZooKeeper which can generate/verify the passwords in delegation tokens. No mechanisms yet exist to ensure that all followers/tservers have seen a new secret key. Ensure MapReduce functionality with Kerberos enabled Key: ACCUMULO-3513 URL: https://issues.apache.org/jira/browse/ACCUMULO-3513 Project: Accumulo Issue Type: Bug Components: client Reporter: Josh Elser Assignee: Josh Elser Priority: Blocker Fix For: 1.7.0 Attachments: ACCUMULO-3513-design.pdf I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help get a picture about what extra might be needed to make this work. Generally, in Hadoop and HBase, the client must have valid credentials to submit a job, then the notion of delegation tokens is used by for further communication since the servers do not have access to the client's sensitive information. A centralized service manages creation of a delegation token which is a record which contains certain information (such as the submitting user name) necessary to securely identify the holder of the delegation token. The general idea is that we would need to build support into the master to manage delegation tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code which implements this general idea, but we will need to apply them Accumulo and verify that it is M/R jobs still work on a kerberized environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3542) Native tarball doesn't have a NOTICE
[ https://issues.apache.org/jira/browse/ACCUMULO-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305736#comment-14305736 ] Billie Rinaldi commented on ACCUMULO-3542: -- http://www.apache.org/dev/release.html#distribute-other-artifacts A default ASF NOTICE would be fine. Our jars already include default LICENSE and NOTICE files in their META-INF directories. The jars and the native tarball are also distributed individually through Maven. I suppose we could decide that Maven doesn't count for distribution of release artifacts; but I think we might as well just add the NOTICE. Native tarball doesn't have a NOTICE Key: ACCUMULO-3542 URL: https://issues.apache.org/jira/browse/ACCUMULO-3542 Project: Accumulo Issue Type: Bug Affects Versions: 1.6.1 Reporter: Billie Rinaldi Fix For: 1.6.3 Do we build any other tarballs we should check? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (ACCUMULO-3549) tablet server location cache may grow too large
[ https://issues.apache.org/jira/browse/ACCUMULO-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-3549. - Resolution: Fixed Fix Version/s: 1.6.2 tablet server location cache may grow too large --- Key: ACCUMULO-3549 URL: https://issues.apache.org/jira/browse/ACCUMULO-3549 Project: Accumulo Issue Type: Bug Components: tserver Reporter: Eric Newton Assignee: Eric Newton Priority: Minor Fix For: 1.6.2, 1.7.0 Time Spent: 50m Remaining Estimate: 0h Now that we're no longer clearing the location cache in the tablet server, it could grow without bound. It should be cleared, either by time, or using an LRU. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
[ https://issues.apache.org/jira/browse/ACCUMULO-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305989#comment-14305989 ] Christopher Tubbs commented on ACCUMULO-3513: - {quote}YARN processes have kerberos principals and credentials, but the tasks they spawn do not.{quote} Oh. Interesting. So, the YARN process can securely authenticate itself with the job controller (NodeManager? I'm not sure terminology here) before a job is submitted, but the task doesn't have access to that. How do they prevent the tasks from getting access to the parent process' Kerberos keytab? How are these tasks sandboxed? Could our Input/OutputFormat be configured to access this keytab? I guess you might not want to do that if you don't trust the job which was submitted, but I'm not sure how we (Accumulo services) can trust that the request is coming from a trusted YARN service, and not some other party which maliciously gained access to a client's delegation token. {quote}This would require us have clients hold onto N delegation tokens though.{quote} No, there'd still only be one delegation token in play, but whoever generated it might change. I'm suggesting instead of a global, fixed leader involving coordination, a random leader is selected for each delegation token. {quote}You need the coordination to roll new secret keys. Using the same secret key for months (assuming average uptime of a cluster) is just asking for attacks.{quote} That's not what I was suggesting. I was suggesting eliminating the need to coordinate between servers by making one server responsible for each token (corresponding to a temporary key stored within that tserver). {quote}Code will speak better than I can:...{quote} Cool. Will take a look. Ensure MapReduce functionality with Kerberos enabled Key: ACCUMULO-3513 URL: https://issues.apache.org/jira/browse/ACCUMULO-3513 Project: Accumulo Issue Type: Bug Components: client Reporter: Josh Elser Assignee: Josh Elser Priority: Blocker Fix For: 1.7.0 Attachments: ACCUMULO-3513-design.pdf I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help get a picture about what extra might be needed to make this work. Generally, in Hadoop and HBase, the client must have valid credentials to submit a job, then the notion of delegation tokens is used by for further communication since the servers do not have access to the client's sensitive information. A centralized service manages creation of a delegation token which is a record which contains certain information (such as the submitting user name) necessary to securely identify the holder of the delegation token. The general idea is that we would need to build support into the master to manage delegation tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code which implements this general idea, but we will need to apply them Accumulo and verify that it is M/R jobs still work on a kerberized environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3549) tablet server location cache may grow too large
[ https://issues.apache.org/jira/browse/ACCUMULO-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305913#comment-14305913 ] Corey Nolet commented on ACCUMULO-3549: --- So we're comfortable with this change? Should I start cutting RC4? On Wed, Feb 4, 2015 at 3:17 PM, Christopher Tubbs (JIRA) j...@apache.org tablet server location cache may grow too large --- Key: ACCUMULO-3549 URL: https://issues.apache.org/jira/browse/ACCUMULO-3549 Project: Accumulo Issue Type: Bug Components: tserver Reporter: Eric Newton Assignee: Eric Newton Priority: Minor Fix For: 1.6.2, 1.7.0 Time Spent: 50m Remaining Estimate: 0h Now that we're no longer clearing the location cache in the tablet server, it could grow without bound. It should be cleared, either by time, or using an LRU. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
[ https://issues.apache.org/jira/browse/ACCUMULO-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306004#comment-14306004 ] Josh Elser commented on ACCUMULO-3513: -- bq. Oh. Interesting. So, the YARN process can securely authenticate itself with the job controller (NodeManager? I'm not sure terminology here) before a job is submitted, but the task doesn't have access to that. ResourceManager, but yes, I think you have the point. bq. How do they prevent the tasks from getting access to the parent process' Kerberos keytab? So, it's an entirely new process, so there's no shared memory. Keytabs on disk should be protected by the filesystem. bq. How are these tasks sandboxed? A little C program is executed by the nodemanager which does your normal fork(), drops permissions on the child process, and runs the actual yarn task. bq. Could our Input/OutputFormat be configured to access this keytab? No, for the above reason -- we cannot read it. If it was generally open, anyone could impersonate the yarn processes. bq. I guess you might not want to do that if you don't trust the job which was submitted, but I'm not sure how we (Accumulo services) can trust that the request is coming from a trusted YARN service, and not some other party which maliciously gained access to a client's delegation token. Like any password, it's expected that the delegation token is protected from prying eyes. The time-limit on the validity of the delegation token helps mitigate some concern, but that's a very small mitigation. We ultimately need to rely on YARN (which it is doing) to keep the delegation token safe from prying eyes from when it leaves the client's possession and makes it way to the actual yarn task. Ensure MapReduce functionality with Kerberos enabled Key: ACCUMULO-3513 URL: https://issues.apache.org/jira/browse/ACCUMULO-3513 Project: Accumulo Issue Type: Bug Components: client Reporter: Josh Elser Assignee: Josh Elser Priority: Blocker Fix For: 1.7.0 Attachments: ACCUMULO-3513-design.pdf I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help get a picture about what extra might be needed to make this work. Generally, in Hadoop and HBase, the client must have valid credentials to submit a job, then the notion of delegation tokens is used by for further communication since the servers do not have access to the client's sensitive information. A centralized service manages creation of a delegation token which is a record which contains certain information (such as the submitting user name) necessary to securely identify the holder of the delegation token. The general idea is that we would need to build support into the master to manage delegation tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code which implements this general idea, but we will need to apply them Accumulo and verify that it is M/R jobs still work on a kerberized environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ACCUMULO-3514) Use Java ServiceLoader to identify classes that are launchable by start
[ https://issues.apache.org/jira/browse/ACCUMULO-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs updated ACCUMULO-3514: Resolution: Fixed Status: Resolved (was: Patch Available) Use Java ServiceLoader to identify classes that are launchable by start --- Key: ACCUMULO-3514 URL: https://issues.apache.org/jira/browse/ACCUMULO-3514 Project: Accumulo Issue Type: Improvement Components: start Reporter: Christopher Tubbs Assignee: Christopher Tubbs Fix For: 1.7.0 Time Spent: 10m Remaining Estimate: 0h This is similar to the goals of ACCUMULO-1496, but it should be possible to do it more efficiently with an annotation processor and Java's ServiceLoader. If successful, this would supersede ACCUMULO-1496. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ACCUMULO-1844) Tests to ensure that Main's classnames exist
[ https://issues.apache.org/jira/browse/ACCUMULO-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs updated ACCUMULO-1844: Resolution: Fixed Status: Resolved (was: Patch Available) Tests to ensure that Main's classnames exist Key: ACCUMULO-1844 URL: https://issues.apache.org/jira/browse/ACCUMULO-1844 Project: Accumulo Issue Type: Improvement Components: start, test Reporter: Josh Elser Assignee: Christopher Tubbs Labels: newbie Fix For: 1.7.0 Time Spent: 10m Remaining Estimate: 0h The Main class, in accumulo-start, references a number of classes to handle the command line arguments to the accumulo shell script. However, since start can't depend on the other modules, it must use String representations of these classes. This is very brittle and, if those classes move, we have no knowledge of this breakage until runtime. We should have something later in the build that will enumerate the arguments to Main to ensure that we don't get a NoClassDefFound error at runtime due to a moved class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ACCUMULO-3420) Get Visibility Metrics from PrintInfo
[ https://issues.apache.org/jira/browse/ACCUMULO-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs updated ACCUMULO-3420: Status: Patch Available (was: Open) Get Visibility Metrics from PrintInfo - Key: ACCUMULO-3420 URL: https://issues.apache.org/jira/browse/ACCUMULO-3420 Project: Accumulo Issue Type: Improvement Reporter: Jenna Huston Assignee: Jenna Huston Fix For: 1.7.0 Add the ability to print visibility metrics such as densities of visibilities by locality group and the number of blocks that visibility is in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
[ https://issues.apache.org/jira/browse/ACCUMULO-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306073#comment-14306073 ] Josh Elser commented on ACCUMULO-3513: -- bq. What user does the task run as? If the effective UID is the same as its parent, the filesystem won't protect it. Pretty sure I covered this already: the YARN tasks run as the user who submitted the job. This requires that your user exists across your YARN node managers. Thus, it is not the same effective UID, it's an entirely different one. bq. If only the ResourceManager and the client could authenticate with Accumulo first Why does the resource manager need to authenticate with Accumulo? The user needs to trust that the YARN cluster they're talking to is real (and not some third party that is somehow masquerading as a YARN cluster). If a user is just submitting their credentials to anyone who listens, the problem is with that user and not something we can solve with Accumulo. bq. MapReduce needs to avoid granting access to its credentials from an untrusted client (which Accumulo does trust) I'm not sure I understand what you mean here: No user code is being run with YARN's credentials. YARN tasks could be run by users who don't have Accumulo accounts, but just being able to run a YARN job, doesn't mean they can authenticate with Accumulo (with a delegation token that was obtained with real credentials). Ensure MapReduce functionality with Kerberos enabled Key: ACCUMULO-3513 URL: https://issues.apache.org/jira/browse/ACCUMULO-3513 Project: Accumulo Issue Type: Bug Components: client Reporter: Josh Elser Assignee: Josh Elser Priority: Blocker Fix For: 1.7.0 Attachments: ACCUMULO-3513-design.pdf I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help get a picture about what extra might be needed to make this work. Generally, in Hadoop and HBase, the client must have valid credentials to submit a job, then the notion of delegation tokens is used by for further communication since the servers do not have access to the client's sensitive information. A centralized service manages creation of a delegation token which is a record which contains certain information (such as the submitting user name) necessary to securely identify the holder of the delegation token. The general idea is that we would need to build support into the master to manage delegation tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code which implements this general idea, but we will need to apply them Accumulo and verify that it is M/R jobs still work on a kerberized environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
[ https://issues.apache.org/jira/browse/ACCUMULO-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306061#comment-14306061 ] Christopher Tubbs commented on ACCUMULO-3513: - {quote}Keytabs on disk should be protected by the filesystem. ... A little C program ... drops permissions ...{quote} What user does the task run as? If the effective UID is the same as its parent, the filesystem won't protect it. {quote}... it's expected that the delegation token is protected from prying eyes ...{quote} There seems to be a trade-off here, with competing goals. On the one hand, we need to make sure Accumulo doesn't give up data to an untrusted middle-man. And, on the other hand, MapReduce needs to avoid granting access to its credentials from an untrusted client (which Accumulo *does* trust). If only the ResourceManager *and* the client could authenticate with Accumulo first, then we could carry information about both of these things in the token used to authenticate to Accumulo in the actual task, then we could trust the middle-man (YARN task) *and* the client to be able to receive the data from Accumulo. Ensure MapReduce functionality with Kerberos enabled Key: ACCUMULO-3513 URL: https://issues.apache.org/jira/browse/ACCUMULO-3513 Project: Accumulo Issue Type: Bug Components: client Reporter: Josh Elser Assignee: Josh Elser Priority: Blocker Fix For: 1.7.0 Attachments: ACCUMULO-3513-design.pdf I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help get a picture about what extra might be needed to make this work. Generally, in Hadoop and HBase, the client must have valid credentials to submit a job, then the notion of delegation tokens is used by for further communication since the servers do not have access to the client's sensitive information. A centralized service manages creation of a delegation token which is a record which contains certain information (such as the submitting user name) necessary to securely identify the holder of the delegation token. The general idea is that we would need to build support into the master to manage delegation tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code which implements this general idea, but we will need to apply them Accumulo and verify that it is M/R jobs still work on a kerberized environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ACCUMULO-3542) Native tarball doesn't have a NOTICE
[ https://issues.apache.org/jira/browse/ACCUMULO-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs updated ACCUMULO-3542: Fix Version/s: 1.7.0 Native tarball doesn't have a NOTICE Key: ACCUMULO-3542 URL: https://issues.apache.org/jira/browse/ACCUMULO-3542 Project: Accumulo Issue Type: Bug Affects Versions: 1.6.1 Reporter: Billie Rinaldi Fix For: 1.7.0, 1.6.3 Do we build any other tarballs we should check? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3557) No write ACL set on /accumulo/instances/...
[ https://issues.apache.org/jira/browse/ACCUMULO-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306268#comment-14306268 ] Christopher Tubbs commented on ACCUMULO-3557: - Restricting write access would be difficult, since the whole point of this name-to-id mapping paradigm was so somebody could easily re-initialize an instance with the same name, without clobbering an existing instance. And, the new instance could have a different instance secret, so what ACL do we use? Personally, I find this to be quite frustrating, and I would prefer to eliminate this mapping entirely. I'd rather make the instance name the unique instance identifier ({{/accumulo/instanceName/}}) and eliminate the separate instance id. That would make this issue go away, because it'd be clear that the ACL to use would be the one (and only one) instance uniquely identified by that name. We could still have a unique id, to distinguish between two instances of the same name, and even to find instances by id, but eliminating this mapping would mean that the id could just be a child attribute of the znode instead ({{/accumulo/instanceName/id}}). FWIW, Accumulo services themselves don't use the instanceName to look up the instance. The risks here are how it affects clients. No write ACL set on /accumulo/instances/... --- Key: ACCUMULO-3557 URL: https://issues.apache.org/jira/browse/ACCUMULO-3557 Project: Accumulo Issue Type: Improvement Components: zookeeper Reporter: Josh Elser Priority: Critical Fix For: 1.7.0 It's common for users to have four arguments to make a connection to Accumulo: zookeeper quorum string, instance name, username and password. The instance name is used to find the instanceID using {{/accumulo/instances/...}} in ZooKeeper. It appears that anyone can write in the {{/accumulo/instances}} ZNode. This seems suspect, because any unauthenticated user can alter the state of ZooKeeper and break users connecting to Accumulo or force them to connect to a different Accumulo instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3549) tablet server location cache may grow too large
[ https://issues.apache.org/jira/browse/ACCUMULO-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306270#comment-14306270 ] Eric Newton commented on ACCUMULO-3549: --- Someone besides me should chime in, but yes. :-) tablet server location cache may grow too large --- Key: ACCUMULO-3549 URL: https://issues.apache.org/jira/browse/ACCUMULO-3549 Project: Accumulo Issue Type: Bug Components: tserver Reporter: Eric Newton Assignee: Eric Newton Priority: Minor Fix For: 1.6.2, 1.7.0 Time Spent: 50m Remaining Estimate: 0h Now that we're no longer clearing the location cache in the tablet server, it could grow without bound. It should be cleared, either by time, or using an LRU. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3557) No write ACL set on /accumulo/instances/...
[ https://issues.apache.org/jira/browse/ACCUMULO-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306285#comment-14306285 ] Josh Elser commented on ACCUMULO-3557: -- bq. Restricting write access would be difficult, since the whole point of this name-to-id mapping paradigm was so somebody could easily re-initialize an instance with the same name, without clobbering an existing instance. In practice, how often is this actually used? I don't think I've ever done this in the years I've used Accumulo. I only see re-init'ing done when I'm destroying the instance. bq. And, the new instance could have a different instance secret, so what ACL do we use? Obviously, we would want to use the current instance.secret (since that's what the rest of the instance nodes would be acl'ed with). The problem would be knowing what the old instance.secret was to remove the old znode. The first thought I have is to have a password-prompt when we get an AuthFailed removing the old ZNode. The administrator running the reinitialization can enter whatever the old secret was. bq. The risks here are how it affects clients. Yes, that was my original concern. Users being redirected to a different Accumulo instances -- maliciously or otherwise. Anyone has the ability to change these nodes which means that anyone with access to the system can prevent users from normally accessing it by just nuking this node. No write ACL set on /accumulo/instances/... --- Key: ACCUMULO-3557 URL: https://issues.apache.org/jira/browse/ACCUMULO-3557 Project: Accumulo Issue Type: Improvement Components: zookeeper Reporter: Josh Elser Priority: Critical Fix For: 1.7.0 It's common for users to have four arguments to make a connection to Accumulo: zookeeper quorum string, instance name, username and password. The instance name is used to find the instanceID using {{/accumulo/instances/...}} in ZooKeeper. It appears that anyone can write in the {{/accumulo/instances}} ZNode. This seems suspect, because any unauthenticated user can alter the state of ZooKeeper and break users connecting to Accumulo or force them to connect to a different Accumulo instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ACCUMULO-3550) Add checkstyle rule to prevent empty javadocs
[ https://issues.apache.org/jira/browse/ACCUMULO-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs updated ACCUMULO-3550: Issue Type: Improvement (was: Bug) Add checkstyle rule to prevent empty javadocs - Key: ACCUMULO-3550 URL: https://issues.apache.org/jira/browse/ACCUMULO-3550 Project: Accumulo Issue Type: Improvement Reporter: Mike Drob Labels: qa -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ACCUMULO-3550) Add checkstyle rule to prevent empty javadocs
[ https://issues.apache.org/jira/browse/ACCUMULO-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs updated ACCUMULO-3550: Labels: qa (was: ) Add checkstyle rule to prevent empty javadocs - Key: ACCUMULO-3550 URL: https://issues.apache.org/jira/browse/ACCUMULO-3550 Project: Accumulo Issue Type: Improvement Reporter: Mike Drob Labels: qa -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3550) Add checkstyle rule to prevent empty javadocs
[ https://issues.apache.org/jira/browse/ACCUMULO-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306284#comment-14306284 ] Christopher Tubbs commented on ACCUMULO-3550: - [JavadocStyle|http://checkstyle.sourceforge.net/config_javadoc.html] will do this, but it will also trigger on javadocs with only tags and no description. That's probably okay, but I'm sure we violate this *a lot*. Add checkstyle rule to prevent empty javadocs - Key: ACCUMULO-3550 URL: https://issues.apache.org/jira/browse/ACCUMULO-3550 Project: Accumulo Issue Type: Bug Reporter: Mike Drob Labels: qa -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
[ https://issues.apache.org/jira/browse/ACCUMULO-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306160#comment-14306160 ] Josh Elser commented on ACCUMULO-3513: -- bq. Ugh, I missed that. Sorry. Now you need to grant YARN setuid privileges. That's... unfortunate. I suppose you also have to make assumptions about which UID you need to use, based on the content of the delegation token, too, and I guess there's no guarantee that this will even be the same on every node, or match the submitter's UID. (Though, presumably, they will all be the same if using some common login service, like AD on all the nodes.) Yes, it is a pain to get YARN set up in secure mode (notably setuid stuff), but it is well written out what you need to do. It's also a stated YARN assumption that the user must exist on every node. Ensure MapReduce functionality with Kerberos enabled Key: ACCUMULO-3513 URL: https://issues.apache.org/jira/browse/ACCUMULO-3513 Project: Accumulo Issue Type: Bug Components: client Reporter: Josh Elser Assignee: Josh Elser Priority: Blocker Fix For: 1.7.0 Attachments: ACCUMULO-3513-design.pdf I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help get a picture about what extra might be needed to make this work. Generally, in Hadoop and HBase, the client must have valid credentials to submit a job, then the notion of delegation tokens is used by for further communication since the servers do not have access to the client's sensitive information. A centralized service manages creation of a delegation token which is a record which contains certain information (such as the submitting user name) necessary to securely identify the holder of the delegation token. The general idea is that we would need to build support into the master to manage delegation tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code which implements this general idea, but we will need to apply them Accumulo and verify that it is M/R jobs still work on a kerberized environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3565) Encourage use of voting on JIRA
[ https://issues.apache.org/jira/browse/ACCUMULO-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305200#comment-14305200 ] Billie Rinaldi commented on ACCUMULO-3565: -- FYI, the Accumulo Interest dashboard shows two lists of tickets: open tickets with patches sorted by update time and open tickets sorted by number of watchers (with a number of votes column you can sort on). There's also an Accumulo dashboard that has a few useful widgets. Encourage use of voting on JIRA --- Key: ACCUMULO-3565 URL: https://issues.apache.org/jira/browse/ACCUMULO-3565 Project: Accumulo Issue Type: Sub-task Reporter: Josh Elser We have lots of issues, many of which tend to fall by the wayside. If we encourage committers/contributors to use the vote feature, we could create a dashboard that highlights the most highly-voted on issues. This will help show where interest lies with Accumulo and help devs or new users find meaningful issues to work on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ACCUMULO-3535) Transition plan: remove instance.dfs.{uri,dir} to instance.volumes
[ https://issues.apache.org/jira/browse/ACCUMULO-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs updated ACCUMULO-3535: Attachment: Whiteboard_Volumes_Workflow.jpg Transition plan: remove instance.dfs.{uri,dir} to instance.volumes -- Key: ACCUMULO-3535 URL: https://issues.apache.org/jira/browse/ACCUMULO-3535 Project: Accumulo Issue Type: Task Reporter: Christopher Tubbs Fix For: 1.7.0 Attachments: Volumes Workflow.png, Whiteboard_Volumes_Workflow.jpg This is a parent issue for several related sub-tasks, to handle the transition from the old and deprecated {{instance.dfs.uri}} and {{instance.dfs,dir}} properties to the new {{instance.volumes}} property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3542) Native tarball doesn't have a NOTICE
[ https://issues.apache.org/jira/browse/ACCUMULO-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306227#comment-14306227 ] Christopher Tubbs commented on ACCUMULO-3542: - Nevermind, I found the answer to my own question. Apparently, it's the [maven-remote-resources-plugin|http://maven.apache.org/apache-resource-bundles/]. Native tarball doesn't have a NOTICE Key: ACCUMULO-3542 URL: https://issues.apache.org/jira/browse/ACCUMULO-3542 Project: Accumulo Issue Type: Bug Affects Versions: 1.6.1 Reporter: Billie Rinaldi Fix For: 1.7.0, 1.6.3 Do we build any other tarballs we should check? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ACCUMULO-3557) No write ACL set on /accumulo/instances/...
[ https://issues.apache.org/jira/browse/ACCUMULO-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306583#comment-14306583 ] Christopher Tubbs commented on ACCUMULO-3557: - bq. In practice, how often is this actually used? It *may* be useful for testing. However, I think it would be *more* useful to simply say something like An instance was already created with that name. If you'd like to overwrite it, enter it's name here to confirm: in order to conveniently remove the old one. No write ACL set on /accumulo/instances/... --- Key: ACCUMULO-3557 URL: https://issues.apache.org/jira/browse/ACCUMULO-3557 Project: Accumulo Issue Type: Improvement Components: zookeeper Reporter: Josh Elser Priority: Critical Fix For: 1.7.0 It's common for users to have four arguments to make a connection to Accumulo: zookeeper quorum string, instance name, username and password. The instance name is used to find the instanceID using {{/accumulo/instances/...}} in ZooKeeper. It appears that anyone can write in the {{/accumulo/instances}} ZNode. This seems suspect, because any unauthenticated user can alter the state of ZooKeeper and break users connecting to Accumulo or force them to connect to a different Accumulo instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)