[jira] [Created] (HADOOP-16148) Cleanup LineReader Unit Test

2019-02-26 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-16148:


 Summary: Cleanup LineReader Unit Test
 Key: HADOOP-16148
 URL: https://issues.apache.org/jira/browse/HADOOP-16148
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.2.0
Reporter: BELUGA BEHR


I was trying to track down a bug and thought it might be coming from the 
{{LineReader}} class.  It wasn't.  However, I did clean up the unit test for 
this class a bit.  I figured I might as well at least post the diff file here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16073) Use JDK1.7 StandardCharset

2019-01-25 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-16073:


 Summary: Use JDK1.7 StandardCharset
 Key: HADOOP-16073
 URL: https://issues.apache.org/jira/browse/HADOOP-16073
 Project: Hadoop Common
  Issue Type: Improvement
  Components: streaming, tools
Affects Versions: 3.2.0
Reporter: BELUGA BEHR


Use Java 1.7 
[StandardCharsets|https://docs.oracle.com/javase/7/docs/api/java/nio/charset/StandardCharsets.html].
  Every JDK must now include support for several common charsets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16067) Invalid Debug Statement KMSACLs

2019-01-23 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-16067:


 Summary: Invalid Debug Statement KMSACLs
 Key: HADOOP-16067
 URL: https://issues.apache.org/jira/browse/HADOOP-16067
 Project: Hadoop Common
  Issue Type: Bug
  Components: kms
Affects Versions: 3.2.0
Reporter: BELUGA BEHR


{code:java}
  if (LOG.isDebugEnabled()) {
LOG.debug("Checking user [{}] for: {}: {}" + ugi.getShortUserName(),
opType.toString(), acl.getAclString());
  }
{code}
The logging message here is incorrect because the first variable is being 
concatenated to the string instead of being passed as an argument.
{code:java}
-- Notice the user name 'hdfs' at the end and the spare curly brackets
2019-01-23 13:27:45,244 DEBUG org.apache.hadoop.crypto.key.kms.server.KMSACLs: 
Checking user [GENERATE_EEK] for: hdfs supergroup: {}hdfs
{code}
[https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSACLs.java#L313-L316]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16022) Increase Compression Buffer Sizes - Remove Magic Numbers

2018-12-27 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-16022:


 Summary: Increase Compression Buffer Sizes - Remove Magic Numbers
 Key: HADOOP-16022
 URL: https://issues.apache.org/jira/browse/HADOOP-16022
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.10.0, 3.2.0
Reporter: BELUGA BEHR


{code:java|title=Compression.java}
// data input buffer size to absorb small reads from application.
private static final int DATA_IBUF_SIZE = 1 * 1024;
// data output buffer size to absorb small writes from application.
private static final int DATA_OBUF_SIZE = 4 * 1024;
{code}

There exists these hard coded buffer sizes in the Compression code.  Instead, 
use the JVM default sizes, which, this day and age, are usually set for 8K.
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15962) FileUtils Small Buffer Size

2018-11-30 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15962:


 Summary: FileUtils Small Buffer Size
 Key: HADOOP-15962
 URL: https://issues.apache.org/jira/browse/HADOOP-15962
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.3.0
Reporter: BELUGA BEHR


Note sure if this code is even being used, but it implements a copy routing 
utilizing a 2K buffer.  Modern JVM uses 8K, but 4K should be minimum.  Also, 
there are libraries for this stuff.

{code:java|title=FileUtil.java}
int count;
byte data[] = new byte[2048];
try (BufferedOutputStream outputStream = new BufferedOutputStream(
new FileOutputStream(outputFile));) {

  while ((count = tis.read(data)) != -1) {
outputStream.write(data, 0, count);
  }

  outputStream.flush();
}
{code}

I also fixed a couple of check-style warnings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15854) AuthToken Use StringBuilder instead of StringBuffer

2018-10-15 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15854:


 Summary: AuthToken Use StringBuilder instead of StringBuffer
 Key: HADOOP-15854
 URL: https://issues.apache.org/jira/browse/HADOOP-15854
 Project: Hadoop Common
  Issue Type: Improvement
  Components: auth
Affects Versions: 3.2.0
Reporter: BELUGA BEHR
 Attachments: HADOOP-15854.1.patch

Use {{StringBuilder}} instead of {{StringBuffer}} because {{StringBuilder}} is 
not synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15852) QuotaUsage Review

2018-10-15 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15852:


 Summary: QuotaUsage Review
 Key: HADOOP-15852
 URL: https://issues.apache.org/jira/browse/HADOOP-15852
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.2.0
Reporter: BELUGA BEHR
 Attachments: HADOOP-15852.1.patch

My new mission is to remove {{StringBuffer}}s in favor of {{StringBuilder}}.

* Simplify Code
* Use Eclipse to generate hashcode/equals
* User StringBuilder instead of StringBuffer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15836) Review of AccessControlList.java

2018-10-09 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15836:


 Summary: Review of AccessControlList.java
 Key: HADOOP-15836
 URL: https://issues.apache.org/jira/browse/HADOOP-15836
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common, security
Affects Versions: 3.2.0
Reporter: BELUGA BEHR


* Improve unit tests (expected / actual were backwards)
* Unit test expected elements to be in order but the class's return Collections 
were unordered
* Formatting cleanup
* Removed superfluous white space
* Remove use of LinkedList
* Removed superfluous code
* Use {{unmodifiable}} Collections where JavaDoc states that caller must not 
manipulate the data structure



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15828) Review of MachineList class

2018-10-08 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15828:


 Summary: Review of MachineList class
 Key: HADOOP-15828
 URL: https://issues.apache.org/jira/browse/HADOOP-15828
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Affects Versions: 3.2.0
Reporter: BELUGA BEHR
 Attachments: HADOOP-15828.1.patch

Clean up and simplify class {{MachineList}}.  Primarily, remove LinkedList 
implementation and use empty collections instead of 'null' values, add logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15760) Include Apache Commons Collections4

2018-09-17 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15760:


 Summary: Include Apache Commons Collections4
 Key: HADOOP-15760
 URL: https://issues.apache.org/jira/browse/HADOOP-15760
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.3, 2.10.0
Reporter: BELUGA BEHR
 Attachments: HADOOP-15760.1.patch

Please allow for use of Apache Commons Collections 4 library with the end goal 
of migrating from Commons Collects 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15377) Review of MetricsConfig.java

2018-04-10 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15377:


 Summary: Review of MetricsConfig.java
 Key: HADOOP-15377
 URL: https://issues.apache.org/jira/browse/HADOOP-15377
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.0.2
Reporter: BELUGA BEHR
 Attachments: HADOOP-15377.1.patch

I recently enabled debug level logging in a MR application and was getting a 
lot of log lines from this class that were just blank, without context.  I've 
enhanced the log messages to include additional context and a few other small 
clean up while looking at this class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15362) Review of Configuration.java

2018-04-04 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15362:


 Summary: Review of Configuration.java
 Key: HADOOP-15362
 URL: https://issues.apache.org/jira/browse/HADOOP-15362
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.0.0
Reporter: BELUGA BEHR


* Various improvements
* Fix a lot of checks style errors



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15246) SpanReceiverInfo - Prefer ArrayList over LinkedList

2018-02-18 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15246:


 Summary: SpanReceiverInfo - Prefer ArrayList over LinkedList
 Key: HADOOP-15246
 URL: https://issues.apache.org/jira/browse/HADOOP-15246
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.0.0
Reporter: BELUGA BEHR
 Attachments: HADOOP-15246.1.patch

Prefer the use of {{ArrayList}} over {{LinkedList}} for performance and memory 
usage



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-15147) SnappyCompressor Typo

2018-01-02 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR resolved HADOOP-15147.
--
Resolution: Not A Problem

It sure confused me! Sorry! :)

> SnappyCompressor Typo
> -
>
> Key: HADOOP-15147
> URL: https://issues.apache.org/jira/browse/HADOOP-15147
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Trivial
>
> {code:title=SnappyCompressor.java}
>   public boolean finished() {
> // Check if all uncompressed data has been consumed
> return (finish && finished && compressedDirectBuf.remaining() == 0);
>   }
> {code}
> Notice "finish" is checked twice



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15148) Improve DataOutputByteBuffer

2017-12-28 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15148:


 Summary: Improve DataOutputByteBuffer
 Key: HADOOP-15148
 URL: https://issues.apache.org/jira/browse/HADOOP-15148
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.0.0
Reporter: BELUGA BEHR
Priority: Trivial


* Use ArrayDeque instead of LinkedList
* Replace an ArrayList that was being used as a queue with ArrayDeque
* Improve write single byte method to hard-code sizes and save time

{quote}
Resizable-array implementation of the Deque interface. Array deques have no 
capacity restrictions; they grow as necessary to support usage. They are not 
thread-safe; in the absence of external synchronization, they do not support 
concurrent access by multiple threads. Null elements are prohibited. This class 
is *likely to be* ... *faster than LinkedList when used as a queue.*
{quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15147) SnappyCompressor Typo

2017-12-28 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15147:


 Summary: SnappyCompressor Typo
 Key: HADOOP-15147
 URL: https://issues.apache.org/jira/browse/HADOOP-15147
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.0.0
Reporter: BELUGA BEHR
Priority: Trivial


{code:title=SnappyCompressor.java}
  public boolean finished() {
// Check if all uncompressed data has been consumed
return (finish && finished && compressedDirectBuf.remaining() == 0);
  }
{code}

Notice "finish" is checked twice



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15146) Remove DataOutputByteBuffer

2017-12-28 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-15146:


 Summary: Remove DataOutputByteBuffer
 Key: HADOOP-15146
 URL: https://issues.apache.org/jira/browse/HADOOP-15146
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.0.0
Reporter: BELUGA BEHR
Priority: Minor


I can't seem to find any references to {{DataOutputByteBuffer}} maybe it should 
be deprecated or simply removed?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14668) Remove Configurable Default Sequence File Compression Type

2017-07-18 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-14668:


 Summary: Remove Configurable Default Sequence File Compression Type
 Key: HADOOP-14668
 URL: https://issues.apache.org/jira/browse/HADOOP-14668
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 3.0.0-alpha3
Reporter: BELUGA BEHR
Priority: Trivial
 Fix For: 2.8.1


It is confusing to have two different ways to set the Sequence File compression 
type.

In a basic configuration, I can set 
_mapreduce.output.fileoutputformat.compress.type_ or 
_io.seqfile.compression.type_.  If I would like to set a default value, I 
should set it by setting the cluster environment's mapred-site.xml file setting 
for _mapreduce.output.fileoutputformat.compress.type_.

Please remove references to this magic string _io.seqfile.compression.type_, 
remove the {{setDefaultCompressionType}} method, and have 
{{getDefaultCompressionType}} return value hard-coded to 
{{CompressionType.RECORD}}.  This will make administration easier as I have to 
only interrogate one configuration.

{code:title=org.apache.hadoop.io.SequenceFile}
  /**
   * Get the compression type for the reduce outputs
   * @param job the job config to look in
   * @return the kind of compression to use
   */
  static public CompressionType getDefaultCompressionType(Configuration job) {
String name = job.get("io.seqfile.compression.type");
return name == null ? CompressionType.RECORD : 
  CompressionType.valueOf(name);
  }
  
  /**
   * Set the default compression type for sequence files.
   * @param job the configuration to modify
   * @param val the new compression type (none, block, record)
   */
  static public void setDefaultCompressionType(Configuration job, 
   CompressionType val) {
job.set("io.seqfile.compression.type", val.toString());
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14525) org.apache.hadoop.io.Text Truncate

2017-06-13 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-14525:


 Summary: org.apache.hadoop.io.Text Truncate
 Key: HADOOP-14525
 URL: https://issues.apache.org/jira/browse/HADOOP-14525
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.8.1
Reporter: BELUGA BEHR


For Apache Hive, VARCHAR fields are much slower than STRING fields when a 
precision (string length cap) is included.  Keep in mind that this precision is 
the number of UTF-8 characters in the string, not the number of bytes.

The general procedure is:

# Load an entire byte buffer into a {{Text}} object
# Convert it to a {{String}}
# Count N number of character code points
# Substring the {{String}} at the correct place
# Convert the String back into a byte array and populate the {{Text}} object

It would be great if the {{Text}} object could offer a truncate/substring 
method based on character count that did not require copying data around



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14477) FileSystem Simplify / Optimize listStatus Method

2017-06-01 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-14477:


 Summary: FileSystem Simplify / Optimize listStatus Method
 Key: HADOOP-14477
 URL: https://issues.apache.org/jira/browse/HADOOP-14477
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0-alpha3, 2.7.3
Reporter: BELUGA BEHR
Priority: Minor


{code:title=org.apache.hadoop.fs.FileSystem.listStatus(ArrayList, 
Path, PathFilter)}
  /*
   * Filter files/directories in the given path using the user-supplied path
   * filter. Results are added to the given array results.
   */
  private void listStatus(ArrayList results, Path f,
  PathFilter filter) throws FileNotFoundException, IOException {
FileStatus listing[] = listStatus(f);
if (listing == null) {
  throw new IOException("Error accessing " + f);
}

for (int i = 0; i < listing.length; i++) {
  if (filter.accept(listing[i].getPath())) {
results.add(listing[i]);
  }
}
  }
{code}

{code:title=org.apache.hadoop.fs.FileSystem.listStatus(Path, PathFilter)}
  public FileStatus[] listStatus(Path f, PathFilter filter) 
   throws FileNotFoundException, IOException {
ArrayList results = new ArrayList();
listStatus(results, f, filter);
return results.toArray(new FileStatus[results.size()]);
  }
{code}

We can be smarter about this:

# Use enhanced for-loops
# Optimize for the case where there are zero files in a directory, save on 
object instantiation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-12674) BootstrapStandby - Inconsistent Logging

2015-12-23 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-12674:


 Summary: BootstrapStandby - Inconsistent Logging
 Key: HADOOP-12674
 URL: https://issues.apache.org/jira/browse/HADOOP-12674
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.7.1
Reporter: BELUGA BEHR
Priority: Minor


{code}
/* Line 379 */
  if (LOG.isDebugEnabled()) {
LOG.debug(msg, e);
  } else {
LOG.fatal(msg);
  }
{code}

Why would message, considered "fatal" under most operating circumstances be 
considered "debug" when debugging is on.  This is confusing to say the least.  
If there is a problem and the user attempts to debug the situation, they may be 
filtering on "fatal" messages and miss the exception.

Please consider using only the fatal logging, and including the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12663) Remove Hard-Coded Values From FileSystem.java

2015-12-21 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-12663:


 Summary: Remove Hard-Coded Values From FileSystem.java
 Key: HADOOP-12663
 URL: https://issues.apache.org/jira/browse/HADOOP-12663
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 2.7.1
Reporter: BELUGA BEHR
Priority: Trivial


Within FileSystem.java, there is one instance where the global variables 
"CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY" and 
"CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT" were being used, 
but in all other instances, their literal values were being used.

Please find attached a patch to remove use of literal values and instead 
replace them with references to the global variables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12644) Access Control List Syntax

2015-12-15 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-12644:


 Summary: Access Control List Syntax
 Key: HADOOP-12644
 URL: https://issues.apache.org/jira/browse/HADOOP-12644
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: BELUGA BEHR
Priority: Minor


Hello,

I was recently learning about the configuration option 
"mapreduce.job.acl-view-job."  I was looking at the syntax and the code.  I 
would like to suggest some improvements.

??the format to use is "user1,user2 group1,group". If set to '*', it allows all 
users/groups to modify this job. If set to ' '(i.e. space), it allows none.??

In reality though, the code is written to split the line on the first space it 
finds.  So:

user1,user2 group1, group2 will work.
(user1,user2),(group1, group2)

user1, user2 group1,group does not work:
(user1,),(user2 group1, group2)

Also, there are many ways to specify "all":
"*"
" *"
"* "
"* *"
"user1,user2 *"
"* group1,group2"

I would like to see the code more strictly enforce what is written in the 
documentation. This will guard against configuration mistakes.  If the input 
does not match the syntax, an error should be produced and made available in 
the logs. The use of a semi-colon as a delimiter is advisable so that any 
white-space in the list of users or groups can simply be ignore.

||mapreduce.job.acl-view-job||Meaning||
|"*"|All access|
|" "|No access|
|"user1;"|User-only access|
|";group1"|Group-only access|
|"user1;group1"|User & Group access|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12639) Update JavaDoc For getTrimmedStrings

2015-12-14 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-12639:


 Summary: Update JavaDoc For getTrimmedStrings
 Key: HADOOP-12639
 URL: https://issues.apache.org/jira/browse/HADOOP-12639
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Reporter: BELUGA BEHR
Priority: Trivial
 Attachments: StringUtils.patch

Added JavaDoc to getTrimmedStrings() to explain what happens with NULL input



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12640) Code Review AccessControlList

2015-12-14 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HADOOP-12640:


 Summary: Code Review AccessControlList
 Key: HADOOP-12640
 URL: https://issues.apache.org/jira/browse/HADOOP-12640
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 2.7.1
Reporter: BELUGA BEHR
Priority: Minor
 Attachments: AccessControlList.patch, TestAccessControlList.path

After some confusion of my own, in particular with 
"mapreduce.job.acl-view-job," I have looked over the AccessControlList 
implementation and cleaned it up and clarified a few points.

1) I added tests to show that when including an asterisk in either the username 
or the group field, it overrides everything and allows all access.

"user1,user2,user3 *" = all access
"* group1,group2" = all access
"* *" = all access
"* " = all access
" *" = all access

2) General clean-up and simplification

3) NOT-BACKWARDS COMPATIBLE
The code currently handled spaces in an asymmetric way. The code splits the ACL 
string on a single space, but limits the resulting array to a size of two. So, 
as long as there are no spaces in the user names section, it works fine, but 
any spaces subsequent to that did not matter.

"user1,user2,user3 group1, group2,group3" - works as expected
["user1,user2,user3", "group1, group2,group3"]

"user1, user2,user3 group1,group2,group3" - Did not work as expected
["user1,","user2,user3, group1, group2,group3"]

The submitted patch will split on all spaces and log a warning if there are 
more than two elements.  This enforces no spaces with the two comma-separated 
lists.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)