[jira] [Commented] (HADOOP-17758) NPE and excessive warnings after HADOOP-17728

2021-06-11 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17361799#comment-17361799
 ] 

Jim Brennan commented on HADOOP-17758:
--

I don't think the NPE will happen with HADOOP-17728 reverted.   I think we can 
close this as fixed by reverting HADOOP-17728.

> NPE and excessive warnings after HADOOP-17728
> -
>
> Key: HADOOP-17758
> URL: https://issues.apache.org/jira/browse/HADOOP-17758
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I'm noticing these warnings and NPE's when just running a simple pi test on a 
> one node cluster:
> {noformat}
> 2021-06-09 21:51:12,334 WARN  
> [org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner] 
> fs.FileSystem (FileSystem.java:run(4025)) - Exception in the cleaner thread 
> but it will continue to run
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:4020)
>   at java.lang.Thread.run(Thread.java:748){noformat}
> This appears to be due to [HADOOP-17728].
> I'm not sure I understand why that change was made?  Wasn't it by design that 
> the remove should wait until something is queued?
> [~kaifeiYi] can you please investigate?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17728) Fix issue of the StatisticsDataReferenceCleaner cleanUp

2021-06-11 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17361795#comment-17361795
 ] 

Jim Brennan commented on HADOOP-17728:
--

[~liuml07] should we close this as invalid?

> Fix issue of the StatisticsDataReferenceCleaner cleanUp
> ---
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available, reverted
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17758) NPE and excessive warnings after HADOOP-17728

2021-06-10 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17361075#comment-17361075
 ] 

Jim Brennan commented on HADOOP-17758:
--

{quote}
We need a timeout to wake me up.
{quote}
I still don't understand why you need to wake up if the queue is still empty.  
There is nothing to clean up until something is added to the queue.



> NPE and excessive warnings after HADOOP-17728
> -
>
> Key: HADOOP-17758
> URL: https://issues.apache.org/jira/browse/HADOOP-17758
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I'm noticing these warnings and NPE's when just running a simple pi test on a 
> one node cluster:
> {noformat}
> 2021-06-09 21:51:12,334 WARN  
> [org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner] 
> fs.FileSystem (FileSystem.java:run(4025)) - Exception in the cleaner thread 
> but it will continue to run
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:4020)
>   at java.lang.Thread.run(Thread.java:748){noformat}
> This appears to be due to [HADOOP-17728].
> I'm not sure I understand why that change was made?  Wasn't it by design that 
> the remove should wait until something is queued?
> [~kaifeiYi] can you please investigate?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17758) NPE and excessive warnings after HADOOP-17728

2021-06-10 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17361053#comment-17361053
 ] 

Jim Brennan commented on HADOOP-17758:
--

I'm sorry, but am I missing something?   What is the point of using a timeout, 
if nothing has been added to the queue?  We'll just loop around again and wait 
on the same lock until something does get queued.
One thread waiting on a lock is not a deadlock.  Is this causing some other 
thread to block as well?
My inclination would be to revert  [HADOOP-17728].


> NPE and excessive warnings after HADOOP-17728
> -
>
> Key: HADOOP-17758
> URL: https://issues.apache.org/jira/browse/HADOOP-17758
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I'm noticing these warnings and NPE's when just running a simple pi test on a 
> one node cluster:
> {noformat}
> 2021-06-09 21:51:12,334 WARN  
> [org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner] 
> fs.FileSystem (FileSystem.java:run(4025)) - Exception in the cleaner thread 
> but it will continue to run
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:4020)
>   at java.lang.Thread.run(Thread.java:748){noformat}
> This appears to be due to [HADOOP-17728].
> I'm not sure I understand why that change was made?  Wasn't it by design that 
> the remove should wait until something is queued?
> [~kaifeiYi] can you please investigate?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17758) NPE and excessive warnings after HADOOP-17728

2021-06-10 Thread Jim Brennan (Jira)
Jim Brennan created HADOOP-17758:


 Summary: NPE and excessive warnings after HADOOP-17728
 Key: HADOOP-17758
 URL: https://issues.apache.org/jira/browse/HADOOP-17758
 Project: Hadoop Common
  Issue Type: Bug
  Components: common
Affects Versions: 3.4.0
Reporter: Jim Brennan


I'm noticing these warnings and NPE's when just running a simple pi test on a 
one node cluster:
{noformat}
2021-06-09 21:51:12,334 WARN  
[org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner] 
fs.FileSystem (FileSystem.java:run(4025)) - Exception in the cleaner thread but 
it will continue to run
java.lang.NullPointerException
at 
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:4020)
at java.lang.Thread.run(Thread.java:748){noformat}
This appears to be due to [HADOOP-17728].
I'm not sure I understand why that change was made?  Wasn't it by design that 
the remove should wait until something is queued?
[~kaifeiYi] can you please investigate?




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17513) Checkstyle IllegalImport does not catch guava imports

2021-02-09 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281886#comment-17281886
 ] 

Jim Brennan commented on HADOOP-17513:
--

{quote}
Personally, I will prefer that the build breaks locally rather than injecting 
automatic code changes.
{quote}
I would prefer it break the build as well, but I am not sure that is possible?  
The current approach is OK with me if it is not feasible to break the build.  
In the ideal case, people making a change that causes this to happen will 
notice it and not submit a patch with the wrong imports.  In the case where 
someone fails to include the import fix in their patch, someone else will 
likely notice and alert them to fix it, as happened in this case.
I admit to being surprised when I saw this was happening, and a little alarmed 
that a build would modify a source file, but I think I understand why it was 
done this way.

> Checkstyle IllegalImport does not catch guava imports
> -
>
> Key: HADOOP-17513
> URL: https://issues.apache.org/jira/browse/HADOOP-17513
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Although YARN-10352 introduces {{guava iterator import}}, it was committed to 
> trunk without checkstyle errors.
> According to [IllegalImportCheck#setIllegalPkgs 
> |https://github.com/checkstyle/checkstyle/blob/master/src/main/java/com/puppycrawl/tools/checkstyle/checks/imports/IllegalImportCheck.java],
>  the packages regex should be the prefix of the package. The code 
> automatically append {{\.*}} to the regex.
> CC: [~aajisaka]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17079) Optimize UGI#getGroups by adding UGI#getGroupsSet

2021-01-22 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17079:
-
Labels:   (was: pull-request-available)

> Optimize UGI#getGroups by adding UGI#getGroupsSet
> -
>
> Key: HADOOP-17079
> URL: https://issues.apache.org/jira/browse/HADOOP-17079
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HADOOP-17079.002.patch, HADOOP-17079.003.patch, 
> HADOOP-17079.004.patch, HADOOP-17079.005.patch, HADOOP-17079.006.patch, 
> HADOOP-17079.007.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> UGI#getGroups has been optimized with HADOOP-13442 by avoiding the 
> List->Set->List conversion. However the returned list is not optimized to 
> contains lookup, especially the user's group membership list is huge 
> (thousands+) . This ticket is opened to add a UGI#getGroupsSet and use 
> Set#contains() instead of List#contains() to speed up large group look up 
> while minimize List->Set conversions in Groups#getGroups() call. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17467) netgroup-user is not added to Groups.cache

2021-01-22 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270214#comment-17270214
 ] 

Jim Brennan commented on HADOOP-17467:
--

[~ahussein], can you please split this into two Jiras.  This one deals with the 
getGgroups problem in JniBasedUnixGroupsNetgroupMapping, which is a serious bug 
that should be fixed on its own.

The potential concurrency bug should be split out to a separate Jira.  It is a 
different problem and it is much less serious because the window is very small, 
but it accounts for most of the code changes in the PR.

Splitting these will allow us to get the first fix in sooner.

> netgroup-user is not added to Groups.cache
> --
>
> Key: HADOOP-17467
> URL: https://issues.apache.org/jira/browse/HADOOP-17467
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> After the optimization in HADOOP-17079, {{JniBasedUnixGroupsNetgroupMapping}} 
> does not implement {{getGroupSet}}.
>  As a result, {{Groups.load()}} load the cache calling {{fetchGroupSet}} 
> which got
>  to the superclass {{JniBasedUnixGroupsMapping}}.
>  In other words, the groups mapping will never fetch from {{NetgroupCache}}.
> This alters the behavior of the implementation. Is there a reason to bypass 
> loading. CC: [~xyao] 
> There is potential concurrency bug in the {{NetgroupCache}} implementation.
> {{NetgroupCache}} is static. When ACL is built, its groups will be added to 
> the {{NetgroupCache}}.
> A {{-refreshUserToGroupsMappings}} forces the cache to reload the users for 
> each group.
>  This is done by first getting the keys, clearing the cache, then finally 
> reloading the users for each group.
>  The problem that the three steps are not atomic.
>  Adding ACLs concurrently may take place between L80-L81 
> ([JniBasedUnixGroupsNetgroupMapping#L79|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/JniBasedUnixGroupsNetgroupMapping.java#L79]).
>  This results in the loss of the most recently added group.
>  Since group names are used in the JNI level, the users of that group won't 
> be retrieved.
> {code:java}
> 78 @Override
> 79  public void cacheGroupsRefresh() throws IOException {
> 80List groups = NetgroupCache.getNetgroupNames();
> 81 NetgroupCache.clear();
> 82cacheGroupsAdd(groups);
> 83  }
> {code}
> +Solution:+
> Refreshing {{NetgroupCache}} should not clear the cache keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17485) port UGI#getGroupsSet optimizations into 2.10

2021-01-21 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269665#comment-17269665
 ] 

Jim Brennan commented on HADOOP-17485:
--

I would recommend minimizing the changes needed to pull back this change and 
fix any unit tests that are broken by it and convert any lambdas.  The 
LDAPGroupMapping changes are needed as part of the port, I think.

Fixing up the deprecated call sites should definitely be done in trunk first - 
I would be ok with ignoring those warnings for this and filing a new Jira to 
fix those, since they are the same as what is in trunk.  The guava replacement 
could probably be lumped with that.

I'm not sure I like changing to LinkedHashSet just to make it easier to fix 
some tests.   Why don't those tests fail in trunk?

IIUC, pulling this back without HADOOP-17467 will break things in 2.10, so 
we'll need to make sure that fix is done in trunk before doing this so both can 
be pulled back together.  Is that correct?

 

> port UGI#getGroupsSet optimizations into 2.10
> -
>
> Key: HADOOP-17485
> URL: https://issues.apache.org/jira/browse/HADOOP-17485
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HADOOP-17079 introduced an optimization adding a UGI#getGroupsSet and use 
> Set#contains() instead of List#contains() to speed up large group look up 
> while minimize List->Set conversions in Groups#getGroups() call.
> This ticket is to port the changes into branch-2.10.
>  
> CC: [~Jim_Brennan], [~xyao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17485) port UGI#getGroupsSet optimizations into 2.10

2021-01-21 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269646#comment-17269646
 ] 

Jim Brennan commented on HADOOP-17485:
--

{quote}The cherry-pick was not clean:
 * I needed to add getGroupsSet for both ShellBasedUnixGroupsNetgroupMapping 
and \{{JniBasedUnixGroupsNetgroupMapping}}to avoid the bug described in 
HADOOP-17467{quote}
Why are you making this change as part of this backporting HADOOP-17079? 
Wouldn't it be better to pull back HADOOP-17467 separately when it is done?
{quote} * I had to replace Java-8 lambda expressions to be compatible with JDK7
 * I replaced some usages of guava since 2.10 has older versions.
 * Yetus generated several errors regarding deprecated getGroups. I replaced 
all the calls with getGroupsSet which were mainly in unit tests. cleaning the 
deprecated calls is not done in the trunk version.
 * LDAPGroupMapping change was not compatible. So, I had to manually replace 
getGroups.
 * I replaced new HashSet with LinkedHashSet. The latter maintains the order of 
insertion. This made the unit tests pass with less changes.
 * In the unit tests, I used Assert.Equals(Set1, Set2) to compare between two 
sets. Again, this change does not exist in trunk because it never used the 
getGroupsSet.{quote}
Seems like most of these would be good to have in trunk as well. Why not make 
these changes in trunk and then pull that Jira back to 2.10?

It's confusing to combine changes like this as part of back-porting a single 
change. If you need to pull back multiple Jiras at once, that is ok, but I 
would not expect so many additional changes in a back-port.  We generally try 
to minimize the changes when back-porting.

> port UGI#getGroupsSet optimizations into 2.10
> -
>
> Key: HADOOP-17485
> URL: https://issues.apache.org/jira/browse/HADOOP-17485
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HADOOP-17079 introduced an optimization adding a UGI#getGroupsSet and use 
> Set#contains() instead of List#contains() to speed up large group look up 
> while minimize List->Set conversions in Groups#getGroups() call.
> This ticket is to port the changes into branch-2.10.
>  
> CC: [~Jim_Brennan], [~xyao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17486) Provide fallbacks for callqueue ipc namespace properties

2021-01-21 Thread Jim Brennan (Jira)
Jim Brennan created HADOOP-17486:


 Summary: Provide fallbacks for callqueue ipc namespace properties
 Key: HADOOP-17486
 URL: https://issues.apache.org/jira/browse/HADOOP-17486
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.1.4
Reporter: Jim Brennan


Filing this proposal on behalf of [~daryn], based on comments he made in one of 
our internal Jiras.

The following settings are currently specified per port:
{noformat}
  /**
   * CallQueue related settings. These are not used directly, but rather
   * combined with a namespace and port. For instance:
   * IPC_NAMESPACE + ".8020." + IPC_CALLQUEUE_IMPL_KEY
   */
  public static final String IPC_NAMESPACE = "ipc";
  public static final String IPC_CALLQUEUE_IMPL_KEY = "callqueue.impl";
  public static final String IPC_SCHEDULER_IMPL_KEY = "scheduler.impl";
  public static final String IPC_IDENTITY_PROVIDER_KEY = 
"identity-provider.impl";
  public static final String IPC_COST_PROVIDER_KEY = "cost-provider.impl";
  public static final String IPC_BACKOFF_ENABLE = "backoff.enable";
  public static final boolean IPC_BACKOFF_ENABLE_DEFAULT = false;
 {noformat}
If one of these properties is not specified for the port, the defaults are 
hard-coded.
It would be nice to provide a way to specify a fallback default property that 
would be used for all ports.  If the property for a specific port is not 
defined, the fallback would be used, and if the fallback is not defined it 
would use the hard-coded defaults.

We would likely need to make the same change for properties specified by these 
classes.  For example, properties used in WeightedTimeCostProvider.

The fallback properties could be specified by dropping the port from the 
property name.  For example, the fallback for {{ipc.8020.cost-provider.impl}} 
would be {{ipc.cost-provider.impl}}.
Another option would be to use something more explicit like 
{{ipc.default.cost-provider.impl}}.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17485) port UGI#getGroupsSet optimizations into 2.10

2021-01-21 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269587#comment-17269587
 ] 

Jim Brennan commented on HADOOP-17485:
--

[~ahussein], can you please summarize any changes that were needed to port this 
back?  Or was it a clean cherry-pick?

 

> port UGI#getGroupsSet optimizations into 2.10
> -
>
> Key: HADOOP-17485
> URL: https://issues.apache.org/jira/browse/HADOOP-17485
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HADOOP-17079 introduced an optimization adding a UGI#getGroupsSet and use 
> Set#contains() instead of List#contains() to speed up large group look up 
> while minimize List->Set conversions in Groups#getGroups() call.
> This ticket is to port the changes into branch-2.10.
>  
> CC: [~Jim_Brennan], [~xyao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13965) Groups should be consistent in using default group mapping class

2021-01-21 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269484#comment-17269484
 ] 

Jim Brennan commented on HADOOP-13965:
--

[~ahussein] I cherry-picked this to branch-2.10.

 

> Groups should be consistent in using default group mapping class
> 
>
> Key: HADOOP-13965
> URL: https://issues.apache.org/jira/browse/HADOOP-13965
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0-alpha2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HADOOP-13965.001.patch
>
>
> {code:title=Groups.java}
>   public Groups(Configuration conf, final Timer timer) {
>impl = 
>   ReflectionUtils.newInstance(
>   
> conf.getClass(CommonConfigurationKeys.HADOOP_SECURITY_GROUP_MAPPING, 
> ShellBasedUnixGroupsMapping.class, 
> GroupMappingServiceProvider.class), 
>   conf);
>   ...
> }
> {code}
> The default value of setting {{hadoop.security.group.mapping}} is different 
> in code and config file. In file {{core-default.xml}}, it uses the class 
> {{JniBasedUnixGroupsMappingWithFallback}} and this should be the true value.
> {code}
> 
>   hadoop.security.group.mapping
>  
> org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback
>   
> Class for user to group mapping (get groups for a given user) for ACL.
> The default implementation,
> org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback,
> will determine if the Java Native Interface (JNI) is available.
>   
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13965) Groups should be consistent in using default group mapping class

2021-01-21 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-13965:
-
Fix Version/s: 2.10.2

> Groups should be consistent in using default group mapping class
> 
>
> Key: HADOOP-13965
> URL: https://issues.apache.org/jira/browse/HADOOP-13965
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0-alpha2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 3.0.0-alpha2, 2.10.2
>
> Attachments: HADOOP-13965.001.patch
>
>
> {code:title=Groups.java}
>   public Groups(Configuration conf, final Timer timer) {
>impl = 
>   ReflectionUtils.newInstance(
>   
> conf.getClass(CommonConfigurationKeys.HADOOP_SECURITY_GROUP_MAPPING, 
> ShellBasedUnixGroupsMapping.class, 
> GroupMappingServiceProvider.class), 
>   conf);
>   ...
> }
> {code}
> The default value of setting {{hadoop.security.group.mapping}} is different 
> in code and config file. In file {{core-default.xml}}, it uses the class 
> {{JniBasedUnixGroupsMappingWithFallback}} and this should be the true value.
> {code}
> 
>   hadoop.security.group.mapping
>  
> org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback
>   
> Class for user to group mapping (get groups for a given user) for ACL.
> The default implementation,
> org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback,
> will determine if the Java Native Interface (JNI) is available.
>   
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13965) Groups should be consistent in using default group mapping class

2021-01-21 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269477#comment-17269477
 ] 

Jim Brennan commented on HADOOP-13965:
--

[~ahussein] I will cherry-pick this change to branch-2.10.

 

> Groups should be consistent in using default group mapping class
> 
>
> Key: HADOOP-13965
> URL: https://issues.apache.org/jira/browse/HADOOP-13965
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0-alpha2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HADOOP-13965.001.patch
>
>
> {code:title=Groups.java}
>   public Groups(Configuration conf, final Timer timer) {
>impl = 
>   ReflectionUtils.newInstance(
>   
> conf.getClass(CommonConfigurationKeys.HADOOP_SECURITY_GROUP_MAPPING, 
> ShellBasedUnixGroupsMapping.class, 
> GroupMappingServiceProvider.class), 
>   conf);
>   ...
> }
> {code}
> The default value of setting {{hadoop.security.group.mapping}} is different 
> in code and config file. In file {{core-default.xml}}, it uses the class 
> {{JniBasedUnixGroupsMappingWithFallback}} and this should be the true value.
> {code}
> 
>   hadoop.security.group.mapping
>  
> org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback
>   
> Class for user to group mapping (get groups for a given user) for ACL.
> The default implementation,
> org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback,
> will determine if the Java Native Interface (JNI) is available.
>   
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17408) Optimize NetworkTopology while sorting of block locations

2021-01-08 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan resolved HADOOP-17408.
--
Fix Version/s: 3.4.0
   3.3.1
   Resolution: Fixed

Thanks for the contribution [~ahussein] and [~daryn]!  I have committed this to 
trunk and branch-3.3.

The patch does not apply cleanly to branch-3.2 or earlier.  Please provide a 
patch for 3.2 if desired.



> Optimize NetworkTopology while sorting of block locations
> -
>
> Key: HADOOP-17408
> URL: https://issues.apache.org/jira/browse/HADOOP-17408
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, net
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In {{NetworkTopology}}, I noticed that there are some hanging fruits to 
> improve the performance.
> Inside {{sortByDistance}}, collections.shuffle is performed on the list 
> before calling {{secondarySort}}.
> {code:java}
> Collections.shuffle(list, r);
> if (secondarySort != null) {
>   secondarySort.accept(list);
> }
> {code}
> However, in different call sites, {{collections.shuffle}} is passed as the 
> secondarySort to {{sortByDistance}}. This means that the shuffle is executed 
> twice on each list.
> Also, logic wise, it is useless to shuffle before applying a tie breaker 
> which might make the shuffle work obsolete.
> In addition, [~daryn] reported that:
> * topology is unnecessarily locking/unlocking to calculate the distance for 
> every node
> * shuffling uses a seeded Random, instead of ThreadLocalRandom, which is 
> heavily synchronized



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17408) Optimize NetworkTopology while sorting of block locations

2021-01-08 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261440#comment-17261440
 ] 

Jim Brennan commented on HADOOP-17408:
--

Thanks for your work on this [~ahussein].  I have approved the PR and I will 
commit later today.


> Optimize NetworkTopology while sorting of block locations
> -
>
> Key: HADOOP-17408
> URL: https://issues.apache.org/jira/browse/HADOOP-17408
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, net
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In {{NetworkTopology}}, I noticed that there are some hanging fruits to 
> improve the performance.
> Inside {{sortByDistance}}, collections.shuffle is performed on the list 
> before calling {{secondarySort}}.
> {code:java}
> Collections.shuffle(list, r);
> if (secondarySort != null) {
>   secondarySort.accept(list);
> }
> {code}
> However, in different call sites, {{collections.shuffle}} is passed as the 
> secondarySort to {{sortByDistance}}. This means that the shuffle is executed 
> twice on each list.
> Also, logic wise, it is useless to shuffle before applying a tie breaker 
> which might make the shuffle work obsolete.
> In addition, [~daryn] reported that:
> * topology is unnecessarily locking/unlocking to calculate the distance for 
> every node
> * shuffling uses a seeded Random, instead of ThreadLocalRandom, which is 
> heavily synchronized



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17408) Optimize NetworkTopology while sorting of block locations

2021-01-05 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259194#comment-17259194
 ] 

Jim Brennan commented on HADOOP-17408:
--

[~ahussein] thanks for the PR!  Can you please separate this into two parts?

I would like to see a separate Jira/PR with just the changes that [~daryn] made 
internally - those changes have gotten some run-time and are a clear 
optimization.

The additional changes you have made are mostly a refactoring, and I am not 
convinced the original behavior has been retained.   Optimizing away the 
shuffle could have been achieved by just moving the shuffle into the else case:
{noformat}
if (secondarySort != null) {
  secondarySort.accept(list);
} else {
  Collections.shuffle(list, r); 
}
{noformat}
The other concern with the refactoring portion is that it changes the signature 
of public method sortByDistance().


> Optimize NetworkTopology while sorting of block locations
> -
>
> Key: HADOOP-17408
> URL: https://issues.apache.org/jira/browse/HADOOP-17408
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, net
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In {{NetworkTopology}}, I noticed that there are some hanging fruits to 
> improve the performance.
> Inside {{sortByDistance}}, collections.shuffle is performed on the list 
> before calling {{secondarySort}}.
> {code:java}
> Collections.shuffle(list, r);
> if (secondarySort != null) {
>   secondarySort.accept(list);
> }
> {code}
> However, in different call sites, {{collections.shuffle}} is passed as the 
> secondarySort to {{sortByDistance}}. This means that the shuffle is executed 
> twice on each list.
> Also, logic wise, it is useless to shuffle before applying a tie breaker 
> which might make the shuffle work obsolete.
> In addition, [~daryn] reported that:
> * topology is unnecessarily locking/unlocking to calculate the distance for 
> every node
> * shuffling uses a seeded Random, instead of ThreadLocalRandom, which is 
> heavily synchronized



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13571) ServerSocketUtil.getPort() should use loopback address, not 0.0.0.0

2020-12-11 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-13571:
-
Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

I have committed this to trunk, branch-3.3, branch-3.2, and branch-3.1.

> ServerSocketUtil.getPort() should use loopback address, not 0.0.0.0
> ---
>
> Key: HADOOP-13571
> URL: https://issues.apache.org/jira/browse/HADOOP-13571
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: HADOOP-13571.001.patch, HADOOP-13571.002.patch
>
>
> Using 0.0.0.0 to check for a free port will succeed even if there's something 
> bound to that same port on the loopback interface. Since this function is 
> used primarily in testing, it should be checking the loopback interface for 
> free ports.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17417) Reduce UGI overhead in token ops

2020-12-11 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan resolved HADOOP-17417.
--
Resolution: Later

This needs to be done as part of a larger feature that we are not ready to put 
up yet.

 

> Reduce UGI overhead in token ops 
> -
>
> Key: HADOOP-17417
> URL: https://issues.apache.org/jira/browse/HADOOP-17417
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, kms, performance, rpc-server, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>
> {{DelegationTokenIdentifier}} has a {{ugiCache}} but  
> AbstractDelegationTokenManager calls a static method {{getRemoteUser()}} 
> which would bypass the cache.
> Performance analysis of the KMS revealed the RPC server layer is creating 
> many and redundant UGI instances.. UGIs are not cheap to instantiate, require 
> synchronization, and waste memory. Reducing instantiations will improve the 
> performance of the ipc readers.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17416) Reduce synchronization in the token secret manager

2020-12-11 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan resolved HADOOP-17416.
--
Resolution: Later

This change is part of a larger feature that we are not ready to put up yet.

 

> Reduce synchronization in the token secret manager
> --
>
> Key: HADOOP-17416
> URL: https://issues.apache.org/jira/browse/HADOOP-17416
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, performance, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>
> [~daryn] reported that Reducing synchronization in the ZK secret manager is 
> complicated by excessive and unnecessary global synchronization in the 
> AbstractDelegationTokenSecretManager.  All RPC services, not just the KMS, 
> will benefit from the reduced synchronization.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13571) ServerSocketUtil.getPort() should use loopback address, not 0.0.0.0

2020-12-11 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248053#comment-17248053
 ] 

Jim Brennan commented on HADOOP-13571:
--

Thanks for the patch [~ebadger] and thanks for the review [~ahussein]!

+1 this looks good to me.  We've been running with this change internally for a 
long time.

I will commit later today.

 

> ServerSocketUtil.getPort() should use loopback address, not 0.0.0.0
> ---
>
> Key: HADOOP-13571
> URL: https://issues.apache.org/jira/browse/HADOOP-13571
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: HADOOP-13571.001.patch, HADOOP-13571.002.patch
>
>
> Using 0.0.0.0 to check for a free port will succeed even if there's something 
> bound to that same port on the loopback interface. Since this function is 
> used primarily in testing, it should be checking the loopback interface for 
> free ports.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17389) KMS should log full UGI principal

2020-12-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245383#comment-17245383
 ] 

Jim Brennan commented on HADOOP-17389:
--

I cherry-picked this to branch-3.3, branch-3.2, and branch-3.1.

> KMS should log full UGI principal
> -
>
> Key: HADOOP-17389
> URL: https://issues.apache.org/jira/browse/HADOOP-17389
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [~daryn] reported that the kms-audit log only logs the short username:
> {{OK[op=GENERATE_EEK, key=key1, user=hdfs, accessCount=4206, 
> interval=10427ms]}}
> In this example, it's impossible to tell which NN(s) requested EDEKs when 
> they are all lumped together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17389) KMS should log full UGI principal

2020-12-07 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17389:
-
Fix Version/s: 3.2.3
   3.1.5
   3.3.1

> KMS should log full UGI principal
> -
>
> Key: HADOOP-17389
> URL: https://issues.apache.org/jira/browse/HADOOP-17389
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [~daryn] reported that the kms-audit log only logs the short username:
> {{OK[op=GENERATE_EEK, key=key1, user=hdfs, accessCount=4206, 
> interval=10427ms]}}
> In this example, it's impossible to tell which NN(s) requested EDEKs when 
> they are all lumped together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17389) KMS should log full UGI principal

2020-12-04 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244273#comment-17244273
 ] 

Jim Brennan commented on HADOOP-17389:
--

Thanks [~aajisaka]!  Any objection to cherry-picking this to other branch-3 
branches?

> KMS should log full UGI principal
> -
>
> Key: HADOOP-17389
> URL: https://issues.apache.org/jira/browse/HADOOP-17389
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [~daryn] reported that the kms-audit log only logs the short username:
> {{OK[op=GENERATE_EEK, key=key1, user=hdfs, accessCount=4206, 
> interval=10427ms]}}
> In this example, it's impossible to tell which NN(s) requested EDEKs when 
> they are all lumped together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17392) Remote exception messages should not include the exception class

2020-12-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan resolved HADOOP-17392.
--
Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed

Thanks [~daryn] and [~ahussein]!  I have committed this to trunk, branch-3.3, 
branch-3.2, and branch-3.1.

 

> Remote exception messages should not include the exception class
> 
>
> Key: HADOOP-17392
> URL: https://issues.apache.org/jira/browse/HADOOP-17392
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HADOOP-9844 added a change that caused some remote SASL exceptions to 
> redundantly include the exception class causing the client to see "{{Class: 
> Class: message}}" from an unwrapped RemoteException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17392) Remote exception messages should not include the exception class

2020-12-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243345#comment-17243345
 ] 

Jim Brennan commented on HADOOP-17392:
--

Thanks [~daryn] and [~ahussein]!  +1 This looks good to me.

> Remote exception messages should not include the exception class
> 
>
> Key: HADOOP-17392
> URL: https://issues.apache.org/jira/browse/HADOOP-17392
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HADOOP-9844 added a change that caused some remote SASL exceptions to 
> redundantly include the exception class causing the client to see "{{Class: 
> Class: message}}" from an unwrapped RemoteException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17367) Add InetAddress api to ProxyUsers.authorize

2020-11-19 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan resolved HADOOP-17367.
--
Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed

Thanks [~ahussein] and [~daryn]!
I've committed this to trunk, branch-3.3, branch-3.2, and branch-3.1.

> Add InetAddress api to ProxyUsers.authorize
> ---
>
> Key: HADOOP-17367
> URL: https://issues.apache.org/jira/browse/HADOOP-17367
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Improve the ProxyUsers implementation by passing the address of the remote 
> peer to avoid resolving the hostname.
> Similarly, this requires adding InetAddress api to MachineList.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17367) Add InetAddress api to ProxyUsers.authorize

2020-11-18 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17235001#comment-17235001
 ] 

Jim Brennan commented on HADOOP-17367:
--

[~ahussein], thanks for putting up this PR.  It looks good to me.  I verified 
it against the internal change we made.  We have been running with this in 
production for almost a year.

Will wait a bit longer to allow for additional comments before committing this.

 

> Add InetAddress api to ProxyUsers.authorize
> ---
>
> Key: HADOOP-17367
> URL: https://issues.apache.org/jira/browse/HADOOP-17367
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Improve the ProxyUsers implementation by passing the address of the remote 
> peer to avoid resolving the hostname.
> Similarly, this requires adding InetAddress api to MachineList.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17360) Log the remote address for authentication success

2020-11-16 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233137#comment-17233137
 ] 

Jim Brennan commented on HADOOP-17360:
--

[~kihwal] merged this PR to trunk, and I cherry-picked it to branch-3.3, 
branch-3.2, and branch-3.1.

Thanks for the contribution [~ahussein]!

> Log the remote address for authentication success
> -
>
> Key: HADOOP-17360
> URL: https://issues.apache.org/jira/browse/HADOOP-17360
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> IPC server logs "Authentication successful for USER". 
> Unlike the hdfs audit log that includes the address with every request, the 
> kms audit log is 10s aggregate per-user op counts.  Including the remote 
> address would make debugging much easier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17360) Log the remote address for authentication success

2020-11-16 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17360:
-
Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Log the remote address for authentication success
> -
>
> Key: HADOOP-17360
> URL: https://issues.apache.org/jira/browse/HADOOP-17360
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> IPC server logs "Authentication successful for USER". 
> Unlike the hdfs audit log that includes the address with every request, the 
> kms audit log is 10s aggregate per-user op counts.  Including the remote 
> address would make debugging much easier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17362) Doing hadoop ls on Har file triggers too many RPC calls

2020-11-16 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan resolved HADOOP-17362.
--
Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed

> Doing hadoop ls on Har file triggers too many RPC calls
> ---
>
> Key: HADOOP-17362
> URL: https://issues.apache.org/jira/browse/HADOOP-17362
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> [~daryn] has noticed that Invoking hadoop ls on HAR is taking too much of 
> time.
> The har system has multiple deficiencies that significantly impacted 
> performance:
> # Parsing the master index references ranges within the archive index. Each 
> range required re-opening the hdfs input stream and seeking to the same 
> location where it previously stopped.
> # Listing a har stats the archive index for every "directory". The per-call 
> cache used a unique key for each stat, rendering the cache useless and 
> significantly increasing memory pressure.
> # Determining the children of a directory scans the entire archive contents 
> and filters out children. The cached metadata already stores the exact child 
> list.
> # Globbing a har's contents resulted in unnecessary stats for every leaf path.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17362) Doing hadoop ls on Har file triggers too many RPC calls

2020-11-13 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231848#comment-17231848
 ] 

Jim Brennan commented on HADOOP-17362:
--

I have committed this to trunk, branch-3.3, branch-3.2, and branch-3.1.

Thanks [~ahussein] and [~daryn] for the contribution!

 

> Doing hadoop ls on Har file triggers too many RPC calls
> ---
>
> Key: HADOOP-17362
> URL: https://issues.apache.org/jira/browse/HADOOP-17362
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> [~daryn] has noticed that Invoking hadoop ls on HAR is taking too much of 
> time.
> The har system has multiple deficiencies that significantly impacted 
> performance:
> # Parsing the master index references ranges within the archive index. Each 
> range required re-opening the hdfs input stream and seeking to the same 
> location where it previously stopped.
> # Listing a har stats the archive index for every "directory". The per-call 
> cache used a unique key for each stat, rendering the cache useless and 
> significantly increasing memory pressure.
> # Determining the children of a directory scans the entire archive contents 
> and filters out children. The cached metadata already stores the exact child 
> list.
> # Globbing a har's contents resulted in unnecessary stats for every leaf path.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HADOOP-17362) Doing hadoop ls on Har file triggers too many RPC calls

2020-11-13 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17362:
-
Comment: was deleted

(was: Measurements taken by [~daryn] when he made this change internally:
{quote}
Given a real world har for 1 hour of tez jobs from one of our research clusters:
- listing formerly took 12s with ~47.5k rpcs, reduced to 6s with 10 rpcs
- globbing killed after 11min with hundreds of thousands of rpcs with 16G heap, 
reduced to 11s with 10 rpcs and 768M heap
{quote})

> Doing hadoop ls on Har file triggers too many RPC calls
> ---
>
> Key: HADOOP-17362
> URL: https://issues.apache.org/jira/browse/HADOOP-17362
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> [~daryn] has noticed that Invoking hadoop ls on HAR is taking too much of 
> time.
> The har system has multiple deficiencies that significantly impacted 
> performance:
> # Parsing the master index references ranges within the archive index. Each 
> range required re-opening the hdfs input stream and seeking to the same 
> location where it previously stopped.
> # Listing a har stats the archive index for every "directory". The per-call 
> cache used a unique key for each stat, rendering the cache useless and 
> significantly increasing memory pressure.
> # Determining the children of a directory scans the entire archive contents 
> and filters out children. The cached metadata already stores the exact child 
> list.
> # Globbing a har's contents resulted in unnecessary stats for every leaf path.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17362) Doing hadoop ls on Har file triggers too many RPC calls

2020-11-13 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231802#comment-17231802
 ] 

Jim Brennan commented on HADOOP-17362:
--

Measurements taken by [~daryn] when he made this change internally:
{quote}
Given a real world har for 1 hour of tez jobs from one of our research clusters:
- listing formerly took 12s with ~47.5k rpcs, reduced to 6s with 10 rpcs
- globbing killed after 11min with hundreds of thousands of rpcs with 16G heap, 
reduced to 11s with 10 rpcs and 768M heap
{quote}

> Doing hadoop ls on Har file triggers too many RPC calls
> ---
>
> Key: HADOOP-17362
> URL: https://issues.apache.org/jira/browse/HADOOP-17362
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> [~daryn] has noticed that Invoking hadoop ls on HAR is taking too much of 
> time.
> The har system has multiple deficiencies that significantly impacted 
> performance:
> # Parsing the master index references ranges within the archive index. Each 
> range required re-opening the hdfs input stream and seeking to the same 
> location where it previously stopped.
> # Listing a har stats the archive index for every "directory". The per-call 
> cache used a unique key for each stat, rendering the cache useless and 
> significantly increasing memory pressure.
> # Determining the children of a directory scans the entire archive contents 
> and filters out children. The cached metadata already stores the exact child 
> list.
> # Globbing a har's contents resulted in unnecessary stats for every leaf path.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17306) RawLocalFileSystem's lastModifiedTime() loses milli seconds in JDK < 10.b09

2020-11-09 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17228751#comment-17228751
 ] 

Jim Brennan commented on HADOOP-17306:
--

>From what I can tell, fixing the affected unit tests should be sufficient for 
>this, to make them use the same modification time in the resource as is 
>returned by the RawLocalFileSystem with this change.

I haven't been able to identify any cases in the code where this will introduce 
new problems, but I can't say for certain it won't.   My main concern was 
whether this change would cause unnecessary re-downloading of localized 
resources during a rolling upgrade.   The {{FSDownload.verifyAndCopy()}} 
failures in the unit tests are cases where the source FS is a local filesystem, 
and we are comparing with a pre-set timestamp.  I think in the normal case, 
both the stat and the timestamp will have come from something like HDFS.

I agree that this change will happen in any case when we move to a JDK that 
changes the underlying behavior.



> RawLocalFileSystem's lastModifiedTime() loses milli seconds in JDK < 10.b09
> ---
>
> Key: HADOOP-17306
> URL: https://issues.apache.org/jira/browse/HADOOP-17306
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RawLocalFileSystem's FileStatus uses {{File.lastModified()}} api from JDK.
> This api looses milliseconds due to JDK bug.
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8177809]
> This bug fixed in JDK 10 b09 onwards and still exists in JDK 8 which is still 
> being used in many productions.
> Apparently, {{Files.getLastModifiedTime()}} from java's nio package returns 
> correct time.
> Use {{Files.getLastModifiedTime()}} instead of {{File.lastModified}} as 
> workaround. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17342) Creating a token identifier should not do kerberos name resolution

2020-11-05 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17227068#comment-17227068
 ] 

Jim Brennan commented on HADOOP-17342:
--

Thanks [~ebadger]!

> Creating a token identifier should not do kerberos name resolution
> --
>
> Key: HADOOP-17342
> URL: https://issues.apache.org/jira/browse/HADOOP-17342
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3
>
> Attachments: HADOOP-17342.001.patch
>
>
> This problem was found and fixed internally for us by [~daryn].
> Creating a token identifier tries to do auth_to_local short username 
> translation. The authentication process creates a blank token identifier for 
> deserializing the wire format. Attempting to resolve an empty username is 
> useless work.
> Discovered the issue during fair call queue backoff testing. The readers are 
> unnecessary slowed down by this bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17306) RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09

2020-11-05 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226902#comment-17226902
 ] 

Jim Brennan commented on HADOOP-17306:
--

Note, it looks like pre-commit build only tested hadoop-common, so I can see 
why the YARN failures were missed.  Not sure why we don't run tests for all 
projects when we make changes in common.


> RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09
> 
>
> Key: HADOOP-17306
> URL: https://issues.apache.org/jira/browse/HADOOP-17306
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RawLocalFileSystem's FileStatus uses {{File.lastModified()}} api from JDK.
> This api looses milliseconds due to JDK bug.
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8177809]
> This bug fixed in JDK 10 b09 onwards and still exists in JDK 8 which is still 
> being used in many productions.
> Apparently, {{Files.getLastModifiedTime()}} from java's nio package returns 
> correct time.
> Use {{Files.getLastModifiedTime()}} instead of {{File.lastModified}} as 
> workaround. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-17306) RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09

2020-11-05 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reopened HADOOP-17306:
--

I have reverted this commit from trunk, branch-3.3, branch-3.2, and 
branch-3.2.2.
[~vinayakumarb] please address all of the unit test failures when you resubmit.
I also think we need to review references to modified times in the source base 
to be sure we are not breaking things with this change.  Yarn Resource 
Localization is one area, but there may be others.  Timestamps are sometimes 
stored in state-stores, so there may be compatibility issues with this change 
as well.


> RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09
> 
>
> Key: HADOOP-17306
> URL: https://issues.apache.org/jira/browse/HADOOP-17306
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RawLocalFileSystem's FileStatus uses {{File.lastModified()}} api from JDK.
> This api looses milliseconds due to JDK bug.
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8177809]
> This bug fixed in JDK 10 b09 onwards and still exists in JDK 8 which is still 
> being used in many productions.
> Apparently, {{Files.getLastModifiedTime()}} from java's nio package returns 
> correct time.
> Use {{Files.getLastModifiedTime()}} instead of {{File.lastModified}} as 
> workaround. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17306) RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09

2020-11-05 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226795#comment-17226795
 ] 

Jim Brennan commented on HADOOP-17306:
--

Unit test failures in [YARN-10479] were caused by this.


> RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09
> 
>
> Key: HADOOP-17306
> URL: https://issues.apache.org/jira/browse/HADOOP-17306
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RawLocalFileSystem's FileStatus uses {{File.lastModified()}} api from JDK.
> This api looses milliseconds due to JDK bug.
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8177809]
> This bug fixed in JDK 10 b09 onwards and still exists in JDK 8 which is still 
> being used in many productions.
> Apparently, {{Files.getLastModifiedTime()}} from java's nio package returns 
> correct time.
> Use {{Files.getLastModifiedTime()}} instead of {{File.lastModified}} as 
> workaround. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17306) RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09

2020-11-05 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226752#comment-17226752
 ] 

Jim Brennan commented on HADOOP-17306:
--

Here is an example of the type of failure I am seeing in the unit tests:
{noformat}
java.io.IOException: Resource 
file:/home/jenkins/jenkins-home/workspace/hadoop-qbt-trunk-java8-linux-x86_64/sourcedir/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/TestContainerManager-tmpDir/scriptFile.sh
 changed on src filesystem - expected: "2020-10-24T09:29:28.000+", was: 
"2020-10-24T09:29:28.936+", current time: "2020-10-24T09:29:29.586+"
at 
org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:278)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:68)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:415)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:412)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:412)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:247)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:240)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:228)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}

> RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09
> 
>
> Key: HADOOP-17306
> URL: https://issues.apache.org/jira/browse/HADOOP-17306
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RawLocalFileSystem's FileStatus uses {{File.lastModified()}} api from JDK.
> This api looses milliseconds due to JDK bug.
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8177809]
> This bug fixed in JDK 10 b09 onwards and still exists in JDK 8 which is still 
> being used in many productions.
> Apparently, {{Files.getLastModifiedTime()}} from java's nio package returns 
> correct time.
> Use {{Files.getLastModifiedTime()}} instead of {{File.lastModified}} as 
> workaround. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17306) RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09

2020-11-05 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226747#comment-17226747
 ] 

Jim Brennan commented on HADOOP-17306:
--

[~aajisaka], [~ayushsaxena] any comment?  My inclination is to revert this 
change.


> RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09
> 
>
> Key: HADOOP-17306
> URL: https://issues.apache.org/jira/browse/HADOOP-17306
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RawLocalFileSystem's FileStatus uses {{File.lastModified()}} api from JDK.
> This api looses milliseconds due to JDK bug.
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8177809]
> This bug fixed in JDK 10 b09 onwards and still exists in JDK 8 which is still 
> being used in many productions.
> Apparently, {{Files.getLastModifiedTime()}} from java's nio package returns 
> correct time.
> Use {{Files.getLastModifiedTime()}} instead of {{File.lastModified}} as 
> workaround. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17306) RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09

2020-11-04 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226305#comment-17226305
 ] 

Jim Brennan commented on HADOOP-17306:
--

This change is causing a large number of YARN unit tests to fail.
 We should consider reverting it until we can address the issues.

I am concerned that this might be a problem not just for tests, but also for 
production code.

This was last build on trunk before this change went in:
 
[https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/303/#showFailuresLink]

This was the first build with this change:
 
[https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/304/#showFailuresLink]

I believe many of the new nodemanager tests failures are due to this change. 
Many of them are failing because the timestamp for localized resources do not 
match what they were set to.
 Example failure:
{noformat}
java.lang.AssertionError: ProcessStartFile doesn't exist!
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager.prepareInitialContainer(TestContainerManager.java:1040)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager.testContainerUpgradeLocalizationFailure(TestContainerManager.java:819)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
{noformat}
 
cc: [~vinayakumarb], [~hexiaoqiao], [~epayne]


> RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09
> 
>
> Key: HADOOP-17306
> URL: https://issues.apache.org/jira/browse/HADOOP-17306
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RawLocalFileSystem's FileStatus uses {{File.lastModified()}} api from JDK.
> This api looses milliseconds due to JDK bug.
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8177809]
> This bug fixed in JDK 10 b09 onwards and still exists in JDK 8 which is still 
> being used in many productions.
> Apparently, {{Files.getLastModifiedTime()}} from java's nio package returns 
> correct time.
> Use {{Files.getLastModifiedTime()}} instead of {{File.lastModified}} as 
> workaround. 



--

[jira] [Commented] (HADOOP-17342) Creating a token identifier should not do kerberos name resolution

2020-11-04 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226085#comment-17226085
 ] 

Jim Brennan commented on HADOOP-17342:
--

[~kihwal], [~ebadger] can one of you please review?


> Creating a token identifier should not do kerberos name resolution
> --
>
> Key: HADOOP-17342
> URL: https://issues.apache.org/jira/browse/HADOOP-17342
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-17342.001.patch
>
>
> This problem was found and fixed internally for us by [~daryn].
> Creating a token identifier tries to do auth_to_local short username 
> translation. The authentication process creates a blank token identifier for 
> deserializing the wire format. Attempting to resolve an empty username is 
> useless work.
> Discovered the issue during fair call queue backoff testing. The readers are 
> unnecessary slowed down by this bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17342) Creating a token identifier should not do kerberos name resolution

2020-11-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225686#comment-17225686
 ] 

Jim Brennan commented on HADOOP-17342:
--

I believe the failure in TestLdapGroupsMapping is unrelated - there is already 
a Jira to fix that: HADOOP-17340

I don't think we need to add a new test case for this.  Any existing tests that 
use this constructor will test it implicitly.  I think a code review in this 
case would be sufficient.




> Creating a token identifier should not do kerberos name resolution
> --
>
> Key: HADOOP-17342
> URL: https://issues.apache.org/jira/browse/HADOOP-17342
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-17342.001.patch
>
>
> This problem was found and fixed internally for us by [~daryn].
> Creating a token identifier tries to do auth_to_local short username 
> translation. The authentication process creates a blank token identifier for 
> deserializing the wire format. Attempting to resolve an empty username is 
> useless work.
> Discovered the issue during fair call queue backoff testing. The readers are 
> unnecessary slowed down by this bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17342) Creating a token identifier should not do kerberos name resolution

2020-11-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17342:
-
Status: Patch Available  (was: Open)

> Creating a token identifier should not do kerberos name resolution
> --
>
> Key: HADOOP-17342
> URL: https://issues.apache.org/jira/browse/HADOOP-17342
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-17342.001.patch
>
>
> This problem was found and fixed internally for us by [~daryn].
> Creating a token identifier tries to do auth_to_local short username 
> translation. The authentication process creates a blank token identifier for 
> deserializing the wire format. Attempting to resolve an empty username is 
> useless work.
> Discovered the issue during fair call queue backoff testing. The readers are 
> unnecessary slowed down by this bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17342) Creating a token identifier should not do kerberos name resolution

2020-11-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17342:
-
Attachment: HADOOP-17342.001.patch

> Creating a token identifier should not do kerberos name resolution
> --
>
> Key: HADOOP-17342
> URL: https://issues.apache.org/jira/browse/HADOOP-17342
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-17342.001.patch
>
>
> This problem was found and fixed internally for us by [~daryn].
> Creating a token identifier tries to do auth_to_local short username 
> translation. The authentication process creates a blank token identifier for 
> deserializing the wire format. Attempting to resolve an empty username is 
> useless work.
> Discovered the issue during fair call queue backoff testing. The readers are 
> unnecessary slowed down by this bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-17342) Creating a token identifier should not do kerberos name resolution

2020-11-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reassigned HADOOP-17342:


Assignee: Jim Brennan

> Creating a token identifier should not do kerberos name resolution
> --
>
> Key: HADOOP-17342
> URL: https://issues.apache.org/jira/browse/HADOOP-17342
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>
> This problem was found and fixed internally for us by [~daryn].
> Creating a token identifier tries to do auth_to_local short username 
> translation. The authentication process creates a blank token identifier for 
> deserializing the wire format. Attempting to resolve an empty username is 
> useless work.
> Discovered the issue during fair call queue backoff testing. The readers are 
> unnecessary slowed down by this bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17342) Creating a token identifier should not do kerberos name resolution

2020-11-03 Thread Jim Brennan (Jira)
Jim Brennan created HADOOP-17342:


 Summary: Creating a token identifier should not do kerberos name 
resolution
 Key: HADOOP-17342
 URL: https://issues.apache.org/jira/browse/HADOOP-17342
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 2.10.1, 3.4.0
Reporter: Jim Brennan


This problem was found and fixed internally for us by [~daryn].

Creating a token identifier tries to do auth_to_local short username 
translation. The authentication process creates a blank token identifier for 
deserializing the wire format. Attempting to resolve an empty username is 
useless work.

Discovered the issue during fair call queue backoff testing. The readers are 
unnecessary slowed down by this bug.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17249) Upgrade jackson-databind to 2.10 on branch-2.10

2020-09-08 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192311#comment-17192311
 ] 

Jim Brennan commented on HADOOP-17249:
--

[~iwasakims] there were some concerns about breaking downstream projects in 
[HADOOP-17094]

https://issues.apache.org/jira/browse/HADOOP-17094?focusedCommentId=17145688&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17145688



> Upgrade jackson-databind to 2.10 on branch-2.10
> ---
>
> Key: HADOOP-17249
> URL: https://issues.apache.org/jira/browse/HADOOP-17249
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.10.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is filed to test backporting HADOOP-16905 to branch-2.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17251) Upgrade netty-all to 4.1.50.Final on branch-2.10

2020-09-08 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192304#comment-17192304
 ] 

Jim Brennan commented on HADOOP-17251:
--

+1 this looks good to me.

> Upgrade netty-all to 4.1.50.Final on branch-2.10
> 
>
> Key: HADOOP-17251
> URL: https://issues.apache.org/jira/browse/HADOOP-17251
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> netty-all seems to be easily updated to fix HADOOP-16918.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-15 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158485#comment-17158485
 ] 

Jim Brennan commented on HADOOP-17127:
--

Thanks [~xkrogen]!

> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 3.1.4, 3.2.2, 2.10.1, 3.3.1, 3.4.0
>
> Attachments: HADOOP-17127-branch-2.10.001.patch, 
> HADOOP-17127-branch-3.1.001.patch, HADOOP-17127-branch-3.2.001.patch, 
> HADOOP-17127.001.patch, HADOOP-17127.002.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-14 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157624#comment-17157624
 ] 

Jim Brennan commented on HADOOP-17127:
--

[~xkrogen] I have verified that the trunk patch applies for branch-3.3, and I 
have added patches for branch-3.2, branch-3.1, and branch-2.10.

Can we get this pulled back to those branches?


> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HADOOP-17127-branch-2.10.001.patch, 
> HADOOP-17127-branch-3.1.001.patch, HADOOP-17127-branch-3.2.001.patch, 
> HADOOP-17127.001.patch, HADOOP-17127.002.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-14 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17127:
-
Attachment: HADOOP-17127-branch-2.10.001.patch

> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HADOOP-17127-branch-2.10.001.patch, 
> HADOOP-17127-branch-3.1.001.patch, HADOOP-17127-branch-3.2.001.patch, 
> HADOOP-17127.001.patch, HADOOP-17127.002.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-14 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17127:
-
Attachment: HADOOP-17127-branch-3.1.001.patch

> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HADOOP-17127-branch-3.1.001.patch, 
> HADOOP-17127-branch-3.2.001.patch, HADOOP-17127.001.patch, 
> HADOOP-17127.002.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-14 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17127:
-
Attachment: HADOOP-17127-branch-3.2.001.patch

> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HADOOP-17127-branch-3.1.001.patch, 
> HADOOP-17127-branch-3.2.001.patch, HADOOP-17127.001.patch, 
> HADOOP-17127.002.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-14 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157572#comment-17157572
 ] 

Jim Brennan commented on HADOOP-17127:
--

Thanks [~xkrogen]!  I am testing out patches for the other branches and I will 
put them up once they are ready.


> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HADOOP-17127.001.patch, HADOOP-17127.002.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-14 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157475#comment-17157475
 ] 

Jim Brennan commented on HADOOP-17127:
--

[~cgregori], [~xkrogen] can you please review?


> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-17127.001.patch, HADOOP-17127.002.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-14 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17127:
-
Attachment: HADOOP-17127.002.patch

> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-17127.001.patch, HADOOP-17127.002.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-14 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157397#comment-17157397
 ] 

Jim Brennan commented on HADOOP-17127:
--

I've submitted patch 002 to fix the checkstyle issue.


> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-17127.001.patch, HADOOP-17127.002.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-13 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17127:
-
Attachment: HADOOP-17127.001.patch

> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-17127.001.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-13 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-17127:
-
Status: Patch Available  (was: Open)

> Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime
> --
>
> Key: HADOOP-17127
> URL: https://issues.apache.org/jira/browse/HADOOP-17127
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-17127.001.patch
>
>
> While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
> {{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to 
> modify this code in DecayRpcScheduler.addResponseTime() to initialize 
> {{queueTime}} and {{processingTime}} with the correct units.
> {noformat}
> long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
> long processingTime = details.get(Timing.PROCESSING, 
> TimeUnit.MILLISECONDS);
> {noformat}
> If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.
> We also found one test case in TestRPC that was assuming the units were 
> milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17127) Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime

2020-07-13 Thread Jim Brennan (Jira)
Jim Brennan created HADOOP-17127:


 Summary: Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and 
processingTime
 Key: HADOOP-17127
 URL: https://issues.apache.org/jira/browse/HADOOP-17127
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Reporter: Jim Brennan
Assignee: Jim Brennan


While making an internal change to use {{TimeUnit.MICROSECONDS}} instead of 
{{TimeUnit.MILLISECONDS}} for rpc details, we found that we also had to modify 
this code in DecayRpcScheduler.addResponseTime() to initialize {{queueTime}} 
and {{processingTime}} with the correct units.
{noformat}
long queueTime = details.get(Timing.QUEUE, TimeUnit.MILLISECONDS);
long processingTime = details.get(Timing.PROCESSING, TimeUnit.MILLISECONDS);
{noformat}
If we change these to use {{RpcMetrics.TIMEUNIT}} it is simpler.

We also found one test case in TestRPC that was assuming the units were 
milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2020-04-23 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-16361:
-
Fix Version/s: 2.10.1

> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 2.9.2, 2.8.5
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 2.10.1
>
> Attachments: HADOOP-16361-branch-2.10.001.patch, 
> HADOOP-16361-branch-2.10.002.patch
>
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2020-04-23 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-16361:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 2.9.2, 2.8.5
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 2.10.1
>
> Attachments: HADOOP-16361-branch-2.10.001.patch, 
> HADOOP-16361-branch-2.10.002.patch
>
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2020-04-16 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085069#comment-17085069
 ] 

Jim Brennan commented on HADOOP-16361:
--

Thanks [~eyang]!

I thought we were not supposed to commit to branch-2 anymore?


> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 2.9.2, 2.8.5
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-16361-branch-2.10.001.patch, 
> HADOOP-16361-branch-2.10.002.patch
>
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2020-04-15 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084394#comment-17084394
 ] 

Jim Brennan commented on HADOOP-16361:
--

Thanks for the review [~eyang]!  Can you please commit this to branch-2.10?

> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 2.9.2, 2.8.5
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-16361-branch-2.10.001.patch, 
> HADOOP-16361-branch-2.10.002.patch
>
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2020-04-15 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-16361:
-
Attachment: HADOOP-16361-branch-2.10.002.patch

> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 2.9.2, 2.8.5
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-16361-branch-2.10.001.patch, 
> HADOOP-16361-branch-2.10.002.patch
>
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2020-04-15 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084316#comment-17084316
 ] 

Jim Brennan commented on HADOOP-16361:
--

Thanks for the review [~eyang]!  I will add the negative test case.

> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 2.9.2, 2.8.5
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-16361-branch-2.10.001.patch
>
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2020-04-15 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084298#comment-17084298
 ] 

Jim Brennan commented on HADOOP-16361:
--

[~daryn], [~kihwal], [~jhung] I've put up [~daryn]'s patch to fix this test.  
It has been failing consistently on branch-2.10 for a very long time.   Would 
be good to get this pulled in.

> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 2.9.2, 2.8.5
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-16361-branch-2.10.001.patch
>
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2020-04-15 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-16361:
-
Attachment: HADOOP-16361-branch-2.10.001.patch
Status: Patch Available  (was: Open)

[~daryn] fixed this in our internal build.  Submitting a patch with his fix.


> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.8.5, 2.9.2, 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-16361-branch-2.10.001.patch
>
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2020-04-15 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reassigned HADOOP-16361:


Assignee: Jim Brennan

> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 2.9.2, 2.8.5
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14206) TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature

2020-03-06 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17053720#comment-17053720
 ] 

Jim Brennan commented on HADOOP-14206:
--

Thanks [~jzhuge]!

> TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature
> -
>
> Key: HADOOP-14206
> URL: https://issues.apache.org/jira/browse/HADOOP-14206
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs, test
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-14206-branch-2.10.001.patch, 
> HADOOP-14206.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HADOOP-Build/11862/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_121.txt:
> {noformat}
> Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 10.454 sec 
> <<< FAILURE! - in org.apache.hadoop.fs.sftp.TestSFTPFileSystem
> testFileExists(org.apache.hadoop.fs.sftp.TestSFTPFileSystem)  Time elapsed: 
> 0.19 sec  <<< ERROR!
> java.io.IOException: com.jcraft.jsch.JSchException: Session.connect: 
> java.security.SignatureException: Invalid encoding for signature
>   at com.jcraft.jsch.Session.connect(Session.java:565)
>   at com.jcraft.jsch.Session.connect(Session.java:183)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:168)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:180)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e

[jira] [Commented] (HADOOP-14206) TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature

2020-03-06 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17053495#comment-17053495
 ] 

Jim Brennan commented on HADOOP-14206:
--

Thanks for the review [~jzhuge]!  Any committers listening who would be willing 
to commit this, assuming there are no concerns about the change?


> TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature
> -
>
> Key: HADOOP-14206
> URL: https://issues.apache.org/jira/browse/HADOOP-14206
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs, test
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-14206-branch-2.10.001.patch, 
> HADOOP-14206.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HADOOP-Build/11862/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_121.txt:
> {noformat}
> Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 10.454 sec 
> <<< FAILURE! - in org.apache.hadoop.fs.sftp.TestSFTPFileSystem
> testFileExists(org.apache.hadoop.fs.sftp.TestSFTPFileSystem)  Time elapsed: 
> 0.19 sec  <<< ERROR!
> java.io.IOException: com.jcraft.jsch.JSchException: Session.connect: 
> java.security.SignatureException: Invalid encoding for signature
>   at com.jcraft.jsch.Session.connect(Session.java:565)
>   at com.jcraft.jsch.Session.connect(Session.java:183)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:168)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:180)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---

[jira] [Commented] (HADOOP-14206) TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature

2020-03-05 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052578#comment-17052578
 ] 

Jim Brennan commented on HADOOP-14206:
--

No tests are included because this is just a pom file change.


> TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature
> -
>
> Key: HADOOP-14206
> URL: https://issues.apache.org/jira/browse/HADOOP-14206
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs, test
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-14206-branch-2.10.001.patch, 
> HADOOP-14206.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HADOOP-Build/11862/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_121.txt:
> {noformat}
> Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 10.454 sec 
> <<< FAILURE! - in org.apache.hadoop.fs.sftp.TestSFTPFileSystem
> testFileExists(org.apache.hadoop.fs.sftp.TestSFTPFileSystem)  Time elapsed: 
> 0.19 sec  <<< ERROR!
> java.io.IOException: com.jcraft.jsch.JSchException: Session.connect: 
> java.security.SignatureException: Invalid encoding for signature
>   at com.jcraft.jsch.Session.connect(Session.java:565)
>   at com.jcraft.jsch.Session.connect(Session.java:183)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:168)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:180)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...

[jira] [Updated] (HADOOP-14206) TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature

2020-03-05 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-14206:
-
Attachment: HADOOP-14206.001.patch

> TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature
> -
>
> Key: HADOOP-14206
> URL: https://issues.apache.org/jira/browse/HADOOP-14206
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs, test
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-14206-branch-2.10.001.patch, 
> HADOOP-14206.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HADOOP-Build/11862/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_121.txt:
> {noformat}
> Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 10.454 sec 
> <<< FAILURE! - in org.apache.hadoop.fs.sftp.TestSFTPFileSystem
> testFileExists(org.apache.hadoop.fs.sftp.TestSFTPFileSystem)  Time elapsed: 
> 0.19 sec  <<< ERROR!
> java.io.IOException: com.jcraft.jsch.JSchException: Session.connect: 
> java.security.SignatureException: Invalid encoding for signature
>   at com.jcraft.jsch.Session.connect(Session.java:565)
>   at com.jcraft.jsch.Session.connect(Session.java:183)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:168)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:180)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apac

[jira] [Updated] (HADOOP-14206) TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature

2020-03-05 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-14206:
-
Attachment: HADOOP-14206-branch-2.10.001.patch
Status: Patch Available  (was: Open)

Submitting patch for branch-2.10

> TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature
> -
>
> Key: HADOOP-14206
> URL: https://issues.apache.org/jira/browse/HADOOP-14206
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs, test
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-14206-branch-2.10.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HADOOP-Build/11862/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_121.txt:
> {noformat}
> Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 10.454 sec 
> <<< FAILURE! - in org.apache.hadoop.fs.sftp.TestSFTPFileSystem
> testFileExists(org.apache.hadoop.fs.sftp.TestSFTPFileSystem)  Time elapsed: 
> 0.19 sec  <<< ERROR!
> java.io.IOException: com.jcraft.jsch.JSchException: Session.connect: 
> java.security.SignatureException: Invalid encoding for signature
>   at com.jcraft.jsch.Session.connect(Session.java:565)
>   at com.jcraft.jsch.Session.connect(Session.java:183)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:168)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:180)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

[jira] [Assigned] (HADOOP-14206) TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature

2020-03-05 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reassigned HADOOP-14206:


Assignee: Jim Brennan

> TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature
> -
>
> Key: HADOOP-14206
> URL: https://issues.apache.org/jira/browse/HADOOP-14206
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs, test
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: Jim Brennan
>Priority: Major
>
> https://builds.apache.org/job/PreCommit-HADOOP-Build/11862/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_121.txt:
> {noformat}
> Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 10.454 sec 
> <<< FAILURE! - in org.apache.hadoop.fs.sftp.TestSFTPFileSystem
> testFileExists(org.apache.hadoop.fs.sftp.TestSFTPFileSystem)  Time elapsed: 
> 0.19 sec  <<< ERROR!
> java.io.IOException: com.jcraft.jsch.JSchException: Session.connect: 
> java.security.SignatureException: Invalid encoding for signature
>   at com.jcraft.jsch.Session.connect(Session.java:565)
>   at com.jcraft.jsch.Session.connect(Session.java:183)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:168)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:180)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14206) TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature

2020-03-05 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052529#comment-17052529
 ] 

Jim Brennan commented on HADOOP-14206:
--

We are still seeing this occasionally on our internal branch-2.10 builds.  I am 
able to reproduce it easily on branch-2.10 by running this test in a loop.
I haven't been able to get it to occur on trunk though, although I am not sure 
why.   I believe the problem is as reported here: 
https://sourceforge.net/p/jsch/bugs/111/
I have found that changing the jsch version from 1.54 to 1.55 does seem to fix 
the problem.

Even though I haven't been able to repro on trunk, I will put up the patch for 
trunk.  Any concerns about updating this?


> TestSFTPFileSystem#testFileExists failure: Invalid encoding for signature
> -
>
> Key: HADOOP-14206
> URL: https://issues.apache.org/jira/browse/HADOOP-14206
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs, test
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Priority: Major
>
> https://builds.apache.org/job/PreCommit-HADOOP-Build/11862/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_121.txt:
> {noformat}
> Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 10.454 sec 
> <<< FAILURE! - in org.apache.hadoop.fs.sftp.TestSFTPFileSystem
> testFileExists(org.apache.hadoop.fs.sftp.TestSFTPFileSystem)  Time elapsed: 
> 0.19 sec  <<< ERROR!
> java.io.IOException: com.jcraft.jsch.JSchException: Session.connect: 
> java.security.SignatureException: Invalid encoding for signature
>   at com.jcraft.jsch.Session.connect(Session.java:565)
>   at com.jcraft.jsch.Session.connect(Session.java:183)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:168)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1626)
>   at 
> org.apache.hadoop.fs.sftp.TestSFTPFileSystem.testFileExists(TestSFTPFileSystem.java:190)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>   at 
> org.apache.hadoop.fs.sftp.SFTPConnectionPool.connect(SFTPConnectionPool.java:180)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.connect(SFTPFileSystem.java:149)
>   at 
> org.apache.hadoop.fs.sftp.SFTPFileSystem.getFileStatus(SFTPFileSystem.java:663)
>   at

[jira] [Commented] (HADOOP-16789) In TestZKFailoverController, restore changes from HADOOP-11149 that were dropped by HDFS-6440

2020-01-06 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008959#comment-17008959
 ] 

Jim Brennan commented on HADOOP-16789:
--

Thanks [~vagarychen]!

 

> In TestZKFailoverController, restore changes from HADOOP-11149 that were 
> dropped by HDFS-6440
> -
>
> Key: HADOOP-16789
> URL: https://issues.apache.org/jira/browse/HADOOP-16789
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 2.10.1
>
> Attachments: HADOOP-16789-branch-2.10.001.patch
>
>
> In our automated tests, we are seeing intermittent failures in 
> TestZKFailoverController.  I have been unable to reproduce the failures 
> locally, but in examining the code, I found a difference that may explain the 
> failures.
> In trunk, HDFS-6440 ( Support more than 2 NameNodes. Contributed by Jesse 
> Yates.) was checked in before HADOOP-11149. TestZKFailoverController times 
> out), which changed the test added in HDFS-6440.
> In branch-2, the order was reversed, and the test that was added in HDFS-6440 
> does not retain the fixes from HADOOP-11149.
> Note that there was also a change from HDFS-10985. 
> (o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions.) that was missed in the HDFS-6440 backport.
> My proposal is to restore the changes from HADOOP-11149.  I made this change 
> internally and it seems to have fixed the intermittent failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16789) In TestZKFailoverController, restore changes from HADOOP-11149 that were dropped by HDFS-6440

2020-01-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007789#comment-17007789
 ] 

Jim Brennan commented on HADOOP-16789:
--

[~csun], [~vagarychen], can you please review?

 

 

> In TestZKFailoverController, restore changes from HADOOP-11149 that were 
> dropped by HDFS-6440
> -
>
> Key: HADOOP-16789
> URL: https://issues.apache.org/jira/browse/HADOOP-16789
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-16789-branch-2.10.001.patch
>
>
> In our automated tests, we are seeing intermittent failures in 
> TestZKFailoverController.  I have been unable to reproduce the failures 
> locally, but in examining the code, I found a difference that may explain the 
> failures.
> In trunk, HDFS-6440 ( Support more than 2 NameNodes. Contributed by Jesse 
> Yates.) was checked in before HADOOP-11149. TestZKFailoverController times 
> out), which changed the test added in HDFS-6440.
> In branch-2, the order was reversed, and the test that was added in HDFS-6440 
> does not retain the fixes from HADOOP-11149.
> Note that there was also a change from HDFS-10985. 
> (o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions.) that was missed in the HDFS-6440 backport.
> My proposal is to restore the changes from HADOOP-11149.  I made this change 
> internally and it seems to have fixed the intermittent failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16789) In TestZKFailoverController, restore changes from HADOOP-11149 that were dropped by HDFS-6440

2020-01-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-16789:
-
Attachment: HADOOP-16789-branch-2.10.001.patch
Status: Patch Available  (was: Open)

Patch 001 for branch-2.10 is basically just pulling in the trunk version of 
testGracefulFailoverMultipleZKfcs (and moves runFC() to match the trunk 
version).

 

> In TestZKFailoverController, restore changes from HADOOP-11149 that were 
> dropped by HDFS-6440
> -
>
> Key: HADOOP-16789
> URL: https://issues.apache.org/jira/browse/HADOOP-16789
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-16789-branch-2.10.001.patch
>
>
> In our automated tests, we are seeing intermittent failures in 
> TestZKFailoverController.  I have been unable to reproduce the failures 
> locally, but in examining the code, I found a difference that may explain the 
> failures.
> In trunk, HDFS-6440 ( Support more than 2 NameNodes. Contributed by Jesse 
> Yates.) was checked in before HADOOP-11149. TestZKFailoverController times 
> out), which changed the test added in HDFS-6440.
> In branch-2, the order was reversed, and the test that was added in HDFS-6440 
> does not retain the fixes from HADOOP-11149.
> Note that there was also a change from HDFS-10985. 
> (o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions.) that was missed in the HDFS-6440 backport.
> My proposal is to restore the changes from HADOOP-11149.  I made this change 
> internally and it seems to have fixed the intermittent failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16789) In TestZKFailoverController, restore changes from HADOOP-11149 that were dropped by HDFS-6440

2020-01-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-16789:
-
Priority: Minor  (was: Major)

> In TestZKFailoverController, restore changes from HADOOP-11149 that were 
> dropped by HDFS-6440
> -
>
> Key: HADOOP-16789
> URL: https://issues.apache.org/jira/browse/HADOOP-16789
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
>
> In our automated tests, we are seeing intermittent failures in 
> TestZKFailoverController.  I have been unable to reproduce the failures 
> locally, but in examining the code, I found a difference that may explain the 
> failures.
> In trunk, HDFS-6440 ( Support more than 2 NameNodes. Contributed by Jesse 
> Yates.) was checked in before HADOOP-11149. TestZKFailoverController times 
> out), which changed the test added in HDFS-6440.
> In branch-2, the order was reversed, and the test that was added in HDFS-6440 
> does not retain the fixes from HADOOP-11149.
> Note that there was also a change from HDFS-10985. 
> (o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions.) that was missed in the HDFS-6440 backport.
> My proposal is to restore the changes from HADOOP-11149.  I made this change 
> internally and it seems to have fixed the intermittent failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16789) In TestZKFailoverController, restore changes from HADOOP-11149 that were dropped by HDFS-6440

2020-01-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-16789:
-
Issue Type: Bug  (was: Test)

> In TestZKFailoverController, restore changes from HADOOP-11149 that were 
> dropped by HDFS-6440
> -
>
> Key: HADOOP-16789
> URL: https://issues.apache.org/jira/browse/HADOOP-16789
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>
> In our automated tests, we are seeing intermittent failures in 
> TestZKFailoverController.  I have been unable to reproduce the failures 
> locally, but in examining the code, I found a difference that may explain the 
> failures.
> In trunk, HDFS-6440 ( Support more than 2 NameNodes. Contributed by Jesse 
> Yates.) was checked in before HADOOP-11149. TestZKFailoverController times 
> out), which changed the test added in HDFS-6440.
> In branch-2, the order was reversed, and the test that was added in HDFS-6440 
> does not retain the fixes from HADOOP-11149.
> Note that there was also a change from HDFS-10985. 
> (o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions.) that was missed in the HDFS-6440 backport.
> My proposal is to restore the changes from HADOOP-11149.  I made this change 
> internally and it seems to have fixed the intermittent failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16789) In TestZKFailoverController, restore changes from HADOOP-11149 that were dropped by HDFS-6440

2020-01-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-16789:
-
  Component/s: common
Affects Version/s: 2.10.0
  Description: 
In our automated tests, we are seeing intermittent failures in 
TestZKFailoverController.  I have been unable to reproduce the failures 
locally, but in examining the code, I found a difference that may explain the 
failures.

In trunk, HDFS-6440 ( Support more than 2 NameNodes. Contributed by Jesse 
Yates.) was checked in before HADOOP-11149. TestZKFailoverController times 
out), which changed the test added in HDFS-6440.

In branch-2, the order was reversed, and the test that was added in HDFS-6440 
does not retain the fixes from HADOOP-11149.

Note that there was also a change from HDFS-10985. 
(o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
assertions.) that was missed in the HDFS-6440 backport.

My proposal is to restore the changes from HADOOP-11149.  I made this change 
internally and it seems to have fixed the intermittent failures.

> In TestZKFailoverController, restore changes from HADOOP-11149 that were 
> dropped by HDFS-6440
> -
>
> Key: HADOOP-16789
> URL: https://issues.apache.org/jira/browse/HADOOP-16789
> Project: Hadoop Common
>  Issue Type: Test
>  Components: common
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>
> In our automated tests, we are seeing intermittent failures in 
> TestZKFailoverController.  I have been unable to reproduce the failures 
> locally, but in examining the code, I found a difference that may explain the 
> failures.
> In trunk, HDFS-6440 ( Support more than 2 NameNodes. Contributed by Jesse 
> Yates.) was checked in before HADOOP-11149. TestZKFailoverController times 
> out), which changed the test added in HDFS-6440.
> In branch-2, the order was reversed, and the test that was added in HDFS-6440 
> does not retain the fixes from HADOOP-11149.
> Note that there was also a change from HDFS-10985. 
> (o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions.) that was missed in the HDFS-6440 backport.
> My proposal is to restore the changes from HADOOP-11149.  I made this change 
> internally and it seems to have fixed the intermittent failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16789) In TestZKFailoverController, restore changes from HADOOP-11149 that were dropped by HDFS-6440

2020-01-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reassigned HADOOP-16789:


Assignee: Jim Brennan

> In TestZKFailoverController, restore changes from HADOOP-11149 that were 
> dropped by HDFS-6440
> -
>
> Key: HADOOP-16789
> URL: https://issues.apache.org/jira/browse/HADOOP-16789
> Project: Hadoop Common
>  Issue Type: Test
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16789) In TestZKFailoverController, restore changes from HADOOP-11149 that were dropped by HDFS-6440

2020-01-03 Thread Jim Brennan (Jira)
Jim Brennan created HADOOP-16789:


 Summary: In TestZKFailoverController, restore changes from 
HADOOP-11149 that were dropped by HDFS-6440
 Key: HADOOP-16789
 URL: https://issues.apache.org/jira/browse/HADOOP-16789
 Project: Hadoop Common
  Issue Type: Test
Reporter: Jim Brennan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16544) update io.netty in branch-2

2019-10-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943684#comment-16943684
 ] 

Jim Brennan commented on HADOOP-16544:
--

[~weichiu], [~iwasakims], [~jhung], is this change going to cause compatibility 
issues with other tools that provide shuffle handlers like Spark and Tez?

cc: [~jeagles]

> update io.netty in branch-2
> ---
>
> Key: HADOOP-16544
> URL: https://issues.apache.org/jira/browse/HADOOP-16544
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: release-blocker
> Fix For: 2.10.0
>
> Attachments: HADOOP-16544-branch-2.001.patch, 
> HADOOP-16544-branch-2.002.patch, HADOOP-16544-branch-2.003.patch, 
> HADOOP-16544-branch-2.004.patch
>
>
> branch-2 pulls in io.netty 3.6.2.Final which is more than 5 years old.
> The latest is 3.10.6Final. I know updating netty is sensitive but it deserves 
> some attention.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16369) Fix zstandard shortname misspelled as zts

2019-06-13 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863456#comment-16863456
 ] 

Jim Brennan commented on HADOOP-16369:
--

I'm +1 on this (non-binding).  Lysdexics untie!


> Fix zstandard shortname misspelled as zts
> -
>
> Key: HADOOP-16369
> URL: https://issues.apache.org/jira/browse/HADOOP-16369
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Attachments: HADOOP-16369.001.patch
>
>
> A few times in the code base zstd was misspelled as ztsd. zts is another 
> library https://github.com/yahoo/athenz/tree/master/clients/java/zts and has 
> caused some grief with the zts confusion in the code base



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16031) TestSecureLogins#testValidKerberosName fails

2019-06-11 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16861074#comment-16861074
 ] 

Jim Brennan commented on HADOOP-16031:
--

Thanks [~eyang]! I filed HADOOP-16361 for this test failure in branch-2.

> TestSecureLogins#testValidKerberosName fails
> 
>
> Key: HADOOP-16031
> URL: https://issues.apache.org/jira/browse/HADOOP-16031
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.2.0, 3.3.0, 3.1.3
>
> Attachments: HADOOP-16031.01.patch
>
>
> {noformat}
> [INFO] Running org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 2.724 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.01 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:429)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:203)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2019-06-11 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16861072#comment-16861072
 ] 

Jim Brennan commented on HADOOP-16361:
--

This was fixed in trunk in HADOOP-16031, but that fix is dependent on 
HADOOP-15996.

[~eyang] commented on this in HADOOP-16031:
{quote}Jim Brennan This patch does not apply to branch-2 because:

1. When TestSecureLogins was merged, HADOOP-12751 was in branch-2.
 2. acl_to_local act as a firewall rule again after HADOOP-15959 reverted 
HADOOP-12751.
 3. acl_to_local pass through is only allowed in Hadoop 3.1.2+ by HADOOP-15996 
(new feature).

The test case does not have a way to work in branch-2 latest anymore because 
lack of ability to allow non-matching auth_to_local rule to pass through. It 
would be best to open a separate issue to address the gap because the branch-2 
KerberosName#getShortName() lacks the ability to handle complex non-kerberos 
name (zookeeper/localhost).
{quote}
Has this test always failed in branch-2?   Or did something change that caused 
it to start failing?

I don't know what the appropriate solution is here.

cc: [~daryn], [~ste...@apache.org]

> TestSecureLogins#testValidKerberosName fails on branch-2
> 
>
> Key: HADOOP-16361
> URL: https://issues.apache.org/jira/browse/HADOOP-16361
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 2.9.2, 2.8.5
>Reporter: Jim Brennan
>Priority: Major
>
> This test is failing in branch-2.
> {noformat}
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 26.917 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.007 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16361) TestSecureLogins#testValidKerberosName fails on branch-2

2019-06-11 Thread Jim Brennan (JIRA)
Jim Brennan created HADOOP-16361:


 Summary: TestSecureLogins#testValidKerberosName fails on branch-2
 Key: HADOOP-16361
 URL: https://issues.apache.org/jira/browse/HADOOP-16361
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.8.5, 2.9.2, 2.10.0
Reporter: Jim Brennan


This test is failing in branch-2.
{noformat}
[ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 26.917 
s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
[ERROR] 
testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  Time 
elapsed: 0.007 s  <<< ERROR!
org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No 
rules applied to zookeeper/localhost
at 
org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:401)
at 
org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:182)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16031) TestSecureLogins#testValidKerberosName fails

2019-06-10 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860145#comment-16860145
 ] 

Jim Brennan commented on HADOOP-16031:
--

[~eyang], [~aajisaka] this test is failing on branch-2 as well.

> TestSecureLogins#testValidKerberosName fails
> 
>
> Key: HADOOP-16031
> URL: https://issues.apache.org/jira/browse/HADOOP-16031
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.2.0, 3.3.0, 3.1.3
>
> Attachments: HADOOP-16031.01.patch
>
>
> {noformat}
> [INFO] Running org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 2.724 s <<< FAILURE! - in org.apache.hadoop.registry.secure.TestSecureLogins
> [ERROR] 
> testValidKerberosName(org.apache.hadoop.registry.secure.TestSecureLogins)  
> Time elapsed: 0.01 s  <<< ERROR!
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to zookeeper/localhost
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:429)
>   at 
> org.apache.hadoop.registry.secure.TestSecureLogins.testValidKerberosName(TestSecureLogins.java:203)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15372) Race conditions and possible leaks in the Shell class

2019-04-08 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812729#comment-16812729
 ] 

Jim Brennan commented on HADOOP-15372:
--

[~ebadger] good point.  The potential for a race is still there, although I 
think the localization code no longer exercises it.

> Race conditions and possible leaks in the Shell class
> -
>
> Key: HADOOP-15372
> URL: https://issues.apache.org/jira/browse/HADOOP-15372
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Miklos Szegedi
>Assignee: Eric Badger
>Priority: Minor
> Attachments: HADOOP-15372.001.patch
>
>
> YARN-5641 introduced some cleanup code in the Shell class. It has a race 
> condition. {{Shell.runCommand()}} can be called while/after 
> {{Shell.getAllShells()}} returned all the shells to be cleaned up. This new 
> thread can avoid the clean up, so that the process held by it can be leaked 
> causing leaked localized files/etc.
> I see another issue as well. {{Shell.runCommand()}} has a finally block with 
> a {{process.destroy();}} to clean up. However, the try catch block does not 
> cover all instructions after the process is started, so for example we can 
> exit the thread and leak the process, if 
> {{timeOutTimer.schedule(timeoutTimerTask, timeOutInterval);}} causes an 
> exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15372) Race conditions and possible leaks in the Shell class

2019-04-08 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812539#comment-16812539
 ] 

Jim Brennan commented on HADOOP-15372:
--

[~miklos.szeg...@cloudera.com], [~ebadger], I recently debugged a case where we 
were (still) leaking tmp dirs for localized tarballs in our 2.8 code.  The 
problem turned out to be not that we were failing to kill all the shells, but 
that we were only killing the first subshell in the tar command, which was: 
{{gzip -dc inFile | ( cd untarDir; tar -xf)}}
When I went to attempt to reproduce the problem in 3.x (trunk), I was unable to 
get it to happen.
I believe this was fixed by YARN-2185, which changed the localization code to 
use runCommandOnStream().  Because there are threads for the input/output of 
the shell command, it is killed when the threads are killed.

So I think this Jira can be closed.  Do you guys agree?

> Race conditions and possible leaks in the Shell class
> -
>
> Key: HADOOP-15372
> URL: https://issues.apache.org/jira/browse/HADOOP-15372
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Miklos Szegedi
>Assignee: Eric Badger
>Priority: Minor
> Attachments: HADOOP-15372.001.patch
>
>
> YARN-5641 introduced some cleanup code in the Shell class. It has a race 
> condition. {{Shell.runCommand()}} can be called while/after 
> {{Shell.getAllShells()}} returned all the shells to be cleaned up. This new 
> thread can avoid the clean up, so that the process held by it can be leaked 
> causing leaked localized files/etc.
> I see another issue as well. {{Shell.runCommand()}} has a finally block with 
> a {{process.destroy();}} to clean up. However, the try catch block does not 
> cover all instructions after the process is started, so for example we can 
> exit the thread and leak the process, if 
> {{timeOutTimer.schedule(timeoutTimerTask, timeOutInterval);}} causes an 
> exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15820) ZStandardDecompressor native code sets an integer field as a long

2018-10-05 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639860#comment-16639860
 ] 

Jim Brennan commented on HADOOP-15820:
--

+1 looks good to me.

 

> ZStandardDecompressor native code sets an integer field as a long
> -
>
> Key: HADOOP-15820
> URL: https://issues.apache.org/jira/browse/HADOOP-15820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15820.001.patch
>
>
> Java_org_apache_hadoop_io_compress_zstd_ZStandardDecompressor_init in 
> ZStandardDecompressor.c sets the {{remaining}} field as a long when it 
> actually is an integer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15548) Randomize local dirs

2018-06-29 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528214#comment-16528214
 ] 

Jim Brennan commented on HADOOP-15548:
--

Thanks [~eepayne]!

> Randomize local dirs
> 
>
> Key: HADOOP-15548
> URL: https://issues.apache.org/jira/browse/HADOOP-15548
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-15548-branch-2.001.patch, HADOOP-15548.001.patch, 
> HADOOP-15548.002.patch
>
>
> shuffle LOCAL_DIRS, LOG_DIRS and LOCAL_USER_DIRS when launching container. 
> Some applications will process these in exactly the same way in every 
> container (e.g. roundrobin) which can cause disks to get unnecessarily 
> overloaded (e.g. one output file written to first entry specified in the 
> environment variable).
> There are two paths for local dir allocation, depending on whether the size 
> is unknown or known.  The unknown path already uses a random algorithm.  The 
> known path initializes with a random starting point, and then goes 
> round-robin after that.  When selecting a dir, it increments the last used by 
> one and then checks sequentially until it finds a dir that satisfies the 
> request.  Proposal is to increment by a random value of between 1 and 
> num_dirs - 1, and then check sequentially from there.  This should result in 
> a more random selection in all cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15548) Randomize local dirs

2018-06-29 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528154#comment-16528154
 ] 

Jim Brennan commented on HADOOP-15548:
--

I've uploaded a patch for branch-2 that fixes the compilation error.

 

> Randomize local dirs
> 
>
> Key: HADOOP-15548
> URL: https://issues.apache.org/jira/browse/HADOOP-15548
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-15548-branch-2.001.patch, HADOOP-15548.001.patch, 
> HADOOP-15548.002.patch
>
>
> shuffle LOCAL_DIRS, LOG_DIRS and LOCAL_USER_DIRS when launching container. 
> Some applications will process these in exactly the same way in every 
> container (e.g. roundrobin) which can cause disks to get unnecessarily 
> overloaded (e.g. one output file written to first entry specified in the 
> environment variable).
> There are two paths for local dir allocation, depending on whether the size 
> is unknown or known.  The unknown path already uses a random algorithm.  The 
> known path initializes with a random starting point, and then goes 
> round-robin after that.  When selecting a dir, it increments the last used by 
> one and then checks sequentially until it finds a dir that satisfies the 
> request.  Proposal is to increment by a random value of between 1 and 
> num_dirs - 1, and then check sequentially from there.  This should result in 
> a more random selection in all cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15548) Randomize local dirs

2018-06-29 Thread Jim Brennan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-15548:
-
Attachment: HADOOP-15548-branch-2.001.patch

> Randomize local dirs
> 
>
> Key: HADOOP-15548
> URL: https://issues.apache.org/jira/browse/HADOOP-15548
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-15548-branch-2.001.patch, HADOOP-15548.001.patch, 
> HADOOP-15548.002.patch
>
>
> shuffle LOCAL_DIRS, LOG_DIRS and LOCAL_USER_DIRS when launching container. 
> Some applications will process these in exactly the same way in every 
> container (e.g. roundrobin) which can cause disks to get unnecessarily 
> overloaded (e.g. one output file written to first entry specified in the 
> environment variable).
> There are two paths for local dir allocation, depending on whether the size 
> is unknown or known.  The unknown path already uses a random algorithm.  The 
> known path initializes with a random starting point, and then goes 
> round-robin after that.  When selecting a dir, it increments the last used by 
> one and then checks sequentially until it finds a dir that satisfies the 
> request.  Proposal is to increment by a random value of between 1 and 
> num_dirs - 1, and then check sequentially from there.  This should result in 
> a more random selection in all cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15548) Randomize local dirs

2018-06-28 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526850#comment-16526850
 ] 

Jim Brennan commented on HADOOP-15548:
--

[~eepayne] thanks for the review!  I've uploaded a new patch that adds a check 
to ensure we are not always selecting the next dir, which is what it used to do.

 

> Randomize local dirs
> 
>
> Key: HADOOP-15548
> URL: https://issues.apache.org/jira/browse/HADOOP-15548
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: HADOOP-15548.001.patch, HADOOP-15548.002.patch
>
>
> shuffle LOCAL_DIRS, LOG_DIRS and LOCAL_USER_DIRS when launching container. 
> Some applications will process these in exactly the same way in every 
> container (e.g. roundrobin) which can cause disks to get unnecessarily 
> overloaded (e.g. one output file written to first entry specified in the 
> environment variable).
> There are two paths for local dir allocation, depending on whether the size 
> is unknown or known.  The unknown path already uses a random algorithm.  The 
> known path initializes with a random starting point, and then goes 
> round-robin after that.  When selecting a dir, it increments the last used by 
> one and then checks sequentially until it finds a dir that satisfies the 
> request.  Proposal is to increment by a random value of between 1 and 
> num_dirs - 1, and then check sequentially from there.  This should result in 
> a more random selection in all cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   >