[jira] [Created] (HADOOP-18224) Upgrade maven compiler plugin

2022-05-05 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18224:
-

 Summary: Upgrade maven compiler plugin
 Key: HADOOP-18224
 URL: https://issues.apache.org/jira/browse/HADOOP-18224
 Project: Hadoop Common
  Issue Type: Task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


Currently we are using maven-compiler-plugin 3.1 version, which is quite old 
(2013) and it's also pulling in vulnerable log4j dependency:
{code:java}
[INFO]
org.apache.maven.plugins:maven-compiler-plugin:maven-plugin:3.1:runtime
[INFO]   org.apache.maven.plugins:maven-compiler-plugin:jar:3.1
[INFO]   org.apache.maven:maven-plugin-api:jar:2.0.9
[INFO]   org.apache.maven:maven-artifact:jar:2.0.9
[INFO]   org.codehaus.plexus:plexus-utils:jar:1.5.1
[INFO]   org.apache.maven:maven-core:jar:2.0.9
[INFO]   org.apache.maven:maven-settings:jar:2.0.9
[INFO]   org.apache.maven:maven-plugin-parameter-documenter:jar:2.0.9
...
...
...
[INFO]   log4j:log4j:jar:1.2.12
[INFO]   commons-logging:commons-logging-api:jar:1.1
[INFO]   com.google.collections:google-collections:jar:1.0
[INFO]   junit:junit:jar:3.8.2
 {code}
 

We should upgrade to 3.10.1 (latest Mar, 2022) version of maven-compiler-plugin.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18435) Remove usage of fs.s3a.executor.capacity

2022-08-31 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18435:
-

 Summary: Remove usage of fs.s3a.executor.capacity
 Key: HADOOP-18435
 URL: https://issues.apache.org/jira/browse/HADOOP-18435
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Viraj Jasani
Assignee: Viraj Jasani


When s3guard was part of s3a, DynamoDBMetadataStore was the only consumer of 
StoreContext that used throttled executor provided by StoreContext, which 
internally uses fs.s3a.executor.capacity to determine executor capacity for 
SemaphoredDelegatingExecutor. With the removal of s3guard from s3a, we should 
also remove fs.s3a.executor.capacity and it's usages as it's no longer being 
used by any StoreContext consumers. The config's existence and its description 
can be really confusing for the users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18455) s3a prefetching Executor should be closed

2022-09-16 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18455:
-

 Summary: s3a prefetching Executor should be closed
 Key: HADOOP-18455
 URL: https://issues.apache.org/jira/browse/HADOOP-18455
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Viraj Jasani
Assignee: Viraj Jasani


This is the follow-up work for HADOOP-18186. The new executor service we use 
for s3a prefetching should be closed while shutting down the file system.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18186) s3a prefetching to use SemaphoredDelegatingExecutor for submitting work

2022-09-16 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HADOOP-18186.
---
Resolution: Fixed

> s3a prefetching to use SemaphoredDelegatingExecutor for submitting work
> ---
>
> Key: HADOOP-18186
> URL: https://issues.apache.org/jira/browse/HADOOP-18186
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>    Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Use SemaphoredDelegatingExecutor for each to stream to submit work, if 
> possible, for better fairness in processes with many streams.
> this also takes a DurationTrackerFactory to count how long was spent in the 
> queue, something we would want to know



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18466) Limit the findbugs suppression IS2_INCONSISTENT_SYNC to S3AFileSystem field

2022-09-22 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18466:
-

 Summary: Limit the findbugs suppression IS2_INCONSISTENT_SYNC to 
S3AFileSystem field
 Key: HADOOP-18466
 URL: https://issues.apache.org/jira/browse/HADOOP-18466
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Reporter: Viraj Jasani
Assignee: Viraj Jasani


Limit the findbugs suppression IS2_INCONSISTENT_SYNC to S3AFileSystem field 
futurePool to avoid letting it discover other synchronization bugs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-18186) s3a prefetching to use SemaphoredDelegatingExecutor for submitting work

2022-09-11 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reopened HADOOP-18186:
---

Re-opening for an addendum

> s3a prefetching to use SemaphoredDelegatingExecutor for submitting work
> ---
>
> Key: HADOOP-18186
> URL: https://issues.apache.org/jira/browse/HADOOP-18186
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>    Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Use SemaphoredDelegatingExecutor for each to stream to submit work, if 
> possible, for better fairness in processes with many streams.
> this also takes a DurationTrackerFactory to count how long was spent in the 
> queue, something we would want to know



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18403) Fix FileSystem leak in ITestS3AAWSCredentialsProvider

2022-08-11 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18403:
-

 Summary: Fix FileSystem leak in ITestS3AAWSCredentialsProvider
 Key: HADOOP-18403
 URL: https://issues.apache.org/jira/browse/HADOOP-18403
 Project: Hadoop Common
  Issue Type: Test
Reporter: Viraj Jasani
Assignee: Viraj Jasani


ITestS3AAWSCredentialsProvider#testAnonymousProvider has FileSystem leak that 
should be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18397) Shutdown AWSSecurityTokenService when it's resources are no longer in use

2022-08-08 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18397:
-

 Summary: Shutdown AWSSecurityTokenService when it's resources are 
no longer in use
 Key: HADOOP-18397
 URL: https://issues.apache.org/jira/browse/HADOOP-18397
 Project: Hadoop Common
  Issue Type: Task
  Components: fs/s3
Reporter: Viraj Jasani
Assignee: Viraj Jasani


AWSSecurityTokenService resources can be released whenever they are no longer 
in use. The documentation of AWSSecurityTokenService#shutdown says while it is 
not important for client to compulsorily shutdown the token service, client can 
definitely perform early release whenever client no longer requires token 
service resources. We achieve this by making STSClient closable, so we can 
certainly utilize it in all places where it's suitable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18592) Sasl connection failure should log remote address

2023-01-11 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18592:
-

 Summary: Sasl connection failure should log remote address
 Key: HADOOP-18592
 URL: https://issues.apache.org/jira/browse/HADOOP-18592
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.3.4
Reporter: Viraj Jasani
Assignee: Viraj Jasani


If Sasl connection fails with some generic error, we miss logging remote server 
that the client was trying to connect to.

Sample log:
{code:java}
2023-01-12 00:22:28,148 WARN  [20%2C1673404849949,1] ipc.Client - Exception 
encountered while connecting to the server 
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
    at sun.nio.ch.IOUtil.read(IOUtil.java:197)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
    at 
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
    at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:141)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1950)
    at 
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:367)
    at 
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:623)
    at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:414)
...
... {code}
We should log the remote server address.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18654) Remove unused custom appender TaskLogAppender

2023-03-06 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18654:
-

 Summary: Remove unused custom appender TaskLogAppender
 Key: HADOOP-18654
 URL: https://issues.apache.org/jira/browse/HADOOP-18654
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


TaskLogAppender is no longer being used in codebase. The only past references 
we have are from old releasenotes (HADOOP-7308, MAPREDUCE-3208, MAPREDUCE-2372, 
HADOOP-1355).

Before we migrate to log4j2, it would be good to remove TaskLogAppender.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18668) Path capability probe for truncate is only honored by RawLocalFileSystem

2023-03-16 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18668:
-

 Summary: Path capability probe for truncate is only honored by 
RawLocalFileSystem
 Key: HADOOP-18668
 URL: https://issues.apache.org/jira/browse/HADOOP-18668
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Viraj Jasani
Assignee: Viraj Jasani


FileSystem#hasPathCapability returns true for probing 
"fs.capability.paths.truncate" only by RawLocalFileSystem. It should be honored 
by all file system implementations that support truncate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18631) Migrate Async appenders to log4j properties

2023-03-17 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HADOOP-18631.
---
Resolution: Fixed

> Migrate Async appenders to log4j properties
> ---
>
> Key: HADOOP-18631
> URL: https://issues.apache.org/jira/browse/HADOOP-18631
> Project: Hadoop Common
>  Issue Type: Sub-task
>        Reporter: Viraj Jasani
>    Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Before we can upgrade to log4j2, we need to migrate async appenders that we 
> add "dynamically in the code" to the log4j.properties file. Instead of using 
> core/hdfs site configs, log4j properties or system properties should be used 
> to determine if the given logger should use async appender.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-18631) Migrate Async appenders to log4j properties

2023-03-17 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reopened HADOOP-18631:
---

> Migrate Async appenders to log4j properties
> ---
>
> Key: HADOOP-18631
> URL: https://issues.apache.org/jira/browse/HADOOP-18631
> Project: Hadoop Common
>  Issue Type: Sub-task
>        Reporter: Viraj Jasani
>    Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Before we can upgrade to log4j2, we need to migrate async appenders that we 
> add "dynamically in the code" to the log4j.properties file. Instead of using 
> core/hdfs site configs, log4j properties or system properties should be used 
> to determine if the given logger should use async appender.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18669) Remove Log4Json Layout

2023-03-17 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18669:
-

 Summary: Remove Log4Json Layout
 Key: HADOOP-18669
 URL: https://issues.apache.org/jira/browse/HADOOP-18669
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


Log4Json extends org.apache.log4j.Layout to provide log layout for Json. This 
utility is not being used anywhere in Hadoop. It is IA.Private (by default).

Log4j2 has introduced drastic changes to the Layout. It also converted it as an 
interface. Log4j2 also has JsonLayout, it provides options like Pretty vs. 
compact JSON, Encoding UTF-8 or UTF-16, Complete well-formed JSON vs. fragment 
JSON, addition of custom fields into generated JSON

[https://github.com/apache/logging-log4j2/blob/2.x/log4j-core/src/main/java/org/apache/logging/log4j/core/layout/JsonLayout.java]

 

This utility is more suitable to be part of log4j project rather than hadoop 
because the maintenance cost in hadoop would be higher with any more upgrades 
introducing changes to the Layout format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18648) Avoid loading kms log4j properties dynamically by KMSWebServer

2023-02-27 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18648:
-

 Summary: Avoid loading kms log4j properties dynamically by 
KMSWebServer
 Key: HADOOP-18648
 URL: https://issues.apache.org/jira/browse/HADOOP-18648
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


Log4j2 does not support loading of log4j properties (/xml/json/yaml) 
dynamically by applications. It no longer supports overriding the loading of 
properties using "log4j.defaultInitOverride" the way log4j1 does.

For KMS, instead of loading the properties file dynamically, we should add the 
log4j properties file as part of HADOOP_OPTS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18649) CLA and CRLA appenders to be replaced with RFA

2023-03-01 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18649:
-

 Summary: CLA and CRLA appenders to be replaced with RFA
 Key: HADOOP-18649
 URL: https://issues.apache.org/jira/browse/HADOOP-18649
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


ContainerLogAppender and ContainerRollingLogAppender both have quite similar 
functionality as RollingFileAppender. Maintenance of custom appenders for 
Log4J2 is costly when there is very minor difference in comparison with 
built-in appender provided by Log4J. 

The goal of this sub-task is to replace both ContainerLogAppender and 
ContainerRollingLogAppender custom appenders with RollingFileAppender without 
changing any system properties already being used to determine file name, file 
size, backup index, pattern layout properties etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18653) LogLevel servlet to determine log impl before using setLevel

2023-03-05 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18653:
-

 Summary: LogLevel servlet to determine log impl before using 
setLevel
 Key: HADOOP-18653
 URL: https://issues.apache.org/jira/browse/HADOOP-18653
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


LogLevel GET API is used to set log level for a given class name dynamically. 
While we have cleaned up the commons-logging references, it would be great to 
determine whether slf4j log4j adapter is in the classpath before allowing 
client to set the log level.

Proposed changes:
 * Use slf4j logger factory to get the log reference for the given class name
 * Use generic utility to identify if the slf4j log4j adapter is in the 
classpath before using log4j API to update the log level
 * If the log4j adapter is not in the classpath, report error in the output



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18645) Provide keytab file key name with ServiceStateException

2023-02-24 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18645:
-

 Summary: Provide keytab file key name with ServiceStateException
 Key: HADOOP-18645
 URL: https://issues.apache.org/jira/browse/HADOOP-18645
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Viraj Jasani
Assignee: Viraj Jasani


 
{code:java}
util.ExitUtil - Exiting with status 1: 
org.apache.hadoop.service.ServiceStateException: java.io.IOException: Running 
in secure mode, but config doesn't have a keytab
1: org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
Running in secure mode, but config doesn't have a keytab
  at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:264)
..
..
 {code}
 

 

When multiple downstreamers use different configs to present the same keytab 
file, if one of the config key gets missing or overridden as part of config 
generators, it becomes bit confusing for operators to realize which config is 
missing for a particular service, especially when keytab file value is already 
present with different config.

It would be nice to report config key with the stacktrace error message.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18620) Avoid using grizzly-http classes

2023-02-06 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18620:
-

 Summary: Avoid using grizzly-http classes
 Key: HADOOP-18620
 URL: https://issues.apache.org/jira/browse/HADOOP-18620
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


As discussed on the parent Jira HADOOP-15984, we do not have any 
grizzly-http-servlet version available that uses Jersey 2 dependencies. 

version 2.4.4 contains Jersey 1 artifacts: 
[https://repo1.maven.org/maven2/org/glassfish/grizzly/grizzly-http-servlet/2.4.4/grizzly-http-servlet-2.4.4.pom]

The next higher version available is 3.0.0-M1 and it contains Jersey 3 
artifacts: 
[https://repo1.maven.org/maven2/org/glassfish/grizzly/grizzly-http-servlet/3.0.0-M1/grizzly-http-servlet-3.0.0-M1.pom]

 

Moreover, we do not use grizzly-http-* modules extensively. We use them only 
for few tests such that we don't have to implement all the methods of 
HttpServletResponse for our custom test classes.

We should get rid of grizzly-http-servlet, grizzly-http and grizzly-http-server 
artifacts of org.glassfish.grizzly and rather implement HttpServletResponse 
directly to avoid having to depend on grizzly upgrades as part of overall 
Jersey upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18628) Server connection should log host name before returning VersionMismatch error

2023-02-10 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18628:
-

 Summary: Server connection should log host name before returning 
VersionMismatch error
 Key: HADOOP-18628
 URL: https://issues.apache.org/jira/browse/HADOOP-18628
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Viraj Jasani
Assignee: Viraj Jasani


In env with dynamically changing IP addresses, debugging issue with the logs 
with only IP address becomes a bit difficult at times.
{code:java}
2023-02-08 23:26:50,112 WARN  [Socket Reader #1 for port 8485] ipc.Server - 
Incorrect RPC Header length from {IPV4}:36556 expected length: 
java.nio.HeapByteBuffer[pos=0 lim=4 cap=4] got length: 
java.nio.HeapByteBuffer[pos=0 lim=4 cap=4] {code}
It would be better to log full hostname for the given IP address rather than 
only IP address.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18631) Migrate Async appenders to log4j properties

2023-02-14 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18631:
-

 Summary: Migrate Async appenders to log4j properties
 Key: HADOOP-18631
 URL: https://issues.apache.org/jira/browse/HADOOP-18631
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


Before we can upgrade to log4j2, we need to migrate async appenders that we add 
"dynamically in the code" to the log4j.properties file. Instead of using 
core/hdfs site configs, whether to use async appenders should be decided based 
on system properties that log4j properties can derive value from.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18809) s3a prefetch read/write file operations should guard channel close

2023-07-17 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18809:
-

 Summary: s3a prefetch read/write file operations should guard 
channel close
 Key: HADOOP-18809
 URL: https://issues.apache.org/jira/browse/HADOOP-18809
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


As per Steve's suggestion from s3a prefetch LRU cache,

s3a prefetch disk based cache file read and write operations should guard 
against close of FileChannel and WritableByteChannel, close them even if 
read/write operations throw IOException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18829) s3a prefetch LRU cache eviction metric

2023-07-26 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18829:
-

 Summary: s3a prefetch LRU cache eviction metric
 Key: HADOOP-18829
 URL: https://issues.apache.org/jira/browse/HADOOP-18829
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


Follow-up from HADOOP-18291:

Add new IO statistics metric to capture s3a prefetch LRU cache eviction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18832:
-

 Summary: Upgrade aws-java-sdk to 1.12.499+
 Key: HADOOP-18832
 URL: https://issues.apache.org/jira/browse/HADOOP-18832
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Viraj Jasani


aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
is 4.1.94.Final and this is used by aws-java-adk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-17612) Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0

2023-05-09 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reopened HADOOP-17612:
---

Reopening to update the resolution

> Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0
> ---
>
> Key: HADOOP-17612
> URL: https://issues.apache.org/jira/browse/HADOOP-17612
> Project: Hadoop Common
>  Issue Type: Task
>        Reporter: Viraj Jasani
>    Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Let's upgrade Zookeeper and Curator to 3.6.3 and 5.2.0 respectively.
> Curator 5.2 also supports Zookeeper 3.5 servers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17612) Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0

2023-05-09 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HADOOP-17612.
---
Resolution: Fixed

> Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0
> ---
>
> Key: HADOOP-17612
> URL: https://issues.apache.org/jira/browse/HADOOP-17612
> Project: Hadoop Common
>  Issue Type: Task
>        Reporter: Viraj Jasani
>    Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Let's upgrade Zookeeper and Curator to 3.6.3 and 5.2.0 respectively.
> Curator 5.2 also supports Zookeeper 3.5 servers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18740) s3a prefetch cache blocks should be accessed by RW locks

2023-05-11 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18740:
-

 Summary: s3a prefetch cache blocks should be accessed by RW locks
 Key: HADOOP-18740
 URL: https://issues.apache.org/jira/browse/HADOOP-18740
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


In order to implement LRU or LFU based cache removal policies for s3a 
prefetched cache blocks, it is important for all cache reader threads to 
acquire read lock and similarly cache file removal mechanism (fs close or cache 
eviction) to acquire write lock before accessing the files.

As we maintain the block entries in an in-memory map, we should be able to 
introduce read-write lock per cache file entry, we don't need coarse-grained 
lock shared by all entries.

 

This is a prerequisite to HADOOP-18291.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-04 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19066:
-

 Summary: AWS SDK V2 - Enabling FIPS should be allowed with central 
endpoint
 Key: HADOOP-19066
 URL: https://issues.apache.org/jira/browse/HADOOP-19066
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.5.0, 3.4.1
Reporter: Viraj Jasani


FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
considers overriding endpoint and enabling fips as mutually exclusive, we fail 
fast if fs.s3a.endpoint is set with fips support (details on HADOOP-18975).

Now, we no longer override SDK endpoint for central endpoint since we enable 
cross region access (details on HADOOP-19044) but we would still fail fast if 
endpoint is central and fips is enabled.

Changes proposed:
 * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
configured.
 * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
cross region access (expected with central endpoint).
 * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19022) ITestS3AConfiguration#testRequestTimeout failure

2024-01-03 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19022:
-

 Summary: ITestS3AConfiguration#testRequestTimeout failure
 Key: HADOOP-19022
 URL: https://issues.apache.org/jira/browse/HADOOP-19022
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani


"fs.s3a.connection.request.timeout" should be specified in milliseconds as per
{code:java}
Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
{code}
The test fails consistently because it sets 120 ms timeout which is less than 
15s (min network operation duration), and hence gets reset to 15000 ms based on 
the enforcement.

 
{code:java}
[ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
Time elapsed: 0.016 s  <<< FAILURE!
java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
different than what AWS sdk configuration uses internally expected:<12> but 
was:<15000>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at 
org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19023) ITestS3AConcurrentOps#testParallelRename intermittent timeout failure

2024-01-03 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19023:
-

 Summary: ITestS3AConcurrentOps#testParallelRename intermittent 
timeout failure
 Key: HADOOP-19023
 URL: https://issues.apache.org/jira/browse/HADOOP-19023
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani


Need to configure higher timeout for the test.

 
{code:java}
[ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 256.281 
s <<< FAILURE! - in org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
[ERROR] 
testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  Time 
elapsed: 72.565 s  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on 
fork-0005/test/testParallelRename-source0: 
software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution 
did not complete before the specified timeout configuration: 15000 millis
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: 
Client execution did not complete before the specified timeout configuration: 
15000 millis
at 
software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97)
at 
software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
at 
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncH

[jira] [Created] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-11 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19146:
-

 Summary: noaa-cors-pds bucket access with global endpoint fails
 Key: HADOOP-19146
 URL: https://issues.apache.org/jira/browse/HADOOP-19146
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Viraj Jasani


All tests accessing noaa-cors-pds use us-east-1 region, as configured at bucket 
level. If global endpoint is configured (e.g. us-west-2), they fail to access 
to bucket.

 

Sample error:
{code:java}
org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
response to region [us-east-1].  This likely indicates that the S3 region 
configured in fs.s3a.endpoint.region does not match the AWS region containing 
the bucket.: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, 
Extended Request ID: 
6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
    at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
    at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
    at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
 {code}
{code:java}
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: 
S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 
6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
    at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93)
    at 
software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279)
    ...
    ...
    ...
    at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53

[jira] [Created] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-02 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19218:
-

 Summary: Avoid DNS lookup while creating IPC Connection object
 Key: HADOOP-19218
 URL: https://issues.apache.org/jira/browse/HADOOP-19218
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Viraj Jasani


Been running HADOOP-18628 in production for quite sometime, everything works 
fine as long as DNS servers in HA are available. Upgrading single NS server at 
a time is also a common case, not problematic.

However, recently we encountered a case where 2 out of 4 NS servers went down 
(temporarily but it's a rare case). With small duration DNS cache and 2s of NS 
fallback timeout configured in resolv.conf, now any client performing DNS 
lookup can encounter 4s+ delay. This caused namenode outage as listener thread 
is single threaded and it was not able to keep up with large num of unique 
clients (in direct proportion with num of DNS resolutions every few seconds) 
initiating connection on listener port.

While having 2 out of 4 DNS servers offline is rare case and NS fallback 
settings could also be improved, it is important to note that we don't need to 
perform DNS resolution for every new connection if the intention is to improve 
the insights into VersionMistmatch errors thrown by the server.

The proposal is the delay the DNS resolution until the server throws the error 
for incompatible header or version mismatch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



<    1   2