date:20230224

[jira] [Created] (HDFS-16933) A race in SerialNumberMap will cause wrong ownership

2023-02-24 Thread ZanderXu (Jira)

ZanderXu created HDFS-16933:
---

 Summary: A race in SerialNumberMap will cause wrong ownership
 Key: HDFS-16933
 URL: https://issues.apache.org/jira/browse/HDFS-16933
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: ZanderXu
Assignee: ZanderXu


If namenode enables parallel fsimage loading, a race that occurs in 
SerialNumberMap will cause wrong owner ship for INodes.
{code:java}
public int get(T t) {
  if (t == null) {
return 0;
  }
  Integer sn = t2i.get(t);
  if (sn == null) {
// Assume there are two thread with different t, such as:
// T1 with hbase
// T2 with hdfs
// If T1 and T2 get the sn in the same time, they will get the same sn, 
such as 10
sn = current.getAndIncrement();
if (sn > max) {
  current.getAndDecrement();
  throw new IllegalStateException(name + ": serial number map is full");
}
Integer old = t2i.putIfAbsent(t, sn);
if (old != null) {
  current.getAndDecrement();
  return old;
}
// If T1 puts the 10->hbase to the i2t first, T2 will use 10 -> hdfs to 
overwrite it. So it will cause that the Inodes will get a wrong owner hdfs, 
actual it should be hbase.
i2t.put(sn, t);
  }
  return sn;
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2023-02-24 Thread Apache Jenkins Server

For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1146/

[Feb 22, 2023, 8:37:35 PM] (github) YARN-11370. [Federation] Refactor 
MemoryFederationStateStore code. (#5126)
[Feb 22, 2023, 9:58:44 PM] (github) HDFS-16901: RBF: Propagates real user's 
username via the caller context, when a proxy user is being used. (#5346)
[Feb 23, 2023, 12:37:49 AM] (Owen O'Malley) HDFS-16901: Minor fix for unit test.
[Feb 23, 2023, 11:23:53 AM] (Steve Loughran) Revert "HADOOP-18590. Publish SBOM 
artifacts (#5281)"
[Feb 23, 2023, 1:23:35 PM] (Steve Loughran) HADOOP-18470. Remove HDFS RBF text 
in the 3.3.5 index.md file




-1 overall


The following subsystems voted -1:
blanks hadolint pathlen spotbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

spotbugs :

   module:hadoop-mapreduce-project/hadoop-mapreduce-client 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

spotbugs :

   
module:hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

spotbugs :

   module:hadoop-mapreduce-project 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

spotbugs :

   module:root 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

Failed junit tests :

   hadoop.mapred.TestShuffleHandler 
  

   cc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1146/artifact/out/results-compile-cc-root.txt
 [96K]

   javac:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1146/artifact/out/results-compile-javac-root.txt
 [528K]

   blanks:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1146/artifact/out/blanks-eol.txt
 [14M]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1146/artifact/out/blanks-tabs.txt
 [2.0M]

   checksty

[jira] [Created] (HDFS-16934) org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression

2023-02-24 Thread Steve Loughran (Jira)

Steve Loughran created HDFS-16934:
-

 Summary: 
org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression
 Key: HDFS-16934
 URL: https://issues.apache.org/jira/browse/HDFS-16934
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: dfsadmin, test
Affects Versions: 3.4.0, 3.3.5, 3.3.9
Reporter: Steve Loughran


jenkins test failure as the logged output is in the wrong order for the 
assertions. HDFS-16624 flipped the order...without that this would have worked.

{code}

java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:87)
at org.junit.Assert.assertTrue(Assert.java:42)
at org.junit.Assert.assertTrue(Assert.java:53)
at 
org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1149)
{code}


Here the code is asserting about the contents of the output, 
{code}
assertTrue(outs.get(0).startsWith("Reconfiguring status for node"));
assertTrue("SUCCESS: Changed property 
dfs.datanode.peer.stats.enabled".equals(outs.get(2))
|| "SUCCESS: Changed property 
dfs.datanode.peer.stats.enabled".equals(outs.get(1)));  // here
assertTrue("\tFrom: \"false\"".equals(outs.get(3)) || "\tFrom: 
\"false\"".equals(outs.get(2)));
assertTrue("\tTo: \"true\"".equals(outs.get(4)) || "\tTo: 
\"true\"".equals(outs.get(3)))
{code}

If you look at the log, the actual line is appearing in that list, just in a 
different place. race condition
{code}
2023-02-24 01:02:06,275 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:testAllDatanodesReconfig(1146)) - dfsadmin -status 
-livenodes output:
2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring 
status for node [127.0.0.1:41795]: started at Fri Feb 24 01:02:03 GMT 2023 and 
finished at Fri Feb 24 01:02:03 GMT 2023.
2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring 
status for node [127.0.0.1:34007]: started at Fri Feb 24 01:02:03 GMT 
2023SUCCESS: Changed property dfs.datanode.peer.stats.enabled
2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -From: "false"
2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -To: "true"
2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -  and finished at 
Fri Feb 24 01:02:03 GMT 2023.
2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - SUCCESS: Changed 
property dfs.datanode.peer.stats.enabled
{code}
we have a race condition in output generation and the assertions are clearly 
too brittle

for the 3.3.5 release I'm not going to make this a blocker. What i will do is 
propose that the asserts move to assertJ with an assertion that the collection 
"containsExactlyInAnyOrder" all the strings.

That will
1. not be brittle.
2. give nice errors on failure




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-16935) TestFsDatasetImpl.testReportBadBlocks brittle

2023-02-24 Thread Steve Loughran (Jira)

Steve Loughran created HDFS-16935:
-

 Summary: TestFsDatasetImpl.testReportBadBlocks brittle
 Key: HDFS-16935
 URL: https://issues.apache.org/jira/browse/HDFS-16935
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.4.0, 3.3.5, 3.3.9
Reporter: Steve Loughran


jenkins failure as sleep() time not long enough
{code}
Failing for the past 1 build (Since #4 )
Took 7.4 sec.
Error Message
expected:<1> but was:<0>
Stacktrace
java.lang.AssertionError: expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:633)
{code}

assert is after a 3s sleep waiting for reports coming in.
{code}
  dataNode.reportBadBlocks(block, dataNode.getFSDataset()
  .getFsVolumeReferences().get(0));
  Thread.sleep(3000);   // 3s sleep
  BlockManagerTestUtil.updateState(cluster.getNamesystem()
  .getBlockManager());
  // Verify the bad block has been reported to namenode
  Assert.assertEquals(1, 
cluster.getNamesystem().getCorruptReplicaBlocks());  // here
{code}

LambdaTestUtils.eventually() should be used around this assert, maybe with an 
even shorter initial delay so on faster systems, test is faster.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 3.3.5

2023-02-24 Thread Steve Loughran

 need this pr in too, https://github.com/apache/hadoop/pull/5429

   1. cuts back on some transitive dependencies from hadoop-aliyun
   2. fixes LICENSE-bin to be correct

#2 is the blocker...and it looks like 3.2.x will also need fixup as well as
the later ones -hadoop binaries have shipped without that file being up to
date, but at least all the transitive stuff is correctly licensed. And i
think we need to change the PR template to mention transitive updates in
the license bit too

if this goes in, I will do the rebuild on monday UK time

On Thu, 23 Feb 2023 at 11:18, Steve Loughran  wrote:

>
> And I've just hit HADOOP-18641. cyclonedx maven plugin breaks on recent
> maven releases (3.9.0)
>
> on a new local build with maven updated on homebrew (which i needed for
> spark). so a code change too. That issue doesn't surface on our
> release dockers, but will hit other people. especially over time. Going to
> revert HADOOP-18590. Publish SBOM artifacts (#5281)
>
>
>
> On Thu, 23 Feb 2023 at 10:29, Steve Loughran  wrote:
>
>> ok, let me cancel, update those jiras and kick off again. that will save
>> anyone else having to do their homework
>>
>> On Thu, 23 Feb 2023 at 08:56, Takanobu Asanuma 
>> wrote:
>>
>>> I'm now -1 as I found the wrong information on the top page (index.md).
>>>
>>> > 1. HDFS-13522, HDFS-16767 & Related Jiras: Allow Observer Reads in HDFS
>>> Router Based Federation.
>>>
>>> The fix version of HDFS-13522 and HDFS-16767 also included 3.3.5 before,
>>> though it is actually not in branch-3.3. I corrected the fix version and
>>> created HDFS-16889 to backport them to branch-3.3 about a month ago.
>>> Unfortunately, it won't be fixed soon. I should have let you know at that
>>> time, sorry.  Supporting Observer NameNode in RBF is a prominent feature.
>>> So I think we have to delete the description from the top page not to
>>> confuse Hadoop users.
>>>
>>> - Takanobu
>>>
>>> 2023年2月23日(木) 17:17 Takanobu Asanuma :
>>>
>>> > Thanks for driving the release, Steve and Mukund.
>>> >
>>> > I found that there were some jiras with wrong fix versions.
>>> >
>>> > The fix versions included 3.3.5, but actually, it isn't in 3.3.5-RC1:
>>> > - HDFS-16845
>>> > - HADOOP-18345
>>> >
>>> > The fix versions didn't include 3.3.5, but actually, it is in 3.3.5-RC1
>>> > (and it is not in release-3.3.4) :
>>> > - HADOOP-17276
>>> > - HDFS-13293
>>> > - HDFS-15630
>>> > - HDFS-16266
>>> > - HADOOP-18003
>>> > - HDFS-16310
>>> > - HADOOP-18014
>>> >
>>> > I corrected all the wrong fix versions just now. I'm not sure we should
>>> > revote it since it only affects the changelog.
>>> >
>>> > - Takanobu
>>> >
>>> > 2023年2月21日(火) 22:43 Steve Loughran :
>>> >
>>> >> Apache Hadoop 3.3.5
>>> >>
>>> >> Mukund and I have put together a release candidate (RC1) for Hadoop
>>> 3.3.5.
>>> >>
>>> >> What we would like is for anyone who can to verify the tarballs,
>>> >> especially
>>> >> anyone who can try the arm64 binaries as we want to include them too.
>>> >>
>>> >> The RC is available at:
>>> >> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC1/
>>> >>
>>> >> The git tag is release-3.3.5-RC1, commit 274f91a3259
>>> >>
>>> >> The maven artifacts are staged at
>>> >>
>>> https://repository.apache.org/content/repositories/orgapachehadoop-1368/
>>> >>
>>> >> You can find my public key at:
>>> >> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>>> >>
>>> >> Change log
>>> >>
>>> >>
>>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC1/CHANGELOG.md
>>> >>
>>> >> Release notes
>>> >>
>>> >>
>>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC1/RELEASENOTES.md
>>> >>
>>> >> This is off branch-3.3 and is the first big release since 3.3.2.
>>> >>
>>> >> Key changes include
>>> >>
>>> >> * Big update of dependencies to try and keep those reports of
>>> >>   transitive CVEs under control -both genuine and false positives.
>>> >> * HDFS RBF enhancements
>>> >> * Critical fix to ABFS input stream prefetching for correct reading.
>>> >> * Vectored IO API for all FSDataInputStream implementations, with
>>> >>   high-performance versions for file:// and s3a:// filesystems.
>>> >>   file:// through java native io
>>> >>   s3a:// parallel GET requests.
>>> >> * This release includes Arm64 binaries. Please can anyone with
>>> >>   compatible systems validate these.
>>> >>
>>> >> Note, because the arm64 binaries are built separately on a different
>>> >> platform and JVM, their jar files may not match those of the x86
>>> >> release -and therefore the maven artifacts. I don't think this is
>>> >> an issue (the ASF actually releases source tarballs, the binaries are
>>> >> there for help only, though with the maven repo that's a bit blurred).
>>> >>
>>> >> The only way to be consistent would actually untar the x86.tar.gz,
>>> >> overwrite its binaries with the arm stuff, retar, sign and push out
>>> >> for the vote. Even automating that would be risky.
>>> >>
>>> >> Please try the release

Re: [VOTE] Release Apache Hadoop 3.3.5

2023-02-24 Thread Ayush Saxena

>
>  And i
> think we need to change the PR template to mention transitive updates in
> the license bit too


Not sure if that is gonna help, People might ignore that or check that in
overconfidence. No harm though..

BTW Ozone has some cool stuff to handle this, it was added here:
https://github.com/apache/ozone/pull/2199

It checks for each PR, if the changes bring any new transitive dependency
or not and if it does, it flags that and then licence and all can be
managed. Worth exploring

-Ayush

On Sat, 25 Feb 2023 at 01:09, Steve Loughran 
wrote:

>  need this pr in too, https://github.com/apache/hadoop/pull/5429
>
>1. cuts back on some transitive dependencies from hadoop-aliyun
>2. fixes LICENSE-bin to be correct
>
> #2 is the blocker...and it looks like 3.2.x will also need fixup as well as
> the later ones -hadoop binaries have shipped without that file being up to
> date, but at least all the transitive stuff is correctly licensed. And i
> think we need to change the PR template to mention transitive updates in
> the license bit too
>
> if this goes in, I will do the rebuild on monday UK time
>
> On Thu, 23 Feb 2023 at 11:18, Steve Loughran  wrote:
>
> >
> > And I've just hit HADOOP-18641. cyclonedx maven plugin breaks on recent
> > maven releases (3.9.0)
> >
> > on a new local build with maven updated on homebrew (which i needed for
> > spark). so a code change too. That issue doesn't surface on our
> > release dockers, but will hit other people. especially over time. Going
> to
> > revert HADOOP-18590. Publish SBOM artifacts (#5281)
> >
> >
> >
> > On Thu, 23 Feb 2023 at 10:29, Steve Loughran 
> wrote:
> >
> >> ok, let me cancel, update those jiras and kick off again. that will save
> >> anyone else having to do their homework
> >>
> >> On Thu, 23 Feb 2023 at 08:56, Takanobu Asanuma 
> >> wrote:
> >>
> >>> I'm now -1 as I found the wrong information on the top page (index.md).
> >>>
> >>> > 1. HDFS-13522, HDFS-16767 & Related Jiras: Allow Observer Reads in
> HDFS
> >>> Router Based Federation.
> >>>
> >>> The fix version of HDFS-13522 and HDFS-16767 also included 3.3.5
> before,
> >>> though it is actually not in branch-3.3. I corrected the fix version
> and
> >>> created HDFS-16889 to backport them to branch-3.3 about a month ago.
> >>> Unfortunately, it won't be fixed soon. I should have let you know at
> that
> >>> time, sorry.  Supporting Observer NameNode in RBF is a prominent
> feature.
> >>> So I think we have to delete the description from the top page not to
> >>> confuse Hadoop users.
> >>>
> >>> - Takanobu
> >>>
> >>> 2023年2月23日(木) 17:17 Takanobu Asanuma :
> >>>
> >>> > Thanks for driving the release, Steve and Mukund.
> >>> >
> >>> > I found that there were some jiras with wrong fix versions.
> >>> >
> >>> > The fix versions included 3.3.5, but actually, it isn't in 3.3.5-RC1:
> >>> > - HDFS-16845
> >>> > - HADOOP-18345
> >>> >
> >>> > The fix versions didn't include 3.3.5, but actually, it is in
> 3.3.5-RC1
> >>> > (and it is not in release-3.3.4) :
> >>> > - HADOOP-17276
> >>> > - HDFS-13293
> >>> > - HDFS-15630
> >>> > - HDFS-16266
> >>> > - HADOOP-18003
> >>> > - HDFS-16310
> >>> > - HADOOP-18014
> >>> >
> >>> > I corrected all the wrong fix versions just now. I'm not sure we
> should
> >>> > revote it since it only affects the changelog.
> >>> >
> >>> > - Takanobu
> >>> >
> >>> > 2023年2月21日(火) 22:43 Steve Loughran :
> >>> >
> >>> >> Apache Hadoop 3.3.5
> >>> >>
> >>> >> Mukund and I have put together a release candidate (RC1) for Hadoop
> >>> 3.3.5.
> >>> >>
> >>> >> What we would like is for anyone who can to verify the tarballs,
> >>> >> especially
> >>> >> anyone who can try the arm64 binaries as we want to include them
> too.
> >>> >>
> >>> >> The RC is available at:
> >>> >> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC1/
> >>> >>
> >>> >> The git tag is release-3.3.5-RC1, commit 274f91a3259
> >>> >>
> >>> >> The maven artifacts are staged at
> >>> >>
> >>>
> https://repository.apache.org/content/repositories/orgapachehadoop-1368/
> >>> >>
> >>> >> You can find my public key at:
> >>> >> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >>> >>
> >>> >> Change log
> >>> >>
> >>> >>
> >>>
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC1/CHANGELOG.md
> >>> >>
> >>> >> Release notes
> >>> >>
> >>> >>
> >>>
> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC1/RELEASENOTES.md
> >>> >>
> >>> >> This is off branch-3.3 and is the first big release since 3.3.2.
> >>> >>
> >>> >> Key changes include
> >>> >>
> >>> >> * Big update of dependencies to try and keep those reports of
> >>> >>   transitive CVEs under control -both genuine and false positives.
> >>> >> * HDFS RBF enhancements
> >>> >> * Critical fix to ABFS input stream prefetching for correct reading.
> >>> >> * Vectored IO API for all FSDataInputStream implementations, with
> >>> >>   high-performance versions for file:// and s3a:// filesystems.
> >>> >>

Socket timeout settings

2023-02-24 Thread Viraj Jasani

We have a specific environment where we need to harmonize socket connection
timeouts for all Hadoop daemons and some downstreamers too. While reviewing
the socket connection timeouts set in NetUtils, UrlConnection
(HttpURLConnection), I compiled a list of the following configurations:

   - ipc.client.connect.timeout
   - dfs.client.socket-timeout
   - dfs.datanode.socket.write.timeout
   - dfs.client.fsck.connect.timeout
   - dfs.client.fsck.read.timeout
   - dfs.federation.router.connect.timeout
   - dfs.qjournal.http.open.timeout.ms
   - dfs.qjournal.http.read.timeout.ms
   - dfs.checksum.ec.socket-timeout
   - hadoop.security.kms.client.timeout
   - mapreduce.reduce.shuffle.connect.timeout
   - mapreduce.reduce.shuffle.read.timeout


Moreover, although “dfs.datanode.socket.reuse.keepalive” does not indicate
a direct socket timeout, we set it as SocketOptions#SO_TIMEOUT if
opsProcessed != 0 (to block read on InputStream only for this timeout,
beyond which it would result in SocketTimeoutException). Similarly,
“ipc.ping.interval” and “ipc.client.rpc-timeout.ms” are also used to set
SocketOptions#SO_TIMEOUT on the socket.

It's possible that I may have missed some socket timeout configs in the
above list. If anyone could provide feedback on this list or suggest any
missing configs, it would be greatly appreciated.

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2023-02-24 Thread Apache Jenkins Server

For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1147/

No changes




-1 overall


The following subsystems voted -1:
blanks hadolint pathlen spotbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

spotbugs :

   module:hadoop-mapreduce-project/hadoop-mapreduce-client 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

spotbugs :

   
module:hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

spotbugs :

   module:hadoop-mapreduce-project 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

spotbugs :

   module:root 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

Failed junit tests :

   hadoop.mapred.TestShuffleHandler 
  

   cc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1147/artifact/out/results-compile-cc-root.txt
 [96K]

   javac:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1147/artifact/out/results-compile-javac-root.txt
 [528K]

   blanks:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1147/artifact/out/blanks-eol.txt
 [14M]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1147/artifact/out/blanks-tabs.txt
 [2.0M]

   checkstyle:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1147/artifact/out/results-checkstyle-root.txt
 [13M]

   hadolint:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1147/artifact/out/results-hadolint.txt
 [8.0K]

   pathlen:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1147/artifact/out/results-pathlen.txt
 [16K]

   pylint:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1147/artifact/out/results-pylint.txt
 [20K]

[jira] [Created] (HDFS-16933) A race in SerialNumberMap will cause wrong ownership

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

[jira] [Created] (HDFS-16934) org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression

[jira] [Created] (HDFS-16935) TestFsDatasetImpl.testReportBadBlocks brittle

Re: [VOTE] Release Apache Hadoop 3.3.5

Re: [VOTE] Release Apache Hadoop 3.3.5

Socket timeout settings

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

8 matches

Site Navigation

Mail list logo

Footer information