[jira] [Created] (HDFS-17332) DFSInputStream: avoid logging stacktrace until when we really need to fail a read request with a MissingBlockException

2024-01-09 Thread Xing Lin (Jira)
Xing Lin created HDFS-17332:
---

 Summary: DFSInputStream: avoid logging stacktrace until when we 
really need to fail a read request with a MissingBlockException
 Key: HDFS-17332
 URL: https://issues.apache.org/jira/browse/HDFS-17332
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
 Environment: In DFSInputStream#actualGetFromOneDataNode(), it would 
send the exception stacktrace to the dfsClient.LOG whenever we fail on a DN. 
However, in most cases, the read request will be served successfully by reading 
from the next available DN. The existence of exception stacktrace in the log 
has caused multiple hadoop users at Linkedin to consider this WARN message as 
the RC/fatal error for their jobs.  We would like to improve the log message 
and avoid sending the stacktrace to dfsClient.LOG when a read succeeds. The 
stackTrace when reading reach DN is sent to the log only when we really need to 
fail a read request (when chooseDataNode()/refetchLocations() throws a 
BlockMissingException). 
Reporter: Xing Lin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17286) Add UDP as a transfer protocol for HDFS

2023-12-12 Thread Xing Lin (Jira)
Xing Lin created HDFS-17286:
---

 Summary: Add UDP as a transfer protocol for HDFS
 Key: HDFS-17286
 URL: https://issues.apache.org/jira/browse/HDFS-17286
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: Xing Lin


Right now, every connection in HDFS is based on RPC/IPC which is based on TCP. 
Connection is re-used based on ConnectionID, which includes RpcTimeout as part 
of the key to identify a connection. The consequence is if we want to use a 
different rpc timeout between two hosts, this would create different TCP 
connections. 

A use case which motivated us to consider UDP is getHAServiceState() in 
ObserverReadProxyProvider. We'd like getHAServiceState() to time out with a 
much smaller timeout threshold and move to probe next Namenode. To support 
this, we used an executorService and set a timeout for the task in HDFS-17030. 
This implementation can be improved by using UDP to query HAServiceState. 
getHAServiceState() does not have to be very reliable, as we can always fall 
back to the active.

Another motivation is it seems 5~10% of RPC calls hitting our active/observers 
are GetHAServiceState(). If we can move them off to the UDP server, that can 
hopefully improve RPC latency.

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17281) Added support of reporting RPC round-trip time at NN.

2023-12-08 Thread Xing Lin (Jira)
Xing Lin created HDFS-17281:
---

 Summary: Added support of reporting RPC round-trip time at NN.
 Key: HDFS-17281
 URL: https://issues.apache.org/jira/browse/HDFS-17281
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: Xing Lin
Assignee: Xing Lin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17118) Fix minor checkstyle warnings in TestObserverReadProxyProvider

2023-07-23 Thread Xing Lin (Jira)
Xing Lin created HDFS-17118:
---

 Summary: Fix minor checkstyle warnings in 
TestObserverReadProxyProvider
 Key: HDFS-17118
 URL: https://issues.apache.org/jira/browse/HDFS-17118
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.4.0
Reporter: Xing Lin


We noticed a few checkstyle warnings when backporting HDFS-17030 from trunk to 
branch-3.3. The yetus build was not stable at that time and we did not notice 
the newly added checkstyle warnings.

 

PR: https://github.com/apache/hadoop/pull/5700



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17067) allowCoreThreadTimeOut should be set to true for nnProbingThreadPool in ObserverReadProxy

2023-07-03 Thread Xing Lin (Jira)
Xing Lin created HDFS-17067:
---

 Summary: allowCoreThreadTimeOut should be set to true for 
nnProbingThreadPool in ObserverReadProxy
 Key: HDFS-17067
 URL: https://issues.apache.org/jira/browse/HDFS-17067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 3.4.0
Reporter: Xing Lin
Assignee: Xing Lin


In HDFS-17030, we introduced an ExecutorService, to submit getHAServiceState() 
requests. We constructed the ExecutorService directly from a basic 
ThreadPoolExecutor, without setting _allowCoreThreadTimeOut_ to true. Then, the 
core thread will be kept up and running even when the main thread exits. To fix 
it, one could set _allowCoreThreadTimeOut_ to true. However, in this PR, we 
decide to directly use an existing executorService implementation 
(_BlockingThreadPoolExecutorService_) in hadoop instead. It takes care of 
setting _allowCoreThreadTimeOut_ and allowing setting the thread prefix.

Second minor issue is we did not shutdown the executorService in close(). It is 
a minor issue as close() will only be called when the garbage collector starts 
to reclaim an ObserverReadProxyProvider object, not when there is no reference 
to the ObserverReadProxyProvider object. The time between when an 
ObserverReadProxyProvider becomes dereferenced and when the garage collector 
actually starts to reclaim that object is out of control/under-defined (unless 
the program is shutdown with an explicit System.exit(1)).


{code:java}
  private final ExecutorService nnProbingThreadPool =
  new ThreadPoolExecutor(1, 4, 1L, TimeUnit.MINUTES,
  new ArrayBlockingQueue(1024));
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17055) Export HAState as a metric from Namenode for monitoring

2023-06-21 Thread Xing Lin (Jira)
Xing Lin created HDFS-17055:
---

 Summary: Export HAState as a metric from Namenode for monitoring
 Key: HDFS-17055
 URL: https://issues.apache.org/jira/browse/HDFS-17055
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.4.0, 3.3.9
Reporter: Xing Lin


We'd like measure the uptime for Namenodes: percentage of time when we have the 
active/standby/observer node available (up and running). We could monitor the 
namenode from an external service, such as ZKFC. But that would require the 
external service to be available 100% itself. And when this third-party 
external monitoring service is down, we won't have info on whether our 
Namenodes are still up.

We propose to take a different approach: we will emit Namenode state directly 
from namenode itself. Whenever we miss a data point for this metric, we 
consider the corresponding namenode to be down/not available. In other words, 
we assume the metric collection/monitoring infrastructure to be 100% reliable.

One implementation detail: in hadoop, we have the _NameNodeMetrics_ class, 
which is used to emit all metrics for {_}NameNode.java{_}. However, we don't 
think that is a good place to emit NameNode HAState. HAState is stored in 
NameNode.java and we should directly emit it from NameNode.java. Otherwise, we 
basically duplicate this info in two classes and we would have to keep them in 
sync. Besides, _NameNodeMetrics_ class does not have a reference to the 
_NameNode_ object which it belongs to. An _NameNodeMetrics_ is created by a 
_static_ function _initMetrics()_ in {_}NameNode.java{_}. We shouldn't emit HA 
state from FSNameSystem.java either, as it is initialized from NameNode.java 
and all state transitions are implemented in NameNode.java.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17042) Add rpcCallSuccesses and OverallRpcProcessingTime to RpcMetrics for Namenode

2023-06-09 Thread Xing Lin (Jira)
Xing Lin created HDFS-17042:
---

 Summary: Add rpcCallSuccesses and OverallRpcProcessingTime to 
RpcMetrics for Namenode
 Key: HDFS-17042
 URL: https://issues.apache.org/jira/browse/HDFS-17042
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.4.0, 3.3.9
Reporter: Xing Lin
Assignee: Xing Lin


We'd like to add two new types of metrics to the existing 
RpcMetrics/RpcDetailedMetrics. 
 * {_}RpcCallSuccesses{_}: it measures the number of RPC requests where they 
are successfully processed by a NN (e.g., with a response with an RpcStatus 
{_}RpcStatusProto.SUCCESS){_}{_}.{_} Then, together with {_}RpcQueueNumOps 
({_}which refers the total number of RPC requests{_}){_}, we can derive the 
RpcErrorRate for our NN, as (RpcQueueNumOps - RpcCallSuccesses) / 
RpcQueueNumOps. 
 * OverallRpcProcessingTime for each RPC method: this metric measures the 
overall RPC processing time for each RPC method at the NN. It covers the time 
from when a request arrives at the NN to when a response is sent back. We are 
already emitting processingTime for each RPC method today in 
RpcDetailedMetrics. We want to extend it to emit overallRpcProcessingTime for 
each RPC method, which includes enqueueTime, queueTime, processingTime, 
responseTime, and handlerTime.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17030) Limit wait time for getHAServiceState in ObserverReaderProxy

2023-05-29 Thread Xing Lin (Jira)
Xing Lin created HDFS-17030:
---

 Summary: Limit wait time for getHAServiceState in 
ObserverReaderProxy
 Key: HDFS-17030
 URL: https://issues.apache.org/jira/browse/HDFS-17030
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.4.0
Reporter: Xing Lin


When HA is enabled and a standby NN is not responsible (either when it is down 
or a heap dump is being taken), we would wait for either 
_socket_connection_timeout * socket_max_retries_on_connection_timeout_ or 
_rpcTimeOut_ before moving on to the next NN. This adds a significantly 
latency. For clusters at Linkedin, we set rpcTimeOut to 120 seconds and a 
request would need to take more than 2 mins to complete when we take a heap 
dump at a standby. This has been causing user job failures. 

The proposal is to add a timeout on getHAServiceState() calls in 
ObserverReaderProxy and we will only wait for the timeout for an NN to respond 
its HA state. Once we pass that timeout, we will move on to the next NN. 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16852) swallow IllegalStateException in KeyProviderCache

2022-11-22 Thread Xing Lin (Jira)
Xing Lin created HDFS-16852:
---

 Summary: swallow IllegalStateException in KeyProviderCache
 Key: HDFS-16852
 URL: https://issues.apache.org/jira/browse/HDFS-16852
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Reporter: Xing Lin


When an HDFS client is created, it will register a shutdownhook to 
shutdownHookManager. ShutdownHookManager doesn't allow adding a new 
shutdownHook when the process is already in shutdown and throws an 
IllegalStateException.

This behavior is not ideal, when a spark program failed during pre-launch. In 
that case, during shutdown, spark would call cleanStagingDir() to clean the 
staging dir. In cleanStagingDir(), it will create a FileSystem object to talk 
to HDFS. However, since this would be the first time to use a filesystem object 
in that process, it will need to create an hdfs client and register the 
shutdownHook. Then, we will hit the IllegalStateException. This 
illegalStateException will mask the actual exception which causes the spark 
program to fail during pre-launch.

We propose to swallow IllegalStateException in KeyProviderCache and log a 
warning. The TCP connection between the client and NameNode should be closed by 
the OS when the process is shutdown. 

Example stacktrace
{code:java}
13-09-2022 14:39:42 PDT INFO - 22/09/13 21:39:41 ERROR util.Utils: Uncaught 
exception in thread shutdown-hook-0   
13-09-2022 14:39:42 PDT INFO - java.lang.IllegalStateException: Shutdown in 
progress, cannot add a shutdownHook    
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:299)
          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.hdfs.KeyProviderCache.(KeyProviderCache.java:71)        
  
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.hdfs.ClientContext.(ClientContext.java:130)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.hdfs.ClientContext.get(ClientContext.java:167)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:383)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:287)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:159)
          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3261)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3310)         
 
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3278)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.spark.deploy.yarn.ApplicationMaster.cleanupStagingDir(ApplicationMaster.scala:675)
          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.spark.deploy.yarn.ApplicationMaster.$anonfun$run$2(ApplicationMaster.scala:259)
          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)      
    
13-09-2022 14:39:42 PDT INFO - at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
          
13-09-2022 14:39:42 PDT INFO - at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2023)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
          
13-09-2022 14:39:42 PDT INFO - at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)          
13-09-2022 14:39:42 PDT INFO - at scala.util.Try$.apply(Try.scala:213)          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
          
13-09-2022 14:39:42 PDT INFO - at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
          
13-09-2022 14:39:42 PDT INFO - at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)         
 
13-09-2022 14:39:42 PDT INFO - at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)          
13-09-2022 14:39:42 PDT INFO - at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
         
13-09-2022 14:39:42 PDT INFO - at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
         
13-09-2022 14:39:42 PDT INFO - at java.lang.Thread.run(Thread.java:748)         
 

[jira] [Created] (HDFS-16818) RBF TestRouterRPCMultipleDestinationMountTableResolver non-deterministic unit tests failures

2022-10-24 Thread Xing Lin (Jira)
Xing Lin created HDFS-16818:
---

 Summary: RBF TestRouterRPCMultipleDestinationMountTableResolver 
non-deterministic unit tests failures
 Key: HDFS-16818
 URL: https://issues.apache.org/jira/browse/HDFS-16818
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: rbf
Affects Versions: 3.4.0
Reporter: Xing Lin


TestRouterRPCMultipleDestinationMountTableResolver fails a couple of times 
nondeterministically when run multiple times. 

I repeated the following commands for 10+ times against 
454157a3844cdd6c92ef650af6c3b323cbec88af in trunk and observed two types of 
failed runs.
{code:java}
mvn test -Dtest="TestRouterRPCMultipleDestinationMountTableResolver"{code}
 

Failed run 1 output:
{code:java}
[ERROR] Failures:
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationHashAllOrder:177->testInvocation:221->testDirec
toryAndFileLevelInvocation:296->verifyDirectoryLevelInvocations:395 
expected:<[COLD]> but was:<[HOT]>
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationHashOrder:193->testInvocation:221->testDirector
yAndFileLevelInvocation:298->verifyDirectoryLevelInvocations:395 
expected:<[COLD]> but was:<[HOT]>
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationLocalOrder:201->testInvocation:221->testDirecto
ryAndFileLevelInvocation:296->verifyDirectoryLevelInvocations:395 
expected:<[COLD]> but was:<[HOT]>
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationRandomOrder:185->testInvocation:221->testDirect
oryAndFileLevelInvocation:296->verifyDirectoryLevelInvocations:395 
expected:<[COLD]> but was:<[HOT]>
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationSpaceOrder:169->testInvocation:221->testDirecto
ryAndFileLevelInvocation:296->verifyDirectoryLevelInvocations:395 
expected:<[COLD]> but was:<[HOT]>
[INFO]
[ERROR] Tests run: 18, Failures: 5, Errors: 0, Skipped: 0{code}
 

Failed run 2 output:
{code:java}
[ERROR] Failures:
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testECMultipleDestinations:430
[ERROR] Errors:
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationHashAllOrder:177->testInvocation:221->testDirec
toryAndFileLevelInvocation:296->verifyDirectoryLevelInvocations:397 NullPointer
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationHashOrder:193->testInvocation:221->testDirector
yAndFileLevelInvocation:298->verifyDirectoryLevelInvocations:397 NullPointer
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationLocalOrder:201->testInvocation:221->testDirecto
ryAndFileLevelInvocation:296->verifyDirectoryLevelInvocations:397 NullPointer
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationRandomOrder:185->testInvocation:221->testDirect
oryAndFileLevelInvocation:296->verifyDirectoryLevelInvocations:397 NullPointer
[ERROR]   
TestRouterRPCMultipleDestinationMountTableResolver.testInvocationSpaceOrder:169->testInvocation:221->testDirecto
ryAndFileLevelInvocation:296->verifyDirectoryLevelInvocations:397 NullPointer
[INFO]
[ERROR] Tests run: 18, Failures: 1, Errors: 5, Skipped: 0{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16816) RBF: auto-create user home dir for trash paths by router

2022-10-24 Thread Xing Lin (Jira)
Xing Lin created HDFS-16816:
---

 Summary: RBF: auto-create user home dir for trash paths by router
 Key: HDFS-16816
 URL: https://issues.apache.org/jira/browse/HDFS-16816
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Reporter: Xing Lin


In RBF, trash files are moved to trash root under user's home dir at the 
corresponding namespace/namenode where the files reside. This was added in 
HDFS-16024. When the user home dir is not created before-hand at a namenode, we 
run into permission denied exceptions when trying to create the parent dir for 
the trash file before moving the file into it. We propose to enhance Router, to 
auto-create a user home's dir at the namenode for trash paths, using router's 
identity (which is assumed to be a super-user).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16790) rbf wrong path when destination dir is not created

2022-10-02 Thread Xing Lin (Jira)
Xing Lin created HDFS-16790:
---

 Summary: rbf wrong path when destination dir is not created
 Key: HDFS-16790
 URL: https://issues.apache.org/jira/browse/HDFS-16790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: rbf
Affects Versions: 3.4.0
Reporter: Xing Lin


mount table at router
{code:java}
$HADOOP_HOME/bin/hdfs dfsrouteradmin -ls
/data1ns1->/data
/data2                    ns2->/data
/data3ns3->/data
{code}
At a client node, when /data is not created in ns2, the error message shows a 
wrong path.
{code:java}
utos@c01:/usr/local/bin/hadoop-3.4.0-SNAPSHOT$ bin/hadoop dfs -ls 
hdfs://ns-fed/data2
ls: File hdfs://ns-fed/data2/data2 does not exist.

utos@c01:/usr/local/bin/hadoop-3.4.0-SNAPSHOT$ bin/hadoop dfs -ls 
hdfs://ns-fed/data3
-rw-r--r--   3 utos supergroup  0 2022-10-02 17:35 
hdfs://ns-fed/data3/file3
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16128) Add support for saving/loading an FS Image

2021-07-13 Thread Xing Lin (Jira)
Xing Lin created HDFS-16128:
---

 Summary: Add support for saving/loading an FS Image
 Key: HDFS-16128
 URL: https://issues.apache.org/jira/browse/HDFS-16128
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs, namenode
Reporter: Xing Lin


We target to enable fine-grained locking by splitting the in-memory namespace 
into multiple partitions each having a separate lock. Intended to improve 
performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16125) iterator for PartitionedGSet would visit the first partition twice

2021-07-12 Thread Xing Lin (Jira)
Xing Lin created HDFS-16125:
---

 Summary: iterator for PartitionedGSet would visit the first 
partition twice
 Key: HDFS-16125
 URL: https://issues.apache.org/jira/browse/HDFS-16125
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs, namenode
Reporter: Xing Lin


Iterator in PartitionedGSet would visit the first partition twice, since we did 
not set the keyIterator to move to the first key during initialization.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org