[jira] [Resolved] (HDFS-17538) Add tranfer priority queue for decommissioning datanode

2024-06-03 Thread Yuanbo Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu resolved HDFS-17538.
---
Resolution: Duplicate

> Add tranfer priority queue for decommissioning datanode
> ---
>
> Key: HDFS-17538
> URL: https://issues.apache.org/jira/browse/HDFS-17538
> Project: Hadoop HDFS
>  Issue Type: Improvement
>    Reporter: Yuanbo Liu
>Priority: Major
> Attachments: image-2024-05-29-16-24-45-601.png, 
> image-2024-05-29-16-26-58-359.png, image-2024-05-29-16-27-35-886.png
>
>
> When decommissioning datanode, blocks will be checked one by one disk, then 
> blocks will be sent to trigger tranfer works in DN. This will make one disk 
> of decommissioning dn very busy and cpus stuck in io-wait with high loads, 
> and sometime even lead to OOM as below:
> !image-2024-05-29-16-24-45-601.png|width=909,height=170!
> !image-2024-05-29-16-26-58-359.png|width=909,height=228!
> !image-2024-05-29-16-27-35-886.png|width=930,height=218!
> Proposal to add priority queue for transfering blocks when decommisioning 
> datanode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17538) Add tranfer priority queue for decommissioning datanode

2024-05-29 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-17538:
-

 Summary: Add tranfer priority queue for decommissioning datanode
 Key: HDFS-17538
 URL: https://issues.apache.org/jira/browse/HDFS-17538
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu
 Attachments: image-2024-05-29-16-24-45-601.png, 
image-2024-05-29-16-26-58-359.png, image-2024-05-29-16-27-35-886.png

When decommissioning datanode, blocks will be checked one by one disk, then 
blocks will be sent to trigger tranfer works in DN. This will make one disk of 
decommissioning dn very busy and cpus stuck in io-wait with high loads, and 
sometime even lead to OOM as below:

!image-2024-05-29-16-24-45-601.png!

!image-2024-05-29-16-26-58-359.png!

!image-2024-05-29-16-27-35-886.png!

Proposal to add priority queue for transfering blocks when decommisioning 
datanode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [Discuss] RBF: Aynchronous router RPC.

2024-05-24 Thread Yuanbo Liu
> >>>> queue; When a connection thread of a downstream ns receives a
> response,
> >> it
> >>>> will hand it over to the async response for processing. The async
> >> response
> >>>> thread will determine whether it has received all responses from the
> >>>> downstream ns. If it does, it will continue to process the response.
> >>>> Otherwise, the async response thread will process the next response.
> The
> >>>> asynchronous router uses CompletableFuture.allOf() to implement
> >>>> asynchronous invoiceConcurrent, and the handler, async handler, async
> >>>> response, and connection thread still does not need to wait
> >> synchronously.
> >>>> In addition, synchronous routers not only have drawbacks in multi ns
> >>>> environments, but also in single downstream ns situations, it is often
> >>>> difficult to decide how many handlers to set for the router, setting
> it
> >> too
> >>>> much will waste thread resources, and setting it too small will not be
> >> able
> >>>> to give pressure to downstream ns; Asynchronous routers can push
> >> requests
> >>>> to downstream ns without considering how to set handlers. Asynchronous
> >>>> routers can also better connect to more downstream storage services
> that
> >>>> support the HDFS protocol, with better scalability.
> >>>>
> >>>> 3.Since I have not yet deployed asynchronous routers to our own
> cluster,
> >>>> there is no performance comparison. However, theoretically, I believe
> >> that
> >>>> asynchronous routers will occupy more memory than synchronous routers.
> >>>> However, I do not believe that it will occupy a lot, especially since
> we
> >>>> can control the maximum number of requests entering the router, as
> >>>> CompletableFuture is stable and widely used; In other aspects, it
> >> should be
> >>>> far superior to synchronous routers, especially in downstream
> scenarios
> >>>> with more ns.If anyone is interested, you can also help to make a
> >>>> performance comparison
> >>>>
> >>>>> 2024年5月21日 11:39,Xiaoqiao He  写道:
> >>>>>
> >>>>> Thanks for this great proposal!
> >>>>>
> >>>>> Some questions after reviewing the design doc (sorry didn't review PR
> >>>>> carefully which is too large.)
> >>>>> 1. This solution will involve RPC framework update, will it affect
> >> other
> >>>>> modules and how to
> >>>>> keep other modules off these changes.
> >>>>> 2. Some RPC requests should be forward concurrently to all downstream
> >> NS,
> >>>>> will it cover
> >>>>> this case in this solution.
> >>>>> 3. Considering there is one init-version implementation, did you
> >> collect
> >>>>> some benchmark vs
> >>>>> the current synchronous model of DFSRouter?
> >>>>> Thanks again.
> >>>>>
> >>>>> Best Regards,
> >>>>> - He Xiaoqiao
> >>>>>
> >>>>> On Tue, May 21, 2024 at 11:21 AM zhangjian <1361320...@qq.com.invalid
> >
> >>>>> wrote:
> >>>>>
> >>>>>> Thank you for your positive attitude towards this feature. You can
> >> debug
> >>>>>> the UTs provided in PR to better understand the current asynchronous
> >>>>>> calling function.
> >>>>>>
> >>>>>>> 2024年5月21日 02:04,Simbarashe Dzinamarira 
> 写道:
> >>>>>>>
> >>>>>>> Excited to see this feature as well. I'll spend more time
> >> understanding
> >>>>>> the
> >>>>>>> proposal and implementation.
> >>>>>>>
> >>>>>>> On Mon, May 20, 2024 at 7:55 AM zhangjian
> <1361320...@qq.com.invalid
> >>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi, Yuanbo liu,  thank you for your interest in this feature, I
> >> think
> >>>>>> the
> >>>>>>>> difficulty of an asynchronous router is not only to implement
> >>>>>> asynchronous
> >>>>>>>> fun

Re: [Discuss] RBF: Aynchronous router RPC.

2024-05-19 Thread Yuanbo Liu
Nice to see this feature brought up. I tried to implement this feature in
our internal clusters, and know that it's a very complicated feature, CC
hdfs-dev to bring more discussion.
By the way, I'm not sure whether virtual thread of higher jdk will help in
this case.

On Mon, May 20, 2024 at 10:10 AM zhangjian <1361320...@qq.com.invalid>
wrote:

> Hello everyone, currently there are some shortcomings in the RPC of HDFS
> router:
>
> Currently the router's handler thread is synchronized, when the *handler* 
> thread
> adds the call to connection.calls, it needs to wait until the *connection* 
> notifies
> the call to complete, and then Only after the response is put into the
> response queue can a new call be obtained from the call queue and
> processed. Therefore, the concurrency performance of the router is limited
> by the number of handlers; a simple example is as follows: If the number of
> handlers is 1 and the maximum number of calls in the connection thread is
> 10, then even if the connection thread can send 10 requests to the
> downstream ns, since the number of handlers is 1, the router can only
> process one request after another.
>
> Since the performance of router rpc is mainly limited by the number of
> handlers, the most effective way to improve rpc performance currently is to
> increase the number of handlers. Letting the router create a large number
> of handler threads will also increase the number of thread switches and
> cannot maximize the use of machine performance.
>
> There are usually multiple ns downstream of the router. If the handler
> forwards the request to an ns with poor performance, it will cause the
> handler to wait for a long time. Due to the reduction of available
> handlers, the router's ability to handle ns requests with normal
> performance will be reduced. From the perspective of the client, the
> performance of the downstream ns of the router has deteriorated at this
> time. We often find that the call queue of the downstream ns is not high,
> but the call queue of the router is very high.
>
> Therefore, although the main function of the router is to federate and
> handle requests from multiple NSs, the current synchronous RPC performance
> cannot satisfy the scenario where there are many NSs downstream of the
> router. Even if the concurrent performance of the router can be improved by
> increasing the number of handlers, it is still relatively slow. More
> threads will increase the CPU context switching time, and in fact many of
> the handler threads are in a blocked state, which is undoubtedly a waste of
> thread resources. When a request enters the router, there is no guarantee
> that there will be a running handler at this time.
>
>
> Therefore, I consider asynchronous router rpc. Please view the issues:
> https://issues.apache.org/jira/browse/HDFS-17531  for the complete
> solution.
>
> And you can also view this PR: https://github.com/apache/hadoop/pull/6838,
> which is just a demo, but it completes the core asynchronous RPC function.
> If you think asynchronous routing is feasible, we can consider splitting
> this PR for easy review in the future.
>
> The PDF is attached and can also be viewed through issues.
>
> Welcome everyone to exchange and discuss!
>


Re: [ANNOUNCE] New Hadoop Committer - Haiyang Hu

2024-04-22 Thread Yuanbo Liu
Congratulations

On Mon, Apr 22, 2024 at 12:14 PM Ayush Saxena  wrote:

> Congratulations Haiyang!!!
>
> -Ayush
>
> > On 22 Apr 2024, at 9:41 AM, Xiaoqiao He  wrote:
> >
> > I am pleased to announce that Haiyang Hu has been elected as
> > a committer on the Apache Hadoop project. We appreciate all of
> > Haiyang's work, and look forward to her/his continued contributions.
> >
> > Congratulations and Welcome, Haiyang!
> >
> > Best Regards,
> > - He Xiaoqiao
> > (On behalf of the Apache Hadoop PMC)
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>


Re: Discussion about NameNode Fine-grained locking

2024-03-05 Thread Yuanbo Liu
I've heard zgc is better in jdk17 or above, so I think the major problem is
that we have to upgrade hadoop code to fit in jdk17.
We were using jdk11 with zgc to test NN, and didn't see an impressive
improvement.

On Wed, Mar 6, 2024 at 11:53 AM Takanobu Asanuma 
wrote:

> > We're trying tuning gc options and even new gc engine like zgc, but they
> are not very helpful.
>
> I'm afraid this is a digression, but could you elaborate on using ZGC for
> NameNode? Did you encounter any problems?
> I've never heard of using ZGC for NameNode in practice, so I'm curious
> about it.
>
> Regards,
> - Takanobu
>
>
> 2024年3月6日(水) 12:35 Yuanbo Liu :
>
> > > a. Snapshot, Symbolic link and reserved feature are not mentioned at
> the
> > design doc, should it be considered
> > Yes, I agree. Since the snapshot、symbol link is not popular in HADOOP, we
> > can try to use global lock(write lock of root inode?). In our production
> > env, we just ignore those features, but in the open source community,
> these
> > should be considered carefully.
> >
> > > b. For the benchmark result, what Read/Write request ratio? And do you
> > meet any GC issues when reaching
> > FGL will need more memory as its qps becomes very high. In practice, if
> the
> > percentage of used memory is greater than 90%, GC time will become a
> major
> > problem. We're trying tuning gc options and even new gc engine like zgc,
> > but they are not very helpful.
> >
> >
> >
> > On Wed, Mar 6, 2024 at 10:51 AM Hui Fei  wrote:
> >
> > > Thanks for suggestions.
> > >
> > > Actually Started working on this improvement. And cut the development
> > > branch :)
> > > From the proposal doc and the current reviewing work, seems that it
> > > doesn't touch the existing logic codes too much. It keeps the original
> > > logic there.
> > >
> > > @Yuanbo @Zengqiang XU   Could you share any
> > internal
> > > improvement info Xiaoqiao mentioned above?
> > >
> > > Xiaoqiao He  于2024年2月26日周一 19:50写道:
> > >
> > >> Thanks for this meaningful proposal. Some nit comments:
> > >> a. Snapshot, Symbolic link and reserved feature are not mentioned at
> the
> > >> design doc, should it be considered
> > >> or different to this core design?
> > >> b. For the benchmark result, what Read/Write request ratio? And do you
> > >> meet
> > >> any GC issues when reaching
> > >> `108K QPS`? If true, would you mind sharing STW time cost?
> > >> c. Is this deployed in your internal cluster now? If true,  any
> > >> performance
> > >> benefit differences compare to the
> > >> benchmark?
> > >> d. This is one huge feature IMO, If discussion passes, suggest
> creating
> > a
> > >> single branch to develop and follow-up
> > >> works.
> > >>
> > >> Thanks again for this meaningful proposal.
> > >>
> > >> Best Regards,
> > >> - He Xiaoqiao
> > >>
> > >>
> > >> On Tue, Feb 20, 2024 at 5:38 PM Yuanbo Liu 
> > wrote:
> > >>
> > >> > Nice to see this feature brought up. We've implemented this feature
> > >> > internally and gained significant performance improvement. I'll be
> > glad
> > >> to
> > >> > work on some jiras if necessary.
> > >> >
> > >> >
> > >> > On Tue, Feb 20, 2024 at 4:41 PM ZanderXu 
> wrote:
> > >> >
> > >> > > Thank you everyone for reviewing this ticket.
> > >> > >
> > >> > > I think if there are no problems with the goal and the overall
> > >> solution,
> > >> > we
> > >> > > are ready to push this ticket forward and I will create some
> > detailed
> > >> > > sub-tasks for this ticket.
> > >> > >
> > >> > > I will split this project into three milestones to make this
> project
> > >> > > cleaner for review and merge.
> > >> > > Milestone 1: Replacing the current global lock with two locks,
> > global
> > >> FS
> > >> > > lock and global BM lock. End-user can choose which locking mode to
> > use
> > >> > > through configuration.
> > >> > > Milestone 2: Replacing the global FS write lock with directory
> > >> tree-based
> > >> > > fine-grained lock.
> > >> >

Re: Discussion about NameNode Fine-grained locking

2024-03-05 Thread Yuanbo Liu
> a. Snapshot, Symbolic link and reserved feature are not mentioned at the
design doc, should it be considered
Yes, I agree. Since the snapshot、symbol link is not popular in HADOOP, we
can try to use global lock(write lock of root inode?). In our production
env, we just ignore those features, but in the open source community, these
should be considered carefully.

> b. For the benchmark result, what Read/Write request ratio? And do you
meet any GC issues when reaching
FGL will need more memory as its qps becomes very high. In practice, if the
percentage of used memory is greater than 90%, GC time will become a major
problem. We're trying tuning gc options and even new gc engine like zgc,
but they are not very helpful.



On Wed, Mar 6, 2024 at 10:51 AM Hui Fei  wrote:

> Thanks for suggestions.
>
> Actually Started working on this improvement. And cut the development
> branch :)
> From the proposal doc and the current reviewing work, seems that it
> doesn't touch the existing logic codes too much. It keeps the original
> logic there.
>
> @Yuanbo @Zengqiang XU   Could you share any internal
> improvement info Xiaoqiao mentioned above?
>
> Xiaoqiao He  于2024年2月26日周一 19:50写道:
>
>> Thanks for this meaningful proposal. Some nit comments:
>> a. Snapshot, Symbolic link and reserved feature are not mentioned at the
>> design doc, should it be considered
>> or different to this core design?
>> b. For the benchmark result, what Read/Write request ratio? And do you
>> meet
>> any GC issues when reaching
>> `108K QPS`? If true, would you mind sharing STW time cost?
>> c. Is this deployed in your internal cluster now? If true,  any
>> performance
>> benefit differences compare to the
>> benchmark?
>> d. This is one huge feature IMO, If discussion passes, suggest creating a
>> single branch to develop and follow-up
>> works.
>>
>> Thanks again for this meaningful proposal.
>>
>> Best Regards,
>> - He Xiaoqiao
>>
>>
>> On Tue, Feb 20, 2024 at 5:38 PM Yuanbo Liu  wrote:
>>
>> > Nice to see this feature brought up. We've implemented this feature
>> > internally and gained significant performance improvement. I'll be glad
>> to
>> > work on some jiras if necessary.
>> >
>> >
>> > On Tue, Feb 20, 2024 at 4:41 PM ZanderXu  wrote:
>> >
>> > > Thank you everyone for reviewing this ticket.
>> > >
>> > > I think if there are no problems with the goal and the overall
>> solution,
>> > we
>> > > are ready to push this ticket forward and I will create some detailed
>> > > sub-tasks for this ticket.
>> > >
>> > > I will split this project into three milestones to make this project
>> > > cleaner for review and merge.
>> > > Milestone 1: Replacing the current global lock with two locks, global
>> FS
>> > > lock and global BM lock. End-user can choose which locking mode to use
>> > > through configuration.
>> > > Milestone 2: Replacing the global FS write lock with directory
>> tree-based
>> > > fine-grained lock.
>> > > Milestone 3: Replacing the global BM lock with directory tree-based
>> > > fine-grained lock.
>> > >
>> > > Each milestone can be merged into the trunk branch in time, which has
>> > > multiple benefits:
>> > > 1. Phased performance improvements can be quickly used by everyone
>> > > 2. All developers can better understand the implementation ideas of
>> the
>> > > fine-grained locking mechanism as soon as possible
>> > > 3. Each milestone is developed based on the latest trunk branch to
>> reduce
>> > > conflicts
>> > >
>> > > If you have any concerns, please feel free to discuss them together.
>> > > I hope you can join us to push this project forward together, thanks.
>> > >
>> > >
>> > > On Mon, 5 Feb 2024 at 11:33, haiyang hu 
>> wrote:
>> > >
>> > > > Thank you for raising the issue of this long-standing bottleneck,
>> this
>> > > > will be a very important improvement!
>> > > >
>> > > > Hopefully can participate and push forward together.
>> > > >
>> > > > Best Regards~
>> > > >
>> > > > Brahma Reddy Battula  于2024年2月3日周六 00:40写道:
>> > > >
>> > > >> Thanks for bringing this and considering all the history around
>> this.
>> > > >> One of the outstanding bottleneck(global lock) f

Re: Discussion about NameNode Fine-grained locking

2024-02-20 Thread Yuanbo Liu
Nice to see this feature brought up. We've implemented this feature
internally and gained significant performance improvement. I'll be glad to
work on some jiras if necessary.


On Tue, Feb 20, 2024 at 4:41 PM ZanderXu  wrote:

> Thank you everyone for reviewing this ticket.
>
> I think if there are no problems with the goal and the overall solution, we
> are ready to push this ticket forward and I will create some detailed
> sub-tasks for this ticket.
>
> I will split this project into three milestones to make this project
> cleaner for review and merge.
> Milestone 1: Replacing the current global lock with two locks, global FS
> lock and global BM lock. End-user can choose which locking mode to use
> through configuration.
> Milestone 2: Replacing the global FS write lock with directory tree-based
> fine-grained lock.
> Milestone 3: Replacing the global BM lock with directory tree-based
> fine-grained lock.
>
> Each milestone can be merged into the trunk branch in time, which has
> multiple benefits:
> 1. Phased performance improvements can be quickly used by everyone
> 2. All developers can better understand the implementation ideas of the
> fine-grained locking mechanism as soon as possible
> 3. Each milestone is developed based on the latest trunk branch to reduce
> conflicts
>
> If you have any concerns, please feel free to discuss them together.
> I hope you can join us to push this project forward together, thanks.
>
>
> On Mon, 5 Feb 2024 at 11:33, haiyang hu  wrote:
>
> > Thank you for raising the issue of this long-standing bottleneck, this
> > will be a very important improvement!
> >
> > Hopefully can participate and push forward together.
> >
> > Best Regards~
> >
> > Brahma Reddy Battula  于2024年2月3日周六 00:40写道:
> >
> >> Thanks for bringing this and considering all the history around this.
> >> One of the outstanding bottleneck(global lock) from a long time.
> >>
> >> Hopefully we can push forward this time.
> >>
> >>
> >> On Fri, Feb 2, 2024 at 12:23 PM Hui Fei  wrote:
> >>
> >> > Thanks for driving this. It's very meaningful. The performance
> >> improvement
> >> > looks very good.
> >> >
> >> > Many users are facing the write performance issue. As far as I know,
> >> some
> >> > companies already implemented the similar idea on their internal
> >> branches.
> >> > But the internal branch is very different from the community one. So
> >> it's
> >> > very hard to be in sync with the community. If this improvement can be
> >> > involved in the community, that would be great to both end-user and
> the
> >> > community.
> >> >
> >> > It is very worth doing.
> >> >
> >> > Zengqiang XU  于2024年2月2日周五 11:07写道:
> >> >
> >> > > Hi everyone
> >> > >
> >> > > I have started a discussion about NameNode Fine-grained Locking to
> >> > improve
> >> > > performance of write operations in NameNode.
> >> > >
> >> > > I started this discussion again for serval main reasons:
> >> > > 1. We have implemented it and gained nearly 7x performance
> >> improvement in
> >> > > our prod environment
> >> > > 2. Many other companies made similar improvements based on their
> >> internal
> >> > > branch.
> >> > > 3. This topic has been discussed for a long time, but still without
> >> any
> >> > > results.
> >> > >
> >> > > I hope we can push this important improvement in the community so
> that
> >> > all
> >> > > end-users can enjoy this significant improvement.
> >> > >
> >> > > I'd really appreciate you can join in and work with me to push this
> >> > feature
> >> > > forward.
> >> > >
> >> > > Thanks very much.
> >> > >
> >> > > Ticket: HDFS-17366 <
> https://issues.apache.org/jira/browse/HDFS-17366>
> >> > > Design: NameNode Fine-grained locking based on directory tree
> >> > > <
> >> > >
> >> >
> >>
> https://docs.google.com/document/d/1bVBQcI4jfzS0UrczB7UhsrQTXmrERGvBV-a9W3HCCjk/edit?usp=sharing
> >> > > >
> >> > >
> >> >
> >>
> >
>


[jira] [Created] (HDFS-17147) RBF: RouterRpcServer getListing become extremely slow when the children of the dir are mounted in the same ns.

2023-08-08 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-17147:
-

 Summary: RBF: RouterRpcServer getListing become extremely slow 
when the children of the dir are mounted in the same ns.
 Key: HDFS-17147
 URL: https://issues.apache.org/jira/browse/HDFS-17147
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu


Suppose we mount table as below:

/dir -> ns0 ->  /target/dir

/dir/child1 -> ns0 -> /target/dir/child1

/dir/child2 -> ns0 -> /target/dir/child2

..

/dir/child200 -> ns0 -> /target/dir/child200

 

when listing /dir with RBF, it's getting extremely slow as getListing has two 
parts:
1. list all children of  /target/dir

2. append the rest 200 mount points to the result.

 

The second part invoke getFileInfo concurrently to make sure mount points are 
accessed under rightful permission. But in this case, the first part includes 
the result of the second part, and there is no need to append second part 
repeatly.

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16657) Changing pool-level lock to volume-level lock for invalidation of blocks

2022-07-12 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-16657:
-

 Summary: Changing pool-level lock to volume-level lock for 
invalidation of blocks
 Key: HDFS-16657
 URL: https://issues.apache.org/jira/browse/HDFS-16657
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yuanbo Liu
 Attachments: image-2022-07-13-10-25-37-383.png, 
image-2022-07-13-10-27-01-386.png, image-2022-07-13-10-27-44-258.png

Recently we see that the heartbeating of dn become slow in a very busy cluster, 
here is the chart:

!image-2022-07-13-10-25-37-383.png!

 

After getting jstack of the dn, we find that dn heartbeat stuck in invalidation 
of blocks:

!image-2022-07-13-10-27-01-386.png!

!image-2022-07-13-10-27-44-258.png!

The key code is:
{code:java}
// code placeholder
try {
  File blockFile = new File(info.getBlockURI());
  if (blockFile != null && blockFile.getParentFile() == null) {
errors.add("Failed to delete replica " + invalidBlks[i]
+  ". Parent not found for block file: " + blockFile);
continue;
  }
} catch(IllegalArgumentException e) {
  LOG.warn("Parent directory check failed; replica " + info
  + " is not backed by a local file");
} {code}
DN is trying to locate parent path of block file, thus there is a disk I/O in 
pool-level lock. When the disk becomes very busy with high io wait, All the 
pending threads will be blocked by the pool-level lock, and the time of 
heartbeat is high. We proposal to change the pool-level lock to volume-level 
lock for block invalidation

cc: [~hexiaoqiao] [~Aiphag0] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16569) Consider attaching block location info from client when closing a completed file

2022-05-05 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-16569:
-

 Summary: Consider attaching block location info from client when 
closing a completed file
 Key: HDFS-16569
 URL: https://issues.apache.org/jira/browse/HDFS-16569
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu


when a file is finished, client will not close it until DNs send RECEIVED_BLOCK 
by ibr or client is timeout. we can always see such kind of log in namenode
{code:java}
is COMMITTED but not COMPLETE(numNodes= 0 <  minimum = 1) in file{code}
Since client already has the last block locations, it's not necessary to rely 
on the ibr from DN when closing file.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16559) Adding new configuration for IBR thread

2022-04-24 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-16559:
-

 Summary: Adding new configuration for IBR thread
 Key: HDFS-16559
 URL: https://issues.apache.org/jira/browse/HDFS-16559
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu


IBR opeeration is seperated from heartbeat thread since HDFS-16016, which is 
great. 

But IBR thread is using the expireTime of heart beat to deside whether wait or 
not. In a high load DataNode, IBR thread becomes an infinit loop without 
sleeping and consuming 100% cpu because of the latency of heartbeat reporting.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16558) Consider changing the lock of delegation token from write lock to read lock

2022-04-24 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-16558:
-

 Summary: Consider changing the lock of delegation token from write 
lock to read lock
 Key: HDFS-16558
 URL: https://issues.apache.org/jira/browse/HDFS-16558
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu
 Attachments: image-2022-04-24-14-13-04-695.png, 
image-2022-04-24-14-13-52-867.png

In a very busy authed cluster, renewing/caneling/gettingdelegation token get 
slow and it will slow down the speed of handling rpcs from client. Since 
AbstractDelegationTokenSecretManager is a thread-safe manager, we propose to 
change the fs lock from write lock to read lock(protect editlog rolling)




!image-2022-04-24-14-13-04-695.png|width=334,height=212!

!image-2022-04-24-14-13-52-867.png|width=324,height=173!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16549) Consider using volume level lock for deleting blocks

2022-04-20 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-16549:
-

 Summary: Consider using volume level lock for deleting blocks
 Key: HDFS-16549
 URL: https://issues.apache.org/jira/browse/HDFS-16549
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu


It's great to see the implement of fine-grain lock for DN has been committed 
into trunk. 

FsDatasetImpl.invalidate is a frequent method to response the delete command 
from NN. How about using volume-level write lock instead of pool-level write 
lock to reduce the cost of write lock.

cc: [~hexiaoqiao]  [~Aiphag0] . 
Thanks for your great work!
h4. [Mingxiang 
Li|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=Aiphag0]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16492) Add metrics and operation names for dataset lock

2022-03-03 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-16492:
-

 Summary: Add metrics and operation names for dataset lock
 Key: HDFS-16492
 URL: https://issues.apache.org/jira/browse/HDFS-16492
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yuanbo Liu






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16479) EC: When reconstruting ec block index, liveBusyBlockIndicies is not enclude, then reconstructing will fail

2022-02-22 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-16479:
-

 Summary: EC: When reconstruting ec block index, 
liveBusyBlockIndicies is not enclude, then reconstructing will fail
 Key: HDFS-16479
 URL: https://issues.apache.org/jira/browse/HDFS-16479
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ec, erasure-coding
Reporter: Yuanbo Liu


We got this exception from DataNodes

`

java.lang.IllegalArgumentException: No enough live striped blocks.
        at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
        at 
org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.(StripedReader.java:128)
        at 
org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReconstructor.(StripedReconstructor.java:135)
        at 
org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.(StripedBlockReconstructor.java:41)
        at 
org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker.processErasureCodingTasks(ErasureCodingWorker.java:133)
        at 
org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:796)
        at 
org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:680)
        at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1314)
        at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1360)
        at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1287)

`
After going through the code of ErasureCodingWork.java, we found

`java

else {
targets[0].getDatanodeDescriptor().addBlockToBeErasureCoded(
new ExtendedBlock(blockPoolId, stripedBlk), getSrcNodes(), targets,
getLiveBlockIndicies(), stripedBlk.getErasureCodingPolicy());
}

`

the liveBusyBlockIndicies is not considered as liveBlockIndicies, hence erasure 
coding reconstruction sometimes will fail as 'No enough live striped blocks'.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16433) The synchronized lock of IncrementalBlockReportManager will slow down the speed of creating/deleting/finalizing/opening block

2022-01-28 Thread Yuanbo Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu resolved HDFS-16433.
---
Resolution: Not A Problem

> The synchronized lock of IncrementalBlockReportManager will slow down the 
> speed of creating/deleting/finalizing/opening block
> -
>
> Key: HDFS-16433
> URL: https://issues.apache.org/jira/browse/HDFS-16433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>        Reporter: Yuanbo Liu
>Priority: Critical
> Attachments: image-2022-01-21-22-52-52-912.png
>
>
> The code in IncrementalBlockReportManager.java
> {code:java}
> synchronized void waitTillNextIBR(long waitTime) {
>   if (waitTime > 0 && !sendImmediately()) {
> try {
>   wait(ibrInterval > 0 && ibrInterval < waitTime? ibrInterval: waitTime);
> } catch (InterruptedException ie) {
>   LOG.warn(getClass().getSimpleName() + " interrupted");
> }
>   }
> } {code}
> We can see that wait(waitime) will hold synchronized, if ibr interval is 
> enabled or heartbeat time is not reached. The lock will block 
> notifyNamenodeBlock function which is widely used in 
> deleting/creating/finalizing/opening blocks, then the speed of DataNode IO 
> will slow down. Here is the graph of blocking relationship
> !image-2022-01-21-22-52-52-912.png|width=976,height=299!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16433) The synchronized lock of IncrementalBlockReportManager will slow down the speed of creating/deleting/finalizing block

2022-01-21 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-16433:
-

 Summary: The synchronized lock of IncrementalBlockReportManager 
will slow down the speed of creating/deleting/finalizing block
 Key: HDFS-16433
 URL: https://issues.apache.org/jira/browse/HDFS-16433
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu
 Attachments: image-2022-01-21-22-52-52-912.png

The code in IncrementalBlockReportManager.java
{code:java}

synchronized void waitTillNextIBR(long waitTime) {
  if (waitTime > 0 && !sendImmediately()) {
try {
  wait(ibrInterval > 0 && ibrInterval < waitTime? ibrInterval: waitTime);
} catch (InterruptedException ie) {
  LOG.warn(getClass().getSimpleName() + " interrupted");
}
  }
} {code}
We can see that wait(waitime) will hold synchronized, if ibr interval is 
enabled or heartbeat time is not reached. Then the lock will block 
notifyNamenodeBlock function which is widely used in 
deleting/creating/finalizing/opening blocks, then the speed of DataNode IO will 
slow down. Here is the graph of blocking relationship
!image-2022-01-21-22-52-52-912.png!

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16201) Select datanode based on storage type

2021-08-31 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HDFS-16201:
-

 Summary: Select datanode based on storage type
 Key: HDFS-16201
 URL: https://issues.apache.org/jira/browse/HDFS-16201
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


Since storage policy was introduced into hdfs, it would be useful if hdfs 
client could choose datanode of replica based on storage type priority when 
reading. The priority should be RAM_DISK > SSD > DISK > ARCHIVE.

Here is the process graph

!https://iwiki.woa.com/download/attachments/979566104/image2021-8-31_20-23-24.png?version=1=1630412605000=v2!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14421) HDFS block two replicas exist in one DataNode

2019-04-10 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-14421:
-

 Summary: HDFS block two replicas exist in one DataNode
 Key: HDFS-14421
 URL: https://issues.apache.org/jira/browse/HDFS-14421
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu


We're using Hadoop-2.7.0.

There is a file which replication factor is 2. Those two replicas exist in one 
Datande. the fsck info is here:

{color:#707070}BP-499819267-xx.xxx.131.201-1452072365222:blk_1400651575_326942161
 len=484045 repl=2 
[DatanodeInfoWithStorage[xx.xxx.80.205:50010,DS-d321be27-cbd4-4edd-81ad-29b3d021ee82,DISK],
 
DatanodeInfoWithStorage[xx.xx.80.205:50010,DS-d321be27-cbd4-4edd-81ad-29b3d021ee82,DISK]].{color}

and this is the exception from xx.xx.80.205

{color:#707070}org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: 
Replica not found for 
BP-499819267-xx.xxx.131.201-1452072365222:blk_1400651575_326942161{color}

It's confusing that why NameNode doesn't update block map after exception. 
What's the reason of two replicas exist in one Datande?

Hope to get anyone's comments. Thanks in advance.

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-9477) namenode starts failed:FSEditLogLoader: Encountered exception on operation TimesOp

2017-11-28 Thread Yuanbo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu resolved HDFS-9477.
--
Resolution: Fixed

I believe that HDFS-8269 has solved this issue, please check it.

> namenode starts failed:FSEditLogLoader: Encountered exception on operation 
> TimesOp
> --
>
> Key: HDFS-9477
> URL: https://issues.apache.org/jira/browse/HDFS-9477
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
> Environment: Ubuntu 12.04.1 LTS, java version "1.7.0_79"
>Reporter: aplee
>Assignee: aplee
>
> backup namenode start failed, log below:
> 2015-11-28 14:09:13,462 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation TimesOp [length=0, path=/.reserved/.inodes/2346114, mtime=-1, 
> atime=1448692924700, opCode=OP_TIMES, txid=14774180]
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:473)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:299)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:629)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:234)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:832)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:813)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:331)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
>   at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
> 2015-11-28 14:09:13,572 FATAL 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unknown error 
> encountered while tailing edits. Shutting down standby NN.
> java.io.IOException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:832)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:813)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:331)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
>   at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:473)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:299)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:629)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:234)
>   ... 9 more
> 2015-11-28 14:09:13,574 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1
> 2015-11-28 14:09:13,575 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: 
> I found record in Edits, but I don't know how this record generated
> 
> OP_TIMES
> 
>   14774180
>   0
>   /.reserved/.inodes/2346114
>   -1
>   1448692924700
> 
>   



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12328) Ozone: Purge metadata of deleted blocks after max retry times

2017-08-20 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-12328:
-

 Summary: Ozone: Purge metadata of deleted blocks after max retry 
times
 Key: HDFS-12328
 URL: https://issues.apache.org/jira/browse/HDFS-12328
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


In HDFS-12283, we set the value of count to -1 if blocks cannot be deleted 
after max retry times. We need to provide APIs for admins to purge the "-1" 
metadata manually.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11568) Ozone: Create metadata path automatically after null checking

2017-03-23 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11568:
-

 Summary: Ozone: Create metadata path automatically after null 
checking
 Key: HDFS-11568
 URL: https://issues.apache.org/jira/browse/HDFS-11568
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


The metadata path of ozone container should be checked properly and created 
accordingly while it's initiated. Some code snips that we need to change are 
here:
{{ContainerMapping.java}}
{code}

// TODO: Fix this checking.
String scmMetaDataDir = conf.get(OzoneConfigKeys
.OZONE_CONTAINER_METADATA_DIRS);
if ((scmMetaDataDir == null) || (scmMetaDataDir.isEmpty())) {
  throw
  new IllegalArgumentException("SCM metadata directory is not valid.");
}
{code}

{{OzoneContainer.java}}
{code}
List locations = new LinkedList<>();
String[] paths = ozoneConfig.getStrings(
OzoneConfigKeys.OZONE_CONTAINER_METADATA_DIRS);
if (paths != null && paths.length > 0) {
  for (String p : paths) {
locations.add(StorageLocation.parse(p));
  }
} else {
  getDataDir(locations);
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11561) Mistake in httpfs doc file

2017-03-21 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11561:
-

 Summary: Mistake in httpfs doc file
 Key: HDFS-11561
 URL: https://issues.apache.org/jira/browse/HDFS-11561
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu
Priority: Trivial


In the httpfs index.md, there is a sentence
{quote}
curl http://httpfs-host:14000/webhdfs/v1/user/foo?op=list` returns the contents 
of ...
{quote}
which is wrong, the right operation is LISTSTATUS, not LIST.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Discussion about ssh fencing in hadoop-common

2017-03-09 Thread Yuanbo Liu
Hi, developers
Sorry to interrupt and hope you all have a good day.
Recently we had a little discussion about ssh fencing in HDFS-11336
 , and thanks for Steve
and Andrew's comments, I got the point that ssh fencing of hadoop is a
relic from the NFS HA days and not reliable. So I'm wondering whether it's
good to remove these codes from hadoop-common. I'd like to get more
thoughts from you guys before I raise a JIRA. Any comment will be
appreciated. Thanks in advance.


Bring a general discussion about impersonation in secure environment

2017-01-16 Thread Yuanbo Liu
Hi, developers
Sorry to interrupt and wish you have a good day.

 As we know, impersonation is quite a common operation in gateway access
model. And I've found that the SPNEGO filter doesn't support impersonation
aka "doAs" operation, and it causes limitation of accessing pages both in
HDFS and YARN when users set the auth filter as SPNEGO filter. Here is the
JIRA(HADOOP-13119 ) to
track all the discussion.

I'd like to bring it to the community and get your thoughts since this JIRA
has been suspended for a while. If you have any suggestion about this JIRA,
please let me know, thanks in advance.


[jira] [Created] (HDFS-11336) [SPS]: Clean up SPS xAttr in a proper situation

2017-01-11 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11336:
-

 Summary: [SPS]: Clean up SPS xAttr in a proper situation
 Key: HDFS-11336
 URL: https://issues.apache.org/jira/browse/HDFS-11336
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


1. When we finish the movement successfully, we should clean Xattrs.
2. When we disable SPS dynamically, we should clean Xattrs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11293) FsDatasetImpl throws ReplicaAlreadyExistsException in a wrong situation

2017-01-04 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11293:
-

 Summary: FsDatasetImpl throws ReplicaAlreadyExistsException in a 
wrong situation
 Key: HDFS-11293
 URL: https://issues.apache.org/jira/browse/HDFS-11293
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu
Priority: Critical


In {{FsDatasetImpl#createTemporary}}, we use {{volumeMap}} to get replica info 
by block pool id. But in this situation:
{code}
datanode A => {DISK, SSD}, datanode B => {DISK, ARCHIVE}.
1. the same block replica exists in A[DISK] and B[DISK].
2. the block pool id of datanode A and datanode B are the same.
{code}
Then we start to change the file's storage policy and move the block replica in 
the cluster. Very likely we have to move block from B[DISK] to A[SSD], at this 
time, datanode A throws ReplicaAlreadyExistsException and it's not a correct 
behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11284) [SPS]: Improve the stability of Storage Policy Satisfier.

2016-12-30 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11284:
-

 Summary: [SPS]: Improve the stability of Storage Policy Satisfier.
 Key: HDFS-11284
 URL: https://issues.apache.org/jira/browse/HDFS-11284
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


Recently I've found in some conditions, SPS is not stable:
* SPS runs under safe model.
* There're some overlap nodes in the chosen target nodes.
* The real replication number of block doesn't match the replication factor. 
For example, the real replication is 2 while the replication factor is 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11236) Erasure Coding cann't support appendToFile

2016-12-13 Thread Yuanbo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu resolved HDFS-11236.
---
Resolution: Duplicate

> Erasure Coding cann't support appendToFile
> --
>
> Key: HDFS-11236
> URL: https://issues.apache.org/jira/browse/HDFS-11236
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: gehaijiang
>
> hadoop 3.0.0-alpha1
> $  hdfs erasurecode -getPolicy /ectest/workers
> ErasureCodingPolicy=[Name=RS-DEFAULT-6-3-64k, 
> Schema=[ECSchema=[Codec=rs-default, numDataUnits=6, numParityUnits=3]], 
> CellSize=65536 ]
> $  hadoop fs  -appendToFile  hadoop/etc/hadoop/httpfs-env.sh  /ectest/workers
> appendToFile: Cannot append to files with striped block /ectest/workers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11195) When appending files by webhdfs rest api fails, it returns 200

2016-11-30 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11195:
-

 Summary: When appending files by webhdfs rest api fails, it 
returns 200
 Key: HDFS-11195
 URL: https://issues.apache.org/jira/browse/HDFS-11195
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


Suppose that there is a Hadoop cluster contains only one datanode, and 
dfs.replication=3. Run:
{code}
curl -i -X POST -T  
"http://:/webhdfs/v1/?op=APPEND"
{code}
it returns 200, even though append operation fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11175) TransparentEncryption.md should be up-to-date since uppercase key names are unsupported.

2016-11-24 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11175:
-

 Summary: TransparentEncryption.md should be up-to-date since 
uppercase key names are unsupported.
 Key: HDFS-11175
 URL: https://issues.apache.org/jira/browse/HDFS-11175
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Reporter: Yuanbo Liu
Priority: Trivial


After HADOOP-11311, key names has been restricted and uppercase key names are 
not allowed. This section of {{TransparentEncryption.md}} should be modified.
{quote}
# As the normal user, create a new encryption key
hadoop key create myKey

# As the super user, create a new empty directory and make it an encryption zone
hadoop fs -mkdir /zone
hdfs crypto -createZone -keyName myKey -path /zone

# chown it to the normal user
hadoop fs -chown myuser:myuser /zone

# As the normal user, put a file in, read it out
hadoop fs -put helloWorld /zone
hadoop fs -cat /zone/helloWorld
{quote}
"myKey" is not allowed here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11150) [SPS]: Provide persistence when satisfying storage policy.

2016-11-16 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11150:
-

 Summary: [SPS]: Provide persistence when satisfying storage policy.
 Key: HDFS-11150
 URL: https://issues.apache.org/jira/browse/HDFS-11150
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


Provide persistence for SPS in case that Hadoop cluster crashes by accident. 
Basically we need to change EditLog and FsImage here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: How to setup local environment to run kerberos test cases.

2016-09-29 Thread Yuanbo Liu
Hi, Rakesh
Thanks for your response. Those docs are helpful but not what I'm asking. I
was running test cases in my local machine, some test cases threw
exception.
For example:
mvn clean package -Dtest=TestSecureNameNode
it threw:

Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 50.024 sec
<<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestSecureNameNode
testName(org.apache.hadoop.hdfs.server.namenode.TestSecureNameNode)  Time
elapsed: 48.552 sec  <<< ERROR!
java.io.IOException: Failed on local exception: java.io.IOException:
Couldn't setup connection for hdfs/localh...@example.com to
localhost.localdomain/127.0.0.1:43815; Host Details : local host is:
"localhost.localdomain/127.0.0.1"; destination host is:
"localhost.localdomain":43815;
at sun.security.krb5.KdcComm.send(KdcComm.java:242)
at sun.security.krb5.KdcComm.send(KdcComm.java:200)
at sun.security.krb5.KrbTgsReq.send(KrbTgsReq.java:254)
at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:269)
at
sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:302)
at
sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:120)
at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:693)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)

This test case is related to Kerberos. I guess I need to setup something
before I run it, but I don't know how to do it. Any thoughts?


How to setup local environment to run kerberos test cases.

2016-09-29 Thread Yuanbo Liu
Hi, developers
I'd like to run kerberos test cases in my local machine, such as
"TestSecureNameNode", but I can't make it. Can anybody tell me how to setup
local environment so that those test cases can run successfully. Any help
will be appreciated, thanks in advance.


[jira] [Created] (HDFS-10883) `getTrashRoot`'s behavior is not consistent in DFS after enabling EZ.

2016-09-20 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-10883:
-

 Summary: `getTrashRoot`'s behavior is not consistent in DFS after 
enabling EZ.
 Key: HDFS-10883
 URL: https://issues.apache.org/jira/browse/HDFS-10883
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


Let's say root path ("/") is the encryption zone, and there is a file called 
"/test" in root path.
{code}
dfs.getTrashRoot(new Path("/"))
{code}
returns "/user/$USER/.Trash",
while
{code}
dfs.getTrashRoot(new Path("/test"))
{code} 
returns "/.Trash/$USER".
The second behavior is not correct. Since root path is the encryption zone,  
which means all files/directories in DFS are encrypted, it's more reasonable   
to return  "/user/$USER/.Trash" no matter what the path is. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-10840) Mover should be in the path of "org.apache.hadoop.hdfs.tools"

2016-09-05 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-10840:
-

 Summary: Mover should be in the path of 
"org.apache.hadoop.hdfs.tools"
 Key: HDFS-10840
 URL: https://issues.apache.org/jira/browse/HDFS-10840
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


The class of mover tool now exists in "org.apache.hadoop.hdfs.server", and it's 
a little confused to be here. I propose to move the class into 
"org.apache.hadoop.hdfs.tools".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10736) Format disk balance command's output info

2016-08-15 Thread Yuanbo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu resolved HDFS-10736.
---
Resolution: Duplicate

> Format disk balance command's output info
> -
>
> Key: HDFS-10736
> URL: https://issues.apache.org/jira/browse/HDFS-10736
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: diskbalancer, hdfs
>        Reporter: Yuanbo Liu
>    Assignee: Yuanbo Liu
>
> When users run command of disk balance as below
> {quote}
> hdfs diskbalancer
> {quote}
> it doesn't print the detail information of options.
> Also when users run disk balance command in a wrong way, the output info is 
> not consistent with other commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-10737) disk balance reporter print null for the volume's path

2016-08-09 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-10737:
-

 Summary: disk balance reporter print null for the volume's path
 Key: HDFS-10737
 URL: https://issues.apache.org/jira/browse/HDFS-10737
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: diskbalancer, hdfs
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


reproduction steps:
1. hdfs diskbalancer -plan xxx.xx(host name of datanode)
2. If plan json is created successfully, run
hdfs diskbalancer -report  xxx.xx
the output info is here:
{noformat}
[DISK: volume-null] - 0.00 used: 45997/101122146304, 1.00 free: 
101122100307/101122146304, isFailed: False, isReadOnly: False, isSkip: False, 
isTransient: False.
{noformat}
{{vol.getPath()}} returns null in {{ReportCommand#handleTopReport}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-10736) Format disk balance command's output info

2016-08-09 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-10736:
-

 Summary: Format disk balance command's output info
 Key: HDFS-10736
 URL: https://issues.apache.org/jira/browse/HDFS-10736
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


When users run command of disk balance as below
{quote}
hdfs diskbalancer
{quote}
it doesn't print the detail information of options.
Also when users run disk balance command in a wrong way, the output info is not 
consistent with other commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui

2016-07-18 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-10645:
-

 Summary: Make block report size as a metric and add this metric to 
datanode web ui
 Key: HDFS-10645
 URL: https://issues.apache.org/jira/browse/HDFS-10645
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, ui
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-10612) Optimize mechanism when block report size exceed the limit of PB message

2016-07-12 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-10612:
-

 Summary: Optimize mechanism when block report size exceed the 
limit of PB message
 Key: HDFS-10612
 URL: https://issues.apache.org/jira/browse/HDFS-10612
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu


Community has made block report size configurable in HDFS-10312. But there is 
still a risk for Hadoop. If block report size exceeds PB message size, the 
cluster may be in a danger situation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apply for "assign to me" permission

2016-07-11 Thread Yuanbo Liu
Hi, all
I'm Yuanbo from China and glad to join the Hadoop community. I'd like to
contribute to Hadoop but I can not assign jira to myself. Is there anything
I need to do to get the permission? Any help will be appreciated.


[jira] [Resolved] (HDFS-10593) MAX_DIR_ITEMS should not be hard coded since RPC buff size is configurable

2016-07-05 Thread Yuanbo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu resolved HDFS-10593.
---
Resolution: Not A Problem

> MAX_DIR_ITEMS should not be hard coded since RPC buff size is configurable 
> ---
>
> Key: HDFS-10593
> URL: https://issues.apache.org/jira/browse/HDFS-10593
> Project: Hadoop HDFS
>  Issue Type: Improvement
>    Reporter: Yuanbo Liu
>
> In HDFS, "dfs.namenode.fs-limits.max-directory-items" was introduced in 
> HDFS-6102 to restrict max items of single directory, and the value of it can 
> not be larger than the value of MAX_DIR_ITEMS. Since 
> "ipc.maximum.data.length" was added in HADOOP-9676 and documented in 
> HADOOP-13039 to make maximum RPC buffer size configurable, it's not proper to 
> hard code the value of MAX_DIR_ITEMS in {{FSDirectory}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-10593) MAX_DIR_ITEMS should not be hard coded since RPC buff size is configurable

2016-07-05 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-10593:
-

 Summary: MAX_DIR_ITEMS should not be hard coded since RPC buff 
size is configurable 
 Key: HDFS-10593
 URL: https://issues.apache.org/jira/browse/HDFS-10593
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yuanbo Liu


In HDFS, "dfs.namenode.fs-limits.max-directory-items" was introduced in 
HDFS-6102 to restrict max items of single directory, and the value of it can 
not be larger than the value of MAX_DIR_ITEMS. Since "ipc.maximum.data.length" 
was added in HADOOP-9676 and documented in HADOOP-13039 to make maximum RPC 
buffer size configurable, it's not proper to hard code the value of 
MAX_DIR_ITEMS in {{FSDirectory}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org