date:20171207

[jira] [Commented] (HDFS-12890) Ozone: XceiverClient should have upper bound on async requests

2017-12-07 Thread Mukul Kumar Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283176#comment-16283176
 ] 

Mukul Kumar Singh commented on HDFS-12890:
--

Thanks for the updated patch [~shashikant]

1) In case of an exception, the ctx is being closed. I feel that on an 
exception, all the futures which are enqueued in the Hashmap should be 
completed exceptionally and the semaphore count should be decremented 
appropriately.


> Ozone: XceiverClient should have upper bound on async requests
> --
>
> Key: HDFS-12890
> URL: https://issues.apache.org/jira/browse/HDFS-12890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: HDFS-7240
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
> Attachments: HDFS-12890-HDFS-7240.001.patch, 
> HDFS-12890-HDFS-7240.002.patch, HDFS-12890-HDFS-7240.003.patch
>
>
> XceiverClient-ratis maintains upper bound on the no of outstanding async 
> requests . XceiverClient
> should also impose an upper bound on the no of outstanding async requests 
> received from client
> for write.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12907) Allow read-only access to reserved raw for non-superusers

2017-12-07 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283173#comment-16283173
 ] 

Xiao Chen commented on HDFS-12907:
--

Thanks for the explanation Daryn.

The rage block is what's painful for us as well, when supporting different 
downstream components and partners. I hope the kms clients were designed / 
documented / implemented / reviewed perfectly too. Interestingly if you see the 
history of KMSCP class you'll see quite a few attempts to make it 'work for the 
case of xxx'. Maybe what you described can be fixed in a new jira, and consider 
some of the past behaviors simply wrong so we don't worry about compatibility. 

Agree letting the DN to pass through raw bytes would be great. I hope this can 
be rolled into a simple design doc for HDFS-12355 so other people can figure 
out easily.

> Allow read-only access to reserved raw for non-superusers
> -
>
> Key: HDFS-12907
> URL: https://issues.apache.org/jira/browse/HDFS-12907
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Rushabh S Shah
> Attachments: HDFS-12907.001.patch, HDFS-12907.patch
>
>
> HDFS-6509 added a special /.reserved/raw path prefix to access the raw file 
> contents of EZ files.  In the simplest sense it doesn't return the FE info in 
> the {{LocatedBlocks}} so the dfs client doesn't try to decrypt the data.  
> This facilitates allowing tools like distcp to copy raw bytes.
> Access to the raw hierarchy is restricted to superusers.  This seems like an 
> overly broad restriction designed to prevent non-admins from munging the EZ 
> related xattrs.  I believe we should relax the restriction to allow 
> non-admins to perform read-only operations.  Allowing non-superusers to 
> easily read the raw bytes will be extremely useful for regular users, esp. 
> for enabling webhdfs client-side encryption.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-10285) Storage Policy Satisfier in Namenode

2017-12-07 Thread Uma Maheswara Rao G (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283102#comment-16283102
]

Uma Maheswara Rao G edited comment on HDFS-10285 at 12/8/17 7:35 AM:
-

{quote}
Here's a rhetorical question: If managing multiple services is hard, why not
bundle oozie, spark, storm, sqoop, kafka, ranger, knox, hive server, etc in the
same process? Or ZK so HA is easier to deploy/manage?
{quote}
Few of my thoughts on this question, each of these projects build for their own
purpose, with its own spec, not for just for helping HDFS or any other single
project. And none of that projects need to access other project internal data
structures. Where as SPS is only functions for HDFS and access internal data
structures. Even forcibly separated out, we need to expose ‘for SPS only’ RPC
APIs. This strikes me to put a question in other way as well, is it make sense
to separate ReplicationMonitor as one separate process? is it fine to start
EDEK as one separate? is it ok to start other threads (like decommissioning
task) as separate processes and co-ordinate via RPC? so that NameSystem class
may become very light weight? I think its the value vs cost will decide whether
to separate or merge into single.

Coming to ZK part, As ZK is not build only for HDFS, I don’t think to have any
such thoughts. Its general purpose co-ordination system. Technically we can’t
keep monitoring services inside NN, because the worry itself is, NN may die,
need failover and so need external process to monitor. Anyway. I think the
whole discussion is about services inside a project, but not cross projects
itself, IMHO.
Here SPS providing only the missing functionality of HSM, that is end-end
policy satisfaction. So, IMV, for users it may not worth to manage additional
process to achieve that missing functionality for particular feature.

{quote}
Today, I looked at the code more closely. It can hold the lock (read lock, but
still) way too long. Notably, but not limited to, you can’t hold the lock while
doing block placement.
{quote}

Appreciate your review Daryn. I think it should be easy to address. We will
make sure to address the comment before merge? is that make sense.

{quote}
I’m curious why it isn’t just part of the standard replication monitoring. If
the DN is told to replicate to itself, it just does the storage movement.
{quote}
That's a good question. Overall approach is exactly same as RM. RM is has its
own q build up for redundancy blocks, and Underreplication scan/check happens
at block level, it make sense. Where as in SPS, policy changes for file, so all
blocks in that file needs movement and policy check should happen
in-co-ordination with replication blocks where they stored currently. So, we
track the queues at file level here and scan/check all block for that files
together at once. Also , We wanted to provide, on the fly reconfigure feature
and we carefully thought that, we don’t want to interfere replication logic
should be given more priority than SPS work. While scheduling blocks, we
respect xmits counts, they are shared between, RM, SPS for controlling DN load.
Assignment priority given to replication/EC blocks, then SPS blocks, when
sending tasks to DN. So, as part of impact analysis, we thought of keeping SPS
in it's own thread than RM thread would be clean and safer than running in that
same loop of RM.

was (Author: umamaheswararao):
{quote}
Here's a rhetorical question: If managing multiple services is hard, why not
bundle oozie, spark, storm, sqoop, kafka, ranger, knox, hive server, etc in the
same process? Or ZK so HA is easier to deploy/manage?
{quote}
Few of my thoughts on this question, each of these projects build for their own
purpose, with its own spec, not for just for helping HDFS or any other single
project. And none of that projects need to access other project internal data
structures. Where as SPS is only functions for HDFS and access internal data
structures. Even forcibly separated out, we need to expose ‘for SPS only’ RPC
APIs. This strikes me to put a question in other way as well, is it make sense
to separate ReplicationMonitor as one separate process? is it fine to start
EDEK as one separate? is it ok to start other threads (like decommissioning
task) as separate processes and co-ordinate via RPC? so that NameSystem class
may become very light weight? I think its the value vs cost will decide whether
to separate or merge into single.

1 2 >

1 - 100 of 105 matches

Mail list logo