from:"Sanjay Radia \(JIRA\)"

[jira] [Commented] (HDFS-10419) Building HDFS on top of new storage layer (HDSL)

2018-04-03 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424559#comment-16424559
 ] 

Sanjay Radia commented on HDFS-10419:
-

While I prefer HDSS, I would gladly let Anu who has done the bulk of heavy 
lifting in this project to have the final say on the name (unless his name 
choice was very horrible which isn't). The most passionate debates in any 
projects are always the name :)

+1  HDDS -  Hadoop Distributed Data Store.

> Building HDFS on top of new storage layer (HDSL)
> 
>
> Key: HDFS-10419
> URL: https://issues.apache.org/jira/browse/HDFS-10419
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Major
> Attachments: Evolving NN using new block-container layer.pdf
>
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10419) Building HDFS on top of new storage layer (HDSL)

2018-03-30 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420318#comment-16420318
 ] 

Sanjay Radia commented on HDFS-10419:
-

I prefer HDSS :  Hadoop Distributed Storage System

 

sanjay

> Building HDFS on top of new storage layer (HDSL)
> 
>
> Key: HDFS-10419
> URL: https://issues.apache.org/jira/browse/HDFS-10419
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Major
> Attachments: Evolving NN using new block-container layer.pdf
>
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10419) Building HDFS on top of new storage layer (HDSL)

2018-03-16 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403116#comment-16403116
 ] 

Sanjay Radia commented on HDFS-10419:
-

In the " [VOTE] Merging branch HDFS-7240 to trunk" thread [~andrew.wang] asked:
{quote}*Sanjay says*:
 >- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen 
 >acknowledge the >benefit of the >>new block layer).  We have two choices here

>** a) Evolve NN so that it can interact with both old and new block layer,

 >**  b) Fork and create new NN that works only with new block layer, the old 
NN will continue to work with old >>block layer.

>There are trade-offs but clearly the 2nd option has least impact on the old 
>HDFS code.

*Andrew asks*: Are you proposing that we pursue the 2nd option to integrate 
HDSL with HDFS?
{quote}
Originally I would have preferred (a), but Owen made a strong case for (b) in 
my discussions with his last week. I believe approach (a) or (b) will depend 
strongly on what we want to do. For example if we do milestone-1 and get the 2x 
scalability and decide to stop there then clearly go with option (a) - it will 
require little refactoring and one can run old and new HDFS side-by-side. If 
you are planning to follow up milestone-1 with say the caching the working set 
of the namespace, then forking the NN code (ie option b) might be better, and 
the new NN will have to keep pulling over features and bug fixes from the old 
NN.. Konstantine has proposed  other alternatives and we would  evaluate (a) or 
(b) for his alternative.  I am not locked into any particular path or how we 
would do it.

 

> Building HDFS on top of new storage layer (HDSL)
> 
>
> Key: HDFS-10419
> URL: https://issues.apache.org/jira/browse/HDFS-10419
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Major
> Attachments: Evolving NN using new block-container layer.pdf
>
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-7240) Scaling HDFS

2018-01-30 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Summary: Scaling HDFS  (was: Object store in HDFS)

> Scaling HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its 
> scaling challenges and how to address those challenges. Scaling HDFS requires 
> scaling the namespace layer and also the block layer. _This jira provides a 
> new block layer,  Hadoop Distributed Storage Layer (HDSL), that scales the 
> block layer by grouping blocks into containers thereby reducing the 
> block-to-location map and also reducing the number of block reports and their 
> processing_
> _A scalable namespace can be put on top this scalable block layer:_
>  * _HDFS-10419 describes how the existing NN can be modified to use the new 
> block layer._
>  * _HDFS-13074  also provides, as an_ *_interim_* _step; a scalable flat 
> Key-Value namespace  on top of the new block layer; while it does not provide 
> the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, 
> FileContext)._
>  
> Old Description
> This jira proposes to add object store capabilities into HDFS. 
>  As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
>  In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-7240) Object store in HDFS

2018-01-30 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Description: 
[^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides, as an_ *_interim_* _step; a scalable flat 
Key-Value namespace  on top of the new block layer; while it does not provide 
the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, 
FileContext)._

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.

  was:
[Scaling HDFS| x] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
namespace  on top of the new block layer; while it does not provide the HDFS 
API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 

 

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.


> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its 
> scaling challenges and how to address those challenges. Scaling HDFS requires 
> scaling the namespace layer and also the block layer. _This jira provides a 
> new block layer,  Hadoop Distributed Storage Layer (HDSL), that scales the 
> block layer by grouping blocks into containers thereby reducing the 
> block-to-location map and also reducing the number of block reports and their 
> processing_
> _A scalable namespace can be put on top this scalable block layer:_
>  * _HDFS-10419 describes how the existing NN can be modified to use the new 
> block layer._
>  * _HDFS-13074  also provides, as an_ *_interim_* _step; a scalable flat 
> Key-Value namespace  on top of the new block layer; while it does not provide 
> the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, 
> FileContext)._
>  
> Old Description
> This jira proposes to add object store capabilities into HDFS. 
>  As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
>  In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by

[jira] [Updated] (HDFS-7240) Object store in HDFS

2018-01-30 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Description: 
[Scaling HDFS| x] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
namespace  on top of the new block layer; while it does not provide the HDFS 
API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 

 

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.

  was:
[^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
namespace  on top of the new block layer; while it does not provide the HDFS 
API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 

 

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.


> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> [Scaling HDFS| x] describes areas where HDFS does well and its scaling 
> challenges and how to address those challenges. Scaling HDFS requires scaling 
> the namespace layer and also the block layer. _This jira provides a new block 
> layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer 
> by grouping blocks into containers thereby reducing the block-to-location map 
> and also reducing the number of block reports and their processing_
> _A scalable namespace can be put on top this scalable block layer:_
>  * _HDFS-10419 describes how the existing NN can be modified to use the new 
> block layer._
>  * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
> namespace  on top of the new block layer; while it does not provide the HDFS 
> API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 
>  
>  
> Old Description
> This jira proposes to add object store capabilities into HDFS. 
>  As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
>  In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA

[jira] [Updated] (HDFS-7240) Object store in HDFS

2018-01-30 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Description: 
[^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
namespace  on top of the new block layer; while it does not provide the HDFS 
API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 

 

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.

  was:
This jira proposes to add object store capabilities into HDFS. 
As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.






> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its 
> scaling challenges and how to address those challenges. Scaling HDFS requires 
> scaling the namespace layer and also the block layer. _This jira provides a 
> new block layer,  Hadoop Distributed Storage Layer (HDSL), that scales the 
> block layer by grouping blocks into containers thereby reducing the 
> block-to-location map and also reducing the number of block reports and their 
> processing_
> _A scalable namespace can be put on top this scalable block layer:_
>  * _HDFS-10419 describes how the existing NN can be modified to use the new 
> block layer._
>  * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
> namespace  on top of the new block layer; while it does not provide the HDFS 
> API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 
>  
>  
> Old Description
> This jira proposes to add object store capabilities into HDFS. 
>  As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
>  In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-7240) Object store in HDFS

2018-01-30 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Attachment: HDFS Scalability-v2.pdf

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10419) Building HDFS on top of Ozone's storage containers

2018-01-23 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336903#comment-16336903
 ] 

Sanjay Radia commented on HDFS-10419:
-

{quote}As a side note, I didn't understand why you used 50MB blocks in your 
math. The default is 128MB and many people run HDFS with 512MB blocks.
{quote}
While default block size is 128MB in many clusters including the ones at Yahoo 
(at least in 2011 when I left) the actual average block size was 50MB because 
most files had one block and even the first block was not full.

> Building HDFS on top of Ozone's storage containers
> --
>
> Key: HDFS-10419
> URL: https://issues.apache.org/jira/browse/HDFS-10419
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Major
> Attachments: Evolving NN using new block-container layer.pdf
>
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020

2018-01-22 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334743#comment-16334743
 ] 

Sanjay Radia commented on HDFS-12990:
-

For this Jira, I can go either way.
 In this specific case there is a very simple work around for the 2.x customer 
if he reads the release notes and adds a config to use his old port number. 
Anyone going from 2.x to 3.0 should read the release. However I can see that  
other harder cases may come up over the next couple of months where there isn’t 
such a convenient workaround. For example we are likely to come up with new 
issues when we do upgrade testing and there will be similar debates including 
arguments on the importance of supporting rolling upgrade from 2x to 3.0. 
  
 To avoid such debates I suggest we amend our guideline to something of the 
following lines:
  
 On each new major release,  till a tag such as  “stable” is assigned by the 
PMC,  we allow  changes that make the software *compatible with the previous 
major release, even if that change breaks compatibility within the major 
release.* These changes may be for where we accidentally break compatibility or 
even when we concisely made a change without fully understanding or 
appreciating the impact of the incompatible change.  The PMC will determine the 
testing criteria for assigning the stable tag.
  
  
 The above guideline  will reduce such debates during a GA of a Major release.  
Further, users will realize that they need to wait for the stable tag before 
going production.   One might argue that a  GA release without a “stable” tag 
is merely a glorified beta or a GA-candidate. Yes i agree. Perhaps in this case 
we may have rushed from beta to GA a little too early. I think the above forces 
the PMC to seriously determine GA testing criteria and  validate whether or 
they have been met. 

> Change default NameNode RPC port back to 8020
> -
>
> Key: HDFS-12990
> URL: https://issues.apache.org/jira/browse/HDFS-12990
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-12990.01.patch
>
>
> In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all 
> default ports to ephemeral ports, which is very appreciated by admin. As part 
> of that change, we also modified the NN RPC port from the famous 8020 to 
> 9820, to be closer to other ports changed there.
> With more integration going on, it appears that all the other ephemeral port 
> changes are fine, but the NN RPC port change is painful for downstream on 
> migrating to Hadoop 3. Some examples include:
> # Hive table locations pointing to hdfs://nn:port/dir
> # Downstream minicluster unit tests that assumed 8020
> # Oozie workflows / downstream scripts that used 8020
> This isn't a problem for HA URLs, since that does not include the port 
> number. But considering the downstream impact, instead of requiring all of 
> them change their stuff, it would be a way better experience to leave the NN 
> port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary 
> upgrade burdens.
> It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to 
> switch the port back.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10419) Building HDFS on top of Ozone's storage containers

2017-12-23 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302631#comment-16302631
 ] 

Sanjay Radia commented on HDFS-10419:
-

Correct, NN is not part of the consensus with the new block-container layer -- 
that is actually goodness - the two layers are decoupled and help 
reduce/eliminate global  locks  in state management in NN . In current HDFS the 
NN drives the consensus on failures. Also upon client death the NN closes an 
open-file  after establishing length; in the new world it will also have to 
close an open-file but,  as you state, by an explicit request asking the 
container replicas for the block-length. 

> Building HDFS on top of Ozone's storage containers
> --
>
> Key: HDFS-10419
> URL: https://issues.apache.org/jira/browse/HDFS-10419
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: Evolving NN using new block-container layer.pdf
>
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-12952) Change OzoneFS's semtics to allow readers to see file content while being written

2017-12-20 Thread Sanjay Radia (JIRA)

Sanjay Radia created HDFS-12952:
---

 Summary: Change OzoneFS's semtics to allow readers to see file 
content while being written
 Key: HDFS-12952
 URL: https://issues.apache.org/jira/browse/HDFS-12952
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Sanjay Radia
Assignee: Anu Engineer


Currently Ozone KSM give visibility to a file only when the file is closed, 
which is similar to S3 FS. OzoneFs should allow partial file visibility as a 
file is being written to match HDFS.

Note this should be fairly straightforward because the block-container layer 
maintains block length consistency as a block is being written even under 
failures due to its use of Raft.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10419) Building HDFS on top of Ozone's storage containers

2017-12-20 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298728#comment-16298728
 ] 

Sanjay Radia commented on HDFS-10419:
-

The new block-container layer allows partial visibility of a block even before 
the block has finalized an closed. It maintains block length consistency 
without the help of the namespace layer (KSM or NN) as a block is being written 
even under failures due to its use of Raft. Hence the NN, when plugged into the 
new block-container layer, will be fine - we can get the HDFS semantics.

You are right that the Ozone KSM give visibility to a file only when the file 
is closed, which is similar to S3 FS.  I will create  a jira to fix that so 
that the OzoneFs matches  HDFS for partail file visibility as a file is being 
written.




> Building HDFS on top of Ozone's storage containers
> --
>
> Key: HDFS-10419
> URL: https://issues.apache.org/jira/browse/HDFS-10419
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: Evolving NN using new block-container layer.pdf
>
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-7240) Object store in HDFS

2017-12-19 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297454#comment-16297454
 ] 

Sanjay Radia commented on HDFS-7240:


One of issues raised is that connecting the NN to the new block-container layer 
will be very difficult because removing the FSN/BM lock is challenging.
I have attached a doc [Evolving NN using new block container 
layer|https://issues.apache.org/jira/secure/attachment/12902931/Evolving%20NN%20using%20new%20block-container%20layer.pdf]
 to  HDFS-10419 that describes 2 milestones for connecting the NN to the new 
block-container layer. The first one does *not* require removing the FSN/BM 
lock and still gives close to 2x scalability because the block map (which 
becomes the container map) is reduced significantly.

I would still like to also point out (as stated above) and in the doc that the 
new block-container layer keeps a consistent state using Raft and hence 
eliminates the coupling between the namespace layer and block layer and that 
the 2nd milestone of removing the FSN/BM lock is much easier with the new block 
layer. If you disagree with my lock argument, then the first milestone get good 
scalability without removing the lock.

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, 
> HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, 
> HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch, 
> MeetingMinutes.pdf, Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, 
> ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10419) Building HDFS on top of Ozone's storage containers

2017-12-19 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-10419:

Attachment: Evolving NN using new block-container layer.pdf

I have attached a doc that describes how the existing NN can be modified to 
plug in the new block-container layer provided by HDFS-7240. Two key milestone 
are describe: First milestone is where the Container Map is kept in NN (gets us 
to almost 2x scalability since container map is 1/40th of original block map 
assuming an average actual block size of 50MB); this milestone does NOT require 
removing the FSN/BM lock. The 2nd milestone is where the container map and 
block management is completely removed which gets us to 2x scalability. After 
the 2nd milestone, the NN can be evolved in several directions for further 
scalability.


> Building HDFS on top of Ozone's storage containers
> --
>
> Key: HDFS-10419
> URL: https://issues.apache.org/jira/browse/HDFS-10419
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: Evolving NN using new block-container layer.pdf
>
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-7240) Object store in HDFS

2017-12-19 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Attachment: (was: Evolving NN using new block-container layer.pdf)

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, 
> HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, 
> HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch, 
> MeetingMinutes.pdf, Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, 
> ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Issue Comment Deleted] (HDFS-7240) Object store in HDFS

2017-12-19 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Comment: was deleted

(was: I have attached a doc that   describes   how   the   existing   NN   can  
 be   modified   to   plug   in   the   new block-container   layer provided by 
HDFS-7240. Two   key milestone   are   describe:   First   milestone   is   
where   the   Container Map   is   kept   in   NN   (gets   us   to   almost   
2x   scalability   since   container   map   is   1/40th   of original   block  
 map assuming an  *average actual* block size of 50MB); this milestone does NOT 
require removing the FSN/BM lock.  The   2nd   milestone   is   where   the   
container   map   and   block management   is   completely   removed   which   
gets   us   to   2x   scalability.   After   the   2nd milestone,   the   NN   
can   be   evolved   in   several   directions   for   further   scalability.)

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, 
> HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, 
> HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch, 
> MeetingMinutes.pdf, Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, 
> ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-7240) Object store in HDFS

2017-12-19 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Attachment: Evolving NN using new block-container layer.pdf

I have attached a doc that   describes   how   the   existing   NN   can   be   
modified   to   plug   in   the   new block-container   layer provided by 
HDFS-7240. Two   key milestone   are   describe:   First   milestone   is   
where   the   Container Map   is   kept   in   NN   (gets   us   to   almost   
2x   scalability   since   container   map   is   1/40th   of original   block  
 map assuming an  *average actual* block size of 50MB); this milestone does NOT 
require removing the FSN/BM lock.  The   2nd   milestone   is   where   the   
container   map   and   block management   is   completely   removed   which   
gets   us   to   2x   scalability.   After   the   2nd milestone,   the   NN   
can   be   evolved   in   several   directions   for   further   scalability.

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: Evolving NN using new block-container layer.pdf, HDFS 
> Scalability and Ozone.pdf, HDFS-7240.001.patch, HDFS-7240.002.patch, 
> HDFS-7240.003.patch, HDFS-7240.003.patch, HDFS-7240.004.patch, 
> HDFS-7240.005.patch, HDFS-7240.006.patch, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-7240) Object store in HDFS

2017-11-21 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261791#comment-16261791
]

Sanjay Radia commented on HDFS-7240:

Ozone Cloudera Meeting Date: Thursday, November 16th 2017
Location: online conferencing

Attendees: ATM, Andrew, Anu, Aaron Fabbri, Jitendra, Sanjay, Sean Mackrory,
other listeners on the phone

Main discussion centered around:
* Wouldn't Ozone be better off as a separate project?
* Why should it be merged now?

Discussion: (This incorporate Andrew’s minutes and adds to it.)

* Anu: Don't want to have this separate since it confuses people about the
long-term vision of Ozone. It's intended as block management for HDFS.
* Andrew: In its current state, Ozone cannot be plugged into the NN as the
BM layer, so it seems premature to merge. Can't benefit existing users, and
they can't test it.
* Response: The Ozone block layer is at a good integration point, and we
want to move on with the NameNode integration as new block layer. Benefits via
KV namespace/FileSystemAPI is there and completely usable for Hive and Spark
apps.
* Andrew: We can do the FSN/BM lock split without merging Ozone. Separate
efforts. This lock split is also a major effort by itself, and is a dangerous
change. It's something that should be baked in production.
* Sanjay: Agree that the lock split should be done in branch. But disagree on
how hard it will be. The split was hard in past but will be easier with new
block layer: one of the key reasons for the coupling of Block-layer to
Namespace layer is that the block length of the each replica at block close
time, esp under failures, has to be consistent. This is done in the central NN
today (due to lack of raft/paxos like protocol in the original block layer).
The block-container layer uses raft for consistency and no longer needs a
central agent like the NN. Then new block-layers built-in consistent state
management simplifies the separation.
* Sanjay: Ozone developers "willing to take the hit" of the slow Hadoop release
cadence. Want to make this part of HDFS since it's easier for users to test and
consume without installing a new cluster.
* ATM: Can still share the same hardware, and run the Ozone daemons
alongside.
* Sanjay countered this
* Sanjay: Want to keep Ozone block management inside the Datanode process to
enable various synergies such as sharing the new netty based protocol engy or
fast-copy between HDFS and Ozone. Not all data needs all the HDFS features like
encryption, erasure coding, etc, and this data could be stored in Ozone.
* Andrew: This fast-copy hasn't been implemented or discussed yet. Unclear
if it'll work at all with existing HDFS block management. Won't work with
encryption or erasure coding. Not clear whether it requires being in the same
DN process even.
* It does have to work with encryption and EC to give value. It can work
with non-encrypted and non EC which are majority of blocks in most Hadoop
clusters. We will provide a design of the shallow-copy.
Sanjay/Anu: Ozone is also useful to test with just the key-value interface.
It's a Hadoop-compatible FileSystem, so apps many apps such as Hive and Spark
can work also on Ozone since they have or ensured that they work well on KV
flat namespace.
* Andrew: If it provides a new API and doesn't support the HDFS feature-set,
doesn't this support it being its own project?
* Sanjay - It provides the EXISTING Hadoop FileSystem interface now. Note
customers are used to have different parts of the namespace(s) having different
features: Customers have asked for Zones with different features enabled [ see
summary - to avoid duplication].
* AaronF: Ozone is a lot of new code and Hadoop already has so much code.. It
is better to have separate projects and not add to Hadoop/HDFS.
Sanjay: Agree it is lot of code. Sometimes, we often have to add siginficant
new code for a project move forward. We have tried to incrementally work around
HDFS Scaling, the NN’s manageability and slow startup issues. This new code
base fundamentally moves us forward in addressing the long standing issues.
Besides this “lots of new code” argument can be used later to prevent the merge
of the projects.

Summary:
There is agreement that the new block-container layer is a good way to solve
the block scaling issue of HDFS. There is no consensus on merging the branch
in vs fork Ozone into a new project. The main objection to merging into HDFS is
that integrating the new block-container layer with the exiting NN will be a
very hard project since the lock split in the NN is very challenging.

Cloudera’s team perspective: (taken from Anderew’s minutes)
* Ozone could be its own project and integrated later, or remain on an HDFS
branch. There are benefits to Ozone being a separate project. Can release
faster, iterate more quickly on feedback, and

[jira] [Commented] (HDFS-10419) Building HDFS on top of Ozone's storage containers

2017-11-03 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238379#comment-16238379
 ] 

Sanjay Radia commented on HDFS-10419:
-

HDFS-5389 describes one approach of building a NN that scales its namespace 
better than the current NN. 
It proposes caching only working set namespace in memory; also see [HUG - 
Removing Namenode's 
Limitation|https://www.slideshare.net/ydn/hadoop-meetup-hug-august-2013-removing-the-namenodes-memory-limitation].
 Independent studies have also analysed  LRU caching of HDFS Metadata  
[Metadata Traces and Workload Models for Evaluating Big Storage 
Systems|https://www.slideshare.net/ydn/hadoop-meetup-hug-august-2013-removing-the-namenodes-memory-limitation]
 This approach works because in spite of having large amounts of data (say data 
for the last five years) most of the data that is accessed is recent (say last 
3-9 months); hence the working set can fit in memory.

> Building HDFS on top of Ozone's storage containers
> --
>
> Key: HDFS-10419
> URL: https://issues.apache.org/jira/browse/HDFS-10419
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Major
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-7240) Object store in HDFS

2017-11-03 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Attachment: HDFS Scalability and Ozone.pdf

I have added a document that explains a design for scaling HDFS and how Ozone 
paves the way towards the full solution.

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, 
> HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, 
> HDFS-7240.004.patch, Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, 
> ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9244) Support nested encryption zones

2016-01-19 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107238#comment-15107238
 ] 

Sanjay Radia commented on HDFS-9244:


The main motivation for nested EZ is root + subdirs as per Andrew's comment. Is 
it such a big deal for an admin to set up EZ as he creates the directories in 
dirs? I think nested encryption will complicate things like volumes down the 
road and I don't think this extra complexity is necessary.
I will comment the volumes jira drive that discussion to a conclusion. 

> Support nested encryption zones
> ---
>
> Key: HDFS-9244
> URL: https://issues.apache.org/jira/browse/HDFS-9244
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: encryption
>Reporter: Xiaoyu Yao
>Assignee: Zhe Zhang
> Attachments: HDFS-9244.00.patch, HDFS-9244.01.patch
>
>
> This JIRA is opened to track adding support of nested encryption zone based 
> on [~andrew.wang]'s [comment 
> |https://issues.apache.org/jira/browse/HDFS-8747?focusedCommentId=14654141=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14654141]
>  for certain use cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-8747) Provide Better "Scratch Space" and "Soft Delete" Support for HDFS Encryption Zones

2016-01-19 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106889#comment-15106889
 ] 

Sanjay Radia edited comment on HDFS-8747 at 1/19/16 7:23 PM:
-

Posted on wrong jira by mistake (moving comment to HDFS-9244)
-The main motivation for nested EZ is root + subdirs as per Andrew's comment. 
Is it such a big deal for an admin to set up EZ as he creates the directories 
in dirs? I think nested encryption will complicate things like volumes down the 
road and I don't think this extra complexity is necessary.-


was (Author: sanjay.radia):
The main motivation for nested EZ is root + subdirs as per Andrew's comment. Is 
it such a big deal for an admin to set up EZ as he creates the directories in 
dirs? I think nested encryption will complicate things like volumes down the 
road and I don't think this extra complexity is necessary.

> Provide Better "Scratch Space" and "Soft Delete" Support for HDFS Encryption 
> Zones
> --
>
> Key: HDFS-8747
> URL: https://issues.apache.org/jira/browse/HDFS-8747
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 2.6.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-8747-07092015.pdf, HDFS-8747-07152015.pdf, 
> HDFS-8747-07292015.pdf
>
>
> HDFS Transparent Data Encryption At-Rest was introduced in Hadoop 2.6 to 
> allow create encryption zone on top of a single HDFS directory. Files under 
> the root directory of the encryption zone will be encrypted/decrypted 
> transparently upon HDFS client write or read operations. 
> Generally, it does not support rename(without data copying) across encryption 
> zones or between encryption zone and non-encryption zone because different 
> security settings of encryption zones. However, there are certain use cases 
> where efficient rename support is desired. This JIRA is to propose better 
> support of two such use cases “Scratch Space” (a.k.a. staging area) and “Soft 
> Delete” (a.k.a. trash) with HDFS encryption zones.
> “Scratch Space” is widely used in Hadoop jobs, which requires efficient 
> rename support. Temporary files from MR jobs are usually stored in staging 
> area outside encryption zone such as “/tmp” directory and then rename to 
> targeted directories as specified once the data is ready to be further 
> processed. 
> Below is a summary of supported/unsupported cases from latest Hadoop:
> * Rename within the encryption zone is supported
> * Rename the entire encryption zone by moving the root directory of the zone  
> is allowed.
> * Rename sub-directory/file from encryption zone to non-encryption zone is 
> not allowed.
> * Rename sub-directory/file from encryption zone A to encryption zone B is 
> not allowed.
> * Rename from non-encryption zone to encryption zone is not allowed.
> “Soft delete” (a.k.a. trash) is a client-side “soft delete” feature that 
> helps prevent accidental deletion of files and directories. If trash is 
> enabled and a file or directory is deleted using the Hadoop shell, the file 
> is moved to the .Trash directory of the user's home directory instead of 
> being deleted.  Deleted files are initially moved (renamed) to the Current 
> sub-directory of the .Trash directory with original path being preserved. 
> Files and directories in the trash can be restored simply by moving them to a 
> location outside the .Trash directory.
> Due to the limited rename support, delete sub-directory/file within 
> encryption zone with trash feature is not allowed. Client has to use 
> -skipTrash option to work around this. HADOOP-10902 and HDFS-6767 improved 
> the error message but without a complete solution to the problem. 
> We propose to solve the problem by generalizing the mapping between 
> encryption zone and its underlying HDFS directories from 1:1 today to 1:N. 
> The encryption zone should allow non-overlapped directories such as scratch 
> space or soft delete "trash" locations to be added/removed dynamically after 
> creation. This way, rename for "scratch space" and "soft delete" can be 
> better supported without breaking the assumption that rename is only 
> supported "within the zone". 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8747) Provide Better "Scratch Space" and "Soft Delete" Support for HDFS Encryption Zones

2016-01-19 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106889#comment-15106889
 ] 

Sanjay Radia commented on HDFS-8747:


The main motivation for nested EZ is root + subdirs as per Andrew's comment. Is 
it such a big deal for an admin to set up EZ as he creates the directories in 
dirs? I think nested encryption will complicate things like volumes down the 
road and I don't think this extra complexity is necessary.

> Provide Better "Scratch Space" and "Soft Delete" Support for HDFS Encryption 
> Zones
> --
>
> Key: HDFS-8747
> URL: https://issues.apache.org/jira/browse/HDFS-8747
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 2.6.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-8747-07092015.pdf, HDFS-8747-07152015.pdf, 
> HDFS-8747-07292015.pdf
>
>
> HDFS Transparent Data Encryption At-Rest was introduced in Hadoop 2.6 to 
> allow create encryption zone on top of a single HDFS directory. Files under 
> the root directory of the encryption zone will be encrypted/decrypted 
> transparently upon HDFS client write or read operations. 
> Generally, it does not support rename(without data copying) across encryption 
> zones or between encryption zone and non-encryption zone because different 
> security settings of encryption zones. However, there are certain use cases 
> where efficient rename support is desired. This JIRA is to propose better 
> support of two such use cases “Scratch Space” (a.k.a. staging area) and “Soft 
> Delete” (a.k.a. trash) with HDFS encryption zones.
> “Scratch Space” is widely used in Hadoop jobs, which requires efficient 
> rename support. Temporary files from MR jobs are usually stored in staging 
> area outside encryption zone such as “/tmp” directory and then rename to 
> targeted directories as specified once the data is ready to be further 
> processed. 
> Below is a summary of supported/unsupported cases from latest Hadoop:
> * Rename within the encryption zone is supported
> * Rename the entire encryption zone by moving the root directory of the zone  
> is allowed.
> * Rename sub-directory/file from encryption zone to non-encryption zone is 
> not allowed.
> * Rename sub-directory/file from encryption zone A to encryption zone B is 
> not allowed.
> * Rename from non-encryption zone to encryption zone is not allowed.
> “Soft delete” (a.k.a. trash) is a client-side “soft delete” feature that 
> helps prevent accidental deletion of files and directories. If trash is 
> enabled and a file or directory is deleted using the Hadoop shell, the file 
> is moved to the .Trash directory of the user's home directory instead of 
> being deleted.  Deleted files are initially moved (renamed) to the Current 
> sub-directory of the .Trash directory with original path being preserved. 
> Files and directories in the trash can be restored simply by moving them to a 
> location outside the .Trash directory.
> Due to the limited rename support, delete sub-directory/file within 
> encryption zone with trash feature is not allowed. Client has to use 
> -skipTrash option to work around this. HADOOP-10902 and HDFS-6767 improved 
> the error message but without a complete solution to the problem. 
> We propose to solve the problem by generalizing the mapping between 
> encryption zone and its underlying HDFS directories from 1:1 today to 1:N. 
> The encryption zone should allow non-overlapped directories such as scratch 
> space or soft delete "trash" locations to be added/removed dynamically after 
> creation. This way, rename for "scratch space" and "soft delete" can be 
> better supported without breaking the assumption that rename is only 
> supported "within the zone". 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8888) Support volumes in HDFS

2015-08-18 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702393#comment-14702393
]

Sanjay Radia commented on HDFS-:

There are several motivations for introducing Volumes to HDFS.

Simplify management and implementation
* Volumes make the management of some HDFS features simpler: Quotas,
Encryption, Snapshots can become volume properties rather than properties of
individual directories. As a unit of management, Volumes also offers strong
isolations in the security settings.
* It can simplify the implementation of some them. For example if we don’t
allow renaming across a volume boundary then Snapshots’ implementation become
easier. Will customers accept this restriction? Won’t some apps like Hive have
to change since they rename from temp to final destination? Recall we disallow
renames across encryption zones and customers have found that acceptable.
Further, we changed Hive to deal with this restriction.
* Volumes can also simplify the management of datasets. For example one can
associate different other policies for volumes. For example one can setup
backup policies across DR zones based on volumes.

Isn’t it more flexible to have features like encryption, snapshots on
arbitrary directories? Having a car with independent steering for each wheel is
more flexible, but steering 2 wheels together makes a car easier to control.
Volumes, while restricting the granularity, will simplify management and also
the implementation.

*Relation to Federation*
How are volumes related to Federation? Currently in federation, each NN has a
single volume. This Jira will allow each NN to have multiple volumes. Volumes
adds to the Federation model. One can distribute/load balance volumes across
NNs. Further it allows N+K failover especially when we add partial namespace
caching (HDFS-). (More on this later.)

Other things to explore with Volumes (outside the scope of this Jira)
* Each volume could become its own RW lock with in the NN. This would improve
parallelism within NN without much additional effort.
* Each volume could have its own image/journal to allow relocation of a volume
to another NN (see federation).
* Associate storage policies with a volume such as the volume is backed by
the same storage. The semantic allows new features like co-located data.

Support volumes in HDFS
---

Key: HDFS-
URL: https://issues.apache.org/jira/browse/HDFS-
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Haohui Mai

There are multiple types of zones (e.g., snapshottable directories,
encryption zones, directories with quotas) which are conceptually close to
namespace volumes in traditional file systems.
This jira proposes to introduce the concept of volume to simplify the
implementation of snapshots and encryption zones.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-8888) Support volumes in HDFS

2015-08-18 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702393#comment-14702393
]

Sanjay Radia edited comment on HDFS- at 8/19/15 3:58 AM:
-

There are several motivations for introducing Volumes to HDFS.

*Other things to explore with Volumes* (outside the scope of this Jira)
* Each volume could become its own RW lock with in the NN. This would improve
parallelism within NN without much additional effort.
* Each volume could have its own image/journal to allow relocation of a volume
to another NN (see federation).
* Associate storage policies with a volume such as the volume is backed by
the same storage. The semantic allows new features like co-located data.

was (Author: sanjay.radia):
There are several motivations for introducing Volumes to HDFS.

Support volumes in HDFS
---

Key: HDFS-
URL: https://issues.apache.org/jira/browse/HDFS-
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Haohui Mai

There

[jira] [Commented] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-06-16 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588681#comment-14588681
]

Sanjay Radia commented on HDFS-7923:

bq. This change is really helpful during startup on big clusters. In the past
we have seen restarting all the DNs at once on a several hundred node cluster
bring the NN to its knees.

There is already a random backoff for the initial block report. You can
configure the initial BR backoff time. When that jira was done there was a
proposal to give each DN a different backoff time depending on the number of
outstanding BRs; this enhancement was not done at that time because this
backoff worked very well. For a several hundred node cluster the initial BR
backoff time should be approx 60sec.

The DataNodes should rate-limit their full block reports by asking the NN on
heartbeat messages
---

Key: HDFS-7923
URL: https://issues.apache.org/jira/browse/HDFS-7923
Project: Hadoop HDFS
Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Fix For: 2.8.0

Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch,
HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch,
HDFS-7923.006.patch, HDFS-7923.007.patch

The DataNodes should rate-limit their full block reports. They can do this
by first sending a heartbeat message to the NN with an optional boolean set
which requests permission to send a full block report. If the NN responds
with another optional boolean set, the DN will send an FBR... if not, it will
wait until later. This can be done compatibly with optional fields.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-06-16 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588931#comment-14588931
]

Sanjay Radia commented on HDFS-7923:

Starvation I didn't literally mean starvation but was more concerned about
fairness and about safety. Can a DN's block report be delayed for some
significant period of time or due to subtle bug even long times. Our current
implementation is very resilient - DNs just sent the BRs at a specific period
irrespective of the NN. Does your design have a safety net - say a DN will
wait a max of 2 periods to get permission (or something like that).

The DataNodes should rate-limit their full block reports by asking the NN on
heartbeat messages
---

Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch,
HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch,
HDFS-7923.006.patch, HDFS-7923.007.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-06-15 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587054#comment-14587054
 ] 

Sanjay Radia commented on HDFS-7923:


How will you ensure that a particular DN does not get staved? i.e. How do you 
guarantee that BRs will get through. HDFS depends on periodic BRs for 
correctness. I recall in  discussions with Facebook where they changed their 
HDFS for incremental BRs but still kept full BRs at a lower frequency just for 
safety.

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.8.0

 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch, 
 HDFS-7923.006.patch, HDFS-7923.007.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-06-15 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587163#comment-14587163
 ] 

Sanjay Radia commented on HDFS-7923:


Have you considered a pull model (NN pulls) which does not risk starvation?

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.8.0

 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch, 
 HDFS-7923.006.patch, HDFS-7923.007.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8401) Memfs - a layered file system for in-memory storage in HDFS

2015-05-28 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-8401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564043#comment-14564043
]

Sanjay Radia commented on HDFS-8401:

Consider the following use case: one wants to run a few jobs and cache the
input and the intermediate output just for the duration of these jobs. Today
the user has to pin such data by changing the dir-file attributes, and when the
jobs are finished he has to reset the attributes. It is easier to say jobxxx
input = memfs://.../input tmp=memfs://.../tmpdir output=. Here setting the
scheme is not inconvenient since it is part of parameters to a program.
Further this works with any existing application - Hive, Pig etc since the hint
to cache is in the scheme of the pathname. Our existing policies and dir level
setting work when things are semi-permanent (ie this dir has dimension tables
and please cache them - all jobs will benefit). In addition we could add or
already have programmatic APIs to indicate that a file being read or written
needs to be cached. But this requires change to the application code. Once we
get fully automated memory caching working we will not need our existing
storage policies nor layers like memfs since the system will just take care of
it all - but it will take us some time to get there.

I think both approaches have their own strengths and are complementary. Note
spark-tachyon uses a layered file system and the approach is viewed as a simple
way to control which files get cached on a per-job basis.

Further one can also cache specific Hive tables in hive meta store by giving a
path name that has the memfs-scheme. Here the memfs-pathname or setting the
dirs attribute are roughly equal from a ease-of-usage perspective.

An additional point about memfs for non-hdfs systems: the Memfs *abstraction*
allows caching S3 data in a very similar fashion. Of course one will have to
build a full caching implementation of memfs for S3 because the memfs proposed
in this Jira is very very thin layer over HDFS because ALL the caching
mechanism is already in HDFS. So I expect several implementation of the memfs
interface for HCFS file systems.

Memfs - a layered file system for in-memory storage in HDFS
---

Key: HDFS-8401
URL: https://issues.apache.org/jira/browse/HDFS-8401
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

We propose creating a layered filesystem that can provide in-memory storage
using existing features within HDFS. memfs will use lazy persist writes
introduced by HDFS-6581. For reads, memfs can use the Centralized Cache
Management feature introduced in HDFS-4949 to load hot data to memory.
Paths in memfs and hdfs will correspond 1:1 so memfs will require no
additional metadata and it can be implemented entirely as a client-side
library.
The advantage of a layered file system is that it requires little or no
changes to existing applications. e.g. Applications can use something like
{{memfs://}} instead of {{hdfs://}} for files targeted to memory storage.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8241) Remove unused Namenode startup option FINALIZE

2015-04-27 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514487#comment-14514487
 ] 

Sanjay Radia commented on HDFS-8241:


Yes it was unfortunate that this incompatibility was missed.
Q. do folks feel that a startup -finalize option to NN is a good interface in 
ADDITION to  admin command  to finalize?
Clearly we needs the admin command to finalize since one does not want to 
restart the NN to finalize. 

 Remove unused Namenode startup option  FINALIZE
 -

 Key: HDFS-8241
 URL: https://issues.apache.org/jira/browse/HDFS-8241
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Attachments: HDFS-8241.patch


 Command : hdfs namenode -finalize
 15/04/24 22:26:23 INFO namenode.NameNode: createNameNode [-finalize]
  *Use of the argument 'FINALIZE' is no longer supported.*  To finalize an 
 upgrade, start the NN  and then run `hdfs dfsadmin -finalizeUpgrade'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8075) Revist layout version

2015-04-07 Thread Sanjay Radia (JIRA)

Sanjay Radia created HDFS-8075:
--

 Summary: Revist layout version
 Key: HDFS-8075
 URL: https://issues.apache.org/jira/browse/HDFS-8075
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: Sanjay Radia


Background
* HDFS image layout was changed to use Protobufs to allow easier forward and 
backward compatibility.
* Hdfs has a layout version which is changed on each change (even if it an  
optional protobuf field was added).
* Hadoop supports two ways of going back during an upgrade:
**  downgrade: go back to old binary version but use existing image/edits so 
that newly created files are not lost
** rollback: go back to checkpoint created before upgrade was started - hence 
newly created files are lost.

Layout needs to be revisited if we want to support downgrade is some 
circumstances which we dont today. Here are use cases:
* Some changes can support downgrade even though they was a change in layout 
since there is not real data loss but only loss of new functionality. E.g. when 
we added ACLs one could have downgraded - there is no data loss but you will 
lose the newly created ACLs. That is acceptable for a user since one does not 
expect to retain the newly added ACLs in an old version.
* Some changes may lead to data-loss if the functionality was used. For 
example, the recent truncate will cause data loss if the functionality was 
actually used. Now one can tell admins NOT use such new such new features till 
the upgrade is finalized in which case one could potentially support downgrade.
* A fairly fundamental change to layout where a downgrade is not possible but a 
rollback is. Say we change the layout completely from protobuf to something 
else. Another example is when HDFS moves to support partial namespace in memory 
- they is likely to be a fairly fundamental change in layout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-8075) Revist layout version

2015-04-07 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-8075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sanjay Radia resolved HDFS-8075.

Resolution: Duplicate

Closed: duplicate of HDFS-5223

Revist layout version
-

Key: HDFS-8075
URL: https://issues.apache.org/jira/browse/HDFS-8075
Project: Hadoop HDFS
Issue Type: Bug
Components: HDFS
Affects Versions: 2.6.0
Reporter: Sanjay Radia

Background
* HDFS image layout was changed to use Protobufs to allow easier forward and
backward compatibility.
* Hdfs has a layout version which is changed on each change (even if it an
optional protobuf field was added).
* Hadoop supports two ways of going back during an upgrade:
** downgrade: go back to old binary version but use existing image/edits so
that newly created files are not lost
** rollback: go back to checkpoint created before upgrade was started -
hence newly created files are lost.
Layout needs to be revisited if we want to support downgrade is some
circumstances which we dont today. Here are use cases:
* Some changes can support downgrade even though they was a change in layout
since there is not real data loss but only loss of new functionality. E.g.
when we added ACLs one could have downgraded - there is no data loss but you
will lose the newly created ACLs. That is acceptable for a user since one
does not expect to retain the newly added ACLs in an old version.
* Some changes may lead to data-loss if the functionality was used. For
example, the recent truncate will cause data loss if the functionality was
actually used. Now one can tell admins NOT use such new such new features
till the upgrade is finalized in which case one could potentially support
downgrade.
* A fairly fundamental change to layout where a downgrade is not possible but
a rollback is. Say we change the layout completely from protobuf to something
else. Another example is when HDFS moves to support partial namespace in
memory - they is likely to be a fairly fundamental change in layout.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5223) Allow edit log/fsimage format changes without changing layout version

2015-04-07 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483654#comment-14483654
]

Sanjay Radia commented on HDFS-5223:

For the edits one could require that in order to downgrade you must do a
save-image and then delete the null edits-log. We would then limit our
solution to the image. For the image we could do the following
* Add a *second* layout version field (call it compatible-layout-version)
that indicates which version can safely read the image without data-loss. A NN
that starts up will compare this field with its current layout version and then
proceed as long as the edits is null.
** The ACL example (see Jira description) will state that the previous version
can safely read the image without data loss. Of course newly created ACLs would
be lost.
** Truncate example is tricky: one can safely downgrade if the truncate
operation was not used. We could add code to not allow such new features till
finalize is done. This is somewhat analogous to what ext3 was trying to do
with its superblock feature flags (see Todd's comment above); what I am
proposing is slightly different since it limits such features till upgrade is
finalized while ext3's approach is more general in that you can downgrade at
anytime as long as you have used the feature. Alternatively, we could simply
not support downgrade for such a feature and simply mark the
compatible-layout-version accordingly.

Allow edit log/fsimage format changes without changing layout version
-

Key: HDFS-5223
URL: https://issues.apache.org/jira/browse/HDFS-5223
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 2.1.1-beta
Reporter: Aaron T. Myers
Assignee: Colin Patrick McCabe
Attachments: HDFS-5223.004.patch

Currently all HDFS on-disk formats are version by the single layout version.
This means that even for changes which might be backward compatible, like the
addition of a new edit log op code, we must go through the full `namenode
-upgrade' process which requires coordination with DNs, etc. HDFS should
support a lighter weight alternative.
Copied description from HDFS-8075 which is a duplicate and now closed.
Background
* HDFS image layout was changed to use Protobufs to allow easier forward and
backward compatibility.
* Hdfs has a layout version which is changed on each change (even if it an
optional protobuf field was added).
* Hadoop supports two ways of going back during an upgrade:
** downgrade: go back to old binary version but use existing image/edits so
that newly created files are not lost
** rollback: go back to checkpoint created before upgrade was started -
hence newly created files are lost.
Layout needs to be revisited if we want to support downgrade is some
circumstances which we dont today. Here are use cases:
* Some changes can support downgrade even though they was a change in layout
since there is not real data loss but only loss of new functionality. E.g.
when we added ACLs one could have downgraded - there is no data loss but you
will lose the newly created ACLs. That is acceptable for a user since one
does not expect to retain the newly added ACLs in an old version.
* Some changes may lead to data-loss if the functionality was used. For
example, the recent truncate will cause data loss if the functionality was
actually used. Now one can tell admins NOT use such new such new features
till the upgrade is finalized in which case one could potentially support
downgrade.
* A fairly fundamental change to layout where a downgrade is not possible but
a rollback is. Say we change the layout completely from protobuf to something
else. Another example is when HDFS moves to support partial namespace in
memory - they is likely to be a fairly fundamental change in layout.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8075) Revist layout version

2015-04-07 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-8075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483602#comment-14483602
]

Sanjay Radia commented on HDFS-8075:

Here is proposal:
* Add a *second* layout version field (call it compatible-layout-version)
that states which version can safely read the image without data-loss.
** The ACL example will state that the previous version can safely read the
image without data loss.
** Truncate example is tricky: one can safely downgrade if the truncate
operation was not used. We could add code to not allow such new features till
finalize is done. Or we could say don't support downgrade for such a feature
and simply mark the compatible-layout-version accordingly.

Revist layout version
-

Key: HDFS-8075
URL: https://issues.apache.org/jira/browse/HDFS-8075
Project: Hadoop HDFS
Issue Type: Bug
Components: HDFS
Affects Versions: 2.6.0
Reporter: Sanjay Radia

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-5223) Allow edit log/fsimage format changes without changing layout version

2015-04-07 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483654#comment-14483654
]

Sanjay Radia edited comment on HDFS-5223 at 4/7/15 6:29 PM:

Add a *second* layout version field (call it compatible-layout-version) that
indicates which version can safely read the image without data-loss. A NN that
starts up will compare this field with its current layout version and then
proceed as long as the edits is null.
* The ACL example (see Jira description) will state that the previous version
can safely read the image without data loss. Of course newly created ACLs would
be lost.
* Truncate example is tricky: one can safely downgrade if the truncate
operation was not used. We could add code to not allow such new features till
finalize is done. This is somewhat analogous to what ext3 was trying to do
with its superblock feature flags (see Todd's comment above); what I am
proposing is slightly different since it limits such features till upgrade is
finalized while ext3's approach is more general in that you can downgrade at
anytime as long as you have used the feature. We could also do the following
slight variation if finalize seems too arbitrary: once you use the new
feature (e.g, truncate) simply change the complatible-layout-version to be
the current one and this will safely prevent older binaries from reading it.
Of course alternatively, we could simply not support downgrade for such a
feature and simply mark the compatible-layout-version accordingly.

was (Author: sanjay.radia):
For the edits one could require that in order to downgrade you must do a
save-image and then delete the null edits-log. We would then limit our
solution to the image. For the image we could do the following
* Add a *second* layout version field (call it compatible-layout-version)
that indicates which version can safely read the image without data-loss. A NN
that starts up will compare this field with its current layout version and then
proceed as long as the edits is null.
** The ACL example (see Jira description) will state that the previous version
can safely read the image without data loss. Of course newly created ACLs would
be lost.
** Truncate example is tricky: one can safely downgrade if the truncate
operation was not used. We could add code to not allow such new features till
finalize is done. This is somewhat analogous to what ext3 was trying to do
with its superblock feature flags (see Todd's comment above); what I am
proposing is slightly different since it limits such features till upgrade is
finalized while ext3's approach is more general in that you can downgrade at
anytime as long as you have used the feature. Alternatively, we could simply
not support downgrade for such a feature and simply mark the
compatible-layout-version accordingly.

Allow edit log/fsimage format changes without changing layout version
-

[jira] [Updated] (HDFS-5223) Allow edit log/fsimage format changes without changing layout version

2015-04-07 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sanjay Radia updated HDFS-5223:
---
Description:
Currently all HDFS on-disk formats are version by the single layout version.
This means that even for changes which might be backward compatible, like the
addition of a new edit log op code, we must go through the full `namenode
-upgrade' process which requires coordination with DNs, etc. HDFS should
support a lighter weight alternative.

Copied description from HDFS-8075 which is a duplicate and now closed.
Background
* HDFS image layout was changed to use Protobufs to allow easier forward and
backward compatibility.
* Hdfs has a layout version which is changed on each change (even if it an
optional protobuf field was added).
* Hadoop supports two ways of going back during an upgrade:
** downgrade: go back to old binary version but use existing image/edits so
that newly created files are not lost
** rollback: go back to checkpoint created before upgrade was started - hence
newly created files are lost.

Layout needs to be revisited if we want to support downgrade is some
circumstances which we dont today. Here are use cases:
* Some changes can support downgrade even though they was a change in layout
since there is not real data loss but only loss of new functionality. E.g. when
we added ACLs one could have downgraded - there is no data loss but you will
lose the newly created ACLs. That is acceptable for a user since one does not
expect to retain the newly added ACLs in an old version.
* Some changes may lead to data-loss if the functionality was used. For
example, the recent truncate will cause data loss if the functionality was
actually used. Now one can tell admins NOT use such new such new features till
the upgrade is finalized in which case one could potentially support downgrade.
* A fairly fundamental change to layout where a downgrade is not possible but a
rollback is. Say we change the layout completely from protobuf to something
else. Another example is when HDFS moves to support partial namespace in memory
- they is likely to be a fairly fundamental change in layout.

was:Currently all HDFS on-disk formats are version by the single layout
version. This means that even for changes which might be backward compatible,
like the addition of a new edit log op code, we must go through the full
`namenode -upgrade' process which requires coordination with DNs, etc. HDFS
should support a lighter weight alternative.

Allow edit log/fsimage format changes without changing layout version
-

[jira] [Commented] (HDFS-8075) Revist layout version

2015-04-07 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-8075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483606#comment-14483606
]

Sanjay Radia commented on HDFS-8075:

Oops. I will close this as duplicate. I will copy the description of this Jira
to the HDFS-5223 since it has some good examples.

Revist layout version
-

Key: HDFS-8075
URL: https://issues.apache.org/jira/browse/HDFS-8075
Project: Hadoop HDFS
Issue Type: Bug
Components: HDFS
Affects Versions: 2.6.0
Reporter: Sanjay Radia

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5223) Allow edit log/fsimage format changes without changing layout version

2015-04-07 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483704#comment-14483704
]

Sanjay Radia commented on HDFS-5223:

The above solution was inspired by Hive's ORC. They have two complementary
mechanisms to address dealing with old and new binaries. They specify the
oldest version that can safely read the new data (which inspired the solution i
gave above) and also new binaries can write in older format. This second
mechanim is too burdensome for HDFS. Instead I would prefer to disable the new
new features after which one cannot downgrade.

Allow edit log/fsimage format changes without changing layout version
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client

2015-03-03 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345657#comment-14345657
]

Sanjay Radia commented on HDFS-6200:

+++1 for this proposal.

Create a separate jar for hdfs-client
-

Key: HDFS-6200
URL: https://issues.apache.org/jira/browse/HDFS-6200
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch,
HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch,
HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch

Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs
client. As discussed in the hdfs-dev mailing list
(http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
downstream projects are forced to bring in additional dependency in order to
access hdfs. The additional dependency sometimes can be difficult to manage
for projects like Apache Falcon and Apache Oozie.
This jira proposes to create a new project, hadoop-hdfs-cliient, which
contains the client side of the hdfs code. Downstream projects can use this
jar instead of the hadoop-hdfs to avoid unnecessary dependency.
Note that it does not break the compatibility of downstream projects. This is
because old downstream projects implicitly depend on hadoop-hdfs-client
through the hadoop-hdfs jar.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7745) HDFS should have its own daemon command and not rely on the one in common

2015-02-06 Thread Sanjay Radia (JIRA)

Sanjay Radia created HDFS-7745:
--

 Summary: HDFS should have its own daemon command  and not rely on 
the one in common
 Key: HDFS-7745
 URL: https://issues.apache.org/jira/browse/HDFS-7745
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sanjay Radia


HDFS should have its own daemon command and not rely on the one in common.  BTW 
Yarn split out its own daemon command during project split. Note the 
hdfs-command does have --daemon flag and hence the daemon script is merely a 
wrapper. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6469) Coordinated replication of the namespace using ConsensusNode

2014-08-26 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111655#comment-14111655
]

Sanjay Radia commented on HDFS-6469:

My thoughts:
* I do believe that Paxos based NN would give faster failover than what NN HA
offers today (30sec to a few minutes but typically no more than 1 minute or
two). So this is clearly a benefit of CNode though I have not heard a single
customer complain about the failover time so far.
* The proposed solution does not increase the write throughput.
* The parallel reads advantage of CNode can be achieved in the current HA
setup with some work (this is discussed above). If this is the main benefit
than I rather pursue enhancing the NN standby to support reads. Further there
is existing on going work to improve the locking in the NN.
* I share Todd's view that ZK is not a usable reference implementation for
Paxos. One really needs a paxos library that can be plugged in rather than an
external server-based solution like ZK.

So at this stage I am having a hard time seeing the benefits to justify the
costs of adding this complexity. I do however understand the overhead that
Wandisco faces in integrating their solution with HDFS each time HDFS is
modified. Would a few plugin interfaces make it easier? I would be more than
happy to support adding such plugins if they would help.

Coordinated replication of the namespace using ConsensusNode

Key: HDFS-6469
URL: https://issues.apache.org/jira/browse/HDFS-6469
Project: Hadoop HDFS
Issue Type: New Feature
Components: namenode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
Attachments: CNodeDesign.pdf

This is a proposal to introduce ConsensusNode - an evolution of the NameNode,
which enables replication of the namespace on multiple nodes of an HDFS
cluster by means of a Coordination Engine.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-15 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099243#comment-14099243
]

Sanjay Radia commented on HDFS-6134:

We have made very good progress over the last few days. Thanks for taking the
time for the offline technical discussions. Below is a summary of the
concerns I have raised previously in this Jira.
# Fix distcp and cp to *automatically* deal with EZ using /r/r internally.
Initially we need to support only row 1 and row 4 in the table I attached
in Hadoop-10919
# Fix Webhdfs to use KMS delegation tokens so that webhdfs can be used with
transparent encryption without giving user hdfs KMS proxy permission (and as
a result to admins). Rest is a key protocol for HDFS and for many Hadoop use
cases, an Admin should not have access to the keys of encrypted files.
# Further work on specifying what HAR should do (I have listed some use cases
and proposed solutions ), and then follow it up with a fix to har.
# Some work on understanding availability and scalability on KMS for medium to
large clusters. Perhaps we need to explore getting the keys ahead of time when
a job is submitted.

Lets complete Items 1 and 2 promptly. Before we publish transparent encryption
in a 2.x release for pubic consumption, let us at least complete item 1 (ie
distcp and cp) and the flag to turn this feature on/of.

Transparent data at rest encryption
---

Key: HDFS-6134
URL: https://issues.apache.org/jira/browse/HDFS-6134
Project: Hadoop HDFS
Issue Type: New Feature
Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch,
HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf,
HDFSDataatRestEncryptionProposal_obsolete.pdf,
HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf

Because of privacy and security regulations, for many industries, sensitive
data at rest must be in encrypted form. For example: the healthcare industry
(HIPAA regulations), the card payment industry (PCI DSS regulations) or the
US government (FISMA regulations).
This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can
be used transparently by any application accessing HDFS via Hadoop Filesystem
Java API, Hadoop libhdfs C library, or WebHDFS REST API.
The resulting implementation should be able to be used in compliance with
different regulation requirements.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-15 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099311#comment-14099311
 ] 

Sanjay Radia commented on HDFS-6134:


Alejandro.  Wrt to the subtle difference between webhfs vs httpfs, can an admin 
grab the EDEKs and raw files and then log into the httpfs machine become user 
httpfs and then trick the KMS to  decrypt the keys because httpfs has proxy 
setting?

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-14 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096614#comment-14096614
 ] 

Sanjay Radia commented on HDFS-6134:


I get your point about client-side code for webhdfs. I do agree that   httpfs 
is a proxy but do you want it to have blanket access to all keys?

My main concern is that this jira completely breaks webhdfs. Do you find that 
acceptable?? There are so many users of this protocol.
BTW did you see my earlier attempt at the solution (13.06 today) - does that 
work?

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-14 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096617#comment-14096617
 ] 

Sanjay Radia commented on HDFS-6134:


Alejandro, can you please summarize your explanation for why during file 
creation, NN requests the KMS to create a new EDEK rather then having the 
client do it. Suresh raised the same concern that I did at our meeting 
yesterday. Thanks

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-14 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097330#comment-14097330
]

Sanjay Radia commented on HDFS-6134:

Had a chat with Owen over the wehbhdfs issue and the solution I had proposed in
[comment |
https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14096027page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14096027].
He said that restricting the client connections from user hdfs are not
necessary: the DN does a doAs(user) . KMS is configured for hdfs to be proxy
but it also blacklists hdfs (and other superusers). That is the DN as a proxy
cannot get a key for hdfs but it can get the keys for other users. So this
brings the httpfs and webhdfs solutions to be the same.

Owen proposed another solution where the httpfs or DN daemons do *not* need to
be trusted proxies for the KMS. The user simply passes a KMS delegation token
in the REST request (we already pass HDFS delegation tokens).

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-14 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097610#comment-14097610
 ] 

Sanjay Radia commented on HDFS-6134:


Context: making things work for cp, distcp, har, etc.
Is the following true:
 the EZ master key (EZKey) is only needed for file creation in EZ subtree. 
After that for reading  or appending to a file, one simple needs the file's 
individual key. If that is true then one can copy raw encrypted files and their 
keys from an EZ to tape, har, tar, etc  and then restore them later and  things 
would just work. Also can one copy raw encrypted files and their keys from an 
EZ  to another EZ which has a different EZKey and again things would work?

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-14 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097687#comment-14097687
]

Sanjay Radia commented on HDFS-6134:

Soem thoughts on the Har use cases and possible outcomes:
1) Har a subtree and the subtree contains an EZ.
2) Har a subtree rooted at the EZ
3) Har a subtree within an EZ
Typically the subtree is replaced by the har itself, though not required. The
Har is read only.
The operation can be performed by an admin or by a user.

Use case 1 - copy the raw files and the keys into the HAR (ie the files inside
the HAR remain encrypted). When files are accessed from the Har filesystem the
same machinery as for HDFS EZ should come to play to allow transparent
decryption of the files. A user with no KMS permission will not be able to
decrypt. Someone with read access to the HAR will be able to get to the raw
files and their keys (how does this compare to the normal HDFS EZ?)
Use case 2 - same as 1.
Use case 3. If the har is copied elsewhere (ie it does not replace the
subtree) then same as 1. If it does replace subtree the HAR will be encrypted
once again (ie double encryption).

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HDFS-6134) Transparent data at rest encryption

2014-08-14 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097687#comment-14097687
]

Sanjay Radia edited comment on HDFS-6134 at 8/14/14 9:23 PM:
-

Some thoughts on the Har use cases and possible outcomes:
1) Har a subtree and the subtree contains an EZ.
2) Har a subtree rooted at the EZ
3) Har a subtree within an EZ
Typically the subtree is replaced by the har itself, though not required. The
Har is read only.
The operation can be performed by an admin or by a user.

was (Author: sanjay.radia):
Soem thoughts on the Har use cases and possible outcomes:
1) Har a subtree and the subtree contains an EZ.
2) Har a subtree rooted at the EZ
3) Har a subtree within an EZ
Typically the subtree is replaced by the har itself, though not required. The
Har is read only.
The operation can be performed by an admin or by a user.

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-14 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097875#comment-14097875
]

Sanjay Radia commented on HDFS-6134:

bq. Had a chat with Owen over the wehbhdfs issue and the solution I had
proposed in comment . He said that restricting the client connections from user
hdfs are not necessary: the DN does a doAs(user) . KMS is configured for hdfs
to be proxy but it also blacklists hdfs (and other superusers). That is the DN
as a proxy cannot get a key for hdfs but it can get the keys for other users.
So this brings the httpfs and webhdfs solutions to be the same.

The above does not work: an admin can login in as hdfs and then pretend to be
the NN/DN and use the proxy privilege to get DEKs from EDEKs (an admin can read
EDEKs easily). (Alejandro - thanks for the explanation - i finally get the
distinction between webhdfs and httpfs)

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-13 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095204#comment-14095204
 ] 

Sanjay Radia commented on HDFS-6134:


Larry I don't completely get the difference between webhdfs and httpfs but I 
think the cause of the difference is that user hdfs is superuser (note DN runs 
as hdfs and  webhdfs code is executed on behalf of the end-user inside the DN 
after checking the permissions), Hence I think  this would potentially open up 
access to all encrypted files that are readable. However that should NOT happen 
if doAs is used (correct?). 

I agree it would be unacceptable to say that if one enables transparent 
encryption then one should disable webhdfs because it would become insecure, 
Andrew  say that Regarding webhdfs, it's not a recommended deployment but 
Aljeandro  say Both httpfs and webhdfs will work just fine  but then in the 
same paragraph says this could fail some security audits.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-13 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096027#comment-14096027
 ] 

Sanjay Radia commented on HDFS-6134:


Alejandro, if we treat user hdfs as a special user such that the  HDFS system 
will not accept any client connections from  hdfs then does this solve this 
problem?. An Admin will not be able to connect as user hdfs but can connect 
as user ClarkKent where  ClarkKent is in the superuser group of hdfs so 
that the admin can do his job as superuser.  It does means that we are trusting 
the HDFS code to be correct in not abusing its access to keys since it has 
proxy authority with KMS (this was not required so far.)

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HDFS-6134) Transparent data at rest encryption

2014-08-13 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096027#comment-14096027
 ] 

Sanjay Radia edited comment on HDFS-6134 at 8/13/14 8:19 PM:
-

Alejandro, a potential solution:  treat user hdfs as a special user such that 
the  HDFS system will NOT accept any client connections from  hdfs. An Admin 
will not be able to connect as user hdfs but can connect as user, say,  
ClarkKent where  ClarkKent is in the superuser group of hdfs so that the 
admin can do his job as superuser.  It does means that we are trusting the HDFS 
code to be correct in not abusing its access to keys since it has proxy 
authority with KMS (this was not required so far.)


was (Author: sanjay.radia):
Alejandro, if we treat user hdfs as a special user such that the  HDFS system 
will not accept any client connections from  hdfs then does this solve this 
problem?. An Admin will not be able to connect as user hdfs but can connect 
as user ClarkKent where  ClarkKent is in the superuser group of hdfs so 
that the admin can do his job as superuser.  It does means that we are trusting 
the HDFS code to be correct in not abusing its access to keys since it has 
proxy authority with KMS (this was not required so far.)

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-13 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096570#comment-14096570
]

Sanjay Radia commented on HDFS-6134:

bq. If you set up httpfs, it runs using the 'httpfs' user, a HDFS regular user
configured as proxyuser to interact with HDFS and KMS doing doAs calls
Alejandro , we modified the original design in this Jira so that the NN is not
a proxy for the keys but instead the client get the keys directly from the KMS
because the best practice in encryption is to eliminate proxies (see Owen's
comment of June 11). With your proposal for httpfs, the httpfs server is a
proxy to get the keys. Perhaps we are approaching the problem wrong. Consider
the following alternative: let webhdfs and httpfs simply send the encrypted raw
data to the client. For the hdfs-native filesystem, the encryption and
decryption happens on the client side; we should consider the same for the
rest protocol. Clearly it requires more code on the rest client side.

BTW the webhdfs-fileSystem (as opposed to the rest protocol that is discussed
about) has a client side library that can mimic the hdfs filesystem's client
side.

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-12 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094485#comment-14094485
]

Sanjay Radia commented on HDFS-6134:

bq. Regarding webhdfs, it's not a recommended deployment.
The design document in this jira already states that webhdfs just works:
* This Jira provides encryption for HDFS data at rest and allows any
application to access it via the Hadoop Filesystem Java API, Hadoop libhdfs C
library, or WebHDFS REST API.
* For HDFS WebHDFS, the DataNodes act as the HDFS client reading/writing files
since that is where encryption/decryption will happen. For HttpFS, the HttpFS
server acts as the HDFS client reading/writing files, since that is where
encryption/decryption will happen.

webhdfs not working is worrying because REST is used by many users who do not
want to deploy hadoop binaries or want to use a non-java client.
Also I do not understand why httpfs works and webhdfs breaks. Neither will
be running as the end-user and hence neither will allow transparent encryption.
Am I missing something?

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-12 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094527#comment-14094527
 ] 

Sanjay Radia commented on HDFS-6134:


bq. Regarding HAR, could you lay out the usecase ...
Alejandro summarize the problem and also the solution of modifying har in his 
comment of June 24th  
https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14042797page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14042797
Andrew you are missing one of the usage models of HAR: The user creating the 
har is not the only user accessing the har - har is a general tool used by an 
admin to compact files and replace the original.

I can think of at least the following  use cases so far :
* A subtree being har'ed has subtree that is EZ - some files in the har will be 
encrypted and some will not. The reader should be able to transparently read 
each of the two kinds 
* A subtree being har'ed is part of subtree that is EZ  - the whole har should 
be encrypted and transparently decrypted when its contents are read.
* A user har's a non-EZ subtree and copies it into a EZ  - should just work as 
you suggest the whole thing is encrypted and requires that the user has access 
to the keys to read the har.



 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-12 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094613#comment-14094613
 ] 

Sanjay Radia commented on HDFS-6134:


Alejandro - for both webhdfs and httpfs to work your proposal is that users 
hdfs and httpfs have access to any key (you mention only webhdfs in your 
comment but I suspect you meant both). However with this approach webhdfs and 
httpfs will then all access to ALL EZ files to users that have read access.
Correct? This would be unacceptable.

I believe the better solution is for webhdfs and httpfs to access the file by 
doing a doAs(endUser). 

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HDFS-6134) Transparent data at rest encryption

2014-08-12 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094613#comment-14094613
]

Sanjay Radia edited comment on HDFS-6134 at 8/13/14 4:58 AM:
-

Alejandro - for both webhdfs and httpfs to work your proposal is that users
hdfs and httpfs have access to any key (you mention only webhdfs in your
comment but I suspect you meant both). However with this approach webhdfs and
httpfs will give access to ALL EZ files to users that have read access.
Correct? This would be unacceptable.

I believe the better solution is for webhdfs and httpfs to access the file by
doing a doAs(endUser).

was (Author: sanjay.radia):
Alejandro - for both webhdfs and httpfs to work your proposal is that users
hdfs and httpfs have access to any key (you mention only webhdfs in your
comment but I suspect you meant both). However with this approach webhdfs and
httpfs will then all access to ALL EZ files to users that have read access.
Correct? This would be unacceptable.

I believe the better solution is for webhdfs and httpfs to access the file by
doing a doAs(endUser).

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093538#comment-14093538
]

Sanjay Radia edited comment on HDFS-6134 at 8/12/14 12:19 AM:
--

bq. Charles posted a design doc for how distcp will work with encryption at
HDFS-6509.
I did a quick glance over it. We also need to do the same for har. I think the
same .raw should work ...

was (Author: sanjay.radia):
.bq. Charles posted a design doc for how distcp will work with encryption at
HDFS-6509.
I did a quick glance over it. We also need to do the same for har. I think the
same .raw should work ...

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093538#comment-14093538
]

Sanjay Radia commented on HDFS-6134:

.bq. Charles posted a design doc for how distcp will work with encryption at
HDFS-6509.
I did a quick glance over it. We also need to do the same for har. I think the
same .raw should work ...

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093579#comment-14093579
]

Sanjay Radia commented on HDFS-6134:

Wrt to webhdfs, the document says that the decryption/encryption will happen in
the Datanode.
* Will the DN be able to access the key necessary to do this?
* The data will be transmitted in the clear - is that what we want? For the
normal HDFS API the decryption/encryption happens at the client side.
* There are two aspects to Webhdfs: the rest client and the webhdfs Filesystem.
Have you considered both use cases?
* Will distcp work via webhdfs? Customers often use webhdfs instead of hdfs
for cross-cluster copies.

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093583#comment-14093583
 ] 

Sanjay Radia commented on HDFS-6134:


One of the items raised at the meeting and summarized by Owen in his meeting 
minutes comment (june 26) is the scalability concern. How is that being 
addressed? Can a  job client get the keys prior to job submission?

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0, 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
 HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
 HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6469) Coordinated replication of the namespace using ConsensusNode

2014-07-15 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062502#comment-14062502
 ] 

Sanjay Radia commented on HDFS-6469:


bq. I meant that if you use QJM then every update on the NameNode results in 
writing into two journals: first into edits log and then into QJM journal. 
Konstantine, HDFS has supported parallel journals (ie multiple editlogs for a 
long time.) that are written in parallel. A customer can use just QJM (which 
gives at least 3 replicas) and can optionally have a local parallel editlog if 
they want additional redundancy. What you are proposing is dual *serial* 
journals. 

 Coordinated replication of the namespace using ConsensusNode
 

 Key: HDFS-6469
 URL: https://issues.apache.org/jira/browse/HDFS-6469
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: namenode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Attachments: CNodeDesign.pdf


 This is a proposal to introduce ConsensusNode - an evolution of the NameNode, 
 which enables replication of the namespace on multiple nodes of an HDFS 
 cluster by means of a Coordination Engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6469) Coordinated replication of the namespace using ConsensusNode

2014-07-15 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062559#comment-14062559
 ] 

Sanjay Radia commented on HDFS-6469:


Todd said:
bq. a fully usable solution would be available to the community at large, 
whereas the design you're proposing seems like it will only be usably 
implemented by a proprietary extension (I don't consider the ZK reference 
implementation likely to actually work in a usable fashion).

Konstanine I had mentioned exactly the above point to you at the Hadoop summit 
Europe.  ZK is a coordination service and for this to be practical it needs to 
be an inline Paxos protocol. We had also discussed 2 potential  paxos libraries 
 that could come into open source: I believe Facebook has one that they may 
contribute and CMU has one called E-Paxos; if either of these become available 
then it addresses this particular issue. I have no objections to a customer 
going to Wandisco for the enterprise supported  version, but if the community 
is going to maintain such an extension then there needs to a practical, 
in-production-usable  free solution; sending offline messages to a coordinator 
service  for each transaction is not usable. Lets discuss the performance part 
in a separate comment. Let me comment on your comparisons to  the topology and 
windows examples that the community supported in the past:
* Topology - these changes allowed Hadoop to be used on containers such as VMs. 
** Both KVM and VirtualBox offer free VM solutions - the customer does not need 
to buy ESX.  
** The topology solution would will also help with a Docker container 
deployment which is freely available and offers even better performance than 
VMs. 
** Hadoop is commonly used in cloud environment (e.g. AWS, or Azure, or 
Altiscale) which all use VMs or containers
** Further, it was recognized that while, in the past, we had considered racks 
to be a failure zone, that there could be other failure zones: nodes (for the 
case of VMs or containers on a host) and also groups of machines.
* Windows - this was done for platform support which is very different than 
what we are talking about here; many open source solutions support multiple 
platforms to enable the widest adoption. BTW Hadoop supported windows via 
cygwin but we made it first class since the initial support via cygwin was 
messy. 

 Coordinated replication of the namespace using ConsensusNode
 

 Key: HDFS-6469
 URL: https://issues.apache.org/jira/browse/HDFS-6469
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: namenode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Attachments: CNodeDesign.pdf


 This is a proposal to introduce ConsensusNode - an evolution of the NameNode, 
 which enables replication of the namespace on multiple nodes of an HDFS 
 cluster by means of a Coordination Engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6469) Coordinated replication of the namespace using ConsensusNode

2014-07-01 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049227#comment-14049227
 ] 

Sanjay Radia commented on HDFS-6469:


Wrt to double journaling
bq. If I follow your logic correctly QJM being Paxos-based uses a journal by 
itself, so we are not increasing journaling here. When you look at the bigger 
picture we see more journals around. HBase uses WAL along with NN edits, which 
by itself persisted in ext4 a journaling file system.

Konstatine, Todd's point is not that there are multiple journal in the system 
but that every update operation of NN, will result in an entry in two journals: 
HDFS's edit log and the journal used  by the consensusNode  paxos protocol.  
Your example of HBases log and NN log is not a good comparison: every write to 
the HBase WAL does NOT result in a HDFS editlog entry - an entry is made in the 
HDFS editlog ONLY when the WAL crosses a block boundary. 

 Coordinated replication of the namespace using ConsensusNode
 

 Key: HDFS-6469
 URL: https://issues.apache.org/jira/browse/HDFS-6469
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: namenode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Attachments: CNodeDesign.pdf


 This is a proposal to introduce ConsensusNode - an evolution of the NameNode, 
 which enables replication of the namespace on multiple nodes of an HDFS 
 cluster by means of a Coordination Engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-26 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045415#comment-14045415
 ] 

Sanjay Radia commented on HDFS-6134:


Noticed the rename restriction for encryption zone. In the past rename was one 
of the main objection to volumes (ie volumes should not restrict renames). I 
think we should bite the bullet and introduce the notion of volumes and use 
encryption as the first use case for volumes (ie encryption zone become 
encrypted volume). Snapshot can also benefit from volume-rename restriction 
because supporting rename across snapshots is very hard to support.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-24 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042527#comment-14042527
 ] 

Sanjay Radia commented on HDFS-6134:


bq. Can you be a bit more specific on HAR breaking?
Har copies subtree data into tar like structure. Har lets you access in the 
individual files transparently - all the work is done on the client side -- the 
NN is not involved and hence will  not be able to  hand out the encrypted keys 
or key versions. It is possible that Har can be changed work but I am merely 
pointing out that I don't think har will work as is with the changes proposed 
in this Jira.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-24 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042636#comment-14042636
 ] 

Sanjay Radia commented on HDFS-6134:


Alejandro - sorry I should have explained the HAR example better: consider a 
subtree which has a  file called E that is encrypted and the rest normal. Now 
the user decides to har the subtree. The file E needs to remain encrypted 
inside the har; also when E is accessed from the har it needs to be 
transparently unencrypted. BTW this might be fixable by changing Har.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-24 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042789#comment-14042789
 ] 

Sanjay Radia commented on HDFS-6134:


bq. The NN gives the HDFS client the encrypted DEK \[unique data encryption key 
of the file\] and the keyVersion ID
Alejandro - isn't it sufficient to hand out a keyname rather than the encrypted 
DEK?

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-23 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040873#comment-14040873
]

Sanjay Radia commented on HDFS-6134:

Aaron said:
bq. distcp... I disagree - this is exactly what one wants ..
So you are saying that distcp should decrypt and re-encrypt data as it copies
it ... most backup tools do not this as they copy data - it is extra CPU
resources and further unneeded venerability. There are customer use cases where
distcp not over an encrypted channel; hence if one of the files being copied
is encrypted one may not want the file to be transparently sent decrypted.
Further, a sensitive file in a subtree may have been encrypted because the
subtree is readable by a larger group and hence the distcp user may not have
access to the keys.

bq. delegation tokens - KMS ... Owen and Tucu have already discussed this quite
a bit above
Turns out this issue come up in discussion with Owen, and he shares the concern
and suggested that I post the concern. Besides even if Alejandro and Owen are
in agreement, my question is relevant and has not been raised so far above:
Encryption is used to overcome limitations of authorization and authentication
in the system. It is relevant to ask if the use of delegation tokens to obtain
keys adds weakness.

bq. meeting ...
Aaron .. you are misunderstanding my point. I am not saying that the discussion
on this jira have not been open.
* See Alejandro's comments: Todd Lipcon and I had an offline discussion with
Andrew Purtell, Yi Liu and Avik Dey and After some offline discussions with
Yi, Tianyou, ATM, Todd, Andrew and Charles ...
** there have been such meetings and I have *no objections* to such private
meetings because I know that the bandwidth helps. I am merely asking for one
more meeting where I can quickly come up to speed on the context that
Alejandro, Todd, Yi, Tianyou, Andrew, Atm, share. It will help me and others
better understand the viewpoint that some of you share due to prevous high
bandwidth meetings.

** There is a precedent of HDFS meetings in spite of open jira discussion -
higher bandwidth to progress faster.
**Perhaps I should have worded the private meetings differently ... sorry if
it came across the wrong way.

Transparent data at rest encryption
---

Key: HDFS-6134
URL: https://issues.apache.org/jira/browse/HDFS-6134
Project: Hadoop HDFS
Issue Type: New Feature
Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf,
HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-23 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041429#comment-14041429
 ] 

Sanjay Radia commented on HDFS-6134:


I believe the transparent encryption will break the HAR file system.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-23 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041449#comment-14041449
]

Sanjay Radia commented on HDFS-6134:

bq. Vanilla distcp will just work with transparent encryption. Data will be
decrypted on read and encrypted on write, assuming both source and target are
in encrypted zones. ...The proposal on changing distcp is to enable a second
use used case.
Alejandro, Aaron the general practice is not to give the admins running
distcp access to keys. Hence, as you suggest, we could change distcp so that
it does not use transparent decryption by default; however, there may be other
such backup tools and applications that customers and other vendors may have
written and we would be breaking them. This may also break the HAR filesystem.

Aaron, you took on a very strong position that transparent
decryption/reencryption is is exactly what one wants. I am missing this -
what are the use cases for distcp where one wants transparent
decryption/reencryption?

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-19 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037737#comment-14037737
]

Sanjay Radia commented on HDFS-6134:

bq. On the distcp not accessing the keys (not decrypting/encrypting), yes, that
is the idea.
Alejandro not sure if I understand what you mean by the above. Are you saying
that distcp and other tools/applications that copy and backup data will have be
changed to do something different when the file is encrypted?
In a sense, this Jira's attempt to provide transparent encryption, is breaking
existing transparency.

Two other questions:
* Are you relying in the the kerberos credentials OR delegation tokens to
obtain the keys? Isn't using the delegation token to obtain keys reducing
security?
* Looks like the proposal relies on file-ACLs to hand out keys - part of the
motivation for using encryption is that ACLs are often correctly set.

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-19 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038099#comment-14038099
]

Sanjay Radia commented on HDFS-6134:

* distcp and such tools and applications
bq. Vanilla distcp will just work with transparent encryption.
This is not what one wants - distcp will not necessarily have permission in
decrypt.

* delegation tokens - KMS will accept delegation tokens - again I don't think
this is what one wants - can the keys be obtained at job submission time?

* File ACLs
bq. The NN gives the HDFS client the encrypted DEK and the keyVersion ID.
I assume the NN will hand this out based on the file ACL. Does the above reduce
the security?

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-19 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038106#comment-14038106
 ] 

Sanjay Radia commented on HDFS-6134:


There are a complex set of issues to be addressed. I know that a bunch of you 
have had some private meetings discussing the various options and tradeoffs. 
Can we please have a short more public meeting next week? I can organize and 
host this at Hortonworks along with Google plus for those that are remote. How 
about next thursday at 1:30pm?

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataAtRestEncryption.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5851) Support memory as a storage medium

2014-06-18 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-5851:
---

Attachment: SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf

Slightly updated doc on DDM:
* Clarified the separation of mechanism from DDM policy as per Colin's and my 
comment on being able to create memory cached file anywhere in the HDFS 
namespace.
 * Explained how DDMs fit  with materialized queries (julains DIMMQ).
 * Updated  references marked TBD
  * Minor improvements to the RDD/Tachyon comparison.
Note this text was written prior to Ali and Li's comment and hence does  
not address their concern. I am rereading the Tachyon paper and will be meeting 
Li over the next couple of days and will update further as needed.

 Support memory as a storage medium
 --

 Key: HDFS-5851
 URL: https://issues.apache.org/jira/browse/HDFS-5851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf


 Memory can be used as a storage medium for smaller/transient files for fast 
 write throughput.
 More information/design will be added later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5851) Support memory as a storage medium

2014-06-17 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034632#comment-14034632
 ] 

Sanjay Radia commented on HDFS-5851:


Colin wrote:
bq. why a separate namespace under hdfs://namespace/.reserved/ddm ? We have 
xattrs now, so files ...
I did not explain it well. It is a separation of policy and mechanism. HDFS has 
to support such files for ANY name. Hence we can use xattr to create files 
write cache.

The policy of managing the memory space and the underlying swap space (e.g. 
hdfs://namepace/.reserved/ddm) is separate from the write-cache mechanism that 
HDFS needs to support in ANY part of its namespace; so I believe we are in 
agreement here. I will explain the policy I am proposing in a separate comment.

 Support memory as a storage medium
 --

 Key: HDFS-5851
 URL: https://issues.apache.org/jira/browse/HDFS-5851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf


 Memory can be used as a storage medium for smaller/transient files for fast 
 write throughput.
 More information/design will be added later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5851) Support memory as a storage medium

2014-04-29 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984544#comment-13984544
 ] 

Sanjay Radia commented on HDFS-5851:


BTW we will host the meeting at Hortonworks for those that are local and want 
to attend in person:
Hortonworks
3460 W. Bayshore Rd
Palo Alto CA 94303


 Support memory as a storage medium
 --

 Key: HDFS-5851
 URL: https://issues.apache.org/jira/browse/HDFS-5851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf


 Memory can be used as a storage medium for smaller/transient files for fast 
 write throughput.
 More information/design will be added later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HDFS-5851) Support memory as a storage medium

2014-04-28 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981608#comment-13981608
]

Sanjay Radia edited comment on HDFS-5851 at 4/28/14 10:23 PM:
--

Added comparison to Tachyon in the doc. The is also an implementation
difference that I don't cover (Tachyon I believe uses RamFs rather than a
memory that is mapped to a HDFS file -- but need to verify that).

I have reproduced the text from the updated doc here for convenience:
Recently, Spark has added an RDD implementation called Tachyon [4]. Tachyon is
outside the address space of an application and allows sharing RDDs across
applications. Both Tachyon and DDMs use memory mapped files and lazy writing to
reduce the need to recompute. Tachyon, since it is an RDD implementation,
records the computation in order to regenerate the data in case of loss whereas
DDMs relies on the application to regenerate. Tachyon and RDDs do not have a
notion of discardability, which is fundamental to DDMs where data can be
discarded when it is under memory and/or backing store pressure. DDMs are
closest to virtual memory/anti-caching in that they virtualize memory, with the
twist that data can be discarded.

was (Author: sanjay.radia):
Added comparison to Tachyon in the doc. The is also an implementation
difference that I don't cover (Tachyon I believe uses RamFs rather than a
memory that is mapped to a HDFS file -- but need to verify that).

Support memory as a storage medium
--

Key: HDFS-5851
URL: https://issues.apache.org/jira/browse/HDFS-5851
Project: Hadoop HDFS
Issue Type: Sub-task
Components: datanode
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Attachments:
SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf,
SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf

Memory can be used as a storage medium for smaller/transient files for fast
write throughput.
More information/design will be added later.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5851) Support memory as a storage medium

2014-04-25 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-5851:
---

Attachment: SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf

Added comparison to Tachyon in the doc. The is also an implementation 
difference that I don't cover (Tachyon I believe uses RamFs rather than a 
memory that is mapped to a HDFS file -- but need to verify that).

 Support memory as a storage medium
 --

 Key: HDFS-5851
 URL: https://issues.apache.org/jira/browse/HDFS-5851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf


 Memory can be used as a storage medium for smaller/transient files for fast 
 write throughput.
 More information/design will be added later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HDFS-5851) Support memory as a storage medium

2014-04-25 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981608#comment-13981608
]

Sanjay Radia edited comment on HDFS-5851 at 4/25/14 9:10 PM:
-

Support memory as a storage medium
--

Memory can be used as a storage medium for smaller/transient files for fast
write throughput.
More information/design will be added later.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5851) Support memory as a storage medium

2014-04-24 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-5851:
---

Attachment: SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf

Please see attached document that identifies some use cases and a proposal for 
using memory for intermediate data. We introduce the notion of Discardable 
Distributed Memory (DDM) that exploit the property the data can be 
reconstructed. Further, by using HDFS files as a backing store to which DDM 
data is lazily written, we give the impression of much larger memory size and 
also give the system an additional degree of freedom to manage the scarce 
memory resource. The main implementation mechanism is memory-mapped files that 
are lazily replicated; this mechanism provides weak-persistence which may have 
other direct use cases beyond DDMs.

 Support memory as a storage medium
 --

 Key: HDFS-5851
 URL: https://issues.apache.org/jira/browse/HDFS-5851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf


 Memory can be used as a storage medium for smaller/transient files for fast 
 write throughput.
 More information/design will be added later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6160) TestSafeMode occasionally fails

2014-04-09 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13964250#comment-13964250
 ] 

Sanjay Radia commented on HDFS-6160:


+1

 TestSafeMode occasionally fails
 ---

 Key: HDFS-6160
 URL: https://issues.apache.org/jira/browse/HDFS-6160
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Arpit Agarwal
 Attachments: HDFS-6160.01.patch


 From 
 https://builds.apache.org/job/PreCommit-HDFS-Build/6511//testReport/org.apache.hadoop.hdfs/TestSafeMode/testInitializeReplQueuesEarly/
  :
 {code}
 java.lang.AssertionError: expected:13 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:212)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5477) Block manager as a service

2014-03-04 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920076#comment-13920076
]

Sanjay Radia commented on HDFS-5477:

I read both documents; some key details are missing (perhaps in the patches,
but they need to be in the document or jira).
* How does the BM know the replication factor of a block? The document tends to
suggest that the BM syncs with the NN on its start. Is this a sort of a
reverse Block report from NN to BM where the NN tells the BM its list of
blocks and their replication factor?
** In particular, one would like the BM's state to exist independently of the
NN so that if one or more NNs are shut down for long periods of time (such as
unmounting a namespace) then blocks and their replicas are still managed. Would
this be possible under your design? If not what would it take to support it?
* How would one support file affinity? - for example there are several use
cases (esp for HBase) where one co-locates file blocks replicas of multiple
files together. How would you support such a feature?
* You briefly reference fine grained locking in the NN - does your design
require that the NN holds certain fine grained locks as threads in the NN makes
a remote call to the BM? Can you please detail these in context of specific
operations in the document and/or jira.
* How do you plan to handle file and block-under construction? There is
delicate code that finalizes sizes of block under construction especially in
face of NN and DN restarts?
* Please describe how you would support the Balancer in your new design.
* Your design does not change client APIs and hence you must be forwarding
block location requires via the NN to the BM. Can you describe how one could
direct clients to go directly to BM for block locations when appropriate (I
believe this can done in API compatible way and also, with protocol changes,
in a backward compatible way.)

Block manager as a service
--

Key: HDFS-5477
URL: https://issues.apache.org/jira/browse/HDFS-5477
Project: Hadoop HDFS
Issue Type: Improvement
Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Attachments: Proposal.pdf, Proposal.pdf, Standalone BM.pdf,
Standalone BM.pdf, patches.tar.gz

The block manager needs to evolve towards having the ability to run as a
standalone service to improve NN vertical and horizontal scalability. The
goal is reducing the memory footprint of the NN proper to support larger
namespaces, and improve overall performance by decoupling the block manager
from the namespace and its lock. Ideally, a distinct BM will be transparent
to clients and DNs.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-4685) Implementation of ACLs in HDFS

2014-01-23 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880440#comment-13880440
]

Sanjay Radia commented on HDFS-4685:

Comment on the the two alternatives for the default ACL proposals in the doc.
Reproducing the text for convenience.

* *Umask-Default-ACL*: The default ACL of the parent is cloned to the ACL of
the child at time of child creation. For new child directories, the default
ACL itself is also cloned, so that the same policy is applied to
sub-directories of sub-directories. Subsequent changes to the parent’s default
ACL will set a different ACL for new children, but will not alter existing
children. This matches POSIX behavior. If the administrator wants to change
policy on the sub-tree later, then this is performed by inserting a new more
restrictive ACL entry at the appropriate sub-tree root (see UC6) and may also
need to run a recursive ACL modification (analogous to chmod -R) since
existing children are not effected by the new ACL.

* *Inherited-Default-ACL*: A child that does not have an ACL of its own
inherits its ACL from the nearest ancestor that has defined a default ACL. A
child node that requires a different ACL can override the default (like the
Umask-Default-ACL). Subsequent changes to the ancestor’s default ACL will cause
all children that do not have an ACL to inherit the new ACL regardless of child
creation time (unlike Umask-Default-ACL). This model, like the ABAC ACLs (use
case UC8), encourages the user to create fewer ACLs (typically on the root of
specific subtrees) while the Posix-compliant Umask-Default-ACL is expected to
results in larger number of ACLs in the system. It would also make a memory
efficient implementation trivial. Note that this model is a deviation from
POSIX behavior.

Consider the following three sub use cases here
4a) OpenUP child for wide access than the default.
4b) Restrict a child for narrower access than the default.
4c) Change the defaultAcl because you made a mistake originally.

Both models support use case 4a and 4b with equal ease. However, with the
Inherited-Default-ACL, it is easy to identify children that have overridden the
default-ACL - the existence of an ACL means that the user intended to override
the default. Also 4c is a natural fit for Inherited-Default-ACL. For the
UMask-Default-ACL, every child has an ACL and hence you have to walk down the
subtree and compare the ACL with the default to see if the user had intended to
override it.

I think the Inherited-Default-ACL is much better design but posix compliance
may triumph and hence am willing to go with UMask-Default-ACL.

Implementation of ACLs in HDFS
--

Key: HDFS-4685
URL: https://issues.apache.org/jira/browse/HDFS-4685
Project: Hadoop HDFS
Issue Type: New Feature
Components: hdfs-client, namenode, security
Affects Versions: 1.1.2
Reporter: Sachin Jose
Assignee: Chris Nauroth
Attachments: HDFS-ACLs-Design-1.pdf, HDFS-ACLs-Design-2.pdf

Currenly hdfs doesn't support Extended file ACL. In unix extended ACL can be
achieved using getfacl and setfacl utilities. Is there anybody working on
this feature ?

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5389) A Namenode that keeps only a part of the namespace in memory

2014-01-03 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-5389:
---

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-2362

 A Namenode that keeps only a part of the namespace in memory
 

 Key: HDFS-5389
 URL: https://issues.apache.org/jira/browse/HDFS-5389
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 0.23.1
Reporter: Lin Xiao
Priority: Minor

 *Background:*
 Currently, the NN Keeps all its namespace in memory. This has had the benefit 
 that the NN code is very simple and, more importantly, helps the NN scale to 
 over 4.5K machines with 60K  to 100K concurrently tasks.  HDFS namespace can 
 be scaled currently using more Ram on the NN and/or using Federation which 
 scales both namespace and performance. The current federation implementation 
 does not allow renames across volumes without data copying but there are 
 proposals to remove that limitation.
 *Motivation:*
  Hadoop lets customers store huge amounts of data at very economical prices 
 and hence allows customers to store their data for several years. While most 
 customers perform analytics on recent  data (last hour, day, week, months, 
 quarter, year), the ability to have five year old data online for analytics 
 is very attractive for many businesses. Although one can use larger RAM in a 
 NN and/or use Federation, it not really necessary to store the entire 
 namespace in memory since only the recent data is typically heavily accessed. 
 *Proposed Solution:*
 Store a portion of the NN's namespace in memory- the working set of the 
 applications that are currently operating. LSM data structures are quite 
 appropriate for maintaining the full namespace in memory. One choice is 
 Google's LevelDB open-source implementation.
 *Benefits:*
  *  Store larger namespaces without resorting to Federated namespace volumes.
  * Complementary to NN Federated namespace volumes,  indeed will allow a 
 single NN to easily store multiple larger volumes.
  *  Faster cold startup - the NN does not have read its full namespace before 
 responding to clients.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5389) A Namenode that keeps only a part of the namespace in memory

2014-01-03 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861940#comment-13861940
 ] 

Sanjay Radia commented on HDFS-5389:


Lin will shortly post a link to her prototype code on GitHub. Her prototype is 
based on HDFS 0.23.

 A Namenode that keeps only a part of the namespace in memory
 

 Key: HDFS-5389
 URL: https://issues.apache.org/jira/browse/HDFS-5389
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 0.23.1
Reporter: Lin Xiao
Priority: Minor

 *Background:*
 Currently, the NN Keeps all its namespace in memory. This has had the benefit 
 that the NN code is very simple and, more importantly, helps the NN scale to 
 over 4.5K machines with 60K  to 100K concurrently tasks.  HDFS namespace can 
 be scaled currently using more Ram on the NN and/or using Federation which 
 scales both namespace and performance. The current federation implementation 
 does not allow renames across volumes without data copying but there are 
 proposals to remove that limitation.
 *Motivation:*
  Hadoop lets customers store huge amounts of data at very economical prices 
 and hence allows customers to store their data for several years. While most 
 customers perform analytics on recent  data (last hour, day, week, months, 
 quarter, year), the ability to have five year old data online for analytics 
 is very attractive for many businesses. Although one can use larger RAM in a 
 NN and/or use Federation, it not really necessary to store the entire 
 namespace in memory since only the recent data is typically heavily accessed. 
 *Proposed Solution:*
 Store a portion of the NN's namespace in memory- the working set of the 
 applications that are currently operating. LSM data structures are quite 
 appropriate for maintaining the full namespace in memory. One choice is 
 Google's LevelDB open-source implementation.
 *Benefits:*
  *  Store larger namespaces without resorting to Federated namespace volumes.
  * Complementary to NN Federated namespace volumes,  indeed will allow a 
 single NN to easily store multiple larger volumes.
  *  Faster cold startup - the NN does not have read its full namespace before 
 responding to clients.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5711) Removing memory limitation of the Namenode by persisting Block - Block location mappings to disk.

2014-01-02 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860846#comment-13860846
 ] 

Sanjay Radia commented on HDFS-5711:


bq. We also intend to use LevelDB to persist metadata, and plan to provide a 
complete solution, by not just persisting the Namespace information but also 
the Blocks Map onto secondary storage. 

The namespace layer and block layer have been separated reasonably well as part 
of the federation work. Hence it makes sense to keep the persistence of the 
namespace and the persistence of the block map as two separate jiras. 
Since there  is already a Jira, HDFS-5389,  focusing on storing  a portion (the 
working set ) of the  namespace in memory, this jira should focus on the 
storing the block mapping on disk (as stated in the title).



 Removing memory limitation of the Namenode by persisting Block - Block 
 location mappings to disk.
 -

 Key: HDFS-5711
 URL: https://issues.apache.org/jira/browse/HDFS-5711
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Rohan Pasalkar

 This jira is to track changes to be made to remove HDFS name-node memory 
 limitation to hold block - block location mappings.
 It is a known fact that the single Name-node architecture of HDFS has 
 scalability limits. The HDFS federation project alleviates this problem by 
 using horizontal scaling. This helps increase the throughput of metadata 
 operation and also the amount of data that can be stored in a Hadoop cluster.
 The Name-node stores all the filesystem metadata in memory (even in the 
 federated architecture), the
 Name-node design can be enhanced by persisting part of the metadata onto 
 secondary storage and retaining 
 the popular or recently accessed metadata information in main memory. This 
 design can benefit a HDFS deployment
 which doesn't use federation but needs to store a large number of files or 
 large number of blocks. Lin Xiao from Hortonworks attempted a similar
 project [1] in the Summer of 2013. They used LevelDB to persist the Namespace 
 information (i.e file and directory inode information).
 A patch with this change is yet to be submitted to code base. We also intend 
 to use LevelDB to persist metadata, and plan to 
 provide a complete solution, by not just persisting  the Namespace 
 information but also the Blocks Map onto secondary storage. 
 We did implement the basic prototype which stores the block-block location 
 mapping metadata to the persistent key-value store i.e. levelDB. Prototype 
 also maintains the in-memory cache of the recently used block-block location 
 mappings metadata. 
 References:
 [1] Lin Xiao, Hortonworks, Removing Name-node’s memory limitation, 
 http://www.slideshare.net/ydn/hadoop-meetup-hug-august-2013-removing-the-namenodes-memory-limitation



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5389) A Namenode that keeps only a part of the namespace in memory

2013-10-19 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-5389:
---

Summary: A Namenode that keeps only a part of the namespace in memory  
(was: Remove INode limitations in Namenode)

 A Namenode that keeps only a part of the namespace in memory
 

 Key: HDFS-5389
 URL: https://issues.apache.org/jira/browse/HDFS-5389
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.23.1
Reporter: Lin Xiao
Priority: Minor

 Current HDFS Namenode stores all of its metadata in RAM. This has allowed 
 Hadoop clusters to scale to 100K concurrent tasks. However, the memory limits 
 the total number of files that a single NN can store. While Federation allows 
 one to create multiple volumes with additional Namenodes, there is a need to 
 scale a single namespace and also to store multiple namespaces in a single 
 Namenode. When inodes are also stored on persistent storage, the system's 
 boot time can be significantly reduced because there is no need to replay 
 edit logs. It also provides the potential to support extended attributes once 
 the memory size is not the bottleneck.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5389) A Namenode that keeps only a part of the namespace in memory

2013-10-19 Thread Sanjay Radia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-5389:
---

Description: 
*Background:*
Currently, the NN Keeps all its namespace in memory. This has had the benefit 
that the NN code is very simple and, more importantly, helps the NN scale to 
over 4.5K machines with 60K  to 100K concurrently tasks.  HDFS namespace can be 
scaled currently using more Ram on the NN and/or using Federation which scales 
both namespace and performance. The current federation implementation does not 
allow renames across volumes without data copying but there are proposals to 
remove that limitation.

*Motivation:*
 Hadoop lets customers store huge amounts of data at very economical prices and 
hence allows customers to store their data for several years. While most 
customers perform analytics on recent  data (last hour, day, week, months, 
quarter, year), the ability to have five year old data online for analytics is 
very attractive for many businesses. Although one can use larger RAM in a NN 
and/or use Federation, it not really necessary to store the entire namespace in 
memory since only the recent data is typically heavily accessed. 

*Proposed Solution:*
Store a portion of the NN's namespace in memory- the working set of the 
applications that are currently operating. LSM data structures are quite 
appropriate for maintaining the full namespace in memory. One choice is 
Google's LevelDB open-source implementation.

*Benefits:*
 *  Store larger namespaces without resorting to Federated namespace volumes.
 * Complementary to NN Federated namespace volumes,  indeed will allow a single 
NN to easily store multiple larger volumes.
 *  Faster cold startup - the NN does not have read its full namespace before 
responding to clients.


  was:
Current HDFS Namenode stores all of its metadata in RAM. This has allowed 
Hadoop clusters to scale to 100K concurrent tasks. However, the memory limits 
the total number of files that a single NN can store. While Federation allows 
one to create multiple volumes with additional Namenodes, there is a need to 
scale a single namespace and also to store multiple namespaces in a single 
Namenode. When inodes are also stored on persistent storage, the system's boot 
time can be significantly reduced because there is no need to replay edit logs. 
It also provides the potential to support extended attributes once the memory 
size is not the bottleneck.



 A Namenode that keeps only a part of the namespace in memory
 

 Key: HDFS-5389
 URL: https://issues.apache.org/jira/browse/HDFS-5389
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.23.1
Reporter: Lin Xiao
Priority: Minor

 *Background:*
 Currently, the NN Keeps all its namespace in memory. This has had the benefit 
 that the NN code is very simple and, more importantly, helps the NN scale to 
 over 4.5K machines with 60K  to 100K concurrently tasks.  HDFS namespace can 
 be scaled currently using more Ram on the NN and/or using Federation which 
 scales both namespace and performance. The current federation implementation 
 does not allow renames across volumes without data copying but there are 
 proposals to remove that limitation.
 *Motivation:*
  Hadoop lets customers store huge amounts of data at very economical prices 
 and hence allows customers to store their data for several years. While most 
 customers perform analytics on recent  data (last hour, day, week, months, 
 quarter, year), the ability to have five year old data online for analytics 
 is very attractive for many businesses. Although one can use larger RAM in a 
 NN and/or use Federation, it not really necessary to store the entire 
 namespace in memory since only the recent data is typically heavily accessed. 
 *Proposed Solution:*
 Store a portion of the NN's namespace in memory- the working set of the 
 applications that are currently operating. LSM data structures are quite 
 appropriate for maintaining the full namespace in memory. One choice is 
 Google's LevelDB open-source implementation.
 *Benefits:*
  *  Store larger namespaces without resorting to Federated namespace volumes.
  * Complementary to NN Federated namespace volumes,  indeed will allow a 
 single NN to easily store multiple larger volumes.
  *  Faster cold startup - the NN does not have read its full namespace before 
 responding to clients.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5389) A Namenode that keeps only a part of the namespace in memory

2013-10-19 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800015#comment-13800015
 ] 

Sanjay Radia commented on HDFS-5389:


Lin built a prototype as an  intern at Hortonworks in 2013. Her  
[slides|http://www.slideshare.net/ydn/hadoop-meetup-hug-august-2013-removing-the-namenodes-memory-limitation]
 presented  at Hadoop User Group (HUG) in August 2013 describes the approach 
and some early results.


 A Namenode that keeps only a part of the namespace in memory
 

 Key: HDFS-5389
 URL: https://issues.apache.org/jira/browse/HDFS-5389
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.23.1
Reporter: Lin Xiao
Priority: Minor

 *Background:*
 Currently, the NN Keeps all its namespace in memory. This has had the benefit 
 that the NN code is very simple and, more importantly, helps the NN scale to 
 over 4.5K machines with 60K  to 100K concurrently tasks.  HDFS namespace can 
 be scaled currently using more Ram on the NN and/or using Federation which 
 scales both namespace and performance. The current federation implementation 
 does not allow renames across volumes without data copying but there are 
 proposals to remove that limitation.
 *Motivation:*
  Hadoop lets customers store huge amounts of data at very economical prices 
 and hence allows customers to store their data for several years. While most 
 customers perform analytics on recent  data (last hour, day, week, months, 
 quarter, year), the ability to have five year old data online for analytics 
 is very attractive for many businesses. Although one can use larger RAM in a 
 NN and/or use Federation, it not really necessary to store the entire 
 namespace in memory since only the recent data is typically heavily accessed. 
 *Proposed Solution:*
 Store a portion of the NN's namespace in memory- the working set of the 
 applications that are currently operating. LSM data structures are quite 
 appropriate for maintaining the full namespace in memory. One choice is 
 Google's LevelDB open-source implementation.
 *Benefits:*
  *  Store larger namespaces without resorting to Federated namespace volumes.
  * Complementary to NN Federated namespace volumes,  indeed will allow a 
 single NN to easily store multiple larger volumes.
  *  Faster cold startup - the NN does not have read its full namespace before 
 responding to clients.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5324) Make Namespace implementation pluggable in the namenode

2013-10-19 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800022#comment-13800022
 ] 

Sanjay Radia commented on HDFS-5324:


Lin, a summer intern at Hortonworks, prototyped a NN that stores only the 
working set in memory (HDFS-5389). She made changes directly to NN code. 

 Make Namespace implementation pluggable in the namenode
 ---

 Key: HDFS-5324
 URL: https://issues.apache.org/jira/browse/HDFS-5324
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.1.1-beta
 Environment: All
Reporter: Milind Bhandarkar
Assignee: Milind Bhandarkar
 Fix For: 3.0.0

 Attachments: AbstractNamesystem.java


 For the last couple of months, we have been working on making Namespace
 implementation in the namenode pluggable. We have demonstrated that it can
 be done without major surgery on the namenode, and does not have noticeable
 performance impact. We would like to contribute it back to Apache HDFS.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5324) Make Namespace implementation pluggable in the namenode

2013-10-09 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790652#comment-13790652
 ] 

Sanjay Radia commented on HDFS-5324:


I am copying my comment from  hdfs-dev:

HDFS pluggability (and relation to pluggability added as part of Federation)
 * Pluggabilty and federation are orthogonal, although we did improved the 
pluggabily of HDFS as part of federation implementation. The *block layer* was 
separated out as part of the federation work and hence makes the general 
development of new  of HDFS namespace implementations easier.  Federation's  
pluggablity was  targeted towards  someone writing a new NN and reusing the 
block storage layer via a library   and *optionally* living side-by-side with 
different implementations of the NN *within the same cluster*. Hence we added 
notion of block pools and separated out the block management layer.  
* So your proposed work is clearly not in conflict with Federation or even with 
the pluggability that Federation added, but philosophically,  your proposal is 
complementary. 

Considerations: A Public API?
The FileSystem/AbstractFileSystem APIs and the newly proposed 
AbstractFSNamesystem are targeting very different kinds of plugability into 
Hadoop. The former takes a thin application API (FileSystem and FileContext) 
and makes it easy for users to plug in different filesytems (S3, LocalFS, etc) 
as Hadoop compatible filesystems. In contrast the later (the proposed 
AbstractFSNamesystem) is a fatter interface inside the depths of HDFS 
implementation and makes parts of the impl pluggable. 

I would  not make your proposed AbstractFSNamesystem a public stable Hadoop API 
but instead direct it towards to HDFS developers who want to extend the 
implementation of HDFS more easily. Were you envisioning the 
AbstractFSNamesystem to be a stable public Hadoop API? If someone has their own 
private implementation for this new abstract class, would  the HDFS community 
have the freedom to modify the abstract class in incompatible ways? 

 Make Namespace implementation pluggable in the namenode
 ---

 Key: HDFS-5324
 URL: https://issues.apache.org/jira/browse/HDFS-5324
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.1.1-beta
 Environment: All
Reporter: Milind Bhandarkar
Assignee: Milind Bhandarkar
 Fix For: 3.0.0


 For the last couple of months, we have been working on making Namespace
 implementation in the namenode pluggable. We have demonstrated that it can
 be done without major surgery on the namenode, and does not have noticeable
 performance impact. We would like to contribute it back to Apache HDFS.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-4685) Implementation of extended file acl in hdfs

2013-08-22 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747891#comment-13747891
 ] 

Sanjay Radia commented on HDFS-4685:


When we added permissions to HDFS in Hadoop 0.16 we had originally considered 
having only ACLs for directories and not doing Unix like permissions. Then, due 
to lack time, we decide to go with unix style permissions. At that time we had 
proposed the following:
* Any directory can have an ACL
* An ACL specifies the list of users and groups that can access that 
directory's subtree. As you resolve paths you take the most stringent 
permission - i.e. the most restrictive permission.
* An ACL also specified who can change the ACL. ie. changes to ACLs can be done 
by others besides the owner. 

 Implementation of extended file acl in hdfs
 ---

 Key: HDFS-4685
 URL: https://issues.apache.org/jira/browse/HDFS-4685
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client, namenode, security
Affects Versions: 1.1.2
Reporter: Sachin Jose
Assignee: Chris Nauroth
Priority: Minor

 Currenly hdfs doesn't support Extended file ACL. In unix extended ACL can be 
 achieved using getfacl and setfacl utilities. Is there anybody working on 
 this feature ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4949) Centralized cache management in HDFS

2013-07-18 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13712529#comment-13712529
 ] 

Sanjay Radia commented on HDFS-4949:


Caching partial blocks: There is no problem with a DN caching only the hot 
parts of a block and still declaring to the NN that the block is cached in ram. 
This would fit in with the proposal of abstracting ram copies as replicas. The 
use case that does not fit in is where DN1 has cached the first 100 bytes and 
and Datanode, DN2 has cached the last 100 bytes and you want the client to go 
to the right data node based on what portion of the file it is reading. If and 
when we finally get to caching portions and we want to support the use case 
mentioned, we, at that time, could considering  the block-info sent for RAM 
replicas to indicate what portion are cached -- this would mean that certain 
replicas have additional in the block map.

Given that we are not caching portions of block for this Jira and that for 
tiered storage for SSDs we want to add the device info to block location, I 
suggest that we proceed with abstracting RAM copies as replicas and later 
revisit this decision for partial block caching at a later point.

 Centralized cache management in HDFS
 

 Key: HDFS-4949
 URL: https://issues.apache.org/jira/browse/HDFS-4949
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: caching-design-doc-2013-07-02.pdf


 HDFS currently has no support for managing or exposing in-memory caches at 
 datanodes. This makes it harder for higher level application frameworks like 
 Hive, Pig, and Impala to effectively use cluster memory, because they cannot 
 explicitly cache important datasets or place their tasks for memory locality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4949) Centralized cache management in HDFS

2013-07-18 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13712856#comment-13712856
]

Sanjay Radia commented on HDFS-4949:

bq. we have to use mmap of a file on disk.
Please look at my comments: I have not objected to mmap and mlock.
I am fine with having Ram replicas backed by disk replica; indeed I see this as
an important advantage over Ramfs where the data is copied. The replication
abstractions allows for a more general view where they are not, but our
implementation restricts the memory replicas to be backed by disk replicas.

bq. In general, tiered storage management happens over a longer period of time
than cache management.
The term tier-storage is unfortunate (I misused it in my original comment). In
HDFS-2832, we consciously used the terms heterogeneous storage and not
tiered storage. Tiering as in moving things based on their hotness is policy.
(BTW I envision using SSDs initially not for moving hot blocks but as storage
for *one* of 3 replicas. I have discussed this use case with a few of the HBase
folks). Caching is a use case that applies well to disks vs ram. Both the use
cases apply well to the abstraction of replicas stored on different kinds of
storage devices.

bq. Memory is not a storage tier. It doesn't store anything; rather, it caches.
Does it make sense to fsck memory? That is silly.
Memory and disks store data but one is way more durable. Fsck is a bad example
- you do fsck on a file system not on the disk. Here we are taking about
entities that store HDFS block data. But this debate over the similarities and
difference between ram and disk is a longer one that we should have over beer.
I am not blind to the differences between disks and ram. Further, by using the
same abstraction to model ram copies and disk copies does not mean that I am
implying that I am going to always treat them as exactly the same and ignore
the differences.

Centralized cache management in HDFS

Key: HDFS-4949
URL: https://issues.apache.org/jira/browse/HDFS-4949
Project: Hadoop HDFS
Issue Type: New Feature
Components: datanode, namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Attachments: caching-design-doc-2013-07-02.pdf

HDFS currently has no support for managing or exposing in-memory caches at
datanodes. This makes it harder for higher level application frameworks like
Hive, Pig, and Impala to effectively use cluster memory, because they cannot
explicitly cache important datasets or place their tasks for memory locality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4949) Centralized cache management in HDFS

2013-07-18 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13713117#comment-13713117
 ] 

Sanjay Radia commented on HDFS-4949:


To converge on this could we do a meetup?

 Centralized cache management in HDFS
 

 Key: HDFS-4949
 URL: https://issues.apache.org/jira/browse/HDFS-4949
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: caching-design-doc-2013-07-02.pdf


 HDFS currently has no support for managing or exposing in-memory caches at 
 datanodes. This makes it harder for higher level application frameworks like 
 Hive, Pig, and Impala to effectively use cluster memory, because they cannot 
 explicitly cache important datasets or place their tasks for memory locality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5005) Move SnapshotException and SnapshotAccessControlException to o.a.h.hdfs.protocol

2013-07-17 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711729#comment-13711729
 ] 

Sanjay Radia commented on HDFS-5005:


+1 


 Move SnapshotException and SnapshotAccessControlException to 
 o.a.h.hdfs.protocol
 

 Key: HDFS-5005
 URL: https://issues.apache.org/jira/browse/HDFS-5005
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-5005.001.patch


 We should move the definition of these two exceptions to the protocol package 
 since they can be directly passed to clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 3 4 5 >

1 - 100 of 449 matches

Mail list logo