from:"Bikas Saha \(JIRA\)"

[jira] [Commented] (HDFS-7240) Object store in HDFS

2016-05-25 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301324#comment-15301324
 ] 

Bikas Saha commented on HDFS-7240:
--

In case there is a conference call, please send an email to hdfs-dev with the 
proposed meeting details for wider dispersal and participation since that is 
the right forum to organize community activities.

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, 
> ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-8401) Memfs - a layered file system for in-memory storage in HDFS

2015-05-14 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544138#comment-14544138
 ] 

Bikas Saha commented on HDFS-8401:
--

I am guessing that anything that is written to in-memory storage via memfs 
would also be read back in-memory via memfs. The centralized cache management 
is not the only in-memory read path, right?

> Memfs - a layered file system for in-memory storage in HDFS
> ---
>
> Key: HDFS-8401
> URL: https://issues.apache.org/jira/browse/HDFS-8401
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>
> We propose creating a layered filesystem that can provide in-memory storage 
> using existing features within HDFS. memfs will use lazy persist writes 
> introduced by HDFS-6581. For reads, memfs can use the Centralized Cache 
> Management feature introduced in HDFS-4949 to load hot data to memory.
> Paths in memfs and hdfs will correspond 1:1 so memfs will require no 
> additional metadata and it can be implemented entirely as a client-side 
> library.
> The advantage of a layered file system is that it requires little or no 
> changes to existing applications. e.g. Applications can use something like 
> {{memfs://}} instead of {{hdfs://}} for files targeted to memory storage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper

2015-02-28 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341849#comment-14341849
 ] 

Bikas Saha commented on HDFS-7858:
--

What are long lived client examples? How many such clients would be there in a 
large busy cluster? Will they be setting watches on ZK?

bq. Adding a cached entry to user's home dir to pick last active NN.  If entry 
is not present, the client picks the Standby from the configuration. 
This seems like a reasonable improvement to the current scheme which will allow 
a client to connect to the current active directly (even though it may be 
listed later in the NN names list).

Please do keep in mind that ZK is just a notifier in the leader election 
scheme. The real control lies in the FailoverController which is pluggable. A 
different FailoverController may not use ZK. The status of the master flag may 
not be valid/be-empty while the FailoverController is fencing the old master 
and bringing up the new master.

Getting configuration from ZK is related but probably orthogonal. The entire 
config for HDFS could be downloaded from ZK based on a well known HDFS service 
name.


> Improve HA Namenode Failover detection on the client using Zookeeper
> 
>
> Key: HDFS-7858
> URL: https://issues.apache.org/jira/browse/HDFS-7858
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> In an HA deployment, Clients are configured with the hostnames of both the 
> Active and Standby Namenodes.Clients will first try one of the NNs 
> (non-deterministically) and if its a standby NN, then it will respond to the 
> client to retry the request on the other Namenode.
> If the client happens to talks to the Standby first, and the standby is 
> undergoing some GC / is busy, then those clients might not get a response 
> soon enough to try the other NN.
> Proposed Approach to solve this :
> 1) Since Zookeeper is already used as the failover controller, the clients 
> could talk to ZK and find out which is the active namenode before contacting 
> it.
> 2) Long-lived DFSClients would have a ZK watch configured which fires when 
> there is a failover so they do not have to query ZK everytime to find out the 
> active NN
> 2) Clients can also cache the last active NN in the user's home directory 
> (~/.lastNN) so that short-lived clients can try that Namenode first before 
> querying ZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper

2015-02-27 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341399#comment-14341399
 ] 

Bikas Saha commented on HDFS-7858:
--

bq. The client will proceed to connect to that NN first (thereby removing 
non-determinism from the current scheme).. and will most probably succeed. It 
will contact ZK only if the connection was unsuccessful..
Yes. It will most probably succeed. But when will it not succeed? When that NN 
has failed over or has crashed, right? Which means that every time a known 
primary NN becomes unavailable there will be surge of failed connections to it 
(from cached entries that point to it) and then these connections will be 
redirected to ZK. For a proxy of the number of connections consider MR jobs, 
where every Map task running on every machine has a DFS client to read  from 
HDFS and every Reduce task on every machine has a DFS client to write to HDFS. 
MR tasks are typically short lived clients.

> Improve HA Namenode Failover detection on the client using Zookeeper
> 
>
> Key: HDFS-7858
> URL: https://issues.apache.org/jira/browse/HDFS-7858
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> In an HA deployment, Clients are configured with the hostnames of both the 
> Active and Standby Namenodes.Clients will first try one of the NNs 
> (non-deterministically) and if its a standby NN, then it will respond to the 
> client to retry the request on the other Namenode.
> If the client happens to talks to the Standby first, and the standby is 
> undergoing some GC / is busy, then those clients might not get a response 
> soon enough to try the other NN.
> Proposed Approach to solve this :
> 1) Since Zookeeper is already used as the failover controller, the clients 
> could talk to ZK and find out which is the active namenode before contacting 
> it.
> 2) Long-lived DFSClients would have a ZK watch configured which fires when 
> there is a failover so they do not have to query ZK everytime to find out the 
> active NN
> 2) Clients can also cache the last active NN in the user's home directory 
> (~/.lastNN) so that short-lived clients can try that Namenode first before 
> querying ZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper

2015-02-27 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341336#comment-14341336
 ] 

Bikas Saha commented on HDFS-7858:
--

We need to be careful about how many clients can be supported by ZK (either 
pinging for info or watchers). ZK is typically a shared service with YARN/HBase 
etc.

> Improve HA Namenode Failover detection on the client using Zookeeper
> 
>
> Key: HDFS-7858
> URL: https://issues.apache.org/jira/browse/HDFS-7858
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> In an HA deployment, Clients are configured with the hostnames of both the 
> Active and Standby Namenodes.Clients will first try one of the NNs 
> (non-deterministically) and if its a standby NN, then it will respond to the 
> client to retry the request on the other Namenode.
> If the client happens to talks to the Standby first, and the standby is 
> undergoing some GC / is busy, then those clients might not get a response 
> soon enough to try the other NN.
> Proposed Approach to solve this :
> 1) Since Zookeeper is already used as the failover controller, the clients 
> could talk to ZK and find out which is the active namenode before contacting 
> it.
> 2) Long-lived DFSClients would have a ZK watch configured which fires when 
> there is a failover so they do not have to query ZK everytime to find out the 
> active NN
> 2) Clients can also cache the last active NN in the user's home directory 
> (~/.lastNN) so that short-lived clients can try that Namenode first before 
> querying ZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-5098) Enhance FileSystem.Statistics to have locality information

2014-11-26 Thread Bikas Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha resolved HDFS-5098.
--
Resolution: Duplicate

> Enhance FileSystem.Statistics to have locality information
> --
>
> Key: HDFS-5098
> URL: https://issues.apache.org/jira/browse/HDFS-5098
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bikas Saha
>Assignee: Suresh Srinivas
> Fix For: 2.6.0
>
>
> Currently in MR/Tez we dont have a good and accurate means to detect how much 
> the the IO was actually done locally. Getting this information from the 
> source of truth would be much better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5098) Enhance FileSystem.Statistics to have locality information

2014-02-25 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13911871#comment-13911871
 ] 

Bikas Saha commented on HDFS-5098:
--

Folks, should we close this a dup of HDFS-4698?

> Enhance FileSystem.Statistics to have locality information
> --
>
> Key: HDFS-5098
> URL: https://issues.apache.org/jira/browse/HDFS-5098
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bikas Saha
>Assignee: Suresh Srinivas
> Fix For: 2.4.0
>
>
> Currently in MR/Tez we dont have a good and accurate means to detect how much 
> the the IO was actually done locally. Getting this information from the 
> source of truth would be much better.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5152) Avoiding redundant Kerberos login for Zookeeper client in ActiveStandbyElector

2013-09-04 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758294#comment-13758294
 ] 

Bikas Saha commented on HDFS-5152:
--

Can you please confirm that things work correctly in non-secure mode where the 
"login" will essentially be null.
{code}
+ZooKeeper zk = new ZooKeeper(zkHostPort, zkSessionTimeout, watcher, login);
{code}

How will this be tested? Jenkins will probably not like that the patch has no 
tests.

Other than that the change looks fine to me. Caveat is that I am not a security 
expert :P

This jira needs to be moved to HADOOP (instead of HDFS) since the Elector is in 
hadoop-common.

> Avoiding redundant Kerberos login for Zookeeper client in ActiveStandbyElector
> --
>
> Key: HDFS-5152
> URL: https://issues.apache.org/jira/browse/HDFS-5152
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Kai Zheng
> Attachments: HDFS-5152.patch
>
>
> Based on the fix in HADOOP-8315, it's possible to deploy a secured HA cluster 
> with SASL support for connection with Zookeeper. However it requires extra 
> configuration for JAAS to initialize the Zookeeper client because the client 
> will do another login in it even when ZKFC service actually has already 
> passed the Kerberos login during its starting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5152) Avoiding redundant Kerberos login for Zookeeper client in ActiveStandbyElector

2013-09-03 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13757032#comment-13757032
 ] 

Bikas Saha commented on HDFS-5152:
--

To be clear, the intent here is to allow ActiveStandbyElector to reuse the 
login context that has already been created when ZKFC authenticates with ZK. Is 
that correct?

Will this change allow ActiveStandbyElector to be used outside of ZKFC (as an 
embedded library) and authenticate using UgiZkLogin?


> Avoiding redundant Kerberos login for Zookeeper client in ActiveStandbyElector
> --
>
> Key: HDFS-5152
> URL: https://issues.apache.org/jira/browse/HDFS-5152
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Kai Zheng
> Attachments: HDFS-5152.patch
>
>
> Based on the fix in HADOOP-8315, it's possible to deploy a secured HA cluster 
> with SASL support for connection with Zookeeper. However it requires extra 
> configuration for JAAS to initialize the Zookeeper client because the client 
> will do another login in it even when ZKFC service actually has already 
> passed the Kerberos login during its starting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5098) Enhance FileSystem.Statistics to have locality information

2013-08-20 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745373#comment-13745373
 ] 

Bikas Saha commented on HDFS-5098:
--

Its good to see that the information is already available for HDFS. However, in 
MR we work at the FileSystem abstraction layer. And we use 
FileSystem.getAllStatistics() to get the statistics. Hence, we cannot depend on 
calling HdfsDataInputStream. It would be great if that information was passed 
up the to the FileSystem.Statistics layer. I guessing thats which the 
Statistics object exists.

> Enhance FileSystem.Statistics to have locality information
> --
>
> Key: HDFS-5098
> URL: https://issues.apache.org/jira/browse/HDFS-5098
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bikas Saha
> Fix For: 2.1.1-beta
>
>
> Currently in MR/Tez we dont have a good and accurate means to detect how much 
> the the IO was actually done locally. Getting this information from the 
> source of truth would be much better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-08-15 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741557#comment-13741557
 ] 

Bikas Saha commented on HDFS-2832:
--

Sorry I am still not clear on this. Let me re-phrase. If there are 10 disks on 
a data node, will it be legal to create 2 volumes of type HDD on that Datanode 
with 5 disks each? I am guessing that the volume+type list for a datanode will 
come from config. i.e. the datanode will not be figuring this out automatically 
by inspecting the hardware on the machine.

> Enable support for heterogeneous storages in HDFS
> -
>
> Key: HDFS-2832
> URL: https://issues.apache.org/jira/browse/HDFS-2832
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: 20130813-HeterogeneousStorage.pdf
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-08-15 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741504#comment-13741504
 ] 

Bikas Saha commented on HDFS-2832:
--

Nice doc!
If there are multiple hard drives on a data node will they be represented as a 
single storage volume of type HDD. Will it be legal to have multiple storage 
volumes of the same type on the same data node? Say 1 volume per HDD.

> Enable support for heterogeneous storages in HDFS
> -
>
> Key: HDFS-2832
> URL: https://issues.apache.org/jira/browse/HDFS-2832
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: 20130813-HeterogeneousStorage.pdf
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5096) Automatically cache new data added to a cached path

2013-08-15 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741192#comment-13741192
 ] 

Bikas Saha commented on HDFS-5096:
--

Will this automatic caching kick-in when the file is completely written or 
while writes are still in-progress? For MR etc it will be most beneficial when 
the file cache has an all or none property. That is complete files are present 
in the cache.

> Automatically cache new data added to a cached path
> ---
>
> Key: HDFS-5096
> URL: https://issues.apache.org/jira/browse/HDFS-5096
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Andrew Wang
>
> For some applications, it's convenient to specify a path to cache, and have 
> HDFS automatically cache new data added to the path without sending a new 
> caching request or a manual refresh command.
> One example is new data appended to a cached file. It would be nice to 
> re-cache a block at the new appended length, and cache new blocks added to 
> the file.
> Another example is a cached Hive partition directory, where a user can drop 
> new files directly into the partition. It would be nice if these new files 
> were cached.
> In both cases, this automatic caching would happen after the file is closed, 
> i.e. block replica is finalized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5096) Automatically cache new data added to a cached path

2013-08-15 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741186#comment-13741186
 ] 

Bikas Saha commented on HDFS-5096:
--

Would automatic cache eviction be a pre-requisite for automatic cache addition?

> Automatically cache new data added to a cached path
> ---
>
> Key: HDFS-5096
> URL: https://issues.apache.org/jira/browse/HDFS-5096
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Andrew Wang
>
> For some applications, it's convenient to specify a path to cache, and have 
> HDFS automatically cache new data added to the path without sending a new 
> caching request or a manual refresh command.
> One example is new data appended to a cached file. It would be nice to 
> re-cache a block at the new appended length, and cache new blocks added to 
> the file.
> Another example is a cached Hive partition directory, where a user can drop 
> new files directly into the partition. It would be nice if these new files 
> were cached.
> In both cases, this automatic caching would happen after the file is closed, 
> i.e. block replica is finalized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-5098) Enhance FileSystem.Statistics to have locality information

2013-08-14 Thread Bikas Saha (JIRA)

Bikas Saha created HDFS-5098:


 Summary: Enhance FileSystem.Statistics to have locality information
 Key: HDFS-5098
 URL: https://issues.apache.org/jira/browse/HDFS-5098
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Bikas Saha
 Fix For: 2.1.1-beta


Currently in MR/Tez we dont have a good and accurate means to detect how much 
the the IO was actually done locally. Getting this information from the source 
of truth would be much better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-07-17 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13711356#comment-13711356
 ] 

Bikas Saha commented on HDFS-2832:
--

Is there an overall design document that one can follow? HDFS-2802 was great in 
this regard with initial docs followed by revisions.

> Enable support for heterogeneous storages in HDFS
> -
>
> Key: HDFS-2832
> URL: https://issues.apache.org/jira/browse/HDFS-2832
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4942) Add retry cache support in Namenode

2013-07-02 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13698674#comment-13698674
 ] 

Bikas Saha commented on HDFS-4942:
--

Some clarifications would be good
1) User runs create command using CLI. Internally, the command is tried many 
times over RPC. Will these retries have the same UUID and will result in 
response coming from retry cache? I think yes.
2) Client in 1) above tries many times and fails. User retries the CLI command 
again. This will generate a new set of retries. Will these have the same UUID 
as the retries in 1) above. Will they get response from retry cache?

> Add retry cache support in Namenode
> ---
>
> Key: HDFS-4942
> URL: https://issues.apache.org/jira/browse/HDFS-4942
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFSRetryCache.pdf
>
>
> In current HA mechanism with FailoverProxyProvider and non HA setups with 
> RetryProxy retry a request from the RPC layer. If the retried request has 
> already been processed at the namenode, the subsequent attempts fail for 
> non-idempotent operations such as  create, append, delete, rename etc. This 
> will cause application failures during HA failover, network issues etc.
> This jira proposes adding retry cache at the namenode to handle these 
> failures. More details in the comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4942) Add retry cache support in Namenode

2013-06-30 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13696376#comment-13696376
 ] 

Bikas Saha commented on HDFS-4942:
--

Suresh, you mean non-idempotent requests like create etc in the description and 
point 1 in your comment?

> Add retry cache support in Namenode
> ---
>
> Key: HDFS-4942
> URL: https://issues.apache.org/jira/browse/HDFS-4942
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>
> In current HA mechanism with FailoverProxyProvider and non HA setups with 
> RetryProxy retry a request from the RPC layer. If the retried request has 
> already been processed at the namenode, the subsequent attempts fail for 
> idempotent operations such as  create, append, delete, rename etc. This will 
> cause application failures during HA failover, network issues etc.
> This jira proposes adding retry cache at the namenode to handle these 
> failures. More details in the comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4606) HDFS API to move file replicas to caller's location

2013-03-17 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604740#comment-13604740
 ] 

Bikas Saha commented on HDFS-4606:
--

This makes sense to me wrt the placement policy API even when all machines have 
SSDs. Until now it wasnt needed because all disks in the datanode were assumed 
to be identical for practical purposes.

> HDFS API to move file replicas to caller's location
> ---
>
> Key: HDFS-4606
> URL: https://issues.apache.org/jira/browse/HDFS-4606
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4606) HDFS API to move file replicas to caller's location

2013-03-16 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604405#comment-13604405
 ] 

Bikas Saha commented on HDFS-4606:
--

There are several such application hint proposals floating around which makes 
it clear that there is need for HDFS to support new API's wrt block placement. 
The number of proposals also seems to suggest that the HDFS community needs to 
abstract the problem space and figure out the correct way forward (while 
balancing HDFS core principles and application complexity). It would help if we 
try to identify the scenarios/use-cases being desired instead of solutions for 
those scenarios/use-cases.
e.g.
HDFS-2121 - By creating replicas on the fly (when read off-switch), it looks 
like we are trying to solve the problem of hot data locality for data 
processing applications. When todays logs come in, almost every daily job wants 
to read them but gets stuck on 3 replicas. Being able to allow hot-data to be 
highly replicated on demand would help latency and locality. HDFS needs to 
understand that such over-replication needs to be tuned down by deleting excess 
replicas as demand falls. Persistance is not necessary here. This feature needs 
to be automatic to be useful.
HDFS-2576 - This seems to solve the use case when the application knows a 
priori that certain files need to be co-located with each other. Its not clear 
whether all replicas of blocks of those file needs co-location or not. By 
specifying the locations, the proposal solves the problem of getting a good 
starting point without persisting any co-location state. And thus, for stable 
clusters its good solution.
On the same jira, there is an alternate proposal to have co-location of files 
be a first class feature that persisted and HDFS can continue to co-locate them 
across machine failures and re-balancing. The application could potentially 
query the co-located machines and use that to assign its own failover services. 
This feature is also useful for data-processing applications that want to 
colocate frequently joined pre-partitioned data to avoid unnecessary 
re-partitioning. I have seen this scenario work at scale at a different Hadoop 
like system at very large scale. So its useful.
However, this does not prevent HDFS from co-locating 2 different Hbase region 
server data on the same machine. HDFS-4606 addresses that problem by letting 
client specify that they want to copy the data locally. But like someone 
suggested somewhere else, moving data to code is kind of going against the 
basic grain of hadoop. So designing such an API needs to be careful about doing 
the right thing and preventing abuse.

IMO, letting applications start managing their own blocks needs to be carefully 
thought out. It might be enticing at first but we might soon end up with issues 
on who owns the blocks and takes actions that HDFS currently takes on blocks 
for snaphotting, replicating, fault tolerance and every future HDFS feature. 
Also, does this mean applications might have to develop a host of namenode 
features themselves as they needs to fix more issues down the line. Or ask HDFS 
for API's to control all such HDFS actions. And how does HDFS manage all these 
API's while doing the right thing for the data in the cluster. Mistakes in data 
management are too critical to make.

> HDFS API to move file replicas to caller's location
> ---
>
> Key: HDFS-4606
> URL: https://issues.apache.org/jira/browse/HDFS-4606
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4198) the shell script error for Cygwin on windows7

2012-11-16 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498728#comment-13498728
 ] 

Bikas Saha commented on HDFS-4198:
--

If your project allows running on trunk version of hadoop and does not depend 
on other Unix things then you could try out the code in hadoop-trunk-win 
branch. You can check it out from the hadoop-common git repository and build 
it. That branch can build and run hadoop without any unix shell scripts or 
Cygwin environment.

> the shell script error for Cygwin on windows7
> -
>
> Key: HDFS-4198
> URL: https://issues.apache.org/jira/browse/HDFS-4198
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.0.2-alpha
> Environment: windows7 ,cygwin.
>Reporter: Han Hui Wen 
> Fix For: 2.0.3-alpha
>
>
> run  /usr/local/hadoop-2.0.2-alpha/sbin/start-all.sh or  
> /usr/local/hadoop-2.0.2-alpha/sbin/start-dfs.sh .
> 1. $HADOOP_PREFIX/bin/hdfs getconf -namenodes
> --in hadoop-config
> HADOOP_SLAVE_NAMES:
> --in hadoop-config-
> --in hadoop-condig.sh before cygpath
> HADOOP_PREFIX:/usr/local/hadoop-2.0.2-alpha
> HADOOP_LOG_DIR:/usr/local/hadoop-2.0.2-alpha/logs
> JAVA_LIBRARY_PATH:
> --in hadoop-condig.sh before cygpath--
> cygpath: can't convert empty path
> --in hadoop-condig.sh after cygpath
> HADOOP_PREFIX:/usr/local/hadoop-2.0.2-alpha
> HADOOP_LOG_DIR:/usr/local/hadoop-2.0.2-alpha/logs
> JAVA_LIBRARY_PATH:
> --in hadoop-condig.sh after cygpath--
> JAVA_LIBRARY_PATH:/usr/local/hadoop-2.0.2-alpha/lib/native
> --start to run it  in hdfs
> localhost
> henry@IBM-RR0A746AMG4 ~
> $ $HADOOP_PREFIX/bin/hdfs getconf -namenodes
> --in hadoop-config
> HADOOP_SLAVE_NAMES:
> --in hadoop-config-
> cygpath: can't convert empty path
> JAVA_LIBRARY_PATH:/usr/local/hadoop-2.0.2-alpha/lib/native
> --start to run it  in hdfs
> localhost
> ---> if we add log in hadoop-condig.sh .  NAMENODES=$($HADOOP_PREFIX/bin/hdfs 
> getconf -namenodes) in start-dfs.sh return strange value like above.
> The return is not reliable ,so it can be store in file or pass through 
> command parameters.
> 2. some directory needs not translate to Cygwin. here is some related error:
> $ ./start-dfs.sh
> HADOOP_LIBEXEC_DIR:/usr/local/hadoop-2.0.2-alpha/sbin/../libexec
> after hdfs-config.sh
> which: no hdfs in (./C:\cygwin\usr\local\hadoop-2.0.2-alpha/bin)
> dirname: missing operand
> Try `dirname --help' for more information.
> which: no hdfs in (./C:\cygwin\usr\local\hadoop-2.0.2-alpha/bin)
> NAMENODES:localhost
> ]tarting namenodes on [localhost
> -will run /usr/local/hadoop-2.0.2-alpha/sbin/slaves.sh
> HADOOP_CONF_DIR:/usr/local/hadoop-2.0.2-alpha/etc/hadoop
> HADOOP_PREFIX:C:\cygwin\usr\local\hadoop-2.0.2-alpha
> NAMENODES:
> para:--script /usr/local/hadoop-2.0.2-alpha/sbin/hdfs start namenode
> -will run 
> /usr/local/hadoop-2.0.2-alpha/sbin/slaves.sh-
> in slaves.sh
> HADOOP_SLAVE_NAMES:localhost
> in slaves.sh-
> SLAVE_NAMES:localhost
> The slave:localhost
> HADOOP_SSH_OPTS:
> : hostname nor servname provided, or not known
> -will run /usr/local/hadoop-2.0.2-alpha/sbin/slaves.sh
> HADOOP_CONF_DIR:/usr/local/hadoop-2.0.2-alpha/etc/hadoop
> HADOOP_PREFIX:C:\cygwin\usr\local\hadoop-2.0.2-alpha
> NAMENODES:
> para:--script /usr/local/hadoop-2.0.2-alpha/sbin/hdfs start datanode
> -will run 
> /usr/local/hadoop-2.0.2-alpha/sbin/slaves.sh-
> in slaves.sh
> HADOOP_SLAVE_NAMES:
> in slaves.sh-
> SLAVE_FILE:/usr/local/hadoop-2.0.2-alpha/etc/hadoop/slaves
> SLAVE_NAMES:localhost
> The slave:localhost
> HADOOP_SSH_OPTS:
> localhost: bash: line 0: cd: C:cygwinusrlocalhadoop-2.0.2-alpha: No such file 
> or directory
> localhost: in hadoop-daemon.sh
> localhost: hadoopScript:/usr/local/hadoop-2.0.2-alpha/sbin/hdfs
> localhost: in hadoop-daemon.sh--
> localhost: datanode running as process 6432. Stop it first.
> ]tarting secondary namenodes [0.0.0.0
> -will run /usr/local/hadoop-2.0.2-alpha/sbin/slaves.sh
> HADOOP_CONF_DIR:/usr/local/hadoop-2.0.2-alpha/etc/hadoop
> HADOOP_PREFIX:C:\cygwin\usr\local\hadoop-2.0.2-alpha
> NAMENODES:
> para:--script /usr/local/hadoop-2.0.2-alpha/sbin/hdfs start secondarynamenode
> -will run 
> /usr/local/hadoop-2.0.2-alpha/sbin/slaves.sh-
> in slaves.sh
> HADOOP_SLAVE_NAMES:0.0.0.0
> in slaves.sh-
> SLAVE_NAMES:0.0.0.0
> The slave:0.0.0.0
> HA

[jira] [Created] (HDFS-3602) Enhancements to HDFS for Windows Server and Windows Azure development and runtime environments

2012-07-05 Thread Bikas Saha (JIRA)

Bikas Saha created HDFS-3602:


 Summary: Enhancements to HDFS for Windows Server and Windows Azure 
development and runtime environments
 Key: HDFS-3602
 URL: https://issues.apache.org/jira/browse/HDFS-3602
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Bikas Saha
Assignee: Bikas Saha


This JIRA tracks the work that needs to be done on trunk to enable Hadoop to 
run on Windows Server and Azure environments. This incorporates porting 
relevant work from the similar effort on branch 1 tracked via HADOOP-8079.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3565) Fix streaming job failures with WindowsResourceCalculatorPlugin

2012-06-25 Thread Bikas Saha (JIRA)

Bikas Saha created HDFS-3565:


 Summary: Fix streaming job failures with 
WindowsResourceCalculatorPlugin
 Key: HDFS-3565
 URL: https://issues.apache.org/jira/browse/HDFS-3565
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha


Some streaming jobs use local mode job runs that do not start tasks trackers. 
In these cases, the jvm context is not setup and hence local mode execution 
causes the code to crash.
Fix is to not not use ResourceCalculatorPlugin in such cases or make the local 
job run creating dummy jvm contexts. Choosing the first option because thats 
the current implicit behavior in Linux. The ProcfsBasedProcessTree (used inside 
the LinuxResourceCalculatorPlugin) does no real work when the process pid is 
not setup correctly. This is what happens when local job mode runs.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3518) Provide API to check HDFS operational state

2012-06-10 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292594#comment-13292594
 ] 

Bikas Saha commented on HDFS-3518:
--

Seems like the most useful information we can get would be if the NN is in safe 
mode or not because the JT can then go into a conservative mode wrt task 
failures etc. because these would very likely be related to NN safe mode. 
HDFS-2413 seems to fit the bill but its committed to the 2.0 line.

> Provide API to check HDFS operational state
> ---
>
> Key: HDFS-3518
> URL: https://issues.apache.org/jira/browse/HDFS-3518
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Bikas Saha
>Assignee: Tsz Wo (Nicholas), SZE
>
> This will improve the usability of JobTracker safe mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3518) Provide API to check HDFS operational state

2012-06-08 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292137#comment-13292137
 ] 

Bikas Saha commented on HDFS-3518:
--

@suresh - Yes
@atm - Sorry. I thought my title would be sufficient :) Suresh's comment 
clarifies what I missed.

> Provide API to check HDFS operational state
> ---
>
> Key: HDFS-3518
> URL: https://issues.apache.org/jira/browse/HDFS-3518
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Bikas Saha
>Assignee: Tsz Wo (Nicholas), SZE
>
> This will improve the usability of JobTracker safe mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3518) Provide API to check HDFS operational state

2012-06-08 Thread Bikas Saha (JIRA)

Bikas Saha created HDFS-3518:


 Summary: Provide API to check HDFS operational state
 Key: HDFS-3518
 URL: https://issues.apache.org/jira/browse/HDFS-3518
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Bikas Saha


This will improve the usability of JobTracker safe mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3424) TestDatanodeBlockScanner and TestReplication fail intermittently on Windows

2012-05-15 Thread Bikas Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated HDFS-3424:
-

Attachment: HDFS-3424.branch-1-win.patch

Attaching fix.

> TestDatanodeBlockScanner and TestReplication fail intermittently on Windows
> ---
>
> Key: HDFS-3424
> URL: https://issues.apache.org/jira/browse/HDFS-3424
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: HDFS-3424.branch-1-win.patch
>
>
> The tests change the block length to corrupt the data block. If the block 
> file is opened by the datanode then the test can concurrently modify it on 
> Linux but such concurrent modification is not allowed by the default 
> permissions on Windows. Since this is more of a test issue, the fix would be 
> to have the tests make sure that the block is not open concurrently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Moved] (HDFS-3424) TestDatanodeBlockScanner and TestReplication fail intermittently on Windows

2012-05-15 Thread Bikas Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha moved MAPREDUCE-4259 to HDFS-3424:
-

Affects Version/s: (was: 1.0.0)
   1.0.0
  Key: HDFS-3424  (was: MAPREDUCE-4259)
  Project: Hadoop HDFS  (was: Hadoop Map/Reduce)

> TestDatanodeBlockScanner and TestReplication fail intermittently on Windows
> ---
>
> Key: HDFS-3424
> URL: https://issues.apache.org/jira/browse/HDFS-3424
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>
> The tests change the block length to corrupt the data block. If the block 
> file is opened by the datanode then the test can concurrently modify it on 
> Linux but such concurrent modification is not allowed by the default 
> permissions on Windows. Since this is more of a test issue, the fix would be 
> to have the tests make sure that the block is not open concurrently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode

2012-04-23 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13259763#comment-13259763
 ] 

Bikas Saha commented on HDFS-3092:
--

>From what I understand the approach is to dedicate a disk per journal daemon. 
>That would be easy when running JD's on NN machines. For the 3rd JD one could 
>use a disk on the JobTracker/ResourceManager machine.

> Enable journal protocol based editlog streaming for standby namenode
> 
>
> Key: HDFS-3092
> URL: https://issues.apache.org/jira/browse/HDFS-3092
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, name-node
>Affects Versions: 0.24.0, 0.23.3
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: ComparisonofApproachesforHAJournals.pdf, 
> MultipleSharedJournals.pdf, MultipleSharedJournals.pdf, 
> MultipleSharedJournals.pdf
>
>
> Currently standby namenode relies on reading shared editlogs to stay current 
> with the active namenode, for namespace changes. BackupNode used streaming 
> edits from active namenode for doing the same. This jira is to explore using 
> journal protocol based editlog streams for the standby namenode. A daemon in 
> standby will get the editlogs from the active and write it to local edits. To 
> begin with, the existing standby mechanism of reading from a file, will 
> continue to be used, instead of from shared edits, from the local edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode

2012-04-20 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258633#comment-13258633
 ] 

Bikas Saha commented on HDFS-3092:
--

@Todd
The definition in the doc for ParallelWritesWithBarrier is deliberately 
shallow. The point was just to differentiate between waiting and not waiting. 
The doc does not go into specifics of algorithms. So your feedback for 
different issues should be directed to the proposal you are commenting on. On 
future improvements - again, the doc is meant to be a comparison of the 
proposals as we saw them in the design docs submitted to the jira's and 
bookkeeper online references.
Basically, going by existing documentation of proposals, the doc tries to 
outline the high level salient points to consider.

@Flavio
Thanks for the roadmap pointer.

> Enable journal protocol based editlog streaming for standby namenode
> 
>
> Key: HDFS-3092
> URL: https://issues.apache.org/jira/browse/HDFS-3092
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, name-node
>Affects Versions: 0.24.0, 0.23.3
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: ComparisonofApproachesforHAJournals.pdf, 
> MultipleSharedJournals.pdf, MultipleSharedJournals.pdf, 
> MultipleSharedJournals.pdf
>
>
> Currently standby namenode relies on reading shared editlogs to stay current 
> with the active namenode, for namespace changes. BackupNode used streaming 
> edits from active namenode for doing the same. This jira is to explore using 
> journal protocol based editlog streams for the standby namenode. A daemon in 
> standby will get the editlogs from the active and write it to local edits. To 
> begin with, the existing standby mechanism of reading from a file, will 
> continue to be used, instead of from shared edits, from the local edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-7240) Object store in HDFS

[jira] [Commented] (HDFS-8401) Memfs - a layered file system for in-memory storage in HDFS

[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper

[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper

[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper

[jira] [Resolved] (HDFS-5098) Enhance FileSystem.Statistics to have locality information

[jira] [Commented] (HDFS-5098) Enhance FileSystem.Statistics to have locality information

[jira] [Commented] (HDFS-5152) Avoiding redundant Kerberos login for Zookeeper client in ActiveStandbyElector

[jira] [Commented] (HDFS-5152) Avoiding redundant Kerberos login for Zookeeper client in ActiveStandbyElector

[jira] [Commented] (HDFS-5098) Enhance FileSystem.Statistics to have locality information

[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

[jira] [Commented] (HDFS-5096) Automatically cache new data added to a cached path

[jira] [Commented] (HDFS-5096) Automatically cache new data added to a cached path

[jira] [Created] (HDFS-5098) Enhance FileSystem.Statistics to have locality information

[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

[jira] [Commented] (HDFS-4942) Add retry cache support in Namenode

[jira] [Commented] (HDFS-4942) Add retry cache support in Namenode

[jira] [Commented] (HDFS-4606) HDFS API to move file replicas to caller's location

[jira] [Commented] (HDFS-4606) HDFS API to move file replicas to caller's location

[jira] [Commented] (HDFS-4198) the shell script error for Cygwin on windows7

[jira] [Created] (HDFS-3602) Enhancements to HDFS for Windows Server and Windows Azure development and runtime environments

[jira] [Created] (HDFS-3565) Fix streaming job failures with WindowsResourceCalculatorPlugin

[jira] [Commented] (HDFS-3518) Provide API to check HDFS operational state

[jira] [Commented] (HDFS-3518) Provide API to check HDFS operational state

[jira] [Created] (HDFS-3518) Provide API to check HDFS operational state

[jira] [Updated] (HDFS-3424) TestDatanodeBlockScanner and TestReplication fail intermittently on Windows

[jira] [Moved] (HDFS-3424) TestDatanodeBlockScanner and TestReplication fail intermittently on Windows

[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode

[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode

30 matches

Site Navigation

Mail list logo

Footer information