date:20121020

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480657#comment-13480657
 ] 

Suresh Srinivas commented on HDFS-2802:
---

Not sure if you read the discussion in section snapshot of being written files. 
I will add more details to the design. My comment earlier was related to strict 
consistency requirements. 

 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4022) Replication not happening for appended block

2012-10-20 Thread Uma Maheswara Rao G (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480661#comment-13480661
]

Uma Maheswara Rao G commented on HDFS-4022:
---

Thanks Vinay for the patch. Thanks a lot, Nicholas for your reviews.
I will commit it in some time today.

Replication not happening for appended block

Key: HDFS-4022
URL: https://issues.apache.org/jira/browse/HDFS-4022
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: suja s
Assignee: Vinay
Priority: Blocker
Attachments: HDFS-4022.patch, HDFS-4022.patch, HDFS-4022.patch,
HDFS-4022.patch

Block written and finalized
Later append called. Block GenTS got changed.
DN side log
Can't send invalid block
BP-407900822-192.xx.xx.xx-1348830837061:blk_-9185630731157263852_108738
logged continously
NN side log
INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Error report from
DatanodeRegistration(192.xx.xx.xx,
storageID=DS-2040532042-192.xx.xx.xx-50010-1348830863443, infoPort=50075,
ipcPort=50020, storageInfo=lv=-40;cid=123456;nsid=116596173;c=0): Can't send
invalid block
BP-407900822-192.xx.xx.xx-1348830837061:blk_-9185630731157263852_108738 also
logged continuosly.
The block checked for tansfer is the one with old genTS whereas the new block
with updated genTS exist in the data dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4088) Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor

2012-10-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480688#comment-13480688
 ] 

Hudson commented on HDFS-4088:
--

Integrated in Hadoop-Yarn-trunk #9 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/9/])
HDFS-4088. Remove throws QuotaExceededException from an 
INodeDirectoryWithQuota constructor. (Revision 1400345)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400345
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectoryWithQuota.java


 Remove throws QuotaExceededException from an INodeDirectoryWithQuota 
 constructor
 --

 Key: HDFS-4088
 URL: https://issues.apache.org/jira/browse/HDFS-4088
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Fix For: 2.0.3-alpha

 Attachments: h4088_20121019.patch


 The constructor body does not throw QuotaExceededException.  We should remove 
 it from the declaration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3483) Better error message when hdfs fsck is run against a ViewFS config

2012-10-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480715#comment-13480715
 ] 

Hudson commented on HDFS-3483:
--

Integrated in Hadoop-Hdfs-0.23-Build #410 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/410/])
svn merge -c 1394864 FIXES: HDFS-3483. Better error message when hdfs fsck 
is run against a ViewFS config. Contributed by Stephen Fritz. (Revision 1400218)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400218
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java


 Better error message when hdfs fsck is run against a ViewFS config
 --

 Key: HDFS-3483
 URL: https://issues.apache.org/jira/browse/HDFS-3483
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Fritz
  Labels: newbie
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: core-site.xml, HDFS-3483.patch, hdfs-site.xml


 I'm running a HA + secure + federated cluster.
 When I run hdfs fsck /nameservices/ha-nn-uri/, I see the following:
 bash-3.2$ hdfs fsck /nameservices/ha-nn-uri/
 FileSystem is viewfs://oracle/
 DFSck exiting.
 Any path I enter will return the same message.
 Attached are my core-site.xml and hdfs-site.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails

2012-10-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480718#comment-13480718
 ] 

Hudson commented on HDFS-3873:
--

Integrated in Hadoop-Hdfs-0.23-Build #410 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/410/])
svn merge -c 1393777 FIXES: HDFS-3996. Add debug log removed in HDFS-3873 
back. Contributed by Eli Collins (Revision 1400216)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400216
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java


 Hftp assumes security is disabled if token fetch fails
 --

 Key: HDFS-3873
 URL: https://issues.apache.org/jira/browse/HDFS-3873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Fix For: 0.23.3, 2.0.2-alpha

 Attachments: HDFS-3873.branch-23.patch, HDFS-3873.patch


 Hftp ignores all exceptions generated while trying to get a token, based on 
 the assumption that it means security is disabled.  Debugging problems is 
 excruciatingly difficult when security is enabled but something goes wrong.  
 Job submissions succeed, but tasks fail because the NN rejects the user as 
 unauthenticated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3996) Add debug log removed in HDFS-3873 back

2012-10-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480719#comment-13480719
 ] 

Hudson commented on HDFS-3996:
--

Integrated in Hadoop-Hdfs-0.23-Build #410 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/410/])
svn merge -c 1393777 FIXES: HDFS-3996. Add debug log removed in HDFS-3873 
back. Contributed by Eli Collins (Revision 1400216)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400216
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java


 Add debug log removed in HDFS-3873 back
 ---

 Key: HDFS-3996
 URL: https://issues.apache.org/jira/browse/HDFS-3996
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.2-alpha
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: hdfs-3996.txt


 Per HDFS-3873 let's add the debug log back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4088) Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor

2012-10-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480723#comment-13480723
 ] 

Hudson commented on HDFS-4088:
--

Integrated in Hadoop-Hdfs-trunk #1201 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1201/])
HDFS-4088. Remove throws QuotaExceededException from an 
INodeDirectoryWithQuota constructor. (Revision 1400345)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400345
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectoryWithQuota.java


 Remove throws QuotaExceededException from an INodeDirectoryWithQuota 
 constructor
 --

 Key: HDFS-4088
 URL: https://issues.apache.org/jira/browse/HDFS-4088
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Fix For: 2.0.3-alpha

 Attachments: h4088_20121019.patch


 The constructor body does not throw QuotaExceededException.  We should remove 
 it from the declaration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4088) Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor

2012-10-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480734#comment-13480734
 ] 

Hudson commented on HDFS-4088:
--

Integrated in Hadoop-Mapreduce-trunk #1231 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1231/])
HDFS-4088. Remove throws QuotaExceededException from an 
INodeDirectoryWithQuota constructor. (Revision 1400345)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400345
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectoryWithQuota.java


 Remove throws QuotaExceededException from an INodeDirectoryWithQuota 
 constructor
 --

 Key: HDFS-4088
 URL: https://issues.apache.org/jira/browse/HDFS-4088
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Fix For: 2.0.3-alpha

 Attachments: h4088_20121019.patch


 The constructor body does not throw QuotaExceededException.  We should remove 
 it from the declaration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480745#comment-13480745
]

Hari Mankude commented on HDFS-2802:

Todd, another option is to look at the inodesUnderConstruction in the NN and
query the DNs for the exact filesize at the time of taking snapshot. Even with
this, the filesize that is obtained will be at the instant. Applications like
hbase will have to deal with hlogs that could have incomplete log entries when
an un-cordinated snapshot is taken at the hdfs. A better approach is to have
the application reach a quiesce point and then take a snap. This is normally
done for oracle (hot backup mode) and sqlserver so that an application
consistent snapshot can be taken.

Also, createSnap()/removeSnap() has the writeLock() on the FSNamesystem which
will ensure that there are no other metadata updates when snap is being taken.

Support for RW/RO snapshots in HDFS
---

Key: HDFS-2802
URL: https://issues.apache.org/jira/browse/HDFS-2802
Project: Hadoop HDFS
Issue Type: New Feature
Components: data-node, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf

Snapshots are point in time images of parts of the filesystem or the entire
filesystem. Snapshots can be a read-only or a read-write point in time copy
of the filesystem. There are several use cases for snapshots in HDFS. I will
post a detailed write-up soon with with more information.

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Todd Lipcon (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480757#comment-13480757
]

Todd Lipcon commented on HDFS-2802:
---

Hi Suresh. Yes, I read the design there. In fact I think the design is based my
comment on HDFS-3960 from a few weeks ago. But after further thinking, I think
that design is too weak. Here's why:

If the first DN in the pipeline has to RPC on every hflush, that would be way
too many RPCs. HBase for example flushes several hundred times per second per
server, so a 1000 node HBase cluster under heavy load would quickly take down a
NameNode. So instead of the DN immediately RPCing on every hflush, it has to
wait until the next heartbeat and report lengths with the heartbeat.

Given this, it may be 5-10 seconds between the hflush and the report of the
length to the datanode. This means that the snapshot will get a length which is
either 5-10 seconds too old or 5-10 seconds too new (depending on whether it
uses the last reported length or if it waits until the next heartbeat to
finalize the snapshot)

A 5-10 second inconsistency window is plenty to break the situation described
above: it's quite likely to get data layer modifications 1 and 3 wthout getting
namespace modifications 2 and 4, or vice versa.

On the other hand, the design I proposed above _does_ handle this, because the
DN isn't reporting the length at the time of heartbeat. Instead it's reporting
a length which is causally consistent with the namespace from the perspective
of the writer of that file.

bq. Todd, another option is to look at the inodesUnderConstruction in the NN
and query the DNs for the exact filesize at the time of taking snapshot

We can't query the DNs while holding the NN lock. It could take several seconds
or longer to contact all the DNs in a loaded 1000+ node cluster, and
potentially 10s of seconds if one of the nodes is actually down. So you'd have
to drop the lock, at which point we're back to the above issue with consistency
against concurrent NS modifications.

bq. A better approach is to have the application reach a quiesce point and
then take a snap. This is normally done for oracle (hot backup mode) and
sqlserver so that an application consistent snapshot can be taken.

The difference is that quiescing a single-node or small-cluster database like
SQL Server or RAC is relatively easy. On the other hand, quiescing a 1000 node
HBase cluster would take a while, and I don't think users will really tolerate
a global stop-the-world to make a snapshot. This is especially true for use
cases like DR/backup where you expect to take snapshots as often as once every
few minutes.

Support for RW/RO snapshots in HDFS
---

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Suresh Srinivas (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480772#comment-13480772
]

Suresh Srinivas commented on HDFS-2802:
---

bq. In fact I think the design is based my comment on HDFS-3960
Actually we have been mulling over some of these ideas for a long time.
HDFS-3960 was just started to get the discussion going.

The design we are proposing is to let DNs send the length. The length known is
what goes into the snapshot instead of recording either zero length for block
under construction or having to initiate communication with
datanodes/implicitly getting it from DN. From what I have heard from some HBase
folks 5-10 seconds lagging should be workable for them. That is why I want to
to talk to few HBase folks in the design review.

Support for RW/RO snapshots in HDFS
---

[jira] [Comment Edited] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Suresh Srinivas (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480772#comment-13480772
]

Suresh Srinivas edited comment on HDFS-2802 at 10/20/12 4:34 PM:
-

bq. In fact I think the design is based my comment on HDFS-3960
Actually that is not true. We have been mulling over many of these ideas for a
long time. HDFS-3960 was just create to get the discussion going.

was (Author: sureshms):
bq. In fact I think the design is based my comment on HDFS-3960
Actually we have been mulling over some of these ideas for a long time.
HDFS-3960 was just started to get the discussion going.

Support for RW/RO snapshots in HDFS
---

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480784#comment-13480784
]

Hari Mankude commented on HDFS-2802:

Todd,

I do not agree that your solution will be any beneficial to hbase than what is
being proposed. Any type of txid information in DNs will be at the beginning of
the transaction. If the client is writing in the middle of block, there is no
way to know the exact size when snap was taken. Querying
inodesUnderConstruction will give the block length at the time of the query. It
is not possible to take an application consistent snapshot (one which does not
require recovery) without coordination with the application.

In fact, communication with DNs when snapshots are being taken will make the
process of taking snapshots very slow while giving very little additional
benefit.

Support for RW/RO snapshots in HDFS
---

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480785#comment-13480785
 ] 

Hari Mankude commented on HDFS-2802:


Sorry hit the comment early.

Additionally, including the sizes of non-finalized blocks in snapshots has 
implication that if the client dies and the non-finalized section is discarded, 
then snapshot might have pointers to non-existent blocks.



 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480803#comment-13480803
 ] 

Todd Lipcon commented on HDFS-2802:
---

bq. The design we are proposing is to let DNs send the length. The length known 
is what goes into the snapshot instead of recording either zero length for 
block under construction or having to initiate communication with 
datanodes/implicitly getting it from DN. From what I have heard from some HBase 
folks 5-10 seconds lagging should be workable for them. That is why I want to 
to talk to few HBase folks in the design review.

I hope I qualify as an HBase folk?

5-10 seconds lagging on the *data* is probably fine. But inconsistency between 
metadata and namespace modifications is a lot tougher. Consider for example an 
application which uses a write-ahead log on HDFS to make a group of namespace 
modifications consistent. See HBASE-2231 for an example of a place where we 
currently have a dataloss bug for which the proposed fix is exactly this:

1. Write new files (compaction result)
2. Write to WAL that compaction is finished
3. Delete old files (compaction sources)

On recovery, if we see the compaction finished entry in the WAL, then we 
roll forward the transaction and delete the source. But if the snapshot 
doesn't preserve ordering of the above operations, we risk either seeing the 
compaction finished when the namespace doesn't have the new files, which 
would result in an accidental deletion of a bunch of data.

So I think we need a way to provide barriers between namespace and data layer 
modifications. The proposal I made above should achieve this.

Another option is something that we've called super flush. This would be a 
flag on hflush() or hsync() indicating that the new length of the file needs to 
be persisted to the NameNode, not just the datanodes. It would be used by 
applications like HBase to determine consistency points for file lengths.


bq. In fact, communication with DNs when snapshots are being taken will make 
the process of taking snapshots very slow while giving very little additional 
benefit.

We should distinguish between two types of slowness for snapshots:
1) Slowness while holding a lock. This is unacceptable IMO - we must hold the 
lock for a bounded amount of time and never make an RPC while holding the lock.
2) Slowness before a snapshot is available for restore. This is acceptable. For 
example, if the user operation create snapshot holds the lock for 10ms, but 
the snapshot is initially in a COLLECTING_LENGTHS state while it waits for 
block lengths that seems acceptable. So long as the lengths are filled in by 
the next heartbeat (or two heartbeats from now) it should be complete (and thus 
ready for recovery) within the minute. Note that we don't need to wait for a 
heartbeat from every datanode. Instead, we just need to wait until, for each 
under-construction block in the snapshotted area, _one_ of its replicas has 
reported. When snapshotting a subtree without any open files, it would still be 
instant.

bq. Additionally, including the sizes of non-finalized blocks in snapshots has 
implication that if the client dies and the non-finalized section is discarded, 
then snapshot might have pointers to non-existent blocks.

I don't understand what you mean here...can you be more specific about the 
scenario?

 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480804#comment-13480804
 ] 

Suresh Srinivas commented on HDFS-2802:
---

bq. I hope I qualify as an HBase folk?
Sure. But I want others to seek others feedback before even considering adding 
any more complexity to the design.

 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4079) Add SnapshotManager

2012-10-20 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480805#comment-13480805
 ] 

Suresh Srinivas commented on HDFS-4079:
---

Nicholas, I have two comments:
# SnapshotManager could be an interface. That way if we need we can make it 
pluggable perhaps in the future. This we could do in a separate jira.
# Second comment is to make for the existing classes such as INode where we are 
changing the access/visibility of methods, we should do it in trunk first. I am 
okay to make it in trunk and then merge it into this branch later.

+1 for the patch.

 Add SnapshotManager
 ---

 Key: HDFS-4079
 URL: https://issues.apache.org/jira/browse/HDFS-4079
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4079_20121019.patch


 SnapshotManager maintains a list for all the snapshottable directories in the 
 namespace.  It also supports snapshot related methods such as setting a 
 directory to snapshottable, creating a snapshot, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480808#comment-13480808
 ] 

Suresh Srinivas commented on HDFS-2802:
---

Should we consider moving the consistency part of the discussion to HDFS-3960?

 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Suresh Srinivas (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480547#comment-13480547
]

Suresh Srinivas edited comment on HDFS-2802 at 10/20/12 7:20 PM:
-

Thanks for the comments guys.

bq. In some of the most commercially popular systems which implement snapshots,
snapshots do not count against the disk quotas
How do they handle disk quota use when the original file is deleted and only
snapshots exists? That is the reason why counting the disk quota makes sense.

bq. First, I'm concerned with the O(# of files + # of directories) nature of
this design, both in terms of time taken to create a snapshot and the NN memory
resources consumed.
I agree with you on this. We wanted to begin with this approach and then
optimize it further in memory. The initial patch uploaded here tried premature
optimization both for memory and snapshot creation time and thus made the code
really complicated. But this is a definite goal and that part of the design we
will update as we continue to work. This is covered in open issues/future work
section.

comment 1:
Agree with this part. As we continue the work, we can make a decision on this.
For supporting RW, lets not make the design/implementation more complicated.

comment 2:
Will address this as we continue to add more details to the design in the next
update.

Comment 3, 6:
I want to make sure you understand this is early design and we will continue to
add more details. I think some of the questions will be answered by how this
works:
- Admin can mark directories as snapshottable using CLI
- User then can create snapshots for these directories using CLI/API. A
snapshot has a snapshot name and it is unique for given snapshot root.

comment 4:
If you look at snapshot implementation in other systems it is done at volume
level. That is the parallel we are talking about.

Comment 5, Comment 7, comment 10:
As regards to consistency (comment 7), a system where snapshot is taken at the
namespace without involving data layer cannot provide string consistency
guarantee. I also think it may not be relevant where writers are different from
the client that is taking the snapshot. Not sure what guarantee such a client
can expect/depend on given writers are separate. We could discuss this during
design review. I also think based on discussion with few HBase folks, they
should be okay with it. Some thing to discuss with them. I am also not clear on
their dependency on HDFS with hbase-6055.

comment 8:
This could change during implementation if we think access time may not be that
important to maintain.

comment 9:
Agreed. I am leaning towards allowing it.

comment 11:
Will add usecases

comment 12:
See the volume comment and the document sort of covers this. We could discuss
this further if the document is not clear.

was (Author: sureshms):
Thanks for the comments guys.

bq. In some of the most commercially popular systems which implement snapshots,
snapshots do not count against the disk quotas
How do they handle disk quota use when the original file is deleted and only
snapshots exit? That is the reason why counting the disk quota makes sense.

comment 1:
Agree with this part. As we continue the work, we can make a decision on this.
For supporting RW, lets not make the design/implementation more complicated.

comment 2:
Will address this as we continue to add more details to the design in the next
update.

comment 4:
If you look at snapshot implementation in other systems it is done at volume
level. That is the parallel we are talking about.

[jira] [Created] (HDFS-4094) Specific file type bulk Transfer into HDFS to a specified HDFS location with a track of the transfer number

2012-10-20 Thread Anurag G Vyas (JIRA)

Anurag G Vyas created HDFS-4094:
---

 Summary: Specific file type bulk Transfer into HDFS to a specified 
HDFS location with a track of the transfer number 
 Key: HDFS-4094
 URL: https://issues.apache.org/jira/browse/HDFS-4094
 Project: Hadoop HDFS
  Issue Type: Wish
  Components: scripts
Affects Versions: 3.0.0
 Environment: Unix
Reporter: Anurag G Vyas
 Fix For: 3.0.0


Need a script for a bulk transfer process into HDFS in such a way that the 
script must be able to identify a specific file type and move only  that file 
type into the specified HDFS location.
For example : Say I have a local directory called user/dir . In it I have txt 
files such as anurag123 , anurag234, vyas678 , ganesh345 , anurag277 , vyas345 
ganesh789 etc. The script must take 2 inputs. One is the file type , say 
anurag. And the other an HDFS location. Then it must move all the files which 
has the name anurag in the directory to the specified HDFS location. 
Also it would be good if the script could keep track of the number of files it 
moved to HDFS by saving the details in a new file either in HDFS or in the 
local directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-4094) Specific file type bulk Transfer into HDFS to a specified HDFS location with a track of the transfer number

2012-10-20 Thread Todd Lipcon (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Todd Lipcon resolved HDFS-4094.
---

Resolution: Invalid

This sounds like a request for a higher-order convenience script, not a feature
we'd put in Hadoop proper.

Specific file type bulk Transfer into HDFS to a specified HDFS location with
a track of the transfer number

Key: HDFS-4094
URL: https://issues.apache.org/jira/browse/HDFS-4094
Project: Hadoop HDFS
Issue Type: Wish
Components: scripts
Affects Versions: 3.0.0
Environment: Unix
Reporter: Anurag G Vyas
Fix For: 3.0.0

Need a script for a bulk transfer process into HDFS in such a way that the
script must be able to identify a specific file type and move only that file
type into the specified HDFS location.
For example : Say I have a local directory called user/dir . In it I have txt
files such as anurag123 , anurag234, vyas678 , ganesh345 , anurag277 ,
vyas345 ganesh789 etc. The script must take 2 inputs. One is the file type ,
say anurag. And the other an HDFS location. Then it must move all the files
which has the name anurag in the directory to the specified HDFS location.
Also it would be good if the script could keep track of the number of files
it moved to HDFS by saving the details in a new file either in HDFS or in the
local directory.

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Colin Patrick McCabe (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480836#comment-13480836
]

Colin Patrick McCabe commented on HDFS-2802:

Suresh said:

bq. How do [other filesystems] handle disk quota use when the original file is
deleted and only snapshots exists? That is the reason why counting the disk
quota makes sense.

ZFS has quotas and refquotas. The former includes snapshot overhead; the
latter does not. Based on some Googling, I think that on NetApp devices,
quotas do not include snapshot overhead. (at least by default). I think it
makes sense to offer both kinds of quota, although we don't have to implement
them both right away, of course.

Support for RW/RO snapshots in HDFS
---

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480840#comment-13480840
 ] 

Suresh Srinivas commented on HDFS-2802:
---

bq. ZFS has quotas and refquotas. The former includes snapshot overhead; 
the latter does not. 
Good to know. Some thing we should consider as well.

 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-4057) NameNode.namesystem should be private. Use getNamesystem() instead.

2012-10-20 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-4057.
---

   Resolution: Fixed
Fix Version/s: 1.2.0

I committed the patch to branch-1. Thank you Brandon.

 NameNode.namesystem should be private. Use getNamesystem() instead.
 ---

 Key: HDFS-4057
 URL: https://issues.apache.org/jira/browse/HDFS-4057
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 1.2.0
Reporter: Brandon Li
Assignee: Brandon Li
Priority: Minor
 Fix For: 1.2.0

 Attachments: HDFS-4057.branch-1.patch, HDFS-4057.branch-1.patch


 NameNode.namesystem should be private. One should use 
 NameNode.getNamesystem() to get it instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4072) On file deletion remove corresponding blocks pending replication

2012-10-20 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4072:
--

Fix Version/s: 1.2.0

I committed the patch to branch-1.

 On file deletion remove corresponding blocks pending replication
 

 Key: HDFS-4072
 URL: https://issues.apache.org/jira/browse/HDFS-4072
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 1.2.0, 3.0.0, 2.0.3-alpha

 Attachments: HDFS-4072.b1.001.patch, HDFS-4072.patch, 
 HDFS-4072.trunk.001.patch, HDFS-4072.trunk.002.patch, 
 HDFS-4072.trunk.003.patch, HDFS-4072.trunk.004.patch, 
 TestPendingAndDelete.java


 Currently when deleting a file, blockManager does not remove records that are 
 corresponding to the file's blocks from pendingRelications. These records can 
 only be removed after timeout (5~10 min).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4095) Add snapshot related metrics

2012-10-20 Thread Jing Zhao (JIRA)

Jing Zhao created HDFS-4095:
---

 Summary: Add snapshot related metrics
 Key: HDFS-4095
 URL: https://issues.apache.org/jira/browse/HDFS-4095
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao


Add metrics for number of snapshots in the system, including 1) number of 
snapshot files, and 2) number of snapshot only files (snapshot file that are 
not deleted but the original file is already deleted).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4096) Add snapshot information to namenode WebUI

2012-10-20 Thread Jing Zhao (JIRA)

Jing Zhao created HDFS-4096:
---

 Summary: Add snapshot information to namenode WebUI
 Key: HDFS-4096
 URL: https://issues.apache.org/jira/browse/HDFS-4096
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao


Add snapshot information to namenode WebUI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4097) provide CLI support for create/delete/list snapshots

2012-10-20 Thread Brandon Li (JIRA)

Brandon Li created HDFS-4097:


 Summary: provide CLI support for create/delete/list snapshots
 Key: HDFS-4097
 URL: https://issues.apache.org/jira/browse/HDFS-4097
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs client, name-node
Affects Versions: Snapshot (HDFS-2802)
Reporter: Brandon Li
Assignee: Brandon Li


provide CLI support for create/delete/list snapshots

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-4022) Replication not happening for appended block

[jira] [Commented] (HDFS-4088) Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor

[jira] [Commented] (HDFS-3483) Better error message when hdfs fsck is run against a ViewFS config

[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails

[jira] [Commented] (HDFS-3996) Add debug log removed in HDFS-3873 back

[jira] [Commented] (HDFS-4088) Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor

[jira] [Commented] (HDFS-4088) Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Comment Edited] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-4079) Add SnapshotManager

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Comment Edited] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Created] (HDFS-4094) Specific file type bulk Transfer into HDFS to a specified HDFS location with a track of the transfer number

[jira] [Resolved] (HDFS-4094) Specific file type bulk Transfer into HDFS to a specified HDFS location with a track of the transfer number

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Resolved] (HDFS-4057) NameNode.namesystem should be private. Use getNamesystem() instead.

[jira] [Updated] (HDFS-4072) On file deletion remove corresponding blocks pending replication

[jira] [Created] (HDFS-4095) Add snapshot related metrics

[jira] [Created] (HDFS-4096) Add snapshot information to namenode WebUI

[jira] [Created] (HDFS-4097) provide CLI support for create/delete/list snapshots

28 matches

Site Navigation

Mail list logo

Footer information