[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-26 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948814#comment-13948814
 ] 

Konstantin Shvachko commented on HDFS-6087:
---

> Currently, block recovery is very complicated during pipeline broken

Indeed the complexity starts when something brakes, and I still don't see how 
you propose to simplify the process.

But truncate as discussed in  HDFS-3107 is performed on a closed file and 
therefore is much less related to pipeline recovery since there is no pipeline 
after the file is closed. Which for me makes truncate a simpler task than the 
rewrite of the pipeline.
Nicholas in [his 
comment|https://issues.apache.org/jira/browse/HDFS-3107?focusedCommentId=13235941&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13235941]
 mentioned three ideas to implement truncate. I was recently thinking about 
another one. 

What if we implement truncate similar to lease recovery. That is when a client 
asks to truncate a file, NN changes the list of blocks by deleting some of the 
tail ones and decrementing the size of the last. Then NN issues a 
DataNodeCommand to recover the last block. DNs as the result of the recovery 
will truncated their replica files, and then call commitBockSynchronization() 
to report the new length to the NN.

Sorry, didn't want to hijack your jira, so if you intend to proceed with more 
general design here I'll re-post my idea under HDFS-3107.

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-26 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13947754#comment-13947754
 ] 

Guo Ruijing commented on HDFS-6087:
---

Hi, Konstantin,

truncate semantics in my proposal is same with HDFS-3107. We may implement them 
after resolving design concerns:

1) implement truncate as design proposal
2) implement write/append as design proposal after truncate is stable

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-26 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13947748#comment-13947748
 ] 

Guo Ruijing commented on HDFS-6087:
---

Hi, Konstantin,

In fact, I am proposing new design for HDFS write/append/truncate, writable 
snapshot and snapshot on snapshot. This JIRA was created for HDFS 
write/append/truncate.
Currently, block recovery is very complicated during pipeline broken for 
example HDFS-5728.



> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-25 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946968#comment-13946968
 ] 

Konstantin Shvachko commented on HDFS-6087:
---

Guo, could you please clarify. 
Are you proposing a new design for pipeline handling or just want to add 
truncate?
New pipeline handling is probably going to be hard, while adding truncate could 
be simpler.
If it is truncate you want, do you have any different requirements for APIs or 
semantics from those laid out under HDFS-3107?

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938070#comment-13938070
 ] 

Todd Lipcon commented on HDFS-6087:
---

Even creating a new hard link on every hflush is a no-go, performance wise, I'd 
think. Involving the NN in a round trip on hflush would also kill the 
scalability of HBase and other applications that hflush hundreds of times per 
second per node.

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-15 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936395#comment-13936395
 ] 

Guo Ruijing commented on HDFS-6087:
---

I think I already resolve Konstantin Shvachko and Tsz Wo Nicholas Sze's 
comments/concerns. I will wait for your new comments/concerns and update it in 
document.

design motivation is:

1) unify HDFS write/append/truncate

2) the design is base of writable snapshot / snapshot restore (This JIRA is not 
created to track snapshot items)

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-15 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936391#comment-13936391
 ] 

Guo Ruijing commented on HDFS-6087:
---

issue: The last block is not available for reading.


solution 1: if the block is referenced by client, the block can be moved to 
remove list in NN after block is unreferenced by client.

1) GetBlockLocations with Reference option
2) Client copy block to local buffer
3) New RPM message UnreferenceBlocks is sent to NN

solution 2: block is moved to trash and delayed to be deleted in DN.

In exsiting, blocks are deleted in DN after Heartbeat is responded to DN (lazy 
to delete blocks)

if block is already read by client and the block is requested to delete, DN 
should delete the block after read complete.

In most case, client can read the last block:

1) client request block location information

2) HDFS client copy blocks to local buffer. 

3) Heartbeat request to delete block(lazy to delete blocks)

4) HDFS application slowly read data from local buffer.

for race condition 2) and 3), we can delay to delete blocks.

even if block is deleted, client can request new block information.

I like solution 2


> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-15 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936383#comment-13936383
 ] 

Guo Ruijing commented on HDFS-6087:
---

writing not in block boundary will trigger block copying in DN:

1) it won't lead to a lot of small block
2) Like most of file system, hflush/hsync/truncate may cause performance 
downgrade.

If we can design zero copy for block copy, there is little performance 
downgrade.

1) Block is defined as (block data file, block length)
2) source block is already committed to NN and immutable.
3) block file can be created/appended and cannot be overridden or truncated.
4) Block size may not be equal to block data file length
5) create hardlink for block data file if copy block length = file length
6) copy block data file if copy block length < file length

Example:

1) Block 1:  (blockfile1, 32M) blockfile1(length: 32M)
2) copy Block 1 to Block 2 with 32M

a) hardlink blockfile 1 to blockfile 2.
b) Block 2: (blockfile2, 32M) blockfile2 (length: 32M)

3) write 16M buffer to block 2

a) Block 1:  (blockfile1, 32M) blockfile1(length: 48M)
   
b) Block 2:  (blockfile2, 48M) blockfile2(length: 48M)

3) copy Block 2 to Block 3 with 16M

a) copy blockfile2 to blockfile3 with 16M

b) Block 1:  (blockfile1, 32M) blockfile1(length: 48M)
   
c) Block 2:  (blockfile2, 48M) blockfile2(length: 48M)

d) block 3: (blockfile 3, 16M) blockfile3(length: 16M)

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-15 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936379#comment-13936379
 ] 

Guo Ruijing commented on HDFS-6087:
---

if client need to read data in early time, application should be:

1. open (for create/append) 2. write 3. hflush/hsync 4. write 5. close

Note: writing not in block boundary will trigger block copy in DN (we may 
design zero copy for block copy)

if client don't need to read in wary time, application can be:

1. open (for create/append) 2. write 3. write. 5 close

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-15 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936378#comment-13936378
 ] 

Guo Ruijing commented on HDFS-6087:
---

It support hflush/hsync:

1) sync all buffer.

2) commit buffer to NN if it is block boundary.

3) copy new block and append buffer to new block and commit to NN if it is not 
block boundary

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-14 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935727#comment-13935727
 ] 

Konstantin Shvachko commented on HDFS-6087:
---

If it does copy-on-write, then the block is not immutable, at least in the 
sense I understand the term.

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935707#comment-13935707
 ] 

Tsz Wo Nicholas Sze commented on HDFS-6087:
---

> 1. A block cannot be read by others while under construction, until it is 
> fully written and committed. ...

It also does not support hflush.

> 2. Your proposal (if I understand it correctly) will potentially lead to a 
> lot of small blocks if appends, fscyncs (and truncates) are used intensively. 
> ...

I guess it won't lead to a lot of small block since it does copy-on-write.  
However, there is going to be a lot of block coping if there are a lot of 
append, hsync, etc.


In addition, I think it would be a problem for reading the last block: If a 
reader opens a file and reads the last block slowly, then a writer reopen the 
file for append and committed the new last block.  The old last block may then 
be deleted and becomes not available to the read anymore.

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-14 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935686#comment-13935686
 ] 

Konstantin Shvachko commented on HDFS-6087:
---

Based on what you write, I see two main problems with your approach.
# A block cannot be read by others while under construction, until it is fully 
written and committed.
That would be a step back. Making UC-blocks readable was one of the append 
design requirements (see  HDFS-265 and preceding work). If a slow client writes 
to a block 1KB/min others will have to wait for hours until they can see the 
progress on the file.
# Your proposal (if I understand it correctly) will potentially lead to a lot 
of small blocks if appends, fscyncs (and truncates) are used intensively.
Say, in order to overcome problem (1) I write my application so that it closes 
the file after each 1KB written and reopens for append one minute later. You 
get lots of 1KB blocks. And small blocks are bad for the NameNode as we know.

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-14 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935017#comment-13935017
 ] 

Guo Ruijing commented on HDFS-6087:
---

update new document according to Konstantin's comments

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-13 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13934485#comment-13934485
 ] 

Guo Ruijing commented on HDFS-6087:
---

I plan to remove snapshot part and add one work-flow for write/append/truncate  
and more work-flow for exception handle in design proposal.

The basic idea:

1) block is immutable. if block is committed to NN, we can copy the bock 
instead of append the block and commit to NN.

2) before block is committed to NN, it is client's repsonsibility to readd it 
if fails and other client cannot read that block. so we don't need 
generationStamp to recover the block.

3) after block is committed to NN, file length is updated in NN so that client 
cannot see uncommitted block.

4) write/append/truncate have same logic.
 
1. Update BlockID before commit failure including pipeline failure. The design 
proposal try to remove generationStamp.

2. extra copyBlock(oldBlockID, newBlockID, length) is used for append and 
truncate.

3. commitBlock a) block will be immutable  b) remove all blocks after offset to 
implement truncate & append  3) update file length.

4. if block is not committed to namenode, file length is not updated and client 
cannot read the block.

5. I will add more failure scenarios

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-13 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13934316#comment-13934316
 ] 

Konstantin Shvachko commented on HDFS-6087:
---

Not sure I fully understood what you propose. So please feel free to correct if 
I am wrong.
# Sounds like you propose to update blockID every time the pipeline fails and 
that will guarantee block immutability. 
Isn't that similar to how current HDFS uses generationStamp? When pipeline 
fails HDFS increments genStamp making previously created replicas outdated.
# Seems you propose to introduce an extra commitBlock() call to NN.
Current HDFS has similar logic. Block commit is incorporated with addBlock() 
and complete() calls. E.g. addBlock() changes state to committed of the 
previous block of the file and then allocates the new one.
# Don't see how you get rid of lease recovery. The purpose of which is to 
reconcile different replicas of the incomplete last block, as they can have 
different lengths or genStamps on different DNs, as the results of the client 
or DNs failure in the middle of a data transfer.
If you propose to discard uncommitted blocks entirely, then it will break 
current semantics, which states that if a byte was read by a client once it 
should be readable by other clients as well.
# I guess it boils down to that your diagrams show regular work-flow, but don't 
consider failure scenarios.

> Unify HDFS write/append/truncate
> 
>
> Key: HDFS-6087
> URL: https://issues.apache.org/jira/browse/HDFS-6087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Guo Ruijing
> Attachments: HDFS Design Proposal.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)