date:20100105

[jira] Commented: (HDFS-867) Add a PowerTopology class to aid replica placement and enhance availability of blocks

2010-01-05 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796665#action_12796665
 ] 

Steve Loughran commented on HDFS-867:
-

How do plan to use this? In data placement? And balancing?

 Add a PowerTopology class to aid replica placement and enhance availability 
 of blocks 
 --

 Key: HDFS-867
 URL: https://issues.apache.org/jira/browse/HDFS-867
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Jeff Hammerbacher
Priority: Minor

 Power outages are a common reason for a DataNode to become unavailable. 
 Having a data structure to represent to the power topology of your data 
 center can be used to implement a power-aware replica placement policy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-853) The HDFS webUI should show a metric that summarizes whether the cluster is balanced regarding disk space usage

2010-01-05 Thread Andrew Ryan (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796772#action_12796772
 ] 

Andrew Ryan commented on HDFS-853:
--

We're currently graphing both mean and standard deviation of datanodes from 
that mean, using a script that parses the output of 'dfsadmin -report'. Our DFS 
cluster nodes all have the same amount of disk space, so you'd expect mean of 
individual datanodes to be the same as % DFS full, but it's not quite the same. 
Haven't yet looked into why this is so.

To directly answer Konstantin's question, the one line we're using is standard 
deviation.

 The HDFS webUI should show a metric that summarizes whether the cluster is 
 balanced regarding disk space usage
 --

 Key: HDFS-853
 URL: https://issues.apache.org/jira/browse/HDFS-853
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: dhruba borthakur

 It is desirable to know how much the datanodes vary form one another in terms 
 of space utilization to get a sense of how well a HDFS cluster is balanced.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-717) Proposal for exceptions thrown by FileContext and Abstract File System

2010-01-05 Thread Suresh Srinivas (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796790#action_12796790
]

Suresh Srinivas commented on HDFS-717:
--

The following are the layers between the application and the Service
Implementation (such as NameNode).
Application = Client library = =RPC client= = Network = = RPC server =
= Service Impl

Key goals:
# InterruptedExceptions in the client library should not be ignored. This will
help in clean application shutdown. InterruptedException on the server side
should not be ignored; see below.
# Applications must be able to differentiate between RPC layer exception from
the exceptions in the Service Impl. Applications can choose to retry a request
based on different categories of exceptions received.
# Exceptions declared in the API should be propagated end to end over RPC from
the server to the application. All undeclared exceptions from the Service Impl
including InterruptedException should be handled by the RPC layer.
# Changes needed in applications to move to FileContext from FileSystem should
be minimal.

Proposal:
Exceptions will be organized as shown below.
# IOException
#* exceptions as declared in the RPC API - note the detailed method exception
will be declared even though they are a subclass of IOException
#* RPCException - exceptions in the rpc layer
#** RPCClientException - exception encountered in RPC client
#** RPCServerException - exception encountered in RPC server
#** UnexpectedServerException - unexpected exception from the Service Impl to
RPC handlers.
# RunTimeException
#* HadoopIllegalArgumentException - sublcass of IllegalArgumentException
indicates illegal or inappropriate argument.
#* HadoopInterruptedException - subclass of RunTimeException thrown on
encountering InterruptedException.
#* UnsupportedOperationException - thrown to indicate the requested operation
is not supported.

Rationale:
# declared exception should be subclass of IOException as before - no changes
here.
# group the rpc exceptions categorized by client side and server side.
# use runtime exception for InterrruptedException - simplifies migration to
FileContext. Subclass of IOException not used as applications might have catch
and ignore code.
# HadoopIllegalArgumentException instead of the java IllegalArgumentException -
helps differentiate exception in Hadoop implementation from exception thrown
from java libraries. Applications can choose to catch IllegalArgumentException.
# unsupported operation is indicated by unchecked UnsupportedOperationException
- subclass of IOException not used as applications might have catch and ignore
code. Using RunTimeException since applications cannot recover from this
condition.

Implementation details:
InterruptedException handling:
# Client side changes
#* Client library (both API interface and RPC client) and InputStream and
OutputStream returned by FileContext throw unchecked HadoopInterruptedException
on InterruptedException.
# Server changes:
#* InterruptedException is currently ignored in the Service Impl layer. With
this change the Service Impl will throw the exception. Methods in protocol
classes such as ClientProtocol will specify InterruptedException in throws
clause.
#* On InterruptedException, RPC handlers close the socket connection to the
client. Client handles this failure same as loss of connection.

RPC layer changes
# RPC layer marshalls HadoopInterruptedException,
HadoopIllegalArgumentException, UnsupportedException from Service Impl all the
way to the client.
# RPC layer throws RPCClientException, RPCServerException and
UnexpectedServerException.

FileContext and AbstractFileSystem and protocol changes:
# Methods in FileContext declare IOException and the relevant subclasses of
IOExceptions. This helps document the specific exceptions thrown and in
marshalling the exception from the server to application over RPC.
RPCExceptions are not declared as thrown in FileContext and AbstractFileSystem,
as some implementation might not use RPC layer (Local file system).
example:
{noformat}
public FSDataInputStream open(Path path) throws IOException,
FileNotFoundException, AccessDeniedException;
{noformat}
# Protocol methods (such as ClientProtocol) will the throw exceptions similar
to FileContext, along with InterruptedException.

Finally the FileContext will throw the following exceptions. The exception
hierarchy is flattened. The semantics remains as defined in the earlier
comments.
# IOException
#* ServerNotReadyException (NameNode safemode etc)
#* OutOfSpaceException for write operations
#* AccessControlException
#* InvalidPathNameException
#* FileNotFoundException
#* FileAlreadyExistsException
#* DirectoryNotEmptyException
#* NotDirectoryException
#* DirectoryNotAllowedException

Proposal for exceptions thrown by FileContext and Abstract File System

[jira] Commented: (HDFS-853) The HDFS webUI should show a metric that summarizes whether the cluster is balanced regarding disk space usage

2010-01-05 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796795#action_12796795
 ] 

Konstantin Shvachko commented on HDFS-853:
--

May be we should use the mean and standard deviation of _utilization_ rather 
than directly disk space. This would work for heterogeneous clusters as well. 
By utilization I mean the percentage of disk space used for blocks on a 
data-node. We should also make sure this is consistent with the Balancer: 
balancing should improve the metrics.

 The HDFS webUI should show a metric that summarizes whether the cluster is 
 balanced regarding disk space usage
 --

 Key: HDFS-853
 URL: https://issues.apache.org/jira/browse/HDFS-853
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: dhruba borthakur

 It is desirable to know how much the datanodes vary form one another in terms 
 of space utilization to get a sense of how well a HDFS cluster is balanced.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-01-05 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796810#action_12796810
]

Konstantin Shvachko commented on HDFS-826:
--

I remember we intended to implement but haven't done it yet two things related
to this issue.
# The client throws an exception to the application when the write pipeline
falls _below_ the minimal replication factor.
# A client should be able to close a file even if its last block is not
complete. With the following semantics: if the last block has at least one
valid replica it will be fully replicated, otherwise the last block is treated
as a corrupt block.

It seems the patch proposes new api to work around the problems rather than
addressing them directly.

Allow a mechanism for an application to detect that datanode(s) have died in
the write pipeline

Key: HDFS-826
URL: https://issues.apache.org/jira/browse/HDFS-826
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs client
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Attachments: ReplicableHdfs.txt

HDFS does not replicate the last block of the file that is being currently
written to by an application. Every datanode death in the write pipeline
decreases the reliability of the last block of the currently-being-written
block. This situation can be improved if the application can be notified of a
datanode death in the write pipeline. Then, the application can decide what
is the right course of action to be taken on this event.
In our use-case, the application can close the file on the first datanode
death, and start writing to a newly created file. This ensures that the
reliability guarantee of a block is close to 3 at all time.
One idea is to make DFSOutoutStream. write() throw an exception if the number
of datanodes in the write pipeline fall below minimum.replication.factor that
is set on the client (this is backward compatible).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

2010-01-05 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-755:
-

Status: Open  (was: Patch Available)

 Read multiple checksum chunks at once in DFSInputStream
 ---

 Key: HDFS-755
 URL: https://issues.apache.org/jira/browse/HDFS-755
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: alldata-hdfs.tsv, benchmark-8-256.png, benchmark.png, 
 hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt


 HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple 
 checksum chunks in a single call to readChunk. This is the HDFS-side use of 
 that new feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

2010-01-05 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-755:
-

Status: Patch Available  (was: Open)

 Read multiple checksum chunks at once in DFSInputStream
 ---

 Key: HDFS-755
 URL: https://issues.apache.org/jira/browse/HDFS-755
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: alldata-hdfs.tsv, benchmark-8-256.png, benchmark.png, 
 hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt


 HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple 
 checksum chunks in a single call to readChunk. This is the HDFS-side use of 
 that new feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

2010-01-05 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796895#action_12796895
]

Hadoop QA commented on HDFS-755:

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12427406/alldata-hdfs.tsv
against trunk revision 895877.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 patch. The patch command could not apply the patch.

Console output:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/169/console

This message is automatically generated.

Read multiple checksum chunks at once in DFSInputStream
---

Key: HDFS-755
URL: https://issues.apache.org/jira/browse/HDFS-755
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs client
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Attachments: alldata-hdfs.tsv, benchmark-8-256.png, benchmark.png,
hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt

HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple
checksum chunks in a single call to readChunk. This is the HDFS-side use of
that new feature.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

2010-01-05 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-755:
-

Status: Patch Available  (was: Open)

 Read multiple checksum chunks at once in DFSInputStream
 ---

 Key: HDFS-755
 URL: https://issues.apache.org/jira/browse/HDFS-755
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: alldata-hdfs.tsv, benchmark-8-256.png, benchmark.png, 
 hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, 
 hdfs-755.txt


 HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple 
 checksum chunks in a single call to readChunk. This is the HDFS-side use of 
 that new feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

2010-01-05 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-755:
-

Status: Open  (was: Patch Available)

 Read multiple checksum chunks at once in DFSInputStream
 ---

 Key: HDFS-755
 URL: https://issues.apache.org/jira/browse/HDFS-755
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: alldata-hdfs.tsv, benchmark-8-256.png, benchmark.png, 
 hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, 
 hdfs-755.txt


 HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple 
 checksum chunks in a single call to readChunk. This is the HDFS-side use of 
 that new feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

2010-01-05 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-755:
-

Attachment: hdfs-755.txt

Reattaching same patch so Hudson doesn't try to apply benchmark results as a 
patch.

 Read multiple checksum chunks at once in DFSInputStream
 ---

 Key: HDFS-755
 URL: https://issues.apache.org/jira/browse/HDFS-755
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: alldata-hdfs.tsv, benchmark-8-256.png, benchmark.png, 
 hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, 
 hdfs-755.txt


 HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple 
 checksum chunks in a single call to readChunk. This is the HDFS-side use of 
 that new feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HDFS-870) Topology is permanently cached

2010-01-05 Thread Allen Wittenauer (JIRA)

Topology is permanently cached
--

 Key: HDFS-870
 URL: https://issues.apache.org/jira/browse/HDFS-870
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Allen Wittenauer


Replacing the topology script requires a namenode bounce because the NN caches 
the information permanently.  It should really either expire it periodically or 
expire on -refreshNodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

2010-01-05 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796947#action_12796947
]

Hadoop QA commented on HDFS-755:

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12429481/hdfs-755.txt
against trunk revision 895877.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/170/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/170/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/170/artifact/trunk/build/test/checkstyle-errors.html
Console output:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/170/console

This message is automatically generated.

Read multiple checksum chunks at once in DFSInputStream
---

HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple
checksum chunks in a single call to readChunk. This is the HDFS-side use of
that new feature.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

2010-01-05 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796952#action_12796952
 ] 

Todd Lipcon commented on HDFS-755:
--

I think these failures are spurious - the same test passes locally:
{noformat}
[junit] Running org.apache.hadoop.hdfs.TestDataTransferProtocol
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 5.678 sec
{noformat}

 Read multiple checksum chunks at once in DFSInputStream
 ---

 Key: HDFS-755
 URL: https://issues.apache.org/jira/browse/HDFS-755
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: alldata-hdfs.tsv, benchmark-8-256.png, benchmark.png, 
 hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, 
 hdfs-755.txt


 HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple 
 checksum chunks in a single call to readChunk. This is the HDFS-side use of 
 that new feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-867) Add a PowerTopology class to aid replica placement and enhance availability of blocks

[jira] Commented: (HDFS-853) The HDFS webUI should show a metric that summarizes whether the cluster is balanced regarding disk space usage

[jira] Commented: (HDFS-717) Proposal for exceptions thrown by FileContext and Abstract File System

[jira] Commented: (HDFS-853) The HDFS webUI should show a metric that summarizes whether the cluster is balanced regarding disk space usage

[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

[jira] Updated: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

[jira] Created: (HDFS-870) Topology is permanently cached

[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

14 matches

Site Navigation

Mail list logo

Footer information