[ 
https://issues.apache.org/jira/browse/HDFS-10268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229396#comment-15229396
 ] 

Chris Nauroth edited comment on HDFS-10268 at 4/7/16 12:09 AM:
---------------------------------------------------------------

I am attaching patch v001.  [~anu], would you please review?  In addition to 
the integration work, I'm including some small bug fixes that I discovered 
along the way.

Here is a summary of the changes.

* {{DataNode}}: Trigger a shutdown workflow for the {{ObjectStoreHandler}}.
* {{ObjectStoreHandler}}: Initialize a StorageContainerManager client for use 
by {{DistributedStorageHandler}}.
* {{ChunkUtils}}: At one point in my testing, I wasn't sending checksums.  This 
code was hitting a {{NullPointerException}}, so I added a guard.  At this 
point, it isn't strictly required, but I'd like to keep the null check anyway 
to be defensive.
* {{Dispatcher}}: Provide logging of full stack traces.
* {{XceiverClient}}: Implement the {{Closeable}} interface using the 
pre-existing {{close}} implementation, and provide a getter to access the 
underlying {{Pipeline}}.
* {{XceiverClientHandler}}: Stifle a nitpicky Checkstyle warning.
* {{XceiverClientManager}}: This is a new class intended to manage the 
lifecycle of {{XceiverClient}} instances.  As per JavaDocs, it's just a simple 
passthrough to initialization and close right now, but we can evolve it later 
to provide more efficient connection pooling.
* {{StorageContainerManager}}: Call registered DataNodes to create a container 
and establish a single {{Pipeline}} for use in all subsequent operations for 
the lifetime of the process.  This is a stub implementation intended to enable 
testing.
* {{OzoneBucket}}: Don't explicitly set a Content-Length header.  The HTTP 
client expects to get this from the {{InputStreamEntity}}, which already 
includes the length.  Explicitly setting a Content-Length header causes it to 
throw an exception.  Also, mark and reset the {{InputStream}} for the digest 
calculation.  Otherwise, the digest calculation fully consumes the stream, and 
it appears there is no content left to send in the request body.
* {{OzoneException}}: Fix an assignment of the exception message to the wrong 
field, which later resulted in incorrect logging.
* {{OzoneQuota}}: Provide a helper method for formatting the quota as a string, 
which can then be consumed later by the existing {{parseQuota}} method.  This 
helps for round-trip serialization/deserialization of the quota in container 
key metadata.
* {{ChunkInputStream}}: This is a new class used during key GET requests that 
manages reading the chunks from the container protocol, buffering internally, 
and ultimately returning the data to the reader (which is the REST service).
* {{ChunkOutputStream}}: This is a new class used during key PUT requests that 
manages receiving the bytes, buffering internally and flushing them to 
container chunks.  To preserve atomic semantics for key replacements, this 
class takes the approach of using a unique "stream ID" for each PUT operation.  
Chunks are named using a composite key consisting of key, stream ID and chunk 
index.  We replace the entire list of chunks in a single operation, so 
therefore a concurrent reader won't accidentally observe interleaved chunks 
from different versions of the key data.  This approach does leave an open 
question of what to do about the orphaned chunks from the prior version of the 
key data.  We'll need to explore ways to clean out orphaned chunks.
* {{ContainerProtocolCalls}}: This class has all of the container protocol 
calls.  We may want to refactor this later into a richer set of client classes.
* {{DistributedStorageHandler}}: All relevant operations for creating/getting 
volumes, buckets and keys are now implemented.  Remaining unimplemented 
operations throw {{UnsupportedOperationException}} to make it clearly visible 
when we hit something that isn't done yet.
* {{OzoneContainerTranslation}}: This class defines the translation back and 
forth between the Ozone domain model and the container protocol model.  Much 
like {{ContainerProtocolCalls}}, this could be a candidate for future 
refactoring.
* {{MiniOzoneCluster}}: A few clean-ups to facilitate the test suite I wrote 
for this patch.
* {{TestOzoneRestWithMiniCluster}}: Tests covering the operations implemented 
in this patch, exercised through a {{MiniOzoneCluster}}.

In addition to the test suite, I also completed manual testing against a 
single-node cluster from a full distro build.

I expect pre-commit will show 1 Checkstyle warning about a long method in 
{{DataNode}}.  This is already a long method, and I don't intend to address it 
within the scope of this patch.

I expect pre-commit will show 1 license check failure on dfs.hosts.json.  This 
is unrelated.  There is a fix for it in trunk, but we haven't merged it into 
the feature branch yet.



was (Author: cnauroth):
I am attaching patch v001.  [~anu], would you please review?  In addition to 
the integration work, I'm including some small bug fixes that I discovered 
along the way.

Here is a summary of the changes.

* {{DataNode}}: Trigger a shutdown workflow for the {{ObjectStoreHandler}}.
* {{ObjectStoreHandler}}: Initialize a StorageContainerManager client for use 
by {{DistributedStorageHandler}}.
* {{ChunkUtils}}: At one point in my testing, I wasn't sending checksums.  This 
code was hitting a {{NullPointerException}}, so I added a guard.  At this 
point, it isn't strictly required, but I'd like to keep the null check anyway 
to be defensive.
* {{Dispatcher}}: Provide logging of full stack traces.
* {{XceiverClient}}: Implement the {{Closeable}} interface using the 
pre-existing {{close}} implementation, and provide a getter to access the 
underlying {{Pipeline}}.
* {{XceiverClientHandler}}: Stifle a nitpicky Checkstyle warning.
* {{XceiverClientManager}}: This is a new class intended to manage the 
lifecycle of {{XceiverClient}} instances.  As per JavaDocs, it's just a simple 
passthrough to initialization and close right now, but we can evolve it later 
to provide more efficient connection pooling.
* {{StorageContainerManager}}: Call registered DataNodes to create a container 
and establish a single {{Pipeline}} for use in all subsequent operations for 
the lifetime of the process.  This is a stub implementation intended to enable 
testing.
* {{OzoneBucket}}: Don't explicitly set a Content-Length header.  The HTTP 
client expects to get this from the {{InputStreamEntity}}, which already 
includes the length.  Explicitly setting a Content-Length header causes it to 
throw an exception.  Also, mark and reset the {{InputStream}} for the digest 
calculation.  Otherwise, the digest calculation fully consumes the stream, and 
it appears there is no content left to send in the request body.
* {{OzoneException}}: Fix an assignment of the exception message to the wrong 
field, which later resulted in incorrect logging.
* {{OzoneQuota}}: Provide a helper method for formatting the quota as a string, 
which can then be consumed later by the existing {{parseQuota}} method.  This 
helps for round-trip serialization/deserialization of the quota in container 
key metadata.
* {{ChunkInputStream}}: This is a new class used during key GET requests that 
manages reading the chunks from the container protocol, buffering internally, 
and ultimately returning the data to the reader (which is the REST service).
* {{ChunkOutputStream}}: This is a new class used during key PUT requests that 
manages receiving the bytes, buffering internally and flushing them to 
container chunks.  To preserve atomic semantics for key replacements, this 
class takes the approach of using a unique "stream ID" for each PUT operation.  
Chunks are named using a composite key consisting of key, stream ID and chunk 
index.  We replace the entire list of chunks in a single operation, so 
therefore a concurrent reader won't accidentally observe interleaved chunks 
from different versions of the key data.  This approach does leave an open 
question of what to do about the orphaned chunks from the prior version of the 
key data.  We'll need to explore ways to clean out orphaned chunks.
* {{ContainerProtocolCalls}}: This class has all of the container protocol 
calls.  We may want to refactor this later into a richer set of client classes.
* {{DistributedStorageHandler}}: All relevant operations for creating/getting 
volumes, buckets and keys are now implemented.  Remaining unimplemented 
operations throw {{UnsupportedOperationException}} to make it clearly visible 
when we hit something that isn't done yet.
* {{OzoneContainerTranslation}}: This class defines the translation back and 
forth between the Ozone domain model and the container protocol model.  Much 
like {{ContainerProtocolCalls}}, this could be a candidate for future 
refactoring.
* {{MiniOzoneCluster}}: A few clean-ups to facilitate the test suite I wrote 
for this patch.
* {{TestOzoneRestWithMiniCluster}}: Tests covering the operations implemented 
in this patch, exercised through a {{MiniOzoneCluster}}.

I expect pre-commit will show 1 Checkstyle warning about a long method in 
{{DataNode}}.  This is already a long method, and I don't intend to address it 
within the scope of this patch.

I expect pre-commit will show 1 license check failure on dfs.hosts.json.  This 
is unrelated.  There is a fix for it in trunk, but we haven't merged it into 
the feature branch yet.


> Ozone: end-to-end integration for create/get volumes, buckets and keys.
> -----------------------------------------------------------------------
>
>                 Key: HDFS-10268
>                 URL: https://issues.apache.org/jira/browse/HDFS-10268
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HDFS-10268-HDFS-7240.001.patch
>
>
> The HDFS-7240 feature branch now has the building blocks required to enable 
> end-to-end functionality and testing for create/get volumes, buckets and 
> keys.  The scope of this patch is to complete the necessary integration in 
> {{DistributedStorageHandler}} and related classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to