[ https://issues.apache.org/jira/browse/HDFS-10268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229396#comment-15229396 ]
Chris Nauroth edited comment on HDFS-10268 at 4/7/16 12:09 AM: --------------------------------------------------------------- I am attaching patch v001. [~anu], would you please review? In addition to the integration work, I'm including some small bug fixes that I discovered along the way. Here is a summary of the changes. * {{DataNode}}: Trigger a shutdown workflow for the {{ObjectStoreHandler}}. * {{ObjectStoreHandler}}: Initialize a StorageContainerManager client for use by {{DistributedStorageHandler}}. * {{ChunkUtils}}: At one point in my testing, I wasn't sending checksums. This code was hitting a {{NullPointerException}}, so I added a guard. At this point, it isn't strictly required, but I'd like to keep the null check anyway to be defensive. * {{Dispatcher}}: Provide logging of full stack traces. * {{XceiverClient}}: Implement the {{Closeable}} interface using the pre-existing {{close}} implementation, and provide a getter to access the underlying {{Pipeline}}. * {{XceiverClientHandler}}: Stifle a nitpicky Checkstyle warning. * {{XceiverClientManager}}: This is a new class intended to manage the lifecycle of {{XceiverClient}} instances. As per JavaDocs, it's just a simple passthrough to initialization and close right now, but we can evolve it later to provide more efficient connection pooling. * {{StorageContainerManager}}: Call registered DataNodes to create a container and establish a single {{Pipeline}} for use in all subsequent operations for the lifetime of the process. This is a stub implementation intended to enable testing. * {{OzoneBucket}}: Don't explicitly set a Content-Length header. The HTTP client expects to get this from the {{InputStreamEntity}}, which already includes the length. Explicitly setting a Content-Length header causes it to throw an exception. Also, mark and reset the {{InputStream}} for the digest calculation. Otherwise, the digest calculation fully consumes the stream, and it appears there is no content left to send in the request body. * {{OzoneException}}: Fix an assignment of the exception message to the wrong field, which later resulted in incorrect logging. * {{OzoneQuota}}: Provide a helper method for formatting the quota as a string, which can then be consumed later by the existing {{parseQuota}} method. This helps for round-trip serialization/deserialization of the quota in container key metadata. * {{ChunkInputStream}}: This is a new class used during key GET requests that manages reading the chunks from the container protocol, buffering internally, and ultimately returning the data to the reader (which is the REST service). * {{ChunkOutputStream}}: This is a new class used during key PUT requests that manages receiving the bytes, buffering internally and flushing them to container chunks. To preserve atomic semantics for key replacements, this class takes the approach of using a unique "stream ID" for each PUT operation. Chunks are named using a composite key consisting of key, stream ID and chunk index. We replace the entire list of chunks in a single operation, so therefore a concurrent reader won't accidentally observe interleaved chunks from different versions of the key data. This approach does leave an open question of what to do about the orphaned chunks from the prior version of the key data. We'll need to explore ways to clean out orphaned chunks. * {{ContainerProtocolCalls}}: This class has all of the container protocol calls. We may want to refactor this later into a richer set of client classes. * {{DistributedStorageHandler}}: All relevant operations for creating/getting volumes, buckets and keys are now implemented. Remaining unimplemented operations throw {{UnsupportedOperationException}} to make it clearly visible when we hit something that isn't done yet. * {{OzoneContainerTranslation}}: This class defines the translation back and forth between the Ozone domain model and the container protocol model. Much like {{ContainerProtocolCalls}}, this could be a candidate for future refactoring. * {{MiniOzoneCluster}}: A few clean-ups to facilitate the test suite I wrote for this patch. * {{TestOzoneRestWithMiniCluster}}: Tests covering the operations implemented in this patch, exercised through a {{MiniOzoneCluster}}. In addition to the test suite, I also completed manual testing against a single-node cluster from a full distro build. I expect pre-commit will show 1 Checkstyle warning about a long method in {{DataNode}}. This is already a long method, and I don't intend to address it within the scope of this patch. I expect pre-commit will show 1 license check failure on dfs.hosts.json. This is unrelated. There is a fix for it in trunk, but we haven't merged it into the feature branch yet. was (Author: cnauroth): I am attaching patch v001. [~anu], would you please review? In addition to the integration work, I'm including some small bug fixes that I discovered along the way. Here is a summary of the changes. * {{DataNode}}: Trigger a shutdown workflow for the {{ObjectStoreHandler}}. * {{ObjectStoreHandler}}: Initialize a StorageContainerManager client for use by {{DistributedStorageHandler}}. * {{ChunkUtils}}: At one point in my testing, I wasn't sending checksums. This code was hitting a {{NullPointerException}}, so I added a guard. At this point, it isn't strictly required, but I'd like to keep the null check anyway to be defensive. * {{Dispatcher}}: Provide logging of full stack traces. * {{XceiverClient}}: Implement the {{Closeable}} interface using the pre-existing {{close}} implementation, and provide a getter to access the underlying {{Pipeline}}. * {{XceiverClientHandler}}: Stifle a nitpicky Checkstyle warning. * {{XceiverClientManager}}: This is a new class intended to manage the lifecycle of {{XceiverClient}} instances. As per JavaDocs, it's just a simple passthrough to initialization and close right now, but we can evolve it later to provide more efficient connection pooling. * {{StorageContainerManager}}: Call registered DataNodes to create a container and establish a single {{Pipeline}} for use in all subsequent operations for the lifetime of the process. This is a stub implementation intended to enable testing. * {{OzoneBucket}}: Don't explicitly set a Content-Length header. The HTTP client expects to get this from the {{InputStreamEntity}}, which already includes the length. Explicitly setting a Content-Length header causes it to throw an exception. Also, mark and reset the {{InputStream}} for the digest calculation. Otherwise, the digest calculation fully consumes the stream, and it appears there is no content left to send in the request body. * {{OzoneException}}: Fix an assignment of the exception message to the wrong field, which later resulted in incorrect logging. * {{OzoneQuota}}: Provide a helper method for formatting the quota as a string, which can then be consumed later by the existing {{parseQuota}} method. This helps for round-trip serialization/deserialization of the quota in container key metadata. * {{ChunkInputStream}}: This is a new class used during key GET requests that manages reading the chunks from the container protocol, buffering internally, and ultimately returning the data to the reader (which is the REST service). * {{ChunkOutputStream}}: This is a new class used during key PUT requests that manages receiving the bytes, buffering internally and flushing them to container chunks. To preserve atomic semantics for key replacements, this class takes the approach of using a unique "stream ID" for each PUT operation. Chunks are named using a composite key consisting of key, stream ID and chunk index. We replace the entire list of chunks in a single operation, so therefore a concurrent reader won't accidentally observe interleaved chunks from different versions of the key data. This approach does leave an open question of what to do about the orphaned chunks from the prior version of the key data. We'll need to explore ways to clean out orphaned chunks. * {{ContainerProtocolCalls}}: This class has all of the container protocol calls. We may want to refactor this later into a richer set of client classes. * {{DistributedStorageHandler}}: All relevant operations for creating/getting volumes, buckets and keys are now implemented. Remaining unimplemented operations throw {{UnsupportedOperationException}} to make it clearly visible when we hit something that isn't done yet. * {{OzoneContainerTranslation}}: This class defines the translation back and forth between the Ozone domain model and the container protocol model. Much like {{ContainerProtocolCalls}}, this could be a candidate for future refactoring. * {{MiniOzoneCluster}}: A few clean-ups to facilitate the test suite I wrote for this patch. * {{TestOzoneRestWithMiniCluster}}: Tests covering the operations implemented in this patch, exercised through a {{MiniOzoneCluster}}. I expect pre-commit will show 1 Checkstyle warning about a long method in {{DataNode}}. This is already a long method, and I don't intend to address it within the scope of this patch. I expect pre-commit will show 1 license check failure on dfs.hosts.json. This is unrelated. There is a fix for it in trunk, but we haven't merged it into the feature branch yet. > Ozone: end-to-end integration for create/get volumes, buckets and keys. > ----------------------------------------------------------------------- > > Key: HDFS-10268 > URL: https://issues.apache.org/jira/browse/HDFS-10268 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone > Reporter: Chris Nauroth > Assignee: Chris Nauroth > Attachments: HDFS-10268-HDFS-7240.001.patch > > > The HDFS-7240 feature branch now has the building blocks required to enable > end-to-end functionality and testing for create/get volumes, buckets and > keys. The scope of this patch is to complete the necessary integration in > {{DistributedStorageHandler}} and related classes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)