[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-02-28 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Status: Open  (was: Patch Available)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-02-28 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Status: Patch Available  (was: Open)

Trigger another building.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-03-31 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: HDFS-9705-v5.patch

Updated the patch:
* Rebased with latest changes;
* Fixed some Javadoc comments inherited from HDFS-9694.

[~szetszwo], [~umamaheswararao], would you help take a look? Thanks.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch, HDFS-9705-v5.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: HDFS-9705-v1.patch

Uploaded a patch, ready for review. Note the new test file will be updated in 
some work like HDFS-9694.
[~szetszwo], would you mind taking a look and give your comment? Thanks.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: (was: HDFS-9705-v1.patch)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: HDFS-9705-v1.patch

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Status: Patch Available  (was: Open)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: HDFS-9705-v2.patch

Change safer and add comment for a confusing place.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Status: Open  (was: Patch Available)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Status: Patch Available  (was: Open)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-27 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: HDFS-9705-v3.patch

Updated the patch according to above discussion.
Note the magic value {{MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51}} 
is used in the case (request length is 0) as the case of empty file. The two 
cases are consolidated into a single place at the beginning. Also cleaned up a 
Javadoc by the way.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-02-01 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: HDFS-9705-v4.patch

Updated the patch, addressing a check style.
bq. public MD5MD5CRC32FileChecksum getFileChecksum(String src, long length):3: 
Method length is 193 lines (max allowed is 150).
This is an old issue, and will be resolved in HDFS-8430 related in the way that 
the large method will be refactored.

Nichalos, how do you like the latest patch? Thanks.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-03-05 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-9705:

Attachment: HDFS-9705-v6.patch

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch, HDFS-9705-v5.patch, HDFS-9705-v6.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-03-06 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-9705:

Attachment: HDFS-9705-v7.patch

The previous comment is kind of misleading. When the crcType is "NULL" which 
means no checksum for files, the execution will go through the "default" branch 
of {makeFinalResult} function.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch, HDFS-9705-v5.patch, 
> HDFS-9705-v6.patch, HDFS-9705-v7.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-03-08 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-9705:

Attachment: HDFS-9705-v7.patch

re-upload the patch to trigger the Jenkins build

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch, HDFS-9705-v5.patch, 
> HDFS-9705-v6.patch, HDFS-9705-v7.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-03-08 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-9705:

Attachment: (was: HDFS-9705-v7.patch)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch, HDFS-9705-v5.patch, 
> HDFS-9705-v6.patch, HDFS-9705-v7.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-03-14 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9705:
--
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha3
   Status: Resolved  (was: Patch Available)

Great, thanks Sammi! Credited both Kai and yourself in the commit message. 
Thanks also to Nicholas for doing earlier reviews.

This doesn't apply to branch-2, so only committed to trunk. Do we care about 
getting this in for 2.x?

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch, HDFS-9705-v5.patch, 
> HDFS-9705-v6.patch, HDFS-9705-v7.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-03-27 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-9705:

Attachment: HDFS-9705-branch-2.001.patch

Patch for back port to branch-2

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: SammiChen
>Priority: Minor
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-9705-branch-2.001.patch, HDFS-9705-v1.patch, 
> HDFS-9705-v2.patch, HDFS-9705-v3.patch, HDFS-9705-v4.patch, 
> HDFS-9705-v5.patch, HDFS-9705-v6.patch, HDFS-9705-v7.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-03-27 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9705:
--
Status: Patch Available  (was: Reopened)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: SammiChen
>Priority: Minor
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-9705-branch-2.001.patch, HDFS-9705-v1.patch, 
> HDFS-9705-v2.patch, HDFS-9705-v3.patch, HDFS-9705-v4.patch, 
> HDFS-9705-v5.patch, HDFS-9705-v6.patch, HDFS-9705-v7.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-03-27 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-9705:

Attachment: HDFS-9705-branch-2.002.patch

fix one style issue 

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: SammiChen
>Priority: Minor
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-9705-branch-2.001.patch, 
> HDFS-9705-branch-2.002.patch, HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch, HDFS-9705-v5.patch, 
> HDFS-9705-v6.patch, HDFS-9705-v7.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-03-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9705:
--
   Resolution: Fixed
Fix Version/s: 2.8.1
   Status: Resolved  (was: Patch Available)

Thanks Sammi and Kai, committed to branch-2 and branch-2.8 :)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: SammiChen
>Priority: Minor
> Fix For: 3.0.0-alpha3, 2.8.1
>
> Attachments: HDFS-9705-branch-2.001.patch, 
> HDFS-9705-branch-2.002.patch, HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch, HDFS-9705-v5.patch, 
> HDFS-9705-v6.patch, HDFS-9705-v7.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2017-05-29 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-9705:

Fix Version/s: 2.9.0

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: SammiChen
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha4, 2.8.1
>
> Attachments: HDFS-9705-branch-2.001.patch, 
> HDFS-9705-branch-2.002.patch, HDFS-9705-v1.patch, HDFS-9705-v2.patch, 
> HDFS-9705-v3.patch, HDFS-9705-v4.patch, HDFS-9705-v5.patch, 
> HDFS-9705-v6.patch, HDFS-9705-v7.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org