[jira] [Resolved] (HDFS-7337) Configurable and pluggable erasure codec and policy

2017-09-21 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-7337.
-
   Resolution: Fixed
Fix Version/s: 3.0.0-beta1
 Release Note: 
This allows users to:
* develop and plugin their own erasure codec and coders. The plugin will be 
loaded automatically from hadoop jars, the corresponding codec and coder will 
be registered for runtime use.
* define their own erasure coding policies thru an xml file and CLI command. 
The added policies will be persisted into fsimage.

> Configurable and pluggable erasure codec and policy
> ---
>
> Key: HDFS-7337
> URL: https://issues.apache.org/jira/browse/HDFS-7337
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: erasure-coding
>Reporter: Zhe Zhang
>Priority: Critical
>  Labels: hdfs-ec-3.0-nice-to-have
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-7337-prototype-v1.patch, 
> HDFS-7337-prototype-v2.zip, HDFS-7337-prototype-v3.zip, 
> PluggableErasureCodec.pdf, PluggableErasureCodec-v2.pdf, 
> PluggableErasureCodec-v3.pdf, PluggableErasureCodec v4.pdf
>
>
> According to HDFS-7285 and the design, this considers to support multiple 
> Erasure Codecs via pluggable approach. It allows to define and configure 
> multiple codec schemas with different coding algorithms and parameters. The 
> resultant codec schemas can be utilized and specified via command tool for 
> different file folders. While design and implement such pluggable framework, 
> it’s also to implement a concrete codec by default (Reed Solomon) to prove 
> the framework is useful and workable. Separate JIRA could be opened for the 
> RS codec implementation.
> Note HDFS-7353 will focus on the very low level codec API and implementation 
> to make concrete vendor libraries transparent to the upper layer. This JIRA 
> focuses on high level stuffs that interact with configuration, schema and etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-7346) Erasure Coding: perform stripping erasure encoding work given block reader and writer

2017-09-20 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-7346.
-
Resolution: Later

> Erasure Coding: perform stripping erasure encoding work given block reader 
> and writer
> -
>
> Key: HDFS-7346
> URL: https://issues.apache.org/jira/browse/HDFS-7346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Kai Zheng
>Assignee: Li Bo
>
> This assumes the facilities like block reader and writer are ready, 
> implements and performs erasure encoding work in *stripping* case utilizing 
> erasure codec and coder provided by the codec framework.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12413) Inotify should support erasure coding policy op as replica meta change

2017-09-10 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-12413:


 Summary: Inotify should support erasure coding policy op as 
replica meta change
 Key: HDFS-12413
 URL: https://issues.apache.org/jira/browse/HDFS-12413
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: erasure-coding
Reporter: Kai Zheng


Currently HDFS Inotify already supports meta change like replica for a file. We 
should also support erasure coding policy setting/unsetting for a file 
similarly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12388) A bad error message in DFSStripedOutputStream

2017-09-03 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-12388:


 Summary: A bad error message in DFSStripedOutputStream
 Key: HDFS-12388
 URL: https://issues.apache.org/jira/browse/HDFS-12388
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kai Zheng


Noticed a failure reported by Jenkins in HDFS-11882. The reported error message 
wasn't correct, it should be: {{the number of failed blocks = 4 > the number of 
data blocks = 3}} =>  {{the number of failed blocks = 4 > the number of parity 
blocks = 3}} 
{noformat}
Regression

org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030.testBlockTokenExpired

Failing for the past 1 build (Since Failed#20973 )
Took 6.4 sec.
Error Message

Failed at i=6294527
Stacktrace

java.io.IOException: Failed at i=6294527
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.write(TestDFSStripedOutputStreamWithFailure.java:559)
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:534)
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testBlockTokenExpired(TestDFSStripedOutputStreamWithFailure.java:273)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
Caused by: java.io.IOException: Failed: the number of failed blocks = 4 > the 
number of data blocks = 3
at 
org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamers(DFSStripedOutputStream.java:392)
at 
org.apache.hadoop.hdfs.DFSStripedOutputStream.handleStreamerFailure(DFSStripedOutputStream.java:410)
at 
org.apache.hadoop.hdfs.DFSStripedOutputStream.flushAllInternals(DFSStripedOutputStream.java:1262)
at 
org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:627)
at 
org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:563)
at 
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
at 
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
at 
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:79)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
at java.io.DataOutputStream.write(DataOutputStream.java:88)
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.write(TestDFSStripedOutputStreamWithFailure.java:557)
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:534)
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testBlockTokenExpired(TestDFSStripedOutputStreamWithFailure.java:273)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11606) Add CLI cmd to remove an erasure code policy

2017-03-31 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-11606:


 Summary: Add CLI cmd to remove an erasure code policy
 Key: HDFS-11606
 URL: https://issues.apache.org/jira/browse/HDFS-11606
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Tim Yao
 Fix For: 3.0.0-alpha3


This is to develop a CLI cmd allowing user to remove a user defined erasure 
code policy by specifying its name. Note if the policy is referenced and used 
by  existing HDFS files, the removal should fail with a good message.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11605) Allow user to customize and add new erasure code codecs and policies

2017-03-31 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-11605:


 Summary: Allow user to customize and add new erasure code codecs 
and policies
 Key: HDFS-11605
 URL: https://issues.apache.org/jira/browse/HDFS-11605
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Huafeng Wang
 Fix For: 3.0.0-alpha3


Based on the facility developed in HDFS-11604, this will develop necessary CLI 
cmd to load an XML file and the results will be maintained in NameNode 
{{ErasureCodingPolicyManager}} as {{USER_POLICIES}} in line with 
{{SYS_POLICIES}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11604) Define and parse erasure code codecs, schemas and policies

2017-03-31 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-11604:


 Summary: Define and parse erasure code codecs, schemas and policies
 Key: HDFS-11604
 URL: https://issues.apache.org/jira/browse/HDFS-11604
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Lin Zeng
 Fix For: 3.0.0-alpha3


According to recent discussions with [~andrew.wang], it would be good to 
support allowing users to define their own erasure code codecs, schemas and 
policies via an XML file. The XML file can be passed to a CLI cmd to parse and 
send to NameNode to persist and maintain.

Open this task to define the XML format providing a default sample file to put 
in the configuration folder for users' reference, and implement the necessary 
parser utility.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-8201) Refactor the end to end test for stripping file writing and reading

2017-03-20 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8201.
-
Resolution: Duplicate

Marking this as duplicate as the desired work was already done elsewhere.

> Refactor the end to end test for stripping file writing and reading
> ---
>
> Key: HDFS-8201
> URL: https://issues.apache.org/jira/browse/HDFS-8201
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-8201.001.patch, HDFS-8201.002.patch, 
> HDFS-8201-HDFS-7285.003.patch, HDFS-8201-HDFS-7285.004.patch, 
> HDFS-8201-HDFS-7285.005.patch
>
>
> According to off-line discussion with [~zhz] and [~xinwei], we need to 
> implement an end to end test for stripping file support:
> * Create an EC zone;
> * Create a file in the zone;
> * Write various typical sizes of content to the file, each size maybe a test 
> method;
> * Read the written content back;
> * Compare the written content and read content to ensure it's good;
> This jira aims to refactor the end to end test 
> class(TestWriteReadStripedFile) in order to reuse them conveniently in the 
> next test step for erasure encoding and recovering. Will open separate issue 
> for it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-10651) Clean up some configuration related codes about legacy block reader

2016-07-18 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-10651:


 Summary: Clean up some configuration related codes about legacy 
block reader
 Key: HDFS-10651
 URL: https://issues.apache.org/jira/browse/HDFS-10651
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Kai Zheng
Assignee: Kai Zheng
Priority: Minor


HDFS-10548 removed the legacy block reader. This is to clean up the 
configuration related codes accordingly as [~andrew.wang] suggested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-10548) Remove the long deprecated BlockReaderRemote

2016-06-19 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-10548:


 Summary: Remove the long deprecated BlockReaderRemote
 Key: HDFS-10548
 URL: https://issues.apache.org/jira/browse/HDFS-10548
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: Kai Zheng
Assignee: Kai Zheng


To lessen the maintain burden like raised in HDFS-8901, suggest we remove 
{{BlockReaderRemote}} class that's deprecated very long time ago. 

>From {{BlockReaderRemote}} header:
{quote}
 * @deprecated this is an old implementation that is being left around
 * in case any issues spring up with the new {@link BlockReaderRemote2}
 * implementation.
 * It will be removed in the next release.
{quote}

>From {{BlockReaderRemote2}} class header:
{quote}
 * This is a new implementation introduced in Hadoop 0.23 which
 * is more efficient and simpler than the older BlockReader
 * implementation. It should be renamed to BlockReaderRemote
 * once we are confident in it.
{quote}

So even further, after getting rid of the old class, we could rename as the 
comment suggested: BlockReaderRemote2 => BlockReaderRemote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing missed/corrupt block

2016-02-18 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9833:
---

 Summary: Erasure coding: recomputing block checksum on the fly by 
reconstructing missed/corrupt block
 Key: HDFS-9833
 URL: https://issues.apache.org/jira/browse/HDFS-9833
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum even 
some of striped blocks are missed, we need to consider recomputing block 
checksum on the fly for the missed/corrupt blocks. To recompute the block 
checksum, the block data needs to be reconstructed by erasure decoding, and the 
main needed codes for the block reconstruction could be borrowed from 
HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
worker, reconstructed blocks need to be written out to target datanodes, but 
here in this case, the remote writing isn't necessary, as the reconstructed 
block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9733) Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum

2016-02-01 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9733:
---

 Summary: Refactor DFSClient#getFileChecksum and 
DataXceiver#blockChecksum
 Key: HDFS-9733
 URL: https://issues.apache.org/jira/browse/HDFS-9733
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Kai Zheng


To prepare for file checksum computing for striped files, this refactors the 
existing codes in Refactor {{DFSClient#getFileChecksum}} and 
{{DataXceiver#blockChecksum}} to make HDFS-8430 and HDFS-9694 easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9719) Refactoring ErasureCodingWorker into smaller reusable constructs

2016-01-27 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9719:
---

 Summary: Refactoring ErasureCodingWorker into smaller reusable 
constructs
 Key: HDFS-9719
 URL: https://issues.apache.org/jira/browse/HDFS-9719
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


This would suggest and refactor {{ErasureCodingWorker}} into smaller constructs 
to be reused in other places like block group checksum computing in datanode 
side. As discussed in HDFS-8430 and implemented in HDFS-9694 patch, checksum 
computing for striped block groups would be distributed to datanode in the 
group, where data block data should be able to be reconstructed when 
missed/corrupted to recompute the block checksum. The most needed codes are in 
the current ErasureCodingWorker and could be reused in order to avoid 
duplication. Fortunately, we have very good and complete tests, which would 
make the refactoring much easier. The refactoring will also help a lot for 
subsequent tasks in phase II for non-striping erasure coded files and blocks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9705:
---

 Summary: Refine the behaviour of getFileChecksum when length = 0
 Key: HDFS-9705
 URL: https://issues.apache.org/jira/browse/HDFS-9705
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Kai Zheng
Priority: Minor


{{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a valid 
value. Currently it will return {{null}} when length is 0, in the following 
code block:
{code}
//compute file MD5
final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
switch (crcType) {
case CRC32:
  return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
  crcPerBlock, fileMD5);
case CRC32C:
  return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
  crcPerBlock, fileMD5);
default:
  // If there is no block allocated for the file,
  // return one with the magic entry that matches what previous
  // hdfs versions return.
  if (locatedblocks.size() == 0) {
return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
  }

  // we should never get here since the validity was checked
  // when getCrcType() was called above.
  return null;
}
{code}
The comment says "we should never get here since the validity was checked" but 
it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} actually 
is a valid case in which the MD5 value is {{d41d8cd98f00b204e9800998ecf8427e}}, 
so suggest we return a reasonable value other than null. At least some useful 
information in the returned value can be seen, like values from block checksum 
header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-01-25 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9694:
---

 Summary: Make existing DFSClient#getFileChecksum() work for 
striped blocks
 Key: HDFS-9694
 URL: https://issues.apache.org/jira/browse/HDFS-9694
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng
 Fix For: 3.0.0


This is a sub-task of HDFS-8430 and will get the existing API 
{{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
refactor existing codes and layout basic work for subsequent tasks like support 
of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9663) Optimize some RPC call using lighter weight construct than DatanodeInfo

2016-01-19 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9663:
---

 Summary: Optimize some RPC call using lighter weight construct 
than DatanodeInfo
 Key: HDFS-9663
 URL: https://issues.apache.org/jira/browse/HDFS-9663
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Kai Zheng


While working on HDFS-8430 when add a RPC in DataTransferProtocol, it was 
noticed the very heavy construct either {{DatanodeInfo}} or 
{{DatanodeInfoWithStorage}} is used to represent a datanode just for connection 
in most time. However, it's very fat and contains much more information than 
that needed. See how it's defined:
{code}
public class DatanodeInfo extends DatanodeID implements Node {
  private long capacity;
  private long dfsUsed;
  private long remaining;
  private long blockPoolUsed;
  private long cacheCapacity;
  private long cacheUsed;
  private long lastUpdate;
  private long lastUpdateMonotonic;
  private int xceiverCount;
  private String location = NetworkTopology.DEFAULT_RACK;
  private String softwareVersion;
  private List dependentHostNames = new LinkedList<>();
  private String upgradeDomain;
...
{code}
In client and datanode sides, for RPC calls like 
{{DataTransferProtocol#writeBlock}}, looks like the information contained in 
{{DatanodeID}} is almost enough.

I did a quick hack that using a light weight construct like 
{{SimpleDatanodeInfo}} that simply extends DatanodeID (no other field added, 
but if whatever field needed, then just add it) and changed the 
DataTransferProtocol#writeBlock call. Manually checked many relevant tests it 
did work fine. How much network traffic saved, did a simple test with codes in 
{{Sender}}:
{code}
  private static void send(final DataOutputStream out, final Op opcode,
  final Message proto) throws IOException {
LOG.trace("Sending DataTransferOp {}: {}",
proto.getClass().getSimpleName(), proto);
int before = out.size();
op(out, opcode);
proto.writeDelimitedTo(out);
int after = out.size();
System.out.println("X sent=" + (after - before));
out.flush();
  }
{code}
Ran the test {{TestWriteRead#testWriteAndRead}}, the change can  save about 100 
bytes in most time for the call. The saving may be not so big because only 3 
datanodes are to send, but in situations like in {{BlockECRecoveryCommand}}, 
there can be 6+ 3 datanodes as targets and sources to send, the saving will be 
significant.

Hence, suggest use more light weight construct to represent a datanode in RPC 
calls when possible. Or other ideas to avoid unnecessary wire data size. This 
may make sense, as noted, there were some discussions in HDFS-8999 to save some 
datanodes bandwidth.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9642) Create reader threads pool on demand according to erasure coding policy

2016-01-11 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9642:
---

 Summary: Create reader threads pool on demand according to erasure 
coding policy
 Key: HDFS-9642
 URL: https://issues.apache.org/jira/browse/HDFS-9642
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Kai Zheng


While investigating some issue it was noticed in {{DFSClient}}, 
{{STRIPED_READ_THREAD_POOL}} will be always created during initialization and 
by default regardless the used erasure coding policy it uses the value *18*.

This suggests:
* Create the thread pool on demand only in striping case.
* When create the pool, use a good value respecting the used erasure coding 
policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9630) DistCp minor refactoring and clean up

2016-01-07 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9630:
---

 Summary: DistCp minor refactoring and clean up
 Key: HDFS-9630
 URL: https://issues.apache.org/jira/browse/HDFS-9630
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp
Reporter: Kai Zheng
Assignee: Kai Zheng
Priority: Minor
 Fix For: 3.0.0


While working on HDFS-9613, it was found there are various checking style 
issues and minor things to clean up in {{DistCp}}. Better to handle them 
separately so the fix can be in earlier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9613) Minor improvement and clean up in distcp

2016-01-04 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9613:
---

 Summary: Minor improvement and clean up in distcp
 Key: HDFS-9613
 URL: https://issues.apache.org/jira/browse/HDFS-9613
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Kai Zheng
Priority: Minor


While working on related issue, it was noticed there are some places in 
{{distcp}} that's better to be improved and cleaned up. Particularly, after a 
file is coped to target cluster, it will check the copied file is fine or not. 
When checking, better to check block size first, then the checksum, because the 
later is a little expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8902) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping read (position and stateful)

2015-12-04 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8902.
-
Resolution: Duplicate

> Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder 
> in striping read (position and stateful)
> -
>
> Key: HDFS-8902
> URL: https://issues.apache.org/jira/browse/HDFS-8902
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>
> We would choose ByteBuffer on heap or direct ByteBuffer according to used 
> erasure coder in striping read (position and stateful), for performance 
> consideration. Pure Java implemented coder favors on heap one, though native 
> coder likes more direct one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8903) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping write

2015-12-04 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8903.
-
Resolution: Duplicate

> Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder 
> in striping write
> --
>
> Key: HDFS-8903
> URL: https://issues.apache.org/jira/browse/HDFS-8903
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>
> We would choose ByteBuffer on heap or direct ByteBuffer according to used 
> erasure coder in striping write, for performance consideration. Pure Java 
> implemented coder favors on heap one, though native coder likes more direct 
> one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8904) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping recovery on DataNode side

2015-12-04 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8904.
-
Resolution: Duplicate

> Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder 
> in striping recovery on DataNode side
> --
>
> Key: HDFS-8904
> URL: https://issues.apache.org/jira/browse/HDFS-8904
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>
> We would choose ByteBuffer on heap or direct ByteBuffer according to used 
> erasure coder in striping recovery in DataNode side like the work to do in 
> client side, for performance consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9333) TestBlockTokenWithDFSStriped errored complaining port in use

2015-10-28 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9333:
---

 Summary: TestBlockTokenWithDFSStriped errored complaining port in 
use
 Key: HDFS-9333
 URL: https://issues.apache.org/jira/browse/HDFS-9333
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Kai Zheng
Priority: Minor


Ref. the following:
{noformat}
Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 30.483 sec <<< 
FAILURE! - in 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped
testRead(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped)
  Time elapsed: 11.021 sec  <<< ERROR!
java.net.BindException: Port in use: localhost:49333
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at 
org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at 
org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:884)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:826)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:821)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:675)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:883)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:862)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1555)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2015)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1996)
at 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS.doTestRead(TestBlockTokenWithDFS.java:539)
at 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped.testRead(TestBlockTokenWithDFSStriped.java:62)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9153) DFSIO could output better throughput

2015-09-28 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9153:
---

 Summary: DFSIO could output better throughput 
 Key: HDFS-9153
 URL: https://issues.apache.org/jira/browse/HDFS-9153
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Kai Zheng
Assignee: Kai Zheng


Ref. the following DFSIO output, I was surprised the test throughput was only 
{{17}} MB/s, which doesn't make sense for a real cluster. Maybe it's used for 
other purpose? For users, it may make more sense to give the throughput 1610 
MB/s (1228800/763), calculated by *Total MBytes processed / Test exec time*.
{noformat}
15/09/28 11:42:23 INFO fs.TestDFSIO: - TestDFSIO - : write
15/09/28 11:42:23 INFO fs.TestDFSIO:Date & time: Mon Sep 28 
11:42:23 CST 2015
15/09/28 11:42:23 INFO fs.TestDFSIO:Number of files: 100
15/09/28 11:42:23 INFO fs.TestDFSIO: Total MBytes processed: 1228800.0
15/09/28 11:42:23 INFO fs.TestDFSIO:  Throughput mb/sec: 17.457387239456878
15/09/28 11:42:23 INFO fs.TestDFSIO: Average IO rate mb/sec: 17.57563018798828
15/09/28 11:42:23 INFO fs.TestDFSIO:  IO rate std deviation: 1.7076328985378455
15/09/28 11:42:23 INFO fs.TestDFSIO: Test exec time sec: 762.697
15/09/28 11:42:23 INFO fs.TestDFSIO: 
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8968) New benchmark throughput tool for striping erasure coding

2015-08-26 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8968:
---

 Summary: New benchmark throughput tool for striping erasure coding
 Key: HDFS-8968
 URL: https://issues.apache.org/jira/browse/HDFS-8968
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Rui Li


We need a new benchmark tool to measure the throughput of client writing and 
reading considering cases or factors:
* 3-replica or striping;
* write or read, stateful read or positional read;
* which erasure coder;
* striping cell size;
* concurrent readers/writers using processes or threads.

The tool should be easy to use and better to avoid unnecessary local 
environment impact, like local disk.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8957) Consolidate client striping input stream codes for stateful read and positional read

2015-08-26 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8957:
---

 Summary: Consolidate client striping input stream codes for 
stateful read and positional read
 Key: HDFS-8957
 URL: https://issues.apache.org/jira/browse/HDFS-8957
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng
 Fix For: HDFS-7285


Currently we have different implementations for client striping read, having 
both *StatefulStripeReader* and *PositionStripeReader*. I attempted to 
consolidate the two implementations into one, and it results in much simpler 
codes, and also better performance. Now in both read paths, it will:
* Use ByteBuffer instead of bytes array, as currently stateful read does;
* Read directly into application's buffer, as currently positional read does;
* Try to align and merge multiple stripes, as currently positional read does;
* Use *ECChunk* version decode API.

The resultant *StripeReader* is approaching very near now to the ideal state 
desired by next step, employing *ErasureCoder* API instead of *RawErasureCoder* 
API.

Will upload an initial patch to illustrate the rough change, even though it 
depends on other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8907) Configurable striping read buffer threhold

2015-08-17 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8907:
---

 Summary: Configurable striping read buffer threhold
 Key: HDFS-8907
 URL: https://issues.apache.org/jira/browse/HDFS-8907
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


In striping input stream, positional read merges all the possible strips 
together, while stateful read reads a strip a time. The former is efficient but 
may incur too large chunk buffers for a client to afford, the latter is simple 
good but can be improved for better throughput. This would consolidate the both 
and use a configurable (new or existing) buffer threshold to control how it 
goes. Fixed chunk buffers for the read will be allocated accordingly and reused 
time and time. The aligned strips to read a time may be computed against the 
threshold. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8905) Refactor DFSInputStream#ReaderStrategy

2015-08-17 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8905:
---

 Summary: Refactor DFSInputStream#ReaderStrategy
 Key: HDFS-8905
 URL: https://issues.apache.org/jira/browse/HDFS-8905
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


DFSInputStream#ReaderStrategy family don't look very good. This refactors a 
little bit to make them make more sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8904) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping recovery on DataNode side

2015-08-17 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8904:
---

 Summary: Uses ByteBuffer on heap or direct ByteBuffer according to 
used erasure coder in striping recovery on DataNode side
 Key: HDFS-8904
 URL: https://issues.apache.org/jira/browse/HDFS-8904
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


We would choose ByteBuffer on heap or direct ByteBuffer according to used 
erasure coder in striping recovery in DataNode side like the work to do in 
client side, for performance consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8903) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping write

2015-08-17 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8903:
---

 Summary: Uses ByteBuffer on heap or direct ByteBuffer according to 
used erasure coder in striping write
 Key: HDFS-8903
 URL: https://issues.apache.org/jira/browse/HDFS-8903
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


We would choose ByteBuffer on heap or direct ByteBuffer according to used 
erasure coder in striping write, for performance consideration. Pure Java 
implemented coder favors on heap one, though native coder likes more direct 
one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8902) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping read (position and stateful)

2015-08-17 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8902:
---

 Summary: Uses ByteBuffer on heap or direct ByteBuffer according to 
used erasure coder in striping read (position and stateful)
 Key: HDFS-8902
 URL: https://issues.apache.org/jira/browse/HDFS-8902
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


We would choose ByteBuffer on heap or direct ByteBuffer according to used 
erasure coder in striping read (position and stateful), for performance 
consideration. Pure Java implemented coder favors on heap one, though native 
coder likes more direct one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8901) Use ByteBuffer/DirectByteBuffer in striping positional read

2015-08-17 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8901:
---

 Summary: Use ByteBuffer/DirectByteBuffer in striping positional 
read
 Key: HDFS-8901
 URL: https://issues.apache.org/jira/browse/HDFS-8901
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


Native erasure coder prefers to direct ByteBuffer for performance 
consideration. To prepare for it, this change uses ByteBuffer through the codes 
in implementing striping position read. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8558) Remove the use of hard-coded cell size value in balancer Dispatcher

2015-06-08 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8558:
---

 Summary: Remove the use of hard-coded cell size value in balancer 
Dispatcher
 Key: HDFS-8558
 URL: https://issues.apache.org/jira/browse/HDFS-8558
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
 Fix For: HDFS-7285


This is part of work in HDFS-8494. To make it easy and discuss this case 
better, open this issue to remove the use of hard-coded cell size value in 
balancer Dispatcher separately. Also copied some good comments here:

>From [~walter.k.su]:
{quote}
It's easy to pass cellSize from BlockInfoStriped to Dispatcher. But the 
question is BlockInfoStriped doesn't have it.
You have to get it by: BlockInfoStriped --> getBlockCollection --> getZone --> 
getCellSize. A bit complicated, isn't it?
I think BlockInfoStriped needs to keep cellSize.
{quote}

>From [~vinayrpet]:
{quote}
I too was thinking the same when the FSImageLoader problem has came up. This 
will increase the memory usage by ~4bytes for each block though.
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8517) Fix a decoding issue in stripped block recovering in client side

2015-06-02 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8517:
---

 Summary: Fix a decoding issue in stripped block recovering in 
client side
 Key: HDFS-8517
 URL: https://issues.apache.org/jira/browse/HDFS-8517
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


[~jingzhao] reported a decoding issue in HDFS-8481 in the comment copied below:
bq. While debugging HDFS-8319, I just found that in 
TestWriteReadStripedFile#testWritePreadWithDNFailure, if we change the 
startOffsetInFile from cellSize * 5 to 0, the test fails with the following 
error msg:
{noformat}
java.lang.AssertionError: Byte at 524288 should be the same expected:<27> but 
was:<-9>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.hadoop.hdfs.TestWriteReadStripedFile.testWritePreadWithDNFailure(TestWriteReadStripedFile.java:390)
{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8382) Remove chunkSize parameter from initialize method of raw erasure coder

2015-05-12 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8382:
---

 Summary: Remove chunkSize parameter from initialize method of raw 
erasure coder
 Key: HDFS-8382
 URL: https://issues.apache.org/jira/browse/HDFS-8382
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


Per discussion in HDFS-8347, we need to support encoding/decoding variable 
width units data instead of predefined fixed width like {{chunkSize}}. Have 
this issue to remove chunkSize in the general raw erasure coder API. Specific 
coder will support fixed chunkSize using hard-coded or specific schema 
customizing way if necessary, like HitchHiker coder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8379) Fix issues like NPE in TestRecoverStripedFile

2015-05-12 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8379:
---

 Summary: Fix issues like NPE in TestRecoverStripedFile
 Key: HDFS-8379
 URL: https://issues.apache.org/jira/browse/HDFS-8379
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


In latest branch, found some issues like NPE. This is to have some quick fixes 
of them.
{noformat}
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.namenode.QuotaCounts.add(QuotaCounts.java:82)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.computeQuotaUsage(INodeFile.java:667)
at 
org.apache.hadoop.hdfs.server.namenode.INode.computeQuotaUsage(INode.java:497)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCountForDelete(FSDirectory.java:676)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.unprotectedDelete(FSDirDeleteOp.java:247)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:57)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.deleteInternal(FSDirDeleteOp.java:172)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:101)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3754)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:957)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:619)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2174)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2168)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8370) Erasure Coding: TestRecoverStripedFile#testRecoverOneParityBlock is failing

2015-05-11 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8370.
-
Resolution: Duplicate

> Erasure Coding: TestRecoverStripedFile#testRecoverOneParityBlock is failing
> ---
>
> Key: HDFS-8370
> URL: https://issues.apache.org/jira/browse/HDFS-8370
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>
> This jira is to analyse more on the failure of this unit test. 
> {code}
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:333)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:234)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverOneParityBlock(TestRecoverStripedFile.java:98)
> {code}
> Exception occurred during recovery packet transferring:
> {code}
> 2015-05-09 15:08:08,910 INFO  datanode.DataNode 
> (BlockReceiver.java:receiveBlock(826)) - Exception for 
> BP-1332677436-67.195.81.147-1431184082022:blk_-9223372036854775792_1001
> java.io.IOException: Premature EOF from inputStream
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:203)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:472)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:787)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:803)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8360) Fix FindBugs issues introduced by erasure coding

2015-05-08 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8360:
---

 Summary: Fix FindBugs issues introduced by erasure coding
 Key: HDFS-8360
 URL: https://issues.apache.org/jira/browse/HDFS-8360
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng


As reported by 
https://issues.apache.org/jira/browse/HADOOP-11938?focusedCommentId=14534949&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14534949,
 there are quite a few FindBugs issues related to erasure coding. It would be 
good to get them resolved before the merge. Note the issues are not relevant to 
HADOOP-11938, I'm just quoting it for the easy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8347) Using chunkSize to perform erasure decoding in stripping blocks recovering

2015-05-07 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8347:
---

 Summary: Using chunkSize to perform erasure decoding in stripping 
blocks recovering
 Key: HDFS-8347
 URL: https://issues.apache.org/jira/browse/HDFS-8347
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


While investigating a test failure in {{TestRecoverStripedFile}}, found two 
issues:
* An extra buffer size instead of the chunkSize defined the schema is used to 
perform the decoding, which is incorrect and will cause a decoding failure as 
below. This is exposed by latest change in erasure coder.
{noformat}
2015-05-08 18:50:06,607 WARN  datanode.DataNode 
(ErasureCodingWorker.java:run(386)) - Transfer failed for all targets.
2015-05-08 18:50:06,608 WARN  datanode.DataNode 
(ErasureCodingWorker.java:run(399)) - Failed to recover striped block: 
BP-1597876081-10.239.12.51-1431082199073:blk_-9223372036854775792_1001
2015-05-08 18:50:06,609 INFO  datanode.DataNode 
(BlockReceiver.java:receiveBlock(826)) - Exception for 
BP-1597876081-10.239.12.51-1431082199073:blk_-9223372036854775784_1001
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:203)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:472)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:787)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:803)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
at java.lang.Thread.run(Thread.java:745)
{noformat}
* In raw erasrue coder, a bad optimization in below codes. It assumes the  heap 
buffer backed by the bytes array available for reading or writing always starts 
with zero and takes the whole.
{code}
  protected static byte[][] toArrays(ByteBuffer[] buffers) {
byte[][] bytesArr = new byte[buffers.length][];

ByteBuffer buffer;
for (int i = 0; i < buffers.length; i++) {
  buffer = buffers[i];
  if (buffer == null) {
bytesArr[i] = null;
continue;
  }

  if (buffer.hasArray()) {
bytesArr[i] = buffer.array();
  } else {
throw new IllegalArgumentException("Invalid ByteBuffer passed, " +
"expecting heap buffer");
  }
}

return bytesArr;
  }
{code} 

Will attach a patch soon to fix the two issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8136) Client gets and uses EC schema when reads and writes a stripping file

2015-04-23 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8136.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed

> Client gets and uses EC schema when reads and writes a stripping file
> -
>
> Key: HDFS-8136
> URL: https://issues.apache.org/jira/browse/HDFS-8136
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Kai Zheng
>Assignee: Kai Sasaki
> Fix For: HDFS-7285
>
> Attachments: HDFS-8136-005.patch, HDFS-8136.1.patch, 
> HDFS-8136.2.patch, HDFS-8136.3.patch, HDFS-8136.4.patch, HDFS-8136.6.patch, 
> HDFS-8136.7.patch
>
>
> Discussed with [~umamaheswararao] and [~vinayrpet], in client when reading 
> and writing a stripping file, it can invoke a separate call to NameNode to 
> request the EC schema associated with the EC zone where the file is in. Then 
> the schema can be used to guide the reading and writing. Currently it uses 
> hard-coded values.
> Optionally, as an optimization consideration, client may cache schema info 
> per file or per zone or per schema name. We could add schema name in 
> {{HdfsFileStatus}} for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8228) testFileSmallerThanOneStripe2 failed

2015-04-23 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8228:
---

 Summary: testFileSmallerThanOneStripe2 failed
 Key: HDFS-8228
 URL: https://issues.apache.org/jira/browse/HDFS-8228
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kai Zheng
Assignee: Li Bo
 Fix For: HDFS-7285


Playing with the branch, found below:
{noformat}
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2343)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244)
at 
org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:823)
2015-04-23 23:21:08,666 INFO  mortbay.log (Slf4jLog.java:info(67)) - Stopped 
SelectChannelConnector@localhost:0
2015-04-23 23:21:08,769 INFO  ipc.Server (Server.java:stop(2540)) - Stopping 
server on 57920
2015-04-23 23:21:08,770 INFO  datanode.DataNode 
(BlockReceiver.java:receiveBlock(826)) - Exception for 
BP-1850767374-10.239.12.51-1429802363548:blk_-9223372036854775737_1007
java.io.InterruptedIOException: Interrupted while waiting for IO on channel 
java.nio.channels.SocketChannel[closed]. 6 millis timeout left.
at 
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:342)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:472)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:787)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:793)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
at java.lang.Thread.run(Thread.java:745)
2015-04-23 23:21:08,769 INFO  datanode.DataNode (BlockReceiver.java:run(1250)) 
- PacketResponder: 
BP-1850767374-10.239.12.51-1429802363548:blk_-9223372036854775737_1007, 
type=LAST_IN_PIPELINE, downstreams=0:[]: Thread is interrupted.
2015-04-23 23:21:08,776 WARN  datanode.DataNode 
(BPServiceActor.java:offerService(756)) - BPOfferService for Block pool 
BP-1850767374-10.239.12.51-1429802363548 (Datanode Uuid 
72b12e39-77cb-463d-a919-0ac06d166fcd) service to localhost/127.0.0.1:36877 
interrupted
2015-04-23 23:21:08,776 INFO  ipc.Server (Server.java:run(846)) - Stopping IPC 
Server Responder
{noformat}

[~libo-intel]], would you help with this? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8202) Improve end to end stirpping file test to add erasure recovering test

2015-04-20 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8202:
---

 Summary: Improve end to end stirpping file test to add erasure 
recovering test
 Key: HDFS-8202
 URL: https://issues.apache.org/jira/browse/HDFS-8202
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 


This to follow on HDFS-8021 to add erasure recovering test in the end to end 
stripping file test:
* After writing certain blocks to the test file, delete some block file;
* Read the file content and compare, see if any recovering issue, or verify the 
erasure recovering works or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8201) Add an end to end test for stripping file writing and reading

2015-04-20 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8201:
---

 Summary: Add an end to end test for stripping file writing and 
reading
 Key: HDFS-8201
 URL: https://issues.apache.org/jira/browse/HDFS-8201
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 


According to off-line discussion with [~zhz] and [~xinwei], we need to 
implement an end to end test for stripping file support:
* Create an EC zone;
* Create a file in the zone;
* Write various typical sizes of content to the file, each size maybe a test 
method;
* Read the written content back;
* Compare the written content and read content to ensure it's good;

The test facility is subject to add more steps for erasure encoding and 
recovering. Will open separate issue for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-7545) Data striping support in HDFS client

2015-04-16 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng reopened HDFS-7545:
-

We still have major sub-task to resolve yet. And we may also need to provide or 
update the document. Sorry it's resolved too soon.

> Data striping support in HDFS client
> 
>
> Key: HDFS-7545
> URL: https://issues.apache.org/jira/browse/HDFS-7545
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Li Bo
> Attachments: DataStripingSupportinHDFSClient.pdf, 
> HDFS-7545-001-DFSOutputStream.patch, HDFS-7545-PoC.patch, clientStriping.patch
>
>
> Data striping is a commonly used data layout with critical benefits in the 
> context of erasure coding. This JIRA aims to extend HDFS client to work with 
> striped blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7545) Data striping support in HDFS client

2015-04-16 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-7545.
-
Resolution: Fixed

Resolving this client side master JIRA as the related sub-tasks are done.

> Data striping support in HDFS client
> 
>
> Key: HDFS-7545
> URL: https://issues.apache.org/jira/browse/HDFS-7545
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Li Bo
> Attachments: DataStripingSupportinHDFSClient.pdf, 
> HDFS-7545-001-DFSOutputStream.patch, HDFS-7545-PoC.patch, clientStriping.patch
>
>
> Data striping is a commonly used data layout with critical benefits in the 
> context of erasure coding. This JIRA aims to extend HDFS client to work with 
> striped blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8156) Define some system schemas in codes

2015-04-15 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8156:
---

 Summary: Define some system schemas in codes
 Key: HDFS-8156
 URL: https://issues.apache.org/jira/browse/HDFS-8156
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


This is to define and add some system schemas in codes, and also resolve some 
TODOs left for HDFS-7859 and HDFS-7866 as they're still subject to further 
discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8139) Refine or extend EC command to support all cases

2015-04-13 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8139.
-
Resolution: Duplicate

> Refine or extend EC command to support all cases
> 
>
> Key: HDFS-8139
> URL: https://issues.apache.org/jira/browse/HDFS-8139
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Vinayakumar B
>
> Currently a {{CODEC}} command is used to distribute stripping recovery 
> command to DataNode. Discussed with [~umamaheswararao] and [~vinayrpet] 
> further, we may need to refine or extend the command to support all cases, as 
> we may distribute encoding and recovering tasks in both stripping and 
> non-stripping cases. We may have two ways: 1) a single command but uses a 
> flag in it to denote all the cases; 2) more commands.
> This will also consider how to layout, optimize or document relevant info 
> contained in the the corresponding command related to the block group, so 
> DataNode will be able to restore such info according to the concrete command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8139) Refine or extend EC command to support all cases

2015-04-13 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8139:
---

 Summary: Refine or extend EC command to support all cases
 Key: HDFS-8139
 URL: https://issues.apache.org/jira/browse/HDFS-8139
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Vinayakumar B


Currently a {{CODEC}} command is used to distribute stripping recovery command 
to DataNode. Discussed with [~umamaheswararao] and [~vinayrpet] further, we may 
need to refine or extend the command to support all cases, as we may distribute 
encoding and recovering tasks in both stripping and non-stripping cases. We may 
have two ways: 1) a single command but uses a flag in it to denote all the 
cases; 2) more commands.

This will also consider how to layout, optimize or document relevant info 
contained in the the corresponding command related to the block group, so 
DataNode will be able to restore such info according to the concrete command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8138) Refine or extend EC command to support all cases

2015-04-13 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8138:
---

 Summary: Refine or extend EC command to support all cases
 Key: HDFS-8138
 URL: https://issues.apache.org/jira/browse/HDFS-8138
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Vinayakumar B


Currently a {{CODEC}} command is used to distribute stripping recovery command 
to DataNode. Discussed with [~umamaheswararao] and [~vinayrpet] further, we may 
need to refine or extend the command to support all cases, as we may distribute 
encoding and recovering tasks in both stripping and non-stripping cases. We may 
have two ways: 1) a single command but uses a flag in it to denote all the 
cases; 2) more commands.

This will also consider how to layout, optimize or document relevant info 
contained in the the corresponding command related to the block group, so 
DataNode will be able to restore such info according to the concrete command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-13 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8137:
---

 Summary: Sends the EC schema to DataNode as well in EC 
encoding/recovering command
 Key: HDFS-8137
 URL: https://issues.apache.org/jira/browse/HDFS-8137
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G


Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the EC 
schema to DataNode as well contained in the EC encoding/recovering command. The 
target DataNode will use it to guide the executing of the task. 

Another way would be, DataNode would just request schema actively thru a 
separate RPC call, and as an optimization consideration, DataNode may cache 
schemas to avoid repeatedly asking for the same schema twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8136) Client gets and uses EC schema when reads and writes a stripping file

2015-04-13 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8136:
---

 Summary: Client gets and uses EC schema when reads and writes a 
stripping file
 Key: HDFS-8136
 URL: https://issues.apache.org/jira/browse/HDFS-8136
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: HDFS-7285
Reporter: Kai Zheng
Assignee: Kai Zheng


Discussed with [~umamaheswararao] and [~vinayrpet], in client when reading and 
writing a stripping file, it can invoke a separate call to NameNode to request 
the EC schema associated with the EC zone where the file is in. Then the schema 
can be used to guide the reading and writing. Currently it uses hard-coded 
values.

Optionally, as an optimization consideration, client may cache schema info per 
file or per zone or per schema name. We could add schema name in 
{{HdfsFileStatus}} for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8112) Enforce authorization policy to protect administration operations for EC zone and schemas

2015-04-09 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8112:
---

 Summary: Enforce authorization policy to protect administration 
operations for EC zone and schemas
 Key: HDFS-8112
 URL: https://issues.apache.org/jira/browse/HDFS-8112
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


We should allow to enforce authorization policy to protect administration 
operations for EC zone and schemas as such behaviors would impact too much for 
a system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8109) ECManager should be able to manage multiple ECSchemas

2015-04-09 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8109.
-
Resolution: Duplicate

> ECManager should be able to manage multiple ECSchemas
> -
>
> Key: HDFS-8109
> URL: https://issues.apache.org/jira/browse/HDFS-8109
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Hui Zheng
>
> [HDFS-8074|https://issues.apache.org/jira/browse/HDFS-8074] has implemented a 
> default EC Schema.
> But a user may use another predefined schema when he creates an EC zone.  
> Maybe we need to implement to get a ECSchema from ECManager by its schema 
> name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8104) Make hard-coded values consistent with the system default schema first before remove them

2015-04-09 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8104.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed

> Make hard-coded values consistent with the system default schema first before 
> remove them
> -
>
> Key: HDFS-8104
> URL: https://issues.apache.org/jira/browse/HDFS-8104
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: HDFS-7285
>
> Attachments: HDFS-8104-v1.patch, HDFS-8104-v2.patch
>
>
> It's not easy to remove the hard-coded values to use the system default 
> schema. We may need several steps/issues to cover relevant aspects. First of 
> all, let's make the hard-coded values consistent with the system default 
> schema first. This might not so easy, as experimental test indicated, when 
> change the following two lines, some tests then failed.
> {code}
> -  public static final byte NUM_DATA_BLOCKS = 3;
> -  public static final byte NUM_PARITY_BLOCKS = 2;
> +  public static final byte NUM_DATA_BLOCKS = 6;
> +  public static final byte NUM_PARITY_BLOCKS = 3;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8105) Make TestReadStripedFile work with RS-6-3 and etc.

2015-04-08 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8105.
-
Resolution: Duplicate

Will handle this as part of HDFS-8104.

> Make TestReadStripedFile work with RS-6-3 and etc.
> --
>
> Key: HDFS-8105
> URL: https://issues.apache.org/jira/browse/HDFS-8105
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: HDFS-7285
>
>
> {{TestReadStripedFile}} failed to work with RS-6-3 and etc. This is to fix it 
> and make it flexible a bit if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8107) Fix stripping related test failures in TestFSImage

2015-04-08 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8107:
---

 Summary: Fix stripping related test failures in TestFSImage
 Key: HDFS-8107
 URL: https://issues.apache.org/jira/browse/HDFS-8107
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kai Zheng
Assignee: Kai Zheng
 Fix For: HDFS-7285


Fix stripping related test failures in {{TestFSImage}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8106) Fix test failure in TestAddStripedBlocks

2015-04-08 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8106:
---

 Summary: Fix test failure in TestAddStripedBlocks
 Key: HDFS-8106
 URL: https://issues.apache.org/jira/browse/HDFS-8106
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kai Zheng
Assignee: Kai Zheng


When run the test, error was noticed.
{code}
java.lang.ClassCastException: 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoContiguousUnderConstruction
 cannot be cast to 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction 
at 
org.apache.hadoop.hdfs.server.namenode.TestAddStripedBlocks.testAddUCReplica(TestAddStripedBlocks.java:275)
{code}

And,
{code}
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.hdfs.server.namenode.TestAddStripedBlocks.testAddStripedBlock(TestAddStripedBlocks.java:112)
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8105) Make TestReadStripedFile work with RS-6-3 and etc.

2015-04-08 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8105:
---

 Summary: Make TestReadStripedFile work with RS-6-3 and etc.
 Key: HDFS-8105
 URL: https://issues.apache.org/jira/browse/HDFS-8105
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kai Zheng
Assignee: Kai Zheng


{{TestReadStripedFile}} failed to work with RS-6-3 and etc. This is to fix it 
and make it flexible a bit if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8104) Make hard-coded values consistent with the system default schema first before remove them

2015-04-08 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8104:
---

 Summary: Make hard-coded values consistent with the system default 
schema first before remove them
 Key: HDFS-8104
 URL: https://issues.apache.org/jira/browse/HDFS-8104
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


It's not easy to remove the hard-coded values to use the system default schema. 
We may need several steps/issues to cover relevant aspects. First of all, let's 
make the hard-coded values consistent with the system default schema first. 
This might not so easy, as experimental test indicated, when change the 
following two lines, some tests then failed.
{code}
-  public static final byte NUM_DATA_BLOCKS = 3;
-  public static final byte NUM_PARITY_BLOCKS = 2;
+  public static final byte NUM_DATA_BLOCKS = 6;
+  public static final byte NUM_PARITY_BLOCKS = 3;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8074) Define a system-wide default EC schema

2015-04-08 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8074.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed

> Define a system-wide default EC schema
> --
>
> Key: HDFS-8074
> URL: https://issues.apache.org/jira/browse/HDFS-8074
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: HDFS-7285
>
> Attachments: HDFS-8074-v1.patch, HDFS-8074-v2.patch, 
> HDFS-8074-v3.patch
>
>
> It's good to have a system default EC schema first with fixed values before 
> we support more schemas. This makes sense to resolve some dependencies before 
> HDFS-7866 can be done in whole. The default system schema is also needed 
> anyhow essentially when admin just wants to use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8095) Allow to configure the system default EC schema

2015-04-08 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8095:
---

 Summary: Allow to configure the system default EC schema
 Key: HDFS-8095
 URL: https://issues.apache.org/jira/browse/HDFS-8095
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


As suggested by [~umamaheswararao] and [~vinayrpet] in HDFS-8074, we may desire 
allowing to configure the system default EC schema, so in any deployment a 
cluster admin may be able to define their own system default one. In the 
discussion, we have two approaches to configure the system default schema: 1) 
predefine it in the {{ecschema-def.xml}} file, making sure it's not changed; 2) 
configure the key parameter values as properties in {{core-site.xml}}. Open 
this for future consideration in case it's forgotten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8074) Have a system default EC schema first before we support more schemas

2015-04-07 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8074:
---

 Summary: Have a system default EC schema first before we support 
more schemas
 Key: HDFS-8074
 URL: https://issues.apache.org/jira/browse/HDFS-8074
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


It's good to have a system default EC schema first with fixed values before we 
support more schemas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8064) Erasure coding: DataNode support for block recovery of striped block groups

2015-04-06 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-8064.
-
  Resolution: Duplicate
Release Note: This is a duplicate of HDFS-7348.

> Erasure coding: DataNode support for block recovery of striped block groups
> ---
>
> Key: HDFS-8064
> URL: https://issues.apache.org/jira/browse/HDFS-8064
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Yi Liu
>
> This JIRA is for block recovery of striped block groups on DataNodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8062) Remove hard-coded values in favor of EC schema

2015-04-03 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8062:
---

 Summary: Remove hard-coded values in favor of EC schema
 Key: HDFS-8062
 URL: https://issues.apache.org/jira/browse/HDFS-8062
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


This is to revisit all the places in NameNode that uses hard-coded values in 
favor of {{ECSchema}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8042) Erasure Coding: distribute replication to non-stripping erasure coding conversion work to DataNode

2015-04-02 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8042:
---

 Summary: Erasure Coding: distribute replication to non-stripping 
erasure coding conversion work to DataNode
 Key: HDFS-8042
 URL: https://issues.apache.org/jira/browse/HDFS-8042
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Vinayakumar B


In *non-stripping* erasure coding case, we need some approach to distribute 
conversion work between replication and non-stripping erasure coding to 
DataNode. It can be NameNode, or a tool utilizing MR just like the current 
distcp, or another one like the balancer/mover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8024) Erasure Coding: ECworker basics, bootstraping and configuration

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8024:
---

 Summary: Erasure Coding: ECworker basics, bootstraping and 
configuration
 Key: HDFS-8024
 URL: https://issues.apache.org/jira/browse/HDFS-8024
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


This is to come up ECWorker itself, considering its basic setup, configuration 
and bootstrapping, which will be used to frame all the related work together, 
like BlockGroup, coding work, block reader, writer and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8023) Erasure Coding: retrieve eraure coding policy and schema for a block or file from NameNode

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8023:
---

 Summary: Erasure Coding: retrieve eraure coding policy and schema 
for a block or file from NameNode
 Key: HDFS-8023
 URL: https://issues.apache.org/jira/browse/HDFS-8023
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Vinayakumar B


NameNode needs to provide RPC call for client, tool, or DataNode to retrieve 
eraure coding policy and schema for a block or file from NameNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8022) Erasure Coding: handle read/write failure for non-stripping coding blocks

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8022:
---

 Summary: Erasure Coding: handle read/write failure for 
non-stripping coding blocks
 Key: HDFS-8022
 URL: https://issues.apache.org/jira/browse/HDFS-8022
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Tsz Wo Nicholas Sze


In *non-stripping* case, for (6, 3)-Reed-Solomon, ECWorker reads 6 data blocks 
and writes 3 parity blocks. We need to handle datanode or network failures when 
reading or writing local or remote blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8021) Erasure Coding: restore BlockGroup and schema info from non-stripping coding command

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8021:
---

 Summary: Erasure Coding: restore BlockGroup and schema info from 
non-stripping coding command
 Key: HDFS-8021
 URL: https://issues.apache.org/jira/browse/HDFS-8021
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


As a task of HDFS-7344, to process *non-stripping* coding commands from 
NameNode, we need to first be able to restore BlockGroup and schema 
information, which will be used to construct coding work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8020) Erasure Coding: restore BlockGroup and schema info from stripping coding command

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8020:
---

 Summary: Erasure Coding: restore BlockGroup and schema info from 
stripping coding command
 Key: HDFS-8020
 URL: https://issues.apache.org/jira/browse/HDFS-8020
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


As a task of HDFS-7344, to process stripping coding commands from NameNode, we 
need to first be able to restore BlockGroup and schema information, which will 
be used to construct coding work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8019) Erasure Coding: erasure coding chunk buffer allocation and management

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8019:
---

 Summary: Erasure Coding: erasure coding chunk buffer allocation 
and management
 Key: HDFS-8019
 URL: https://issues.apache.org/jira/browse/HDFS-8019
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


As a task of HDFS-7344, this is to come up a chunk buffer pool allocating and 
managing coding chunk buffers, either based on on-heap or off-heap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8018) Erasure Coding: on-demand erasure recovery for erased data serving client request in DataNode

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8018:
---

 Summary: Erasure Coding: on-demand erasure recovery for erased 
data serving client request in DataNode
 Key: HDFS-8018
 URL: https://issues.apache.org/jira/browse/HDFS-8018
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Yong Zhang


As a task of HDFS-7344, this is to enhance and allow ECWorker to be able to 
serve data on demand when client requests for erased blocks in *non-stripping* 
erasure coding case. In stripping case, it's done in client side directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8017) Erasure Coding: perform non-stripping erasure decoding work given block reader and writer

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8017:
---

 Summary: Erasure Coding: perform non-stripping erasure decoding 
work given block reader and writer
 Key: HDFS-8017
 URL: https://issues.apache.org/jira/browse/HDFS-8017
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Zhe Zhang


This assumes the facilities like block reader and writer are ready, implements 
and performs erasure decoding work in *non-stripping* case utilizing erasure 
codec and coder provided by the codec framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8016) Erasure Coding: perform non-stripping erasure encoding work given block reader and writer

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8016:
---

 Summary: Erasure Coding: perform non-stripping erasure encoding 
work given block reader and writer
 Key: HDFS-8016
 URL: https://issues.apache.org/jira/browse/HDFS-8016
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Yong Zhang


This assumes the facilities like block reader and writer are ready, implements 
and performs erasure encoding work in non-stripping case utilizing erasure 
codec and coder provided by the codec framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8015) Erasure Coding: local and remote block writer for coding work in DataNode

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8015:
---

 Summary: Erasure Coding: local and remote block writer for coding 
work in DataNode
 Key: HDFS-8015
 URL: https://issues.apache.org/jira/browse/HDFS-8015
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Li Bo


As a task of HDFS-7344 ECWorker, in either stripping or non-stripping erasure 
coding, to perform encoding or decoding, we need to be able to write data 
blocks locally or remotely. This is to come up block writer facility in 
DataNode side. Better to think about the similar work done in client side, so 
in future it's possible to unify the both.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8014) Erasure Coding: local and remote block reader for coding work in DataNode

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8014:
---

 Summary: Erasure Coding: local and remote block reader for coding 
work in DataNode
 Key: HDFS-8014
 URL: https://issues.apache.org/jira/browse/HDFS-8014
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Zhe Zhang


As a task of HDFS-7344 ECWorker, in either stripping or non-stripping erasure 
coding, to perform encoding or decoding, we need first to be able to read 
locally or remotely data blocks. This is to come up block reader facility in 
DataNode side. Better to think about the similar work done in client side, so 
in future it's possible to unify the both.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8013) Erasure coding: distribute recovery work for erasure coding blocks to DataNode

2015-03-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-8013:
---

 Summary: Erasure coding: distribute recovery work for erasure 
coding blocks to DataNode
 Key: HDFS-8013
 URL: https://issues.apache.org/jira/browse/HDFS-8013
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Vinayakumar B


Similar to HDFS-7369, for non-stripping erasure coding, NameNode may detect 
some erasure coding blocks are erased, so will schedule and distribute recovery 
work to DataNode to reconstruct such blocks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7856) DFSOutputStream using outer independent class DataStreamer and Packet

2015-03-03 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-7856.
-
Resolution: Duplicate

This isn't needed as the major work should be done while separating the inner 
classes out.

> DFSOutputStream using outer independent class DataStreamer and Packet
> -
>
> Key: HDFS-7856
> URL: https://issues.apache.org/jira/browse/HDFS-7856
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
>
> DataStreamer and Packet are outer classes after being separated out. 
> DataStreamer may directly change the state of DFSOutputStream when it is an 
> inner class. Now DFSOutputStream has to handle this issue when using outer 
> class DataStreamer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7866) Erasure coding: Allow NameNode to load, list and sync predefined EC schemas

2015-03-02 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7866:
---

 Summary: Erasure coding: Allow NameNode to load, list and sync 
predefined EC schemas
 Key: HDFS-7866
 URL: https://issues.apache.org/jira/browse/HDFS-7866
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


This is to extend NameNode to load, list and sync predefine EC schemas in 
authorized and controlled approach. The provided facilities will be used to 
implement DFSAdmin commands so admin can list available EC schemas, then could 
choose some of them for target EC zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-02-27 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7859:
---

 Summary: Erasure Coding: Persist EC schemas in NameNode
 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
persist EC schemas in NameNode centrally and reliably, so that EC zones can 
reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7793) Refactor DFSOutputStream seperating DataStreamer out

2015-02-12 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7793:
---

 Summary: Refactor DFSOutputStream seperating DataStreamer out
 Key: HDFS-7793
 URL: https://issues.apache.org/jira/browse/HDFS-7793
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Li Bo


As proposed by [~jingzhao] and discussed with [~zhz] in HDFS-7729, it would be 
great to refactor DFSOutputStream first and separate DataStreamer out of it, 
before we enhance it to support the stripping layout. [~umamaheswararao] 
suggested we have this issue to handle the refactoring. Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7699) Erasure Codec API to possiblly consider all the essential aspects for an erasure code

2015-01-28 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7699:
---

 Summary: Erasure Codec API to possiblly consider all the essential 
aspects for an erasure code
 Key: HDFS-7699
 URL: https://issues.apache.org/jira/browse/HDFS-7699
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


This is to define the even higher level API *ErasureCodec* to possiblly 
consider all the essential aspects for an erasure code, as discussed in in 
HDFS-7337 in details. Generally, it will cover the necessary configurations 
about which *RawErasureCoder* to use for the code scheme, how to form and 
layout the BlockGroup, and etc. It will also discuss how an *ErasureCodec* will 
be used in both client and DataNode, in all the supported modes related to EC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7674) Adding metrics for Erasure Coding

2015-01-24 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7674:
---

 Summary: Adding metrics for Erasure Coding
 Key: HDFS-7674
 URL: https://issues.apache.org/jira/browse/HDFS-7674
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


As the design (in HDFS-7285) indicates, erasure coding involves non-trivial 
impact and workload for NameNode, DataNode and client; it also allows 
configurable and pluggable erasure codec and schema with flexible tradeoff 
options (see HDFS-7337). To support necessary analysis and adjustment, we'd 
better have various meaningful metrics for the EC support, like 
encoding/decoding tasks, recovered blocks, read/transferred data size, 
computation time and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7664) Reed-Solomon ErasureCoder

2015-01-22 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7664:
---

 Summary: Reed-Solomon ErasureCoder
 Key: HDFS-7664
 URL: https://issues.apache.org/jira/browse/HDFS-7664
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


This is to implement Reed-Solomon ErasureCoder using the API defined in 
HDFS-7662. It supports to plugin via configuration for concrete 
RawErasureCoder, using either JavaRSErasureCoder added in HDFS-7418 or 
IsaRSErasureCoder added in HDFS-7338.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7662) Erasure Coder API for encoding and decoding of BlockGroup

2015-01-22 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7662:
---

 Summary: Erasure Coder API for encoding and decoding of BlockGroup
 Key: HDFS-7662
 URL: https://issues.apache.org/jira/browse/HDFS-7662
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng


This is to define ErasureCoder API for encoding and decoding of BlockGroup. 
Given a BlockGroup, ErasureCoder extracts data chunks from the blocks and 
everages RawErasureCoder defined in HDFS-7353 to perform concrete encoding or 
decoding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7418) RS coder in pure Java

2014-11-20 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7418:
---

 Summary: RS coder in pure Java
 Key: HDFS-7418
 URL: https://issues.apache.org/jira/browse/HDFS-7418
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


This will implement RS coder by porting existing codes in HDFS-RAID in the new 
codec and coder framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7417) XOR codes

2014-11-20 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7417:
---

 Summary: XOR codes
 Key: HDFS-7417
 URL: https://issues.apache.org/jira/browse/HDFS-7417
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


This will implement XOR codes by porting the codes from HDFS-RAID. The coder in 
the algorithm is needed by some high level codecs like LRC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-7363) Pluggable algorithms to form block groups in erasure coding

2014-11-12 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng reopened HDFS-7363:
-
  Assignee: Kai Zheng

Let me reuse this item as a sub task of HDFS-7337, where BlockGrouper is 
defined for this purpose as part of a codec. Its role:
Given desired data blocks, BlockGrouper calculates and arranges a BlockGroup 
for encoding. Different code can have different layout about a BlockGroup. In 
LRC(6, 2, 2), we have 3 child block groups: 2 local groups plus 1 global group; 
In RS, we have 1 block group. Given a BlockGroup with some blocks missing, 
BlockGroups also calculates and determines how to recover if recoverable, like 
using which blocks to recover a missing block. With such information the 
corresponding ErasureCoder can 
perform the concrete decoding work. 

> Pluggable algorithms to form block groups in erasure coding
> ---
>
> Key: HDFS-7363
> URL: https://issues.apache.org/jira/browse/HDFS-7363
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Kai Zheng
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7363) Pluggable algorithms to form block groups in erasure coding

2014-11-09 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-7363.
-
Resolution: Duplicate

> Pluggable algorithms to form block groups in erasure coding
> ---
>
> Key: HDFS-7363
> URL: https://issues.apache.org/jira/browse/HDFS-7363
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7371) Loading EC schemas from configuration

2014-11-06 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7371:
---

 Summary: Loading EC schemas from configuration
 Key: HDFS-7371
 URL: https://issues.apache.org/jira/browse/HDFS-7371
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Kai Zheng
Assignee: Kai Zheng


System administrator can configure multiple EC codecs in hdfs-site.xml file, 
and codec instances or schemas in a new configuration file named ec-schema.xml 
in the conf folder. A codec can be referenced by its instance or schema using 
the codec name, and a schema can be utilized and specified by the schema name 
for a folder to enforce EC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7353) Common Erasure Codec API and plugin support

2014-11-04 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7353:
---

 Summary: Common Erasure Codec API and plugin support
 Key: HDFS-7353
 URL: https://issues.apache.org/jira/browse/HDFS-7353
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng


This is to abstract and define common codec API across different codec 
algorithms like RS, XOR and etc. Such API can be implemented by utilizing 
various library support, such as Intel ISA library and Jerasure library. It 
provides default implementation and also allows to plugin vendor specific ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7348) Process erasure decoding work

2014-11-03 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7348:
---

 Summary: Process erasure decoding work
 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Kai Zheng
Assignee: Li Bo


As one of the tasks for HDFS-7344, this is to process decoding work, recovering 
data blocks according to block groups and codec schema;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7346) Process erasure encoding work

2014-11-03 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7346:
---

 Summary: Process erasure encoding work
 Key: HDFS-7346
 URL: https://issues.apache.org/jira/browse/HDFS-7346
 Project: Hadoop HDFS
  Issue Type: Task
  Components: datanode
Reporter: Kai Zheng
Assignee: Li Bo


As one of the tasks for HDFS-7344, this is to process encoding work, 
calculating parity blocks as specified in block groups and codec schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7345) Local Reconstruction Codes (LRC)

2014-11-03 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7345:
---

 Summary: Local Reconstruction Codes (LRC)
 Key: HDFS-7345
 URL: https://issues.apache.org/jira/browse/HDFS-7345
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-EC
Reporter: Kai Zheng
Assignee: Kai Zheng


HDFS-7285 proposes to support Erasure Coding inside HDFS, supports multiple 
Erasure Coding codecs via pluggable framework and implements Reed Solomon code 
by default. This is to support a more advanced coding mechanism, Local 
Reconstruction Codes (LRC). As discussed in the paper 
(https://www.usenix.org/system/files/conference/atc12/atc12-final181_0.pdf), 
LRC reduces the number of erasure coding fragments that need to be read when 
reconstructing data fragments that are offline, while still keeping the storage 
overhead low. The important benefits of LRC are that it reduces the bandwidth 
and I/Os required for repair reads over prior codes, while still allowing a 
significant reduction in storage overhead. Intel ISA library also supports LRC 
in its update and can also be leveraged. The implementation would also consider 
how to distribute the calculating of local and global parity blocks to other 
relevant DataNodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7344) Erasure Coding worker and support in DataNode

2014-11-03 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7344:
---

 Summary: Erasure Coding worker and support in DataNode
 Key: HDFS-7344
 URL: https://issues.apache.org/jira/browse/HDFS-7344
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-EC
Reporter: Kai Zheng
Assignee: Kai Zheng


According to HDFS-7285 and the design, this handles DataNode side extension and 
related support for Erasure Coding, and implements ECWorker. It mainly covers 
the following aspects, and separate tasks may be opened to handle each of them.
* Process encoding work, calculating parity blocks as specified in block groups 
and codec schema;
* Process decoding work, recovering data blocks according to block groups and 
codec schema;
* Handle client requests for passive recovery blocks data and serving data on 
demand while reconstructing;
* Write parity blocks according to storage policy.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7343) A comprehensive storage policy engine

2014-11-03 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-7343:
---

 Summary: A comprehensive storage policy engine
 Key: HDFS-7343
 URL: https://issues.apache.org/jira/browse/HDFS-7343
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Kai Zheng


As discussed in HDFS-7285, it would be better to have a comprehensive and 
flexible storage policy engine considering file attributes, metadata, data 
temperature, storage type, EC codec, available hardware capabilities, 
user/application preference and etc.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-5152) Avoiding redundant Kerberos login for Zookeeper client in ActiveStandbyElector

2013-08-30 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-5152:
---

 Summary: Avoiding redundant Kerberos login for Zookeeper client in 
ActiveStandbyElector
 Key: HDFS-5152
 URL: https://issues.apache.org/jira/browse/HDFS-5152
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: security
Reporter: Kai Zheng


Based on the fix in HADOOP-8315, it's possible to deploy a secured HA cluster 
with SASL support for connection with Zookeeper. However it requires extra 
configuration for JAAS to initialize the Zookeeper client because the client 
will do another login in it even when ZKFC service actually has already passed 
the Kerberos login during its starting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-5129) JournalNode should also respect DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY as NN and SNN

2013-08-24 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved HDFS-5129.
-

Resolution: Duplicate

Duplicate of HDFS-5091. Support for spnego keytab separate from the JournalNode 
keytab for secure HA.

> JournalNode should also respect DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY as 
> NN and SNN
> 
>
> Key: HDFS-5129
> URL: https://issues.apache.org/jira/browse/HDFS-5129
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Priority: Minor
>
> JournalNode should also respect DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY as 
> NN and SNN. This will allow the same keytab file for NN, SNN and JournalNode 
> regarding Spnego and would help simplify keytab deployment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-5129) JournalNode should also respect DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY as NN and SNN

2013-08-24 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-5129:
---

 Summary: JournalNode should also respect 
DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY as NN and SNN
 Key: HDFS-5129
 URL: https://issues.apache.org/jira/browse/HDFS-5129
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kai Zheng
Priority: Minor


JournalNode should also respect DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY as 
NN and SNN. This will allow the same keytab file for NN, SNN and JournalNode 
regarding Spnego and would help simplify keytab deployment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-5087) Allowing specific JAVA heap max setting for HDFS related services

2013-08-11 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-5087:
---

 Summary: Allowing specific JAVA heap max setting for HDFS related 
services
 Key: HDFS-5087
 URL: https://issues.apache.org/jira/browse/HDFS-5087
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts
Reporter: Kai Zheng
Priority: Minor


This allows specific JAVA heap max setting for HDFS related services as it does 
for YARN services, to be consistent. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >