[jira] [Created] (HDFS-8580) Erasure coding: Persist cellSize in BlockInfoStriped and StripedBlockProto

2015-06-11 Thread Walter Su (JIRA)
Walter Su created HDFS-8580:
---

 Summary: Erasure coding: Persist cellSize in BlockInfoStriped and 
StripedBlockProto
 Key: HDFS-8580
 URL: https://issues.apache.org/jira/browse/HDFS-8580
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Walter Su
Assignee: Walter Su


Zhe Zhang, Kai Zheng and I had a offline discussion. Here is what we thought:  
Add a cellSize field in BlockInfoStriped as a workaround, and deal with memory 
usage in follow-on.

discussion in HDFS-8494:
from [~jingzhao]
{quote}
Also, we should consider adding a chunk size field to StripedBlockProto and 
removing the cell size field from HdfsFileStatus. In this way we can access the 
chunk size information in the storage layer.
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8576) Lease recovery returns false , eventhough recovery happens.

2015-06-11 Thread J.Andreina (JIRA)
J.Andreina created HDFS-8576:


 Summary:  Lease recovery returns false , eventhough recovery 
happens.
 Key: HDFS-8576
 URL: https://issues.apache.org/jira/browse/HDFS-8576
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: J.Andreina
Assignee: J.Andreina


FSNamesystem#recoverLease , returns false eventhough lease recover happens. 
Hence only on second retry for recovering lease on a file ,returns success 
after checking if the file is not underconstruction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8577) Avoid retrying to recover lease on a file which does not exist

2015-06-11 Thread J.Andreina (JIRA)
J.Andreina created HDFS-8577:


 Summary: Avoid retrying to recover lease on a file which does not 
exist
 Key: HDFS-8577
 URL: https://issues.apache.org/jira/browse/HDFS-8577
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: J.Andreina
Assignee: J.Andreina



Avoid retrying to recover lease on a file which does not exist

{noformat}
recoverLease got exception:
java.io.FileNotFoundException: File does not exist: /hello_hi
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)

Retrying in 5000 ms...
Retry #1
recoverLease got exception:
java.io.FileNotFoundException: File does not exist: /hello_hi
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)

{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel

2015-06-11 Thread Raju Bairishetti (JIRA)
Raju Bairishetti created HDFS-8578:
--

 Summary: On upgrade, Datanode should process all storage/data dirs 
in parallel
 Key: HDFS-8578
 URL: https://issues.apache.org/jira/browse/HDFS-8578
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Raju Bairishetti
Priority: Critical


Right now, during upgrades datanode is processing all the storage dirs 
sequentially. Assume it takes ~20 mins to process a single storage dir then  
datanode which has ~10 disks will take around 3hours to come up.

{code}
   for (int idx = 0; idx  getNumStorageDirs(); idx++) {
  doTransition(datanode, getStorageDir(idx), nsInfo, startOpt);
  assert getCTime() == nsInfo.getCTime() 
  : Data-node and name-node CTimes must be the same.;
}
{code}

Can we make datanode to process all the staorage dirs parallelly? This saves 
lots of time during upgrades.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8579) Update HDFS usage with missing options

2015-06-11 Thread J.Andreina (JIRA)
J.Andreina created HDFS-8579:


 Summary: Update HDFS usage with missing options
 Key: HDFS-8579
 URL: https://issues.apache.org/jira/browse/HDFS-8579
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: J.Andreina
Assignee: J.Andreina
Priority: Minor


Update hdfs usage with missing options (fetchdt and debug)
{noformat}
 1
./hdfs fetchdt
fetchdt opts token file
Options:
  --webservice url  Url to contact NN on
  --renewer nameName of the delegation token renewer
  --cancelCancel the delegation token
  --renew Renew the delegation token.  Delegation token must have 
been fetched using the --renewer name option.
  --print Print the delegation token

2
 ./hdfs debug
Usage: hdfs debug command [arguments]

verify [-meta metadata-file] [-block block-file]
recoverLease [-path path] [-retries num-retries]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Hadoop-Hdfs-trunk-Java8 - Build # 214 - Failure

2015-06-11 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/214/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 7485 lines...]
[INFO] --- maven-source-plugin:2.3:test-jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.15:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.0:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS Client . SUCCESS [ 47.834 s]
[INFO] Apache Hadoop HDFS  FAILURE [  02:47 h]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [  0.091 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:47 h
[INFO] Finished at: 2015-06-11T14:22:24+00:00
[INFO] Final Memory: 52M/160M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-hdfs: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports
 for the individual test results.
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn goals -rf :hadoop-hdfs
Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to Hadoop-Hdfs-trunk-Java8 #213
Archived 1 artifacts
Archive block size is 32768
Received 0 blocks and 797620 bytes
Compression is 0.0%
Took 19 sec
Recording test results
Updating HADOOP-12074
Updating YARN-3785
Updating HDFS-8549
Updating MAPREDUCE-6389
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
5 tests failed.
REGRESSION:  
org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2

Error Message:
org.apache.hadoop.util.ExitUtil$ExitException: Could not sync enough journals 
to persistent storage due to No journals available to flush. Unsynced 
transactions: 1
 at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:126)
 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:631)
 at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.endCurrentLogSegment(FSEditLog.java:1298)
 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.close(FSEditLog.java:362)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopActiveServices(FSNamesystem.java:1224)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.stopActiveServices(NameNode.java:1717)
 at 
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.exitState(ActiveState.java:70)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.stop(NameNode.java:863)
 at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdownNameNode(MiniDFSCluster.java:1796)
 at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1847)
 at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1827)
 at 
org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:493)
 at 
org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2(TestLeaseRecovery2.java:426)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 

Build failed in Jenkins: Hadoop-Hdfs-trunk-Java8 #214

2015-06-11 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/214/changes

Changes:

[devaraj] MAPREDUCE-6389. Fix BaileyBorweinPlouffe CLI usage message. 
Contributed by

[wang] HDFS-8549. Abort the balancer if an upgrade is in progress.

[xgong] YARN-3785. Support for Resource as an argument during submitApp call in

[vinayakumarb] HADOOP-12074. in Shell.java#runCommand() rethrow 
InterruptedException as InterruptedIOException (Contributed by Lavkesh Lahngir)

--
[...truncated 7292 lines...]
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.util.TestBestEffortLongFile
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.231 sec - in 
org.apache.hadoop.hdfs.util.TestBestEffortLongFile
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.util.TestDiff
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.625 sec - in 
org.apache.hadoop.hdfs.util.TestDiff
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.util.TestByteArrayManager
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.499 sec - in 
org.apache.hadoop.hdfs.util.TestByteArrayManager
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.util.TestXMLUtils
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.092 sec - in 
org.apache.hadoop.hdfs.util.TestXMLUtils
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.util.TestLightWeightHashSet
Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.189 sec - in 
org.apache.hadoop.hdfs.util.TestLightWeightHashSet
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.util.TestMD5FileUtils
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.256 sec - in 
org.apache.hadoop.hdfs.util.TestMD5FileUtils
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.util.TestLightWeightLinkedSet
Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.204 sec - in 
org.apache.hadoop.hdfs.util.TestLightWeightLinkedSet
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.util.TestAtomicFileOutputStream
Tests run: 4, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.246 sec - in 
org.apache.hadoop.hdfs.util.TestAtomicFileOutputStream
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.util.TestExactSizeInputStream
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.073 sec - in 
org.apache.hadoop.hdfs.util.TestExactSizeInputStream
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestLease
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.115 sec - in 
org.apache.hadoop.hdfs.TestLease
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.508 sec - in 
org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestHFlush
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 32.766 sec - 
in org.apache.hadoop.hdfs.TestHFlush
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestRemoteBlockReader
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.349 sec - in 
org.apache.hadoop.hdfs.TestRemoteBlockReader
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestHdfsAdmin
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.011 sec - in 
org.apache.hadoop.hdfs.TestHdfsAdmin
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestDistributedFileSystem
Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 52.455 sec - 
in org.apache.hadoop.hdfs.TestDistributedFileSystem
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestRollingUpgradeRollback
Tests run: 3, 

[jira] [Created] (HDFS-8581) count cmd calculate wrong when huge files exist in one folder

2015-06-11 Thread tongshiquan (JIRA)
tongshiquan created HDFS-8581:
-

 Summary: count cmd calculate wrong when huge files exist in one 
folder
 Key: HDFS-8581
 URL: https://issues.apache.org/jira/browse/HDFS-8581
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Reporter: tongshiquan
Assignee: J.Andreina
Priority: Minor


If one directory such as /result exists about 20 files, then when execute 
hdfs dfs -count /, the result will go wrong. For all directories whose name 
after /result, file num will not be included.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


how to skip single corrupted SequenceFile in SequenceFileInputFormat ?

2015-06-11 Thread Zhang Xiaoyu
Hi, all,
My MR job (consumer pipeline) is using SequenceFileInputFormat as as the
input format in the MultipleInputs

for (FileStatus input : inputs) {
MultipleInputs.addInputPath(job, myPath,
SequenceFileInputFormat.class, MyMapper.class);
}


My application will fail in condition that, when generator (use
SequenceFile.Writer) just create a zero size file, and keep append k-v to
it, but the content is not big enough so that nothing is flushed to the
file yet (even no blocking is generated), at this moment if the consumer
pipeline program kicked off and consume the file, it will treat it as a
corruption file with exception

java.io.EOFException: null
at java.io.DataInputStream.readFully(DataInputStream.java:197)
~[na:1.7.0_60-ea]
at java.io.DataInputStream.readFully(DataInputStream.java:169)
~[na:1.7.0_60-ea]
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:146)
~[hadoop-common-2.2.0.jar:na]
at
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339)
[hadoop-common-2.2.0.jar:na]
at
org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1832)
[hadoop-common-2.2.0.jar:na]
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1752)
[hadoop-common-2.2.0.jar:na]
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1773)
[hadoop-common-2.2.0.jar:na]
at
org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
[hadoop-mapreduce-client-core-2.2.0.jar:na]
at
org.apache.hadoop.mapreduce.lib.input.DelegatingRecordReader.initialize(DelegatingRecordReader.java:84)
[hadoop-mapreduce-client-core-2.2.0.jar:na]
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:524)
[hadoop-mapreduce-client-core-2.2.0.jar:na]
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:762)
[hadoop-mapreduce-client-core-2.2.0.jar:na]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
[hadoop-mapreduce-client-core-2.2.0.jar:na]
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
[hadoop-mapreduce-client-common-2.2.0.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
[na:1.7.0_60-ea]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_60-ea]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_60-ea]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_60-ea]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_60-ea]

all the code is controlled in lib class, so there is not much thing I can
do in my MR job. So is there a way to skip a single *corrupted*
SequenceFile ?

another thing is when the program fail, and when I open vim the input file,
I found the file SEEMS has the proper header (SEQ, size, and etc..), so not
sure which part is corrected, maybe it is just timing, means when the read
happen, it doesn't has those header yet.

NOT SURE this will help but here is the header (plus a little bit content
maybe) of the corrupted file:

SEQ^F^Yorg.apache.hadoop.io.Textorg.apache.hadoop.io.BytesWritable
^@^@^@^@^@^@ù9añ æfá#¬694I­Ç^@^@^@8c^@^@^@%$


here is an empty sequence file, which is fine by consumer :

SEQ^F^Yorg.apache.hadoop.io.Textorg.apache.hadoop.io.BytesWritable
^@^@^@^@^@^@86bÍI§ï897ê=E^OÝ¢^D

Any idea ? Thanks in advance.

Johnny


Re: how to skip single corrupted SequenceFile in SequenceFileInputFormat ?

2015-06-11 Thread Zhang Xiaoyu
sorry, just add on top of that, it will fail in two condition

1. the file flush something to it, with header and some data

2. the file flush nothing to it, so when vim the file, it is complete empty

it will fail in both cases with the same exception, so looks like treat
them both as corrupted file, the question is is there a way to skip those
individual in the input format ? as my input is a folder, and there are
many files, some are corrupted, some are not, but at least it shouldn't
fail the MR job just because of a single file

Thanks,
Johnny

On Thu, Jun 11, 2015 at 12:15 PM, Zhang Xiaoyu zhangxiaoyu...@gmail.com
wrote:

 Hi, all,
 My MR job (consumer pipeline) is using SequenceFileInputFormat as as the
 input format in the MultipleInputs

 for (FileStatus input : inputs) {
 MultipleInputs.addInputPath(job, myPath, SequenceFileInputFormat.class, 
 MyMapper.class);
 }


 My application will fail in condition that, when generator (use
 SequenceFile.Writer) just create a zero size file, and keep append k-v to
 it, but the content is not big enough so that nothing is flushed to the
 file yet (even no blocking is generated), at this moment if the consumer
 pipeline program kicked off and consume the file, it will treat it as a
 corruption file with exception

 java.io.EOFException: null
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
 ~[na:1.7.0_60-ea]
 at java.io.DataInputStream.readFully(DataInputStream.java:169)
 ~[na:1.7.0_60-ea]
 at
 org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:146)
 ~[hadoop-common-2.2.0.jar:na]
 at
 org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339)
 [hadoop-common-2.2.0.jar:na]
 at
 org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1832)
 [hadoop-common-2.2.0.jar:na]
 at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1752)
 [hadoop-common-2.2.0.jar:na]
 at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1773)
 [hadoop-common-2.2.0.jar:na]
 at
 org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
 [hadoop-mapreduce-client-core-2.2.0.jar:na]
 at
 org.apache.hadoop.mapreduce.lib.input.DelegatingRecordReader.initialize(DelegatingRecordReader.java:84)
 [hadoop-mapreduce-client-core-2.2.0.jar:na]
 at
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:524)
 [hadoop-mapreduce-client-core-2.2.0.jar:na]
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:762)
 [hadoop-mapreduce-client-core-2.2.0.jar:na]
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
 [hadoop-mapreduce-client-core-2.2.0.jar:na]
 at
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
 [hadoop-mapreduce-client-common-2.2.0.jar:na]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 [na:1.7.0_60-ea]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 [na:1.7.0_60-ea]
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 [na:1.7.0_60-ea]
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 [na:1.7.0_60-ea]
 at java.lang.Thread.run(Thread.java:744) [na:1.7.0_60-ea]

 all the code is controlled in lib class, so there is not much thing I can
 do in my MR job. So is there a way to skip a single *corrupted*
 SequenceFile ?

 another thing is when the program fail, and when I open vim the input
 file, I found the file SEEMS has the proper header (SEQ, size, and etc..),
 so not sure which part is corrected, maybe it is just timing, means when
 the read happen, it doesn't has those header yet.

 NOT SURE this will help but here is the header (plus a little bit content
 maybe) of the corrupted file:

 SEQ^F^Yorg.apache.hadoop.io.Textorg.apache.hadoop.io.BytesWritable
 ^@^@^@^@^@^@ù9añ æfá#¬694I­Ç^@^@^@8c^@^@^@%$


 here is an empty sequence file, which is fine by consumer :

 SEQ^F^Yorg.apache.hadoop.io.Textorg.apache.hadoop.io.BytesWritable
 ^@^@^@^@^@^@86bÍI§ï897ê=E^OÝ¢^D

 Any idea ? Thanks in advance.

 Johnny



[jira] [Created] (HDFS-8582) Spurious failure messages when running datanode reconfiguration

2015-06-11 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-8582:
---

 Summary: Spurious failure messages when running datanode 
reconfiguration
 Key: HDFS-8582
 URL: https://issues.apache.org/jira/browse/HDFS-8582
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Minor


When running a DN reconfig to hotswap some drives, it spits out this output:

{noformat}
$ hdfs dfsadmin -reconfig datanode localhost:9023 status
15/06/09 14:58:10 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Reconfiguring status for DataNode[localhost:9023]: started at Tue Jun 09 
14:57:37 PDT 2015 and finished at Tue Jun 09 14:57:56 PDT 2015.
FAILED: Change property 
rpc.engine.org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolPB
From: org.apache.hadoop.ipc.ProtobufRpcEngine
To: 
Error: Property 
rpc.engine.org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolPB is not 
reconfigurable.
FAILED: Change property mapreduce.client.genericoptionsparser.used
From: true
To: 
Error: Property mapreduce.client.genericoptionsparser.used is not 
reconfigurable.
FAILED: Change property rpc.engine.org.apache.hadoop.ipc.ProtocolMetaInfoPB
From: org.apache.hadoop.ipc.ProtobufRpcEngine
To: 
Error: Property rpc.engine.org.apache.hadoop.ipc.ProtocolMetaInfoPB is 
not reconfigurable.
SUCCESS: Change property dfs.datanode.data.dir
From: file:///data/1/user/dfs
To: file:///data/1/user/dfs,file:///data/2/user/dfs
FAILED: Change property dfs.datanode.startup
From: REGULAR
To: 
Error: Property dfs.datanode.startup is not reconfigurable.
FAILED: Change property 
rpc.engine.org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolPB
From: org.apache.hadoop.ipc.ProtobufRpcEngine
To: 
Error: Property 
rpc.engine.org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolPB is not 
reconfigurable.
FAILED: Change property 
rpc.engine.org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolPB
From: org.apache.hadoop.ipc.ProtobufRpcEngine
To: 
Error: Property 
rpc.engine.org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolPB is not 
reconfigurable.
FAILED: Change property 
rpc.engine.org.apache.hadoop.tracing.TraceAdminProtocolPB
From: org.apache.hadoop.ipc.ProtobufRpcEngine
To: 
Error: Property 
rpc.engine.org.apache.hadoop.tracing.TraceAdminProtocolPB is not reconfigurable.
{noformat}

These failed messages are spurious and should not be shown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8585) Remove dataBlockNum and parityBlockNum from StripedBlockProto

2015-06-11 Thread Yi Liu (JIRA)
Yi Liu created HDFS-8585:


 Summary: Remove dataBlockNum and parityBlockNum from 
StripedBlockProto 
 Key: HDFS-8585
 URL: https://issues.apache.org/jira/browse/HDFS-8585
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu


Since in HDFS-8427, we remove {{dataBlockNum}} and {{parityBlockNum}}, and get 
it from {{ECSchema}}, so we also need it from {{StripedBlockProto}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8584) Support using ramfs partitions on Linux

2015-06-11 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-8584:
---

 Summary: Support using ramfs partitions on Linux
 Key: HDFS-8584
 URL: https://issues.apache.org/jira/browse/HDFS-8584
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.7.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


Now that the bulk of work for HDFS-6919 is complete the memory limit 
enforcement uses the {{dfs.datanode.max.locked.memory}} setting and not the RAM 
disk free space availability.

We can now use ramfs partitions. This will require fixing the free space 
computation and reservation logic for transient volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)