Update: checkstyle, shellcheck, and whitespace tests

2015-04-30 Thread Allen Wittenauer

Hey gang.

With the commit of HADOOP-11866, the output for checkstyle and 
whitespace has been greatly enhanced to actually be useful now.  Instead of 
getting weird output, you'll see the file, the line number and (in the case of 
checkstyle) error.  Note that line numbers are after the patch applied. So any 
fuzzing done by patch will offset the line numbers if your source tree isn't up 
to date.

Shellcheck output has also been fixed to show errors that might have 
been missed before.  Additionally, I've filed an INFRA ticket to try to get 
shellcheck installed on the Jenkins nodes so that we can start seeing errors 
from it as well.

Thanks.

[jira] [Resolved] (HDFS-8308) Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when loading editlog

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-8308.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285

Thanks for the review, Nicholas! I've committed this.

> Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when 
> loading editlog
> 
>
> Key: HDFS-8308
> URL: https://issues.apache.org/jira/browse/HDFS-8308
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: HDFS-7285
>
> Attachments: HDFS-8308.000.patch
>
>
> If the editlog contains a transaction for creating an EC file, the NN will 
> get blocked in {{waitForLoadingFSImage}} because the following call path:
> FSDirectory#addFileForEditLog --> FSDirectory#isInECZone --> 
> FSDirectory#getECSchema --> ECZoneManager#getECSchema --> 
> ECZoneManager#getECZoneInfo --> FSNamesystem#getSchema --> 
> waitForLoadingFSImage
> This jira plans to fix this bug and also do some code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8308) Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when loading editlog

2015-04-30 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-8308:
---

 Summary: Erasure Coding: NameNode may get blocked in 
waitForLoadingFSImage() when loading editlog
 Key: HDFS-8308
 URL: https://issues.apache.org/jira/browse/HDFS-8308
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao


If the editlog contains a transaction for creating an EC file, the NN will get 
blocked in {{waitForLoadingFSImage}} because the following call path:

FSDirectory#addFileForEditLog --> FSDirectory#isInECZone --> 
FSDirectory#getECSchema --> ECZoneManager#getECSchema --> 
ECZoneManager#getECZoneInfo --> FSNamesystem#getSchema --> waitForLoadingFSImage

This jira plans to fix this bug and also do some code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Planning Hadoop 2.6.1 release

2015-04-30 Thread Chris Nauroth
Thank you, Arpit.  In addition, I suggest we include the following:

HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification
pipe is full
HADOOP-11604. Prevent ConcurrentModificationException while closing domain
sockets during shutdown of DomainSocketWatcher thread.
HADOOP-11648. Set DomainSocketWatcher thread name explicitly
HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there
is an I/O error during requestShortCircuitShm

HADOOP-11604 and 11648 are not critical by themselves, but they are
pre-requisites to getting a clean cherry-pick of 11802, which we believe
finally fixes the root cause of this issue.


--Chris Nauroth




On 4/30/15, 3:55 PM, "Arpit Agarwal"  wrote:

>HDFS candidates for back-porting to Hadoop 2.6.1. The first two were
>requested in [1].
>
>HADOOP-11674. oneByteBuf in CryptoInputStream and CryptoOutputStream
>should be non static
>HADOOP-11710. Make CryptoOutputStream behave like DFSOutputStream wrt
>synchronization
>
>HDFS-7009. Active NN and standby NN have different live nodes.
>HDFS-7035. Make adding a new data directory to the DataNode an atomic and
>improve error handling
>HDFS-7425. NameNode block deletion logging uses incorrect appender.
>HDFS-7443. Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate
>block files are present in the same volume.
>HDFS-7489. Incorrect locking in FsVolumeList#checkDirs can hang datanodes
>HDFS-7503. Namenode restart after large deletions can cause slow
>processReport.
>HDFS-7575. Upgrade should generate a unique storage ID for each volume.
>HDFS-7579. Improve log reporting during block report rpc failure.
>HDFS-7587. Edit log corruption can happen if append fails with a quota
>violation.
>HDFS-7596. NameNode should prune dead storages from storageMap.
>HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks
>in the blocksMap on NameNode restart.
>HDFS-7714. Simultaneous restart of HA NameNodes and DataNode can cause
>DataNode to register successfully with only one NameNode.
>HDFS-7733. NFS: readdir/readdirplus return null directory attribute on
>failure.
>HDFS-7831. Fix the starting index and end condition of the loop in
>FileDiffList.findEarlierSnapshotBlocks().
>HDFS-7885. Datanode should not trust the generation stamp provided by
>client.
>HDFS-7960. The full block report should prune zombie storages even if
>they're not empty.
>HDFS-8072. Reserved RBW space is not released if client terminates while
>writing block.
>HDFS-8127. NameNode Failover during HA upgrade can cause DataNode to
>finalize upgrade.
>
>
>Arpit
>
>[1] Will Hadoop 2.6.1 be released soon?
>http://markmail.org/thread/zlsr6prejyogdyvh
>
>
>
>On 4/27/15, 11:47 AM, "Vinod Kumar Vavilapalli" 
>wrote:
>
>>There were several requests on the user lists [1] for a 2.6.1 release. I
>>got many offline comments too.
>>
>>Planning to do a 2.6.1 release in a few weeks time. We already have a
>>bunch
>>of tickets committed to 2.7.1. I created a filter [2] to tracking pending
>>tickets.
>>
>>We need to collectively come up with a list of critical issues. We can
>>use
>>the JIRA Target Version field for the same. I see some but not a whole
>>lot
>>of new work for this release, most of it is likely going to be pulling in
>>critical patches from 2.7.1/2.8 etc.
>>
>>Thoughts?
>>
>>Thanks
>>+Vinod
>>
>>[1] Will Hadoop 2.6.1 be released soon?
>>http://markmail.org/thread/zlsr6prejyogdyvh
>>[2] 2.6.1 pending tickets
>>https://issues.apache.org/jira/issues/?filter=12331711
>>
>
>



Re: Planning Hadoop 2.6.1 release

2015-04-30 Thread Arpit Agarwal
HDFS candidates for back-porting to Hadoop 2.6.1. The first two were requested 
in [1].

HADOOP-11674. oneByteBuf in CryptoInputStream and CryptoOutputStream should be 
non static
HADOOP-11710. Make CryptoOutputStream behave like DFSOutputStream wrt 
synchronization

HDFS-7009. Active NN and standby NN have different live nodes.
HDFS-7035. Make adding a new data directory to the DataNode an atomic and 
improve error handling
HDFS-7425. NameNode block deletion logging uses incorrect appender.
HDFS-7443. Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block 
files are present in the same volume.
HDFS-7489. Incorrect locking in FsVolumeList#checkDirs can hang datanodes
HDFS-7503. Namenode restart after large deletions can cause slow processReport.
HDFS-7575. Upgrade should generate a unique storage ID for each volume.
HDFS-7579. Improve log reporting during block report rpc failure.
HDFS-7587. Edit log corruption can happen if append fails with a quota 
violation.
HDFS-7596. NameNode should prune dead storages from storageMap.
HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the 
blocksMap on NameNode restart.
HDFS-7714. Simultaneous restart of HA NameNodes and DataNode can cause DataNode 
to register successfully with only one NameNode.
HDFS-7733. NFS: readdir/readdirplus return null directory attribute on failure.
HDFS-7831. Fix the starting index and end condition of the loop in 
FileDiffList.findEarlierSnapshotBlocks().
HDFS-7885. Datanode should not trust the generation stamp provided by client.
HDFS-7960. The full block report should prune zombie storages even if they're 
not empty.
HDFS-8072. Reserved RBW space is not released if client terminates while 
writing block.
HDFS-8127. NameNode Failover during HA upgrade can cause DataNode to finalize 
upgrade.


Arpit

[1] Will Hadoop 2.6.1 be released soon? 
http://markmail.org/thread/zlsr6prejyogdyvh



On 4/27/15, 11:47 AM, "Vinod Kumar Vavilapalli"  wrote:

>There were several requests on the user lists [1] for a 2.6.1 release. I
>got many offline comments too.
>
>Planning to do a 2.6.1 release in a few weeks time. We already have a bunch
>of tickets committed to 2.7.1. I created a filter [2] to tracking pending
>tickets.
>
>We need to collectively come up with a list of critical issues. We can use
>the JIRA Target Version field for the same. I see some but not a whole lot
>of new work for this release, most of it is likely going to be pulling in
>critical patches from 2.7.1/2.8 etc.
>
>Thoughts?
>
>Thanks
>+Vinod
>
>[1] Will Hadoop 2.6.1 be released soon?
>http://markmail.org/thread/zlsr6prejyogdyvh
>[2] 2.6.1 pending tickets
>https://issues.apache.org/jira/issues/?filter=12331711
>




[jira] [Created] (HDFS-8307) Spurious DNS Queries from hdfs shell

2015-04-30 Thread Anu Engineer (JIRA)
Anu Engineer created HDFS-8307:
--

 Summary: Spurious DNS Queries from hdfs shell
 Key: HDFS-8307
 URL: https://issues.apache.org/jira/browse/HDFS-8307
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.1
Reporter: Anu Engineer
Priority: Trivial


With HA configured the hdfs shell (org.apache.hadoop.fs.FsShell) seems to issue 
a DNS query for the cluster Name. if  fs.defaultFS is set to hdfs://mycluster, 
then the shell seems to issue a DNS query for mycluster.FQDN or mycluster.

since mycluster not a machine name  DNS query always fails with 
"DNS 85 Standard query response 0x2aeb No such name"

Repro Steps:

# Setup a HA cluster 
# Log on to any node
# Run wireshark monitoring port 53 - "sudo tshark 'port 53'"
# Run "sudo -u hdfs hdfs dfs -ls /" 
# You should be able to see DNS queries to mycluster.FQDN in wireshark




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs

2015-04-30 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-8306:
---

 Summary: Generate ACL and Xattr outputs in OIV XML outputs
 Key: HDFS-8306
 URL: https://issues.apache.org/jira/browse/HDFS-8306
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Minor


Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are 
outputs. It makes inspecting {{fsimage}} from XML outputs less practical. Also 
it prevents recovering a fsimage from XML file.

This JIRA is adding ACL and XAttrs in the XML outputs as the first step to 
achieve the goal described in HDFS-8061.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8305) HDFS INotify: the destination argument to RenameOp should always end with the file name

2015-04-30 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-8305:
--

 Summary: HDFS INotify: the destination argument to RenameOp should 
always end with the file name
 Key: HDFS-8305
 URL: https://issues.apache.org/jira/browse/HDFS-8305
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


HDFS INotify: the destination argument to RenameOp should always end with the 
file name rather than sometimes being a directory name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8304) Separate out shared log purging methods for QJM (Paxos directory) and FJM

2015-04-30 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-8304:
---

 Summary: Separate out shared log purging methods for QJM (Paxos 
directory) and FJM
 Key: HDFS-8304
 URL: https://issues.apache.org/jira/browse/HDFS-8304
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang


With HDFS-8303 QJM will purge its /current dir through FJM. However its Paxos 
dir needs to be purged separately.

QJM currently uses its own {{JNStorage#purgeMatching}} method while FJM calls 
{{matchEditLogs}} to find all matches first and then uses 
{{DeletionStoragePurger#purgeLog}}.

This JIRA aims to create a unified method for both QJM's Paxos dir and FJM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-8303:
---

 Summary: QJM should purge old logs in the current directory 
through FJM
 Key: HDFS-8303
 URL: https://issues.apache.org/jira/browse/HDFS-8303
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang


As the first step of the consolidation effort, QJM should call its FJM to purge 
the current directory. 

The current QJM logic of purging current dir is very similar to FJM purging 
logic.

QJM:
{code}
 private static final List CURRENT_DIR_PURGE_REGEXES =
  ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
  long txid = Long.parseLong(matcher.group(1));
  if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}

FJM:
{code}
  private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
  private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
  private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
  NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
  if (log.getFirstTxId() < minTxIdToKeep &&
  log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
  }
}
{code}

I can see 2 differences:
# FJM has a slightly stricter match for empty/corrupt in-progress files: the 
suffix shouldn't have blank space
# FJM verifies that both start and end txID of a finalized edit file to be old 
enough

Both seem safer than the QJM logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8302) Consolidate log purging logic in QJM and FJM

2015-04-30 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-8302:
---

 Summary: Consolidate log purging logic in QJM and FJM
 Key: HDFS-8302
 URL: https://issues.apache.org/jira/browse/HDFS-8302
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Zhe Zhang
Assignee: Zhe Zhang


When executing {{purgeLogsOlderThan}}, {{JNStorage}} purges both the current 
directory and the Paxos directory using its own logic:
{code}
  void purgeDataOlderThan(long minTxIdToKeep) throws IOException {
purgeMatching(sd.getCurrentDir(),
CURRENT_DIR_PURGE_REGEXES, minTxIdToKeep);
purgeMatching(getPaxosDir(), PAXOS_DIR_PURGE_REGEXES, minTxIdToKeep);
  }
{code}

Meanwhile, FJM has its own logic of serving {{purgeLogsOlderThan}}, which is 
executed only under the legacy NFS-based journaling configuration.

This JIRA aims to consolidate these 2 separate purging procedures



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8301) Create a separate document for tag storage type explicitly.

2015-04-30 Thread Xiaoyu Yao (JIRA)
Xiaoyu Yao created HDFS-8301:


 Summary:  Create a separate document for tag storage type 
explicitly.
 Key: HDFS-8301
 URL: https://issues.apache.org/jira/browse/HDFS-8301
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Xiaoyu Yao


This is a follow up JIRA based on 
[comments|https://issues.apache.org/jira/browse/HDFS-7770?focusedCommentId=14512195&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14512195]
  from HDFS-7770 to create a separate document for tag storage type explicitly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-8300:
---

 Summary: Fix unit test failures and findbugs warning caused by 
HDFS-8283
 Key: HDFS-8300
 URL: https://issues.apache.org/jira/browse/HDFS-8300
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jing Zhao






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Block level compression

2015-04-30 Thread Abhishek Das
Hi,

Is there any way I can get the size of each compressed (Gzip) block without
actually compressing it. For example, I have 200mb uncompressed data in
HDFS and the block size is 64 MB. I want to get the size of each of the 4
compressed blocks. The result might look like, the first block is 15 MB,
second block is 20 MB, third one is 18MB and the fourth one is 2 MB.

I was thinking of using some command like hadoop fsck -blocks -files
-locations to get each of the block files and run some kind of gzip -c
FILENAME | wc -c to get the size of the compressed file.

Please advise.

Regards,
Abhishek


[jira] [Created] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem

2015-04-30 Thread Hari Sekhon (JIRA)
Hari Sekhon created HDFS-8299:
-

 Summary: HDFS reporting missing blocks when they are actually 
present due to read-only filesystem
 Key: HDFS-8299
 URL: https://issues.apache.org/jira/browse/HDFS-8299
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
 Environment: Fsck shows missing blocks when the blocks can be found on 
a datanode's filesystem and the datanode has been restarted to try to get it to 
recognize that the blocks are indeed present and hence report them to the 
NameNode in a block report.

Fsck output showing an example "missing" block:
{code}/apps/hive/warehouse/.db/someTable/00_0: CORRUPT 
blockpool BP-120244285--1417023863606 block blk_1075202330
 MISSING 1 blocks of total size 3260848 B
0. BP-120244285--1417023863606:blk_1075202330_1484191 len=3260848 
MISSING!{code}
The block is definitely present on more than one datanode however, here is the 
output from one of them that I restarted to try to get it to report the block 
to the NameNode:
{code}# ll 
/archive1/dn/current/BP-120244285--1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*
-rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 
/archive1/dn/current/BP-120244285--1417023863606/current/finalized/subdir22/subdir73/blk_1075202330
-rw-r--r-- 1 hdfs 499   25483 Apr 27 15:02 
/archive1/dn/current/BP-120244285--1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code}
It's worth noting that this is on HDFS tiered storage on an archive tier going 
to a networked block device that may have become temporarily unavailable but is 
available now. See also feature request HDFS-8297 for online rescan to not have 
to go around restarting datanodes.

It turns out in the datanode log (that I am attaching) this is because the 
datanode fails to get a write lock on the filesystem. I think it would be 
better to be able to read-only those blocks however, since this way causes 
client visible data unavailability when the data could in fact be read.

{code}2015-04-30 14:11:08,235 WARN  datanode.DataNode 
(DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir 
/archive1/dn :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not 
writable: /archive1/dn
at 
org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193)
at 
org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at 
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
{code}

Hari Sekhon
http://www.linkedin.com/in/harisekhon
Reporter: Hari Sekhon
Priority: Critical
 Attachments: datanode.log





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8298) HA: NameNode should not shut down completely without quorum, doesn't recover from temporary failures

2015-04-30 Thread Hari Sekhon (JIRA)
Hari Sekhon created HDFS-8298:
-

 Summary: HA: NameNode should not shut down completely without 
quorum, doesn't recover from temporary failures
 Key: HDFS-8298
 URL: https://issues.apache.org/jira/browse/HDFS-8298
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, HDFS, namenode, qjm
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon


In an HDFS HA setup if there is a temporary problem with contacting journal 
nodes (eg. network interruption), the NameNode shuts down entirely, when it 
should instead go in to a standby mode so that it can stay online and retry to 
achieve quorum later.

If both NameNodes shut themselves off like this then even after the temporary 
network outage is resolved, the entire cluster remains offline indefinitely 
until operator intervention, whereas it could have self-repaired after 
re-contacting the journalnodes and re-achieving quorum.

{code}2015-04-15 15:59:26,900 FATAL namenode.FSEditLog 
(JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for 
required journal (JournalAndStre
am(mgr=QJM to [:8485, :8485, :8485], stream=QuorumOutputStream 
starting at txid 54270281))
java.io.IOException: Interrupted waiting 2ms for a quorum of nodes to 
respond.
at 
org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:134)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
at 
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
at 
org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
at 
org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
at 
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:639)
at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:388)
at java.lang.Thread.run(Thread.java:745)
2015-04-15 15:59:26,901 WARN  client.QuorumJournalManager 
(QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at 
txid 54270281
2015-04-15 15:59:26,904 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
Exiting with status 1
2015-04-15 15:59:27,001 INFO  namenode.NameNode (StringUtils.java:run(659)) - 
SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down NameNode at /
/{code}

Hari Sekhon
http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: Hadoop-Hdfs-trunk-Java8 #170

2015-04-30 Thread Apache Jenkins Server
See 

Changes:

[jing9] HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed 
by Tsz Wo Nicholas Sze.

[wheat9] HDFS-8269. getBlockLocations() does not resolve the .reserved path and 
generates incorrect edit logs when updating the atime. Contributed by Haohui 
Mai.

[tgraves] YARN-3517. RM web ui for dumping scheduler logs should be for admins 
only (Varun Vasudev via tgraves)

[jianhe] YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be 
scheduled. Contributed by Anubhav Dhoot

[wang] HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. 
Contributed by Charles Lamb.

[devaraj] MAPREDUCE-6339. Job history file is not flushed correctly because

[aajisaka] HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by 
Binglin Chang.

[aajisaka] HADOOP-11821. Fix findbugs warnings in hadoop-sls. Contributed by 
Brahma Reddy Battula.

[aajisaka] HDFS-7770. Need document for storage type label of data node storage 
locations under dfs.data.dir. Contributed by Xiaoyu Yao.

--
[...truncated 4990 lines...]
[INFO] 
+ cd hadoop-hdfs-project
+ /home/jenkins/tools/maven/latest/bin/mvn clean verify checkstyle:checkstyle 
findbugs:findbugs -Drequire.test.libhadoop -Pdist -Pnative -Dtar -Pdocs -fae
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Apache Hadoop HDFS Client
[INFO] Apache Hadoop HDFS
[INFO] Apache Hadoop HttpFS
[INFO] Apache Hadoop HDFS BookKeeper Journal
[INFO] Apache Hadoop HDFS-NFS
[INFO] Apache Hadoop HDFS Project
[INFO] 
[INFO] Using the builder 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder
 with a thread count of 1
[INFO] 
[INFO] 
[INFO] Building Apache Hadoop HDFS Client 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-hdfs-client ---
[INFO] Deleting 

[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-hdfs-client 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 

[mkdir] Created dir: 

[INFO] Executed tasks
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
hadoop-hdfs-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
hadoop-hdfs-client ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 86 source files to 

[WARNING] 
:
 

 uses or overrides a deprecated API.
[WARNING] 
:
 Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
hadoop-hdfs-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hadoop-hdfs-client ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.17:test (default-test) @ hadoop-hdfs-client 
---
[INFO] No tests to run.
[INFO] 
[INFO] --- maven-jar-plugin:2.5:jar (prepare-jar) @ hadoop-hdfs-client ---
[INFO] Building jar: 

Hadoop-Hdfs-trunk-Java8 - Build # 170 - Still Failing

2015-04-30 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 5183 lines...]
[INFO] 
[INFO] --- maven-source-plugin:2.3:test-jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.12.1:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.0:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS Client . FAILURE [ 31.949 s]
[INFO] Apache Hadoop HDFS  SKIPPED
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [  0.106 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 33.069 s
[INFO] Finished at: 2015-04-30T12:16:05+00:00
[INFO] Final Memory: 50M/154M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:2.12.1:checkstyle 
(default-cli) on project hadoop-hdfs-client: An error has occurred in 
Checkstyle report generation. Failed during checkstyle execution: Unable to 
find configuration file at location: 
file:///home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs-client/dev-support/checkstyle.xml:
 Could not find resource 
'file:///home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs-client/dev-support/checkstyle.xml'.
 -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to Hadoop-Hdfs-trunk-Java8 #146
Archived 1 artifacts
Archive block size is 32768
Received 0 blocks and 668296 bytes
Compression is 0.0%
Took 40 sec
Recording test results
Updating HDFS-8269
Updating HADOOP-11821
Updating YARN-3533
Updating HDFS-7770
Updating MAPREDUCE-6339
Updating YARN-3517
Updating HDFS-5574
Updating HDFS-8283
Updating HDFS-8214
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

Build failed in Jenkins: Hadoop-Hdfs-trunk #2111

2015-04-30 Thread Apache Jenkins Server
See 

Changes:

[jing9] HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed 
by Tsz Wo Nicholas Sze.

[wheat9] HDFS-8269. getBlockLocations() does not resolve the .reserved path and 
generates incorrect edit logs when updating the atime. Contributed by Haohui 
Mai.

[tgraves] YARN-3517. RM web ui for dumping scheduler logs should be for admins 
only (Varun Vasudev via tgraves)

[jianhe] YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be 
scheduled. Contributed by Anubhav Dhoot

[wang] HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. 
Contributed by Charles Lamb.

[devaraj] MAPREDUCE-6339. Job history file is not flushed correctly because

[aajisaka] HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by 
Binglin Chang.

[aajisaka] HADOOP-11821. Fix findbugs warnings in hadoop-sls. Contributed by 
Brahma Reddy Battula.

[aajisaka] HDFS-7770. Need document for storage type label of data node storage 
locations under dfs.data.dir. Contributed by Xiaoyu Yao.

--
[...truncated 4989 lines...]
[INFO] 
[INFO] Total time: 04:03 min
[INFO] Finished at: 2015-04-30T12:10:06+00:00
[INFO] Final Memory: 227M/1483M
[INFO] 
+ cd hadoop-hdfs-project
+ /home/jenkins/tools/maven/latest/bin/mvn clean verify checkstyle:checkstyle 
findbugs:findbugs -Drequire.test.libhadoop -Pdist -Pnative -Dtar -Pdocs -fae 
-Dmaven.javadoc.skip=true
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Apache Hadoop HDFS Client
[INFO] Apache Hadoop HDFS
[INFO] Apache Hadoop HttpFS
[INFO] Apache Hadoop HDFS BookKeeper Journal
[INFO] Apache Hadoop HDFS-NFS
[INFO] Apache Hadoop HDFS Project
[INFO] 
[INFO] Using the builder 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder
 with a thread count of 1
[INFO] 
[INFO] 
[INFO] Building Apache Hadoop HDFS Client 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-hdfs-client ---
[INFO] Deleting 

[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-hdfs-client 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 

[mkdir] Created dir: 

[INFO] Executed tasks
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
hadoop-hdfs-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
hadoop-hdfs-client ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 86 source files to 

[WARNING] 
:
 

 uses or overrides a deprecated API.
[WARNING] 
:
 Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
hadoop-hdfs-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hadoop-hdfs-client ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.17:test (default-test) @ hadoop-hdfs-client 
---
[INFO] No tests to run.
[INFO] 
[INFO] --- maven-jar-plugin:2.5:jar (prepare-jar) @ hadoop-hdfs-client ---
[INFO] Building jar: 


Hadoop-Hdfs-trunk - Build # 2111 - Still Failing

2015-04-30 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 5182 lines...]
[INFO] 
[INFO] --- maven-source-plugin:2.3:test-jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Skipping javadoc generation
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.12.1:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.0:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS Client . FAILURE [ 25.785 s]
[INFO] Apache Hadoop HDFS  SKIPPED
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [  0.120 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 27.291 s
[INFO] Finished at: 2015-04-30T12:10:36+00:00
[INFO] Final Memory: 58M/723M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:2.12.1:checkstyle 
(default-cli) on project hadoop-hdfs-client: An error has occurred in 
Checkstyle report generation. Failed during checkstyle execution: Unable to 
find configuration file at location: 
file:///home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/hadoop-hdfs-project/hadoop-hdfs-client/dev-support/checkstyle.xml:
 Could not find resource 
'file:///home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/hadoop-hdfs-project/hadoop-hdfs-client/dev-support/checkstyle.xml'.
 -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to Hadoop-Hdfs-trunk #2088
Archived 1 artifacts
Archive block size is 32768
Received 0 blocks and 315154 bytes
Compression is 0.0%
Took 19 sec
Recording test results
Updating HDFS-8269
Updating HADOOP-11821
Updating YARN-3533
Updating HDFS-7770
Updating MAPREDUCE-6339
Updating YARN-3517
Updating HDFS-5574
Updating HDFS-8283
Updating HDFS-8214
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Created] (HDFS-8297) Ability to online trigger data dir rescan for blocks

2015-04-30 Thread Hari Sekhon (JIRA)
Hari Sekhon created HDFS-8297:
-

 Summary: Ability to online trigger data dir rescan for blocks
 Key: HDFS-8297
 URL: https://issues.apache.org/jira/browse/HDFS-8297
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon


Feature request to add functionality to online trigger data dir rescan for 
available blocks without having to restart datanode.

Motivation is if using HDFS storage tiering with an archive tier to a separate 
hyperscale storage device over the network (Hedvig in this case) which may go 
away and then return due to say a network interruption or other temporary 
error, this leaves HDFS fsck declaring missing blocks, that are clearly visible 
on the mount point for the node's archive directory. An online trigger for data 
dir rescsan for available blocks would avoid having to do a rolling restart of 
all datanodes across a cluster. I did try sending a kill -HUP to the datanode 
process (both SecureDataNodeStarter parent and child) while tailing the log 
hoping this might do it, but nothing happened in the log.

Hari Sekhon
http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Doubt on ReplicaInPipeline.stopWriter

2015-04-30 Thread 张铎
We are using hadoop 2.6.0 and recently we found that sometimes datanode can
hang(not HDFS-7489, we had already patched it).

This one was caused by pipeline recovery of blk_1150368526_76663248

datanode log

2015-04-28 17:51:55,297 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248 src: /
10.0.46.4:33257 dest: /10.0.49.6:50010
2015-04-28 17:53:01,923 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248 src: /
10.0.50.13:63069 dest: /10.0.49.6:50010
2015-04-28 17:53:10,790 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl:
Recover RBW replica
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248
2015-04-28 17:53:10,790 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl:
Recovering ReplicaBeingWritten, blk_1150368526_76663248, RBW
2015-04-28 17:54:10,791 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: Join on writer thread
Thread[DataXceiver for client
DFSClient_attempt_1428562732216_171256_r_00_0_670237631_1 at /
10.0.46.4:33257 [Receiving block
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248],5,dataXceiverServer]
timed out
2015-04-28 17:54:10,791 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248 received
exception java.io.IOException: Join on writer thread Thread[DataXceiver for
client DFSClient_attempt_1428562732216_171256_r_00_0_670237631_1 at /
10.0.46.4:33257 [Receiving block
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248],5,dataXceiverServer]
timed out
java.io.IOException: Join on writer thread Thread[DataXceiver for client
DFSClient_attempt_1428562732216_171256_r_00_0_670237631_1 at /
10.0.46.4:33257 [Receiving block
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248],5,dataXceiverServer]
timed out


And this is the jstack result when datanode hangs

"DataXceiver for client
DFSClient_attempt_1428562732216_171256_r_00_0_670237631_1 at /
10.0.50.13:63069 [Receiving block
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248]" daemon
prio=10 tid=0x7f0dd052e000 nid=0x66e9 in Object.wait() [0x7f0d9e27
1000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1289)
- locked <0x0006efad2f60> (a org.apache.hadoop.util.Daemon)
at
org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverRbw(FsDatasetImpl.java:1123)
- locked <0x0006c82026c8> (a
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverRbw(FsDatasetImpl.java:114)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:188)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
at java.lang.Thread.run(Thread.java:745)

"DataXceiver for client
DFSClient_attempt_1428562732216_171256_r_00_0_670237631_1 at /
10.0.46.4:33257 [Receiving block
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248]" daemon
prio=10 tid=0x7f0dd02a1800 nid=0x64ca runnable [0x7f0dae602000]
   java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:345)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:613)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:781)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:730)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
at java.lang.Thread.run(Thread.java:745)

"PacketResponder:
BP-1553162417-10.0.67.25-1418714798720:blk_1150368526_76663248,
type=LAST_IN_PIPELINE, downstreams=0:[]" daemon prio=10
tid=0x7f0dd0a2b800 nid=0x6560 in Object.wait() [0x7f0dad0ed000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketRespond

[jira] [Resolved] (HDFS-8183) Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang resolved HDFS-8183.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed

The patch LGTM. +1 and I just committed it to the branch (since the change is 
simple we can probably watch Jenkins later). Thanks Rakesh for the contribution!

> Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads
> --
>
> Key: HDFS-8183
> URL: https://issues.apache.org/jira/browse/HDFS-8183
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: HDFS-7285
>
> Attachments: HDFS-8183-001.patch, HDFS-8183-002.patch
>
>
> The idea of this task is to improve closing of all the streamers. Presently 
> if any of the streamer throws an exception, it will returning immediately. 
> This leaves all the other streamer threads running. Instead its good to 
> handle the exceptions of each streamer independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8296) BlockManager.getUnderReplicatedBlocksCount() is not giving correct count if namenode in safe mode.

2015-04-30 Thread surendra singh lilhore (JIRA)
surendra singh lilhore created HDFS-8296:


 Summary:  BlockManager.getUnderReplicatedBlocksCount() is not 
giving correct count if namenode in safe mode.
 Key: HDFS-8296
 URL: https://issues.apache.org/jira/browse/HDFS-8296
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore


{{underReplicatedBlocksCount}} update by the {{updateState()}} API.

{code}
 void updateState() {
pendingReplicationBlocksCount = pendingReplications.size();
underReplicatedBlocksCount = neededReplications.size();
corruptReplicaBlocksCount = corruptReplicas.size();
  }
 {code}

 but this will not call when NN in safe mode. This is happening because 
"computeDatanodeWork()" we will return 0 if NN in safe mode 

 {code}

  int computeDatanodeWork() {
   .
if (namesystem.isInSafeMode()) {
  return 0;
}


this.updateState();


  }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations

2015-04-30 Thread Xinwei Qin (JIRA)
Xinwei Qin  created HDFS-8295:
-

 Summary: Add MODIFY and REMOVE ECSchema editlog operations
 Key: HDFS-8295
 URL: https://issues.apache.org/jira/browse/HDFS-8295
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Xinwei Qin 
Assignee: Xinwei Qin 


If MODIFY and REMOVE ECSchema operations are supported, then add these editlog 
operations to persist them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)