[jira] [Reopened] (HDFS-4184) Add the ability for Client to provide more hint information for DataNode to manage the OS buffer cache more accurate

2012-11-12 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HDFS-4184:
-


Here, I reopened it for you (in case you can't)

> Add the ability for Client to provide more hint information for DataNode to 
> manage the OS buffer cache more accurate
> 
>
> Key: HDFS-4184
> URL: https://issues.apache.org/jira/browse/HDFS-4184
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: binlijin
>
> HDFS now has the ability to use posix_fadvise and sync_data_range syscalls to 
> manage the OS buffer cache.
> {code}
> When hbase read hlog the data we can set dfs.datanode.drop.cache.behind.reads 
> to true to drop data out of the buffer cache when performing sequential reads.
> When hbase write hlog we can set dfs.datanode.drop.cache.behind.writes to 
> true to drop data out of the buffer cache after writing
> When hbase read hfile during compaction we can set 
> dfs.datanode.readahead.bytes to a non-zero value to trigger readahead for 
> sequential reads.
> and so on... 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4184) Add new interface for Client to provide more information

2012-11-12 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HDFS-4184.
-

Resolution: Invalid

Resolving invalid as not enough detail.

The JIRA subject and description do not seem to match.  As per Ted in previous 
issue, please add more detail when you create issue so we can know better to 
what you refer.  Meantime I'm closing this.  Open a new one when better 
specification (this seems to require a particular version of hadoop, etc.).

Thanks Binlijin.

> Add new interface for Client to provide more information
> 
>
> Key: HDFS-4184
> URL: https://issues.apache.org/jira/browse/HDFS-4184
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: binlijin
>
> When hbase read or write hlog we can use 
> dfs.datanode.drop.cache.behind.reads、dfs.datanode.drop.cache.behind.writes, 
> when hbase read hfile during compaction we can use readahead and so on... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4185) Add a metric for number of active leases

2012-11-12 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-4185:


 Summary: Add a metric for number of active leases
 Key: HDFS-4185
 URL: https://issues.apache.org/jira/browse/HDFS-4185
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 2.0.2-alpha, 0.23.4
Reporter: Kihwal Lee


We have seen cases of systematic open file leaks, which could have been 
detected if we have a metric that shows number of active leases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4184) Add new interface for Client to provide more information

2012-11-12 Thread binlijin (JIRA)
binlijin created HDFS-4184:
--

 Summary: Add new interface for Client to provide more information
 Key: HDFS-4184
 URL: https://issues.apache.org/jira/browse/HDFS-4184
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: binlijin


When hbase read or write hlog we can use 
dfs.datanode.drop.cache.behind.reads、dfs.datanode.drop.cache.behind.writes, 
when hbase read hfile during compaction we can use readahead and so on... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4183) Throttle block recovery

2012-11-12 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-4183:


 Summary: Throttle block recovery
 Key: HDFS-4183
 URL: https://issues.apache.org/jira/browse/HDFS-4183
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 2.0.2-alpha, 0.23.4
Reporter: Kihwal Lee
Priority: Critical


When a large number of files are abandoned without closing, a storm of lease 
expiration follows in about an hour (lease hard limit). For the last block of 
each file, block recovery is initiated and when the datanode is done, it calls 
commitBlockSynchronization() is called against namenode. A burst of these calls 
can slow down namenode considerably. We need to throttle block recovery and/or 
speed up the rate at which commitBlockSynchronization() is served.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-12 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-4182:
-

 Summary: SecondaryNameNode leaks NameCache entries
 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
Reporter: Todd Lipcon
Priority: Critical


We recently saw an issue where a 2NN ran out of memory, even though it had a 
relatively small fsimage. When we looked at the heap dump, we saw that all of 
the memory had gone to entries in the NameCache.

It appears that the NameCache is staying in "initializing" mode forever, and 
therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4181) LeaseManager tries to double remove and prints extra messages

2012-11-12 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-4181:


 Summary: LeaseManager tries to double remove and prints extra 
messages
 Key: HDFS-4181
 URL: https://issues.apache.org/jira/browse/HDFS-4181
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.2-alpha, 0.23.4
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical


When checkLeases() runs, internalReleaseLease() is called on the expired ones. 
When it returns true, the lease is already removed, yet it is tried again in 
checkLease(). This causes unnecessary ERROR messages to be logged. The line 
doing {{removing.add(p)}} should be removed.

The internalReleaseLease() method logs a detailed message per call, so the 
extra INFO log message from checkLease() is redundant. 

The error message from removeLease() can be very big and needs to be cut down. 
When the namenode itself is holding a lot of leases for block recovery, hitting 
this error is very expensive. In one instance, slow block recovery caused the 
namenode to hold more than 42K leases. The one log line in this case was over 4 
MB.  The dump of data structure should be only enabled in debug mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4180) TestFileCreation fails in branch-1 but not branch-1.1

2012-11-12 Thread Tsz Wo (Nicholas), SZE (JIRA)
Tsz Wo (Nicholas), SZE created HDFS-4180:


 Summary: TestFileCreation fails in branch-1 but not branch-1.1
 Key: HDFS-4180
 URL: https://issues.apache.org/jira/browse/HDFS-4180
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Tsz Wo (Nicholas), SZE


{noformat}
Testcase: testFileCreation took 3.419 sec
Caused an ERROR
java.io.IOException: Cannot create /test_dir; already exists as a directory
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1374)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1334)
...
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

org.apache.hadoop.ipc.RemoteException: java.io.IOException: Cannot create 
/test_dir; already exists as a directory
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1374)
...
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
at 
org.apache.hadoop.hdfs.TestFileCreation.checkFileCreation(TestFileCreation.java:249)
at 
org.apache.hadoop.hdfs.TestFileCreation.testFileCreation(TestFileCreation.java:179)
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4179) BackupNode: allow reads, fix checkpointing, safeMode

2012-11-12 Thread Konstantin Shvachko (JIRA)
Konstantin Shvachko created HDFS-4179:
-

 Summary: BackupNode: allow reads, fix checkpointing, safeMode
 Key: HDFS-4179
 URL: https://issues.apache.org/jira/browse/HDFS-4179
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.2-alpha
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko


BackupNode should be allowed to accept read command. Needs some adjustments in 
checkpointing and with safe mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4178) shell scripts should not close stderr

2012-11-12 Thread Andy Isaacson (JIRA)
Andy Isaacson created HDFS-4178:
---

 Summary: shell scripts should not close stderr
 Key: HDFS-4178
 URL: https://issues.apache.org/jira/browse/HDFS-4178
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: scripts
Affects Versions: 2.0.2-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson


The {{start-dfs.sh}} and {{stop-dfs.sh}} scripts close stderr for some 
subprocesses using the construct
bq. {{2>&-}}
This is dangerous because child processes started up under this scenario will 
re-use filedescriptor 2 for opened files.  Since libc and many other codepaths 
assume that filedescriptor 2 can be written to in error conditions, this can 
potentially result in data corruption.

Much better to redirect stderr using the construct {{2>/dev/null}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4177) Add a snapshot parameter to INodeDirectory.getChildrenList()

2012-11-12 Thread Tsz Wo (Nicholas), SZE (JIRA)
Tsz Wo (Nicholas), SZE created HDFS-4177:


 Summary: Add a snapshot parameter to 
INodeDirectory.getChildrenList()
 Key: HDFS-4177
 URL: https://issues.apache.org/jira/browse/HDFS-4177
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE


With the snapshot features, the children list of directories can be different 
in snapshots and the current view.  The snapshot information is required to 
select the children list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4176) EditLogTailer should call rollEdits with a timeout

2012-11-12 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-4176:
-

 Summary: EditLogTailer should call rollEdits with a timeout
 Key: HDFS-4176
 URL: https://issues.apache.org/jira/browse/HDFS-4176
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, name-node
Affects Versions: 2.0.2-alpha, 3.0.0
Reporter: Todd Lipcon


When the EditLogTailer thread calls rollEdits() on the active NN via RPC, it 
currently does so without a timeout. So, if the active NN has frozen (but not 
actually crashed), this call can hang forever. This can then potentially 
prevent the standby from becoming active.

This may actually considered a side effect of HADOOP-6762 -- if the RPC were 
interruptible, that would also fix the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hadoop-Hdfs-0.23-Build - Build # 433 - Unstable

2012-11-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/433/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 12268 lines...]
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/target/test-dir
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-dependency-plugin:2.1:build-classpath (build-classpath) @ 
hadoop-hdfs-project ---
[INFO] Wrote classpath file 
'/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/target/classes/mrapp-generated-classpath'.
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-install-plugin:2.3.1:install (default-install) @ 
hadoop-hdfs-project ---
[INFO] Installing 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/pom.xml
 to 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-hdfs-project/0.23.5-SNAPSHOT/hadoop-hdfs-project-0.23.5-SNAPSHOT.pom
[INFO] 
[INFO] --- maven-antrun-plugin:1.6:run (create-testdirs) @ hadoop-hdfs-project 
---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-dependency-plugin:2.1:build-classpath (build-classpath) @ 
hadoop-hdfs-project ---
[INFO] Skipped writing classpath file 
'/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/target/classes/mrapp-generated-classpath'.
  No changes found.
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.6:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:2.3.2:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] ** FindBugsMojo execute ***
[INFO] canGenerate is false
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  SUCCESS [5:22.523s]
[INFO] Apache Hadoop HttpFS .. SUCCESS [47.681s]
[INFO] Apache Hadoop HDFS Project  SUCCESS [0.059s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 6:10.874s
[INFO] Finished at: Mon Nov 12 11:40:15 UTC 2012
[INFO] Final Memory: 51M/743M
[INFO] 
+ /home/jenkins/tools/maven/latest/bin/mvn test 
-Dmaven.test.failure.ignore=true -Pclover 
-DcloverLicenseLocation=/home/jenkins/tools/clover/latest/lib/clover.license
Archiving artifacts
Recording test results
Build step 'Publish JUnit test result report' changed build result to UNSTABLE
Publishing Javadoc
Recording fingerprints
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Unstable
Sending email for trigger: Unstable



###
## FAILED TESTS (if any) 
##
1 tests failed.
REGRESSION:  org.apache.hadoop.hdfs.TestCrcCorruption.testCrcCorruption

Error Message:
IPC server unable to read call parameters: readObject can't find class 
org.apache.hadoop.hdfs.protocol.ExtendedBlock

Stack Trace:
java.lang.RuntimeException: IPC server unable to read call parameters: 
readObject can't find class org.apache.hadoop.hdfs.protocol.ExtendedBlock
at org.apache.hadoop.ipc.Client.call(Client.java:1088)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195)
at $Proxy13.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67)
at $Proxy13.addBlock(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBl

Jenkins build became unstable: Hadoop-Hdfs-0.23-Build #433

2012-11-12 Thread Apache Jenkins Server
See