[jira] [Created] (HADOOP-16159) Deadlock when using FsUrlStreamHandlerFactory
Ajith S created HADOOP-16159: Summary: Deadlock when using FsUrlStreamHandlerFactory Key: HADOOP-16159 URL: https://issues.apache.org/jira/browse/HADOOP-16159 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.1.2, 2.8.5 Reporter: Ajith S URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory()) This induces the thread lock Thread 1 : Does load class which will do below - waiting to lock <0x0005c0c1e5e0> (a org.apache.hadoop.conf.Configuration) at org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840) at org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74) at java.net.URL.getURLStreamHandler(URL.java:1142) at java.net.URL.(URL.java:420) at sun.misc.URLClassPath$JarLoader.(URLClassPath.java:812) at sun.misc.URLClassPath$JarLoader$3.run(URLClassPath.java:1094) at sun.misc.URLClassPath$JarLoader$3.run(URLClassPath.java:1091) at java.security.AccessController.doPrivileged(Native Method) at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1090) at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1050) at sun.misc.URLClassPath.getResource(URLClassPath.java:239) at java.net.URLClassLoader$1.run(URLClassLoader.java:365) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) - locked <0x0005b7991168> (a org.apache.spark.util.MutableURLClassLoader) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) Thread 2 : Create new URL - waiting to lock <0x0005b7991168> (a org.apache.spark.util.MutableURLClassLoader) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.xerces.parsers.ObjectFactory.findProviderClass(Unknown Source) at org.apache.xerces.parsers.ObjectFactory.newInstance(Unknown Source) at org.apache.xerces.parsers.ObjectFactory.createObject(Unknown Source) at org.apache.xerces.parsers.ObjectFactory.createObject(Unknown Source) at org.apache.xerces.parsers.DOMParser.(Unknown Source) at org.apache.xerces.parsers.DOMParser.(Unknown Source) at org.apache.xerces.jaxp.DocumentBuilderImpl.(Unknown Source) at org.apache.xerces.jaxp.DocumentBuilderFactoryImpl.newDocumentBuilder(Unknown Source) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2737) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2696) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2579) - locked <0x0005c0c1e5e0> (a org.apache.hadoop.conf.Configuration) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1091) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840) at org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74) at java.net.URL.getURLStreamHandler(URL.java:1142) at java.net.URL.(URL.java:599) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-14792) Package on windows fail
[ https://issues.apache.org/jira/browse/HADOOP-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S resolved HADOOP-14792. -- Resolution: Not A Problem as BUILDING.txt clearly states, cygwin is not supported, removing cygwin from PATH and using Git unix tools instead resolved this. marking as not a issue > Package on windows fail > --- > > Key: HADOOP-14792 > URL: https://issues.apache.org/jira/browse/HADOOP-14792 > Project: Hadoop Common > Issue Type: Bug >Reporter: Ajith S >Assignee: Ajith S > Attachments: packagefail.png > > > {{mvn package -Pdist -Pnative-win -DskipTests -Dtar > -Dmaven.javadoc.skip=true}} > command fails on windows > this is because > dev-support/bin/dist-copynativelibs need dos2unix conversion > to avoid failure, we can add the conversion before bash execute -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14792) Package on windows fail
[ https://issues.apache.org/jira/browse/HADOOP-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134802#comment-16134802 ] Ajith S commented on HADOOP-14792: -- as cygwin do not have dos2unix command installed by default, we can replace carriage return by awk or tr command as below {noformat} /cygdrive/d/hadoop/code/hadoop $ D:/hadoop/code/hadoop/hadoop-project/../dev-support/bin/dist-copynativelibs --version=3.0.0-beta1-SNAPSHOT --builddir=D:\hadoop\code\hadoop\hadoop-project-dist\target --artifact id=hadoop-project-dist --isalbundle=false --isallib= --openssllib= --opensslbinbundle=false --openssllibbundle=false --snappybinbundle=false --snappylib= --snappylibbundle=false - -zstdbinbundle=false --zstdlib= --zstdlibbundle=false D:/hadoop/code/hadoop/hadoop-project/../dev-support/bin/dist-copynativelibs: line 16: $'\r': command not found : invalid option name/hadoop-project/../dev-support/bin/dist-copynativelibs: line 17: set: pipefail D:/hadoop/code/hadoop/hadoop-project/../dev-support/bin/dist-copynativelibs: line 18: $'\r': command not found D:/hadoop/code/hadoop/hadoop-project/../dev-support/bin/dist-copynativelibs: line 21: syntax error near unexpected token `$'\r'' ':/hadoop/code/hadoop/hadoop-project/../dev-support/bin/dist-copynativelibs: line 21: `function bundle_native_lib() /cygdrive/d/hadoop/code/hadoop $ tr -d '\15\32' < D:/hadoop/code/hadoop/hadoop-project/../dev-support/bin/dist-copynativelibs > D:/hadoop/code/hadoop/hadoop-project/../dev-support/bin/dist-copynativelibs /cygdrive/d/hadoop/code/hadoop $ D:/hadoop/code/hadoop/hadoop-project/../dev-support/bin/dist-copynativelibs --version=3.0.0-beta1-SNAPSHOT --builddir=D:\hadoop\code\hadoop\hadoop-project-dist\target --artifact id=hadoop-project-dist --isalbundle=false --isallib= --openssllib= --opensslbinbundle=false --openssllibbundle=false --snappybinbundle=false --snappylib= --snappylibbundle=false - -zstdbinbundle=false --zstdlib= --zstdlibbundle=false /cygdrive/d/hadoop/code/hadoop{noformat} > Package on windows fail > --- > > Key: HADOOP-14792 > URL: https://issues.apache.org/jira/browse/HADOOP-14792 > Project: Hadoop Common > Issue Type: Bug >Reporter: Ajith S >Assignee: Ajith S > Attachments: packagefail.png > > > {{mvn package -Pdist -Pnative-win -DskipTests -Dtar > -Dmaven.javadoc.skip=true}} > command fails on windows > this is because > dev-support/bin/dist-copynativelibs need dos2unix conversion > to avoid failure, we can add the conversion before bash execute -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14792) Package on windows fail
[ https://issues.apache.org/jira/browse/HADOOP-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-14792: - Attachment: packagefail.png > Package on windows fail > --- > > Key: HADOOP-14792 > URL: https://issues.apache.org/jira/browse/HADOOP-14792 > Project: Hadoop Common > Issue Type: Bug >Reporter: Ajith S >Assignee: Ajith S > Attachments: packagefail.png > > > {{mvn package -Pdist -Pnative-win -DskipTests -Dtar > -Dmaven.javadoc.skip=true}} > command fails on windows > this is because > dev-support/bin/dist-copynativelibs need dos2unix conversion > to avoid failure, we can add the conversion before bash execute -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14451) Deadlock in NativeIO
[ https://issues.apache.org/jira/browse/HADOOP-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HADOOP-14451: Assignee: (was: Ajith S) > Deadlock in NativeIO > > > Key: HADOOP-14451 > URL: https://issues.apache.org/jira/browse/HADOOP-14451 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Ajith S >Priority: Blocker > Attachments: HADOOP-14451-01.patch, Nodemanager.jstack > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14451) Deadlock in NativeIO
[ https://issues.apache.org/jira/browse/HADOOP-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024697#comment-16024697 ] Ajith S commented on HADOOP-14451: -- i think people here are more interested in uploading a patch rather than discussing with the submitter if analysis and approach is right. If i have added well amount of analysis, i would probably have a solution too. Anyways, feel free to assign to self if that's the case > Deadlock in NativeIO > > > Key: HADOOP-14451 > URL: https://issues.apache.org/jira/browse/HADOOP-14451 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Ajith S >Assignee: Ajith S >Priority: Blocker > Attachments: HADOOP-14451-01.patch, Nodemanager.jstack > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14451) Deadlock in NativeIO
[ https://issues.apache.org/jira/browse/HADOOP-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022919#comment-16022919 ] Ajith S commented on HADOOP-14451: -- Some references https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html#FindClass https://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.4.2 > Deadlock in NativeIO > > > Key: HADOOP-14451 > URL: https://issues.apache.org/jira/browse/HADOOP-14451 > Project: Hadoop Common > Issue Type: Bug >Reporter: Ajith S >Assignee: Ajith S >Priority: Critical > Attachments: Nodemanager.jstack > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-14451) Deadlock in NativeIO
[ https://issues.apache.org/jira/browse/HADOOP-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022896#comment-16022896 ] Ajith S edited comment on HADOOP-14451 at 5/24/17 1:37 PM: --- attaching affected nodemanager jstack For the background of the issue, test NM seem to have all containers timeout after restart, and hence from jstack, arrived at above conclusion with deadlock in NativeIO was (Author: ajithshetty): attaching affected nodemanager jstack > Deadlock in NativeIO > > > Key: HADOOP-14451 > URL: https://issues.apache.org/jira/browse/HADOOP-14451 > Project: Hadoop Common > Issue Type: Bug >Reporter: Ajith S >Assignee: Ajith S >Priority: Critical > Attachments: Nodemanager.jstack > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14451) Deadlock in NativeIO
[ https://issues.apache.org/jira/browse/HADOOP-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-14451: - Attachment: Nodemanager.jstack attaching affected nodemanager jstack > Deadlock in NativeIO > > > Key: HADOOP-14451 > URL: https://issues.apache.org/jira/browse/HADOOP-14451 > Project: Hadoop Common > Issue Type: Bug >Reporter: Ajith S >Assignee: Ajith S >Priority: Critical > Attachments: Nodemanager.jstack > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Moved] (HADOOP-14451) Deadlock in NativeIO
[ https://issues.apache.org/jira/browse/HADOOP-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S moved YARN-6637 to HADOOP-14451: Key: HADOOP-14451 (was: YARN-6637) Project: Hadoop Common (was: Hadoop YARN) > Deadlock in NativeIO > > > Key: HADOOP-14451 > URL: https://issues.apache.org/jira/browse/HADOOP-14451 > Project: Hadoop Common > Issue Type: Bug >Reporter: Ajith S >Assignee: Ajith S >Priority: Critical > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12309) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException
[ https://issues.apache.org/jira/browse/HADOOP-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-12309: - Attachment: HADOOP-12309.3.patch Attached after rebase. Please review > [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class > org.apache.hadoop.io.MultipleIOException > - > > Key: HADOOP-12309 > URL: https://issues.apache.org/jira/browse/HADOOP-12309 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Attachments: HADOOP-12309.2.patch, HADOOP-12309.3.patch, > HADOOP-12309.patch > > > Can use java.lang.Throwable.addSuppressed(Throwable) instead of > org.apache.hadoop.io.MultipleIOException as 1.7+ java provides support for > this. org.apache.hadoop.io.MultipleIOException can be deprecated as for now > {code} > > catch (IOException e) { > if(generalException == null) > { > generalException = new IOException("General exception"); > } > generalException.addSuppressed(e); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12253) ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0
[ https://issues.apache.org/jira/browse/HADOOP-12253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-12253: - Attachment: HADOOP-12253.2.patch Thanks for the input. Attached patch after modifications. Please review > ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0 > > > Key: HADOOP-12253 > URL: https://issues.apache.org/jira/browse/HADOOP-12253 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.6.0 > Environment: hadoop 2.6.0 hive 1.1.0 tez0.7 cenos6.4 >Reporter: tangjunjie >Assignee: Ajith S > Attachments: HADOOP-12253.2.patch, HADOOP-12253.patch > > > When I enable hdfs federation.I run a query on hive on tez. Then it occur a > exception: > {noformat} > 8.784 PM WARNorg.apache.hadoop.security.UserGroupInformation No > groups available for user tangjijun > 3:12:28.784 PMERROR org.apache.hadoop.hive.ql.exec.Task Failed > to execute tez graph. > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.fs.viewfs.ViewFileSystem$InternalDirOfViewFs.getFileStatus(ViewFileSystem.java:771) > at > org.apache.hadoop.fs.viewfs.ViewFileSystem.getFileStatus(ViewFileSystem.java:359) > at > org.apache.tez.client.TezClientUtils.checkAncestorPermissionsForAllUsers(TezClientUtils.java:955) > at > org.apache.tez.client.TezClientUtils.setupTezJarsLocalResources(TezClientUtils.java:184) > at > org.apache.tez.client.TezClient.getTezJarResources(TezClient.java:787) > at org.apache.tez.client.TezClient.start(TezClient.java:337) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:191) > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:234) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:136) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144) > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69) > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > I digging into the issue,I found the code snippet in ViewFileSystem.java as > follows: > {noformat} > @Override > public FileStatus getFileStatus(Path f) throws IOException { > checkPathIsSlash(f); > return new FileStatus(0, true, 0, 0, creationTime, creationTime, > PERMISSION_555, ugi.getUserName(), ugi.getGroupNames()[0], > new Path(theInternalDir.fullPath).makeQualified( > myUri, ROOT_PATH)); > } > {noformat} > If the node in cluster haven't creat user like > tangjijun,ugi.getGroupNames()[0] will throw > ArrayIndexOutOfBoundsException.Because no user mean no group. > I create user tangjijun on that node. Then the job was executed normally. > I think this code should check ugi.getGroupNames() is empty.When it is empty > ,then print some log. Not to throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12376) S3NInputStream.close() downloads the remaining bytes of the object from S3
[ https://issues.apache.org/jira/browse/HADOOP-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012883#comment-15012883 ] Ajith S commented on HADOOP-12376: -- This is discussed with the JetS3t's developer. This is the bug introduced between JetS3t release 0.8.1 and 0.9.0. Will be fixed in the upcoming release of JetS3t https://bitbucket.org/jmurty/jets3t/issues/218/httpmethodreleaseinputstream-close-abort > S3NInputStream.close() downloads the remaining bytes of the object from S3 > -- > > Key: HADOOP-12376 > URL: https://issues.apache.org/jira/browse/HADOOP-12376 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.6.0, 2.7.1 >Reporter: Steve Loughran >Assignee: Ajith S > > This is the same as HADOOP-11570, possibly the swift code has the same > problem. > Apparently (as raised on ASF lists), when you close an s3n input stream, it > reads through the remainder of the file. This kills performance on partial > reads of large files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12382) Add Documentation for FairCallQueue
[ https://issues.apache.org/jira/browse/HADOOP-12382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738602#comment-14738602 ] Ajith S commented on HADOOP-12382: -- Sure, feel free to assign to yourself > Add Documentation for FairCallQueue > --- > > Key: HADOOP-12382 > URL: https://issues.apache.org/jira/browse/HADOOP-12382 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ajith S >Assignee: Ajith S > > Added supporting documentation explaining the FairCallQueue and mention all > the properties introduced by its subtasks accordingly -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12382) Add Documentation for FairCallQueue
Ajith S created HADOOP-12382: Summary: Add Documentation for FairCallQueue Key: HADOOP-12382 URL: https://issues.apache.org/jira/browse/HADOOP-12382 Project: Hadoop Common Issue Type: Sub-task Reporter: Ajith S Assignee: Ajith S Added supporting documentation explaining the FairCallQueue and mention all the properties introduced by its subtasks accordingly -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-9640) RPC Congestion Control with FairCallQueue
[ https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14733724#comment-14733724 ] Ajith S commented on HADOOP-9640: - I missed checking the sub-tasks. Will like to update documentation for FairCallQueue as it has lot of configuration introduced. It will be better if we can explain it briefly similar to http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html. Added a sub-task for it. > RPC Congestion Control with FairCallQueue > - > > Key: HADOOP-9640 > URL: https://issues.apache.org/jira/browse/HADOOP-9640 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.2.0, 3.0.0 >Reporter: Xiaobo Peng >Assignee: Chris Li > Labels: hdfs, qos, rpc > Attachments: FairCallQueue-PerformanceOnCluster.pdf, > MinorityMajorityPerformance.pdf, NN-denial-of-service-updated-plan.pdf, > faircallqueue.patch, faircallqueue2.patch, faircallqueue3.patch, > faircallqueue4.patch, faircallqueue5.patch, faircallqueue6.patch, > faircallqueue7_with_runtime_swapping.patch, > rpc-congestion-control-draft-plan.pdf > > > For an easy-to-read summary see: > http://www.ebaytechblog.com/2014/08/21/quality-of-service-in-hadoop/ > Several production Hadoop cluster incidents occurred where the Namenode was > overloaded and failed to respond. > We can improve quality of service for users during namenode peak loads by > replacing the FIFO call queue with a [Fair Call > Queue|https://issues.apache.org/jira/secure/attachment/12616864/NN-denial-of-service-updated-plan.pdf]. > (this plan supersedes rpc-congestion-control-draft-plan). > Excerpted from the communication of one incident, “The map task of a user was > creating huge number of small files in the user directory. Due to the heavy > load on NN, the JT also was unable to communicate with NN...The cluster > became responsive only once the job was killed.” > Excerpted from the communication of another incident, “Namenode was > overloaded by GetBlockLocation requests (Correction: should be getFileInfo > requests. the job had a bug that called getFileInfo for a nonexistent file in > an endless loop). All other requests to namenode were also affected by this > and hence all jobs slowed down. Cluster almost came to a grinding > halt…Eventually killed jobtracker to kill all jobs that are running.” > Excerpted from HDFS-945, “We've seen defective applications cause havoc on > the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories > (60k files) etc.” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12376) S3NInputStream.close() downloads the remaining bytes of the object from S3
[ https://issues.apache.org/jira/browse/HADOOP-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730525#comment-14730525 ] Ajith S commented on HADOOP-12376: -- Hi After initial analysis i found that in the jets3t jar, this particular change https://bitbucket.org/jmurty/jets3t/diff/src/org/jets3t/service/impl/rest/httpclient/HttpMethodReleaseInputStream.java?diff2=3709f8458ba6=default {code} if (!underlyingStreamConsumed) { // Underlying input stream has not been consumed, abort method // to force connection to be closed and cleaned-up. -httpMethod.abort(); +httpResponse.getEntity().consumeContent(); //Current version consumes entity in a utility } -httpMethod.releaseConnection(); alreadyReleased = true; } {code} is causing the issue as instead of aborting it chooses to consume the stream before closing > S3NInputStream.close() downloads the remaining bytes of the object from S3 > -- > > Key: HADOOP-12376 > URL: https://issues.apache.org/jira/browse/HADOOP-12376 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.6.0, 2.7.1 >Reporter: Steve Loughran >Assignee: Ajith S > > This is the same as HADOOP-11570, possibly the swift code has the same > problem. > Apparently (as raised on ASF lists), when you close an s3n input stream, it > reads through the remainder of the file. This kills performance on partial > reads of large files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-9640) RPC Congestion Control with FairCallQueue
[ https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730272#comment-14730272 ] Ajith S commented on HADOOP-9640: - Hi [~chrilisf] Any progress on the issue.? If you are not looking into this, i would like to continue work :) > RPC Congestion Control with FairCallQueue > - > > Key: HADOOP-9640 > URL: https://issues.apache.org/jira/browse/HADOOP-9640 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.2.0, 3.0.0 >Reporter: Xiaobo Peng >Assignee: Chris Li > Labels: hdfs, qos, rpc > Attachments: FairCallQueue-PerformanceOnCluster.pdf, > MinorityMajorityPerformance.pdf, NN-denial-of-service-updated-plan.pdf, > faircallqueue.patch, faircallqueue2.patch, faircallqueue3.patch, > faircallqueue4.patch, faircallqueue5.patch, faircallqueue6.patch, > faircallqueue7_with_runtime_swapping.patch, > rpc-congestion-control-draft-plan.pdf > > > For an easy-to-read summary see: > http://www.ebaytechblog.com/2014/08/21/quality-of-service-in-hadoop/ > Several production Hadoop cluster incidents occurred where the Namenode was > overloaded and failed to respond. > We can improve quality of service for users during namenode peak loads by > replacing the FIFO call queue with a [Fair Call > Queue|https://issues.apache.org/jira/secure/attachment/12616864/NN-denial-of-service-updated-plan.pdf]. > (this plan supersedes rpc-congestion-control-draft-plan). > Excerpted from the communication of one incident, “The map task of a user was > creating huge number of small files in the user directory. Due to the heavy > load on NN, the JT also was unable to communicate with NN...The cluster > became responsive only once the job was killed.” > Excerpted from the communication of another incident, “Namenode was > overloaded by GetBlockLocation requests (Correction: should be getFileInfo > requests. the job had a bug that called getFileInfo for a nonexistent file in > an endless loop). All other requests to namenode were also affected by this > and hence all jobs slowed down. Cluster almost came to a grinding > halt…Eventually killed jobtracker to kill all jobs that are running.” > Excerpted from HDFS-945, “We've seen defective applications cause havoc on > the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories > (60k files) etc.” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-12376) S3NInputStream.close() downloads the remaining bytes of the object from S3
[ https://issues.apache.org/jira/browse/HADOOP-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HADOOP-12376: Assignee: Ajith S > S3NInputStream.close() downloads the remaining bytes of the object from S3 > -- > > Key: HADOOP-12376 > URL: https://issues.apache.org/jira/browse/HADOOP-12376 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.6.0, 2.7.1 >Reporter: Steve Loughran >Assignee: Ajith S > > This is the same as HADOOP-11570, possibly the swift code has the same > problem. > Apparently (as raised on ASF lists), when you close an s3n input stream, it > reads through the remainder of the file. This kills performance on partial > reads of large files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12309) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException
[ https://issues.apache.org/jira/browse/HADOOP-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-12309: - Attachment: HADOOP-12309.2.patch Attaching patch after working on comments. Please review [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException - Key: HADOOP-12309 URL: https://issues.apache.org/jira/browse/HADOOP-12309 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Minor Attachments: HADOOP-12309.2.patch, HADOOP-12309.patch Can use java.lang.Throwable.addSuppressed(Throwable) instead of org.apache.hadoop.io.MultipleIOException as 1.7+ java provides support for this. org.apache.hadoop.io.MultipleIOException can be deprecated as for now {code} catch (IOException e) { if(generalException == null) { generalException = new IOException(General exception); } generalException.addSuppressed(e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12309) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException
[ https://issues.apache.org/jira/browse/HADOOP-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700795#comment-14700795 ] Ajith S commented on HADOOP-12309: -- Okay :) thanks for the input [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException - Key: HADOOP-12309 URL: https://issues.apache.org/jira/browse/HADOOP-12309 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Minor Attachments: HADOOP-12309.patch Can use java.lang.Throwable.addSuppressed(Throwable) instead of org.apache.hadoop.io.MultipleIOException as 1.7+ java provides support for this. org.apache.hadoop.io.MultipleIOException can be deprecated as for now {code} catch (IOException e) { if(generalException == null) { generalException = new IOException(General exception); } generalException.addSuppressed(e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12328) Hadoop/MapReduce ignores hidden (dot) files
[ https://issues.apache.org/jira/browse/HADOOP-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699061#comment-14699061 ] Ajith S commented on HADOOP-12328: -- Hi . are hidden files internal to system and by default, in org.apache.hadoop.mapred.FileInputFormatK, V we have a hiddenFileFilter {code} private static final PathFilter hiddenFileFilter = new PathFilter(){ public boolean accept(Path p){ String name = p.getName(); return !name.startsWith(_) !name.startsWith(.); } }; {code} so in case you use TextInputFormat, SequenceFileInputFormat etc, its expected to ignore it If you want to list it for some reason, you can override the org.apache.hadoop.mapred.FileInputFormat.listStatus(JobConf) in your custom format class Hadoop/MapReduce ignores hidden (dot) files --- Key: HADOOP-12328 URL: https://issues.apache.org/jira/browse/HADOOP-12328 Project: Hadoop Common Issue Type: Bug Reporter: Naga For some reason Hadoop/MapReduce does not pick up hidden (dot) files for processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-12328) Hadoop/MapReduce ignores hidden (dot) files
[ https://issues.apache.org/jira/browse/HADOOP-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HADOOP-12328: Assignee: Ajith S Hadoop/MapReduce ignores hidden (dot) files --- Key: HADOOP-12328 URL: https://issues.apache.org/jira/browse/HADOOP-12328 Project: Hadoop Common Issue Type: Bug Reporter: Naga Assignee: Ajith S For some reason Hadoop/MapReduce does not pick up hidden (dot) files for processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12309) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException
[ https://issues.apache.org/jira/browse/HADOOP-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694543#comment-14694543 ] Ajith S commented on HADOOP-12309: -- Hi [~szetszwo] {quote} Otherwise, the stack trace won't be correct {quote} Thanks for the input. I have one small doubt, wouldn't the stack be from the line i throw exception and doesn't matter where i create the exception object.? please correct me if i am wrong [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException - Key: HADOOP-12309 URL: https://issues.apache.org/jira/browse/HADOOP-12309 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Minor Attachments: HADOOP-12309.patch Can use java.lang.Throwable.addSuppressed(Throwable) instead of org.apache.hadoop.io.MultipleIOException as 1.7+ java provides support for this. org.apache.hadoop.io.MultipleIOException can be deprecated as for now {code} catch (IOException e) { if(generalException == null) { generalException = new IOException(General exception); } generalException.addSuppressed(e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12309) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException
[ https://issues.apache.org/jira/browse/HADOOP-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694544#comment-14694544 ] Ajith S commented on HADOOP-12309: -- Hi [~szetszwo] {quote} Otherwise, the stack trace won't be correct {quote} Thanks for the input. I have one small doubt, wouldn't the stack be from the line i throw exception and doesn't matter where i create the exception object.? please correct me if i am wrong [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException - Key: HADOOP-12309 URL: https://issues.apache.org/jira/browse/HADOOP-12309 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Minor Attachments: HADOOP-12309.patch Can use java.lang.Throwable.addSuppressed(Throwable) instead of org.apache.hadoop.io.MultipleIOException as 1.7+ java provides support for this. org.apache.hadoop.io.MultipleIOException can be deprecated as for now {code} catch (IOException e) { if(generalException == null) { generalException = new IOException(General exception); } generalException.addSuppressed(e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12309) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException
[ https://issues.apache.org/jira/browse/HADOOP-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694546#comment-14694546 ] Ajith S commented on HADOOP-12309: -- Thanks for the input. I will change the patch accordingly [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException - Key: HADOOP-12309 URL: https://issues.apache.org/jira/browse/HADOOP-12309 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Minor Attachments: HADOOP-12309.patch Can use java.lang.Throwable.addSuppressed(Throwable) instead of org.apache.hadoop.io.MultipleIOException as 1.7+ java provides support for this. org.apache.hadoop.io.MultipleIOException can be deprecated as for now {code} catch (IOException e) { if(generalException == null) { generalException = new IOException(General exception); } generalException.addSuppressed(e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11252) RPC client write does not time out by default
[ https://issues.apache.org/jira/browse/HADOOP-11252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681346#comment-14681346 ] Ajith S commented on HADOOP-11252: -- Patch looks good, can we commit this patch and track decoupling ping and timeout in separate jira.? RPC client write does not time out by default - Key: HADOOP-11252 URL: https://issues.apache.org/jira/browse/HADOOP-11252 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.5.0 Reporter: Wilfred Spiegelenburg Assignee: Wilfred Spiegelenburg Priority: Critical Attachments: HADOOP-11252.patch The RPC client has a default timeout set to 0 when no timeout is passed in. This means that the network connection created will not timeout when used to write data. The issue has shown in YARN-2578 and HDFS-4858. Timeouts for writes then fall back to the tcp level retry (configured via tcp_retries2) and timeouts between the 15-30 minutes. Which is too long for a default behaviour. Using 0 as the default value for timeout is incorrect. We should use a sane value for the timeout and the ipc.ping.interval configuration value is a logical choice for it. The default behaviour should be changed from 0 to the value read for the ping interval from the Configuration. Fixing it in common makes more sense than finding and changing all other points in the code that do not pass in a timeout. Offending code lines: https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java#L488 and https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java#L350 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-9654) IPC timeout doesn't seem to be kicking in
[ https://issues.apache.org/jira/browse/HADOOP-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681335#comment-14681335 ] Ajith S commented on HADOOP-9654: - As per [~rohithsharma] comments, the issue is same as HADOOP-11252 So suggest we can close this, as much of the discussions are in other jira. IPC timeout doesn't seem to be kicking in - Key: HADOOP-9654 URL: https://issues.apache.org/jira/browse/HADOOP-9654 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.1.0-beta Reporter: Roman Shaposhnik Assignee: Ajith S During my Bigtop testing I made the NN OOM. This, in turn, made all of the clients stuck in the IPC call (even the new clients that I run *after* the NN went OOM). Here's an example of a jstack output on the client that was running: {noformat} $ hadoop fs -lsr / {noformat} Stacktrace: {noformat} /usr/java/jdk1.6.0_21/bin/jstack 19078 2013-06-19 23:14:00 Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode): Attach Listener daemon prio=10 tid=0x7fcd8c8c1800 nid=0x5105 waiting on condition [0x] java.lang.Thread.State: RUNNABLE IPC Client (1223039541) connection to ip-10-144-82-213.ec2.internal/10.144.82.213:17020 from root daemon prio=10 tid=0x7fcd8c7ea000 nid=0x4aa0 runnable [0x7fcd443e2000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked 0x7fcd7529de18 (a sun.nio.ch.Util$1) - locked 0x7fcd7529de00 (a java.util.Collections$UnmodifiableSet) - locked 0x7fcd7529da80 (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:116) at java.io.FilterInputStream.read(FilterInputStream.java:116) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:421) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked 0x7fcd752aaf18 (a java.io.BufferedInputStream) at java.io.DataInputStream.readInt(DataInputStream.java:370) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:943) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:840) Low Memory Detector daemon prio=10 tid=0x7fcd8c09 nid=0x4a9b runnable [0x] java.lang.Thread.State: RUNNABLE CompilerThread1 daemon prio=10 tid=0x7fcd8c08d800 nid=0x4a9a waiting on condition [0x] java.lang.Thread.State: RUNNABLE CompilerThread0 daemon prio=10 tid=0x7fcd8c08a800 nid=0x4a99 waiting on condition [0x] java.lang.Thread.State: RUNNABLE Signal Dispatcher daemon prio=10 tid=0x7fcd8c088800 nid=0x4a98 runnable [0x] java.lang.Thread.State: RUNNABLE Finalizer daemon prio=10 tid=0x7fcd8c06a000 nid=0x4a97 in Object.wait() [0x7fcd902e9000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) Reference Handler daemon prio=10 tid=0x7fcd8c068000 nid=0x4a96 in Object.wait() [0x7fcd903ea000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) main prio=10 tid=0x7fcd8c00a800 nid=0x4a92 in Object.wait() [0x7fcd91b06000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on
[jira] [Resolved] (HADOOP-9654) IPC timeout doesn't seem to be kicking in
[ https://issues.apache.org/jira/browse/HADOOP-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S resolved HADOOP-9654. - Resolution: Duplicate Closing the issues, if anyone disagree, please feel free to reopen IPC timeout doesn't seem to be kicking in - Key: HADOOP-9654 URL: https://issues.apache.org/jira/browse/HADOOP-9654 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.1.0-beta Reporter: Roman Shaposhnik Assignee: Ajith S During my Bigtop testing I made the NN OOM. This, in turn, made all of the clients stuck in the IPC call (even the new clients that I run *after* the NN went OOM). Here's an example of a jstack output on the client that was running: {noformat} $ hadoop fs -lsr / {noformat} Stacktrace: {noformat} /usr/java/jdk1.6.0_21/bin/jstack 19078 2013-06-19 23:14:00 Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode): Attach Listener daemon prio=10 tid=0x7fcd8c8c1800 nid=0x5105 waiting on condition [0x] java.lang.Thread.State: RUNNABLE IPC Client (1223039541) connection to ip-10-144-82-213.ec2.internal/10.144.82.213:17020 from root daemon prio=10 tid=0x7fcd8c7ea000 nid=0x4aa0 runnable [0x7fcd443e2000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked 0x7fcd7529de18 (a sun.nio.ch.Util$1) - locked 0x7fcd7529de00 (a java.util.Collections$UnmodifiableSet) - locked 0x7fcd7529da80 (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:116) at java.io.FilterInputStream.read(FilterInputStream.java:116) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:421) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked 0x7fcd752aaf18 (a java.io.BufferedInputStream) at java.io.DataInputStream.readInt(DataInputStream.java:370) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:943) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:840) Low Memory Detector daemon prio=10 tid=0x7fcd8c09 nid=0x4a9b runnable [0x] java.lang.Thread.State: RUNNABLE CompilerThread1 daemon prio=10 tid=0x7fcd8c08d800 nid=0x4a9a waiting on condition [0x] java.lang.Thread.State: RUNNABLE CompilerThread0 daemon prio=10 tid=0x7fcd8c08a800 nid=0x4a99 waiting on condition [0x] java.lang.Thread.State: RUNNABLE Signal Dispatcher daemon prio=10 tid=0x7fcd8c088800 nid=0x4a98 runnable [0x] java.lang.Thread.State: RUNNABLE Finalizer daemon prio=10 tid=0x7fcd8c06a000 nid=0x4a97 in Object.wait() [0x7fcd902e9000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) Reference Handler daemon prio=10 tid=0x7fcd8c068000 nid=0x4a96 in Object.wait() [0x7fcd903ea000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) main prio=10 tid=0x7fcd8c00a800 nid=0x4a92 in Object.wait() [0x7fcd91b06000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd752528e8 (a org.apache.hadoop.ipc.Client$Call) at java.lang.Object.wait(Object.java:485) at
[jira] [Commented] (HADOOP-9654) IPC timeout doesn't seem to be kicking in
[ https://issues.apache.org/jira/browse/HADOOP-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681334#comment-14681334 ] Ajith S commented on HADOOP-9654: - +1 IPC timeout doesn't seem to be kicking in - Key: HADOOP-9654 URL: https://issues.apache.org/jira/browse/HADOOP-9654 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.1.0-beta Reporter: Roman Shaposhnik Assignee: Ajith S During my Bigtop testing I made the NN OOM. This, in turn, made all of the clients stuck in the IPC call (even the new clients that I run *after* the NN went OOM). Here's an example of a jstack output on the client that was running: {noformat} $ hadoop fs -lsr / {noformat} Stacktrace: {noformat} /usr/java/jdk1.6.0_21/bin/jstack 19078 2013-06-19 23:14:00 Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode): Attach Listener daemon prio=10 tid=0x7fcd8c8c1800 nid=0x5105 waiting on condition [0x] java.lang.Thread.State: RUNNABLE IPC Client (1223039541) connection to ip-10-144-82-213.ec2.internal/10.144.82.213:17020 from root daemon prio=10 tid=0x7fcd8c7ea000 nid=0x4aa0 runnable [0x7fcd443e2000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked 0x7fcd7529de18 (a sun.nio.ch.Util$1) - locked 0x7fcd7529de00 (a java.util.Collections$UnmodifiableSet) - locked 0x7fcd7529da80 (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:116) at java.io.FilterInputStream.read(FilterInputStream.java:116) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:421) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked 0x7fcd752aaf18 (a java.io.BufferedInputStream) at java.io.DataInputStream.readInt(DataInputStream.java:370) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:943) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:840) Low Memory Detector daemon prio=10 tid=0x7fcd8c09 nid=0x4a9b runnable [0x] java.lang.Thread.State: RUNNABLE CompilerThread1 daemon prio=10 tid=0x7fcd8c08d800 nid=0x4a9a waiting on condition [0x] java.lang.Thread.State: RUNNABLE CompilerThread0 daemon prio=10 tid=0x7fcd8c08a800 nid=0x4a99 waiting on condition [0x] java.lang.Thread.State: RUNNABLE Signal Dispatcher daemon prio=10 tid=0x7fcd8c088800 nid=0x4a98 runnable [0x] java.lang.Thread.State: RUNNABLE Finalizer daemon prio=10 tid=0x7fcd8c06a000 nid=0x4a97 in Object.wait() [0x7fcd902e9000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) Reference Handler daemon prio=10 tid=0x7fcd8c068000 nid=0x4a96 in Object.wait() [0x7fcd903ea000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) main prio=10 tid=0x7fcd8c00a800 nid=0x4a92 in Object.wait() [0x7fcd91b06000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd752528e8 (a org.apache.hadoop.ipc.Client$Call) at java.lang.Object.wait(Object.java:485) at
[jira] [Updated] (HADOOP-12309) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException
[ https://issues.apache.org/jira/browse/HADOOP-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-12309: - Attachment: HADOOP-12309.patch Submitting patch please review [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException - Key: HADOOP-12309 URL: https://issues.apache.org/jira/browse/HADOOP-12309 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Minor Attachments: HADOOP-12309.patch Can use java.lang.Throwable.addSuppressed(Throwable) instead of org.apache.hadoop.io.MultipleIOException as 1.7+ java provides support for this. org.apache.hadoop.io.MultipleIOException can be deprecated as for now {code} catch (IOException e) { if(generalException == null) { generalException = new IOException(General exception); } generalException.addSuppressed(e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12309) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException
[ https://issues.apache.org/jira/browse/HADOOP-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-12309: - Status: Patch Available (was: Open) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException - Key: HADOOP-12309 URL: https://issues.apache.org/jira/browse/HADOOP-12309 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Minor Attachments: HADOOP-12309.patch Can use java.lang.Throwable.addSuppressed(Throwable) instead of org.apache.hadoop.io.MultipleIOException as 1.7+ java provides support for this. org.apache.hadoop.io.MultipleIOException can be deprecated as for now {code} catch (IOException e) { if(generalException == null) { generalException = new IOException(General exception); } generalException.addSuppressed(e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-9654) IPC timeout doesn't seem to be kicking in
[ https://issues.apache.org/jira/browse/HADOOP-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681206#comment-14681206 ] Ajith S commented on HADOOP-9654: - +1 for [~rvs] I think we can introduce a new default *ipc.client.timeout* property which can be used in case the ipc.client.ping=false(which is default now) -1 is not a reasonable timeout value, we can set the new property to may be say 3600 seconds.? reasonable.? IPC timeout doesn't seem to be kicking in - Key: HADOOP-9654 URL: https://issues.apache.org/jira/browse/HADOOP-9654 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.1.0-beta Reporter: Roman Shaposhnik Assignee: Ajith S During my Bigtop testing I made the NN OOM. This, in turn, made all of the clients stuck in the IPC call (even the new clients that I run *after* the NN went OOM). Here's an example of a jstack output on the client that was running: {noformat} $ hadoop fs -lsr / {noformat} Stacktrace: {noformat} /usr/java/jdk1.6.0_21/bin/jstack 19078 2013-06-19 23:14:00 Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode): Attach Listener daemon prio=10 tid=0x7fcd8c8c1800 nid=0x5105 waiting on condition [0x] java.lang.Thread.State: RUNNABLE IPC Client (1223039541) connection to ip-10-144-82-213.ec2.internal/10.144.82.213:17020 from root daemon prio=10 tid=0x7fcd8c7ea000 nid=0x4aa0 runnable [0x7fcd443e2000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked 0x7fcd7529de18 (a sun.nio.ch.Util$1) - locked 0x7fcd7529de00 (a java.util.Collections$UnmodifiableSet) - locked 0x7fcd7529da80 (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:116) at java.io.FilterInputStream.read(FilterInputStream.java:116) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:421) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked 0x7fcd752aaf18 (a java.io.BufferedInputStream) at java.io.DataInputStream.readInt(DataInputStream.java:370) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:943) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:840) Low Memory Detector daemon prio=10 tid=0x7fcd8c09 nid=0x4a9b runnable [0x] java.lang.Thread.State: RUNNABLE CompilerThread1 daemon prio=10 tid=0x7fcd8c08d800 nid=0x4a9a waiting on condition [0x] java.lang.Thread.State: RUNNABLE CompilerThread0 daemon prio=10 tid=0x7fcd8c08a800 nid=0x4a99 waiting on condition [0x] java.lang.Thread.State: RUNNABLE Signal Dispatcher daemon prio=10 tid=0x7fcd8c088800 nid=0x4a98 runnable [0x] java.lang.Thread.State: RUNNABLE Finalizer daemon prio=10 tid=0x7fcd8c06a000 nid=0x4a97 in Object.wait() [0x7fcd902e9000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) Reference Handler daemon prio=10 tid=0x7fcd8c068000 nid=0x4a96 in Object.wait() [0x7fcd903ea000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) main prio=10 tid=0x7fcd8c00a800 nid=0x4a92 in Object.wait() [0x7fcd91b06000]
[jira] [Updated] (HADOOP-12253) ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0
[ https://issues.apache.org/jira/browse/HADOOP-12253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-12253: - Attachment: HADOOP-12253.patch Will avoid ArrayIndexOutOfBoundsException ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0 Key: HADOOP-12253 URL: https://issues.apache.org/jira/browse/HADOOP-12253 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.6.0 Environment: hadoop 2.6.0 hive 1.1.0 tez0.7 cenos6.4 Reporter: tangjunjie Assignee: Ajith S Attachments: HADOOP-12253.patch When I enable hdfs federation.I run a query on hive on tez. Then it occur a exception: {noformat} 8.784 PM WARNorg.apache.hadoop.security.UserGroupInformation No groups available for user tangjijun 3:12:28.784 PMERROR org.apache.hadoop.hive.ql.exec.Task Failed to execute tez graph. java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.fs.viewfs.ViewFileSystem$InternalDirOfViewFs.getFileStatus(ViewFileSystem.java:771) at org.apache.hadoop.fs.viewfs.ViewFileSystem.getFileStatus(ViewFileSystem.java:359) at org.apache.tez.client.TezClientUtils.checkAncestorPermissionsForAllUsers(TezClientUtils.java:955) at org.apache.tez.client.TezClientUtils.setupTezJarsLocalResources(TezClientUtils.java:184) at org.apache.tez.client.TezClient.getTezJarResources(TezClient.java:787) at org.apache.tez.client.TezClient.start(TezClient.java:337) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:191) at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:234) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} I digging into the issue,I found the code snippet in ViewFileSystem.java as follows: {noformat} @Override public FileStatus getFileStatus(Path f) throws IOException { checkPathIsSlash(f); return new FileStatus(0, true, 0, 0, creationTime, creationTime, PERMISSION_555, ugi.getUserName(), ugi.getGroupNames()[0], new Path(theInternalDir.fullPath).makeQualified( myUri, ROOT_PATH)); } {noformat} If the node in cluster haven't creat user like tangjijun,ugi.getGroupNames()[0] will throw ArrayIndexOutOfBoundsException.Because no user mean no group. I create user tangjijun on that node. Then the job was executed normally. I think this code should check ugi.getGroupNames() is empty.When it is empty ,then print some log. Not to throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12253) ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0
[ https://issues.apache.org/jira/browse/HADOOP-12253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-12253: - Status: Patch Available (was: Open) Submitting patch. Please review ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0 Key: HADOOP-12253 URL: https://issues.apache.org/jira/browse/HADOOP-12253 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.6.0 Environment: hadoop 2.6.0 hive 1.1.0 tez0.7 cenos6.4 Reporter: tangjunjie Assignee: Ajith S Attachments: HADOOP-12253.patch When I enable hdfs federation.I run a query on hive on tez. Then it occur a exception: {noformat} 8.784 PM WARNorg.apache.hadoop.security.UserGroupInformation No groups available for user tangjijun 3:12:28.784 PMERROR org.apache.hadoop.hive.ql.exec.Task Failed to execute tez graph. java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.fs.viewfs.ViewFileSystem$InternalDirOfViewFs.getFileStatus(ViewFileSystem.java:771) at org.apache.hadoop.fs.viewfs.ViewFileSystem.getFileStatus(ViewFileSystem.java:359) at org.apache.tez.client.TezClientUtils.checkAncestorPermissionsForAllUsers(TezClientUtils.java:955) at org.apache.tez.client.TezClientUtils.setupTezJarsLocalResources(TezClientUtils.java:184) at org.apache.tez.client.TezClient.getTezJarResources(TezClient.java:787) at org.apache.tez.client.TezClient.start(TezClient.java:337) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:191) at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:234) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} I digging into the issue,I found the code snippet in ViewFileSystem.java as follows: {noformat} @Override public FileStatus getFileStatus(Path f) throws IOException { checkPathIsSlash(f); return new FileStatus(0, true, 0, 0, creationTime, creationTime, PERMISSION_555, ugi.getUserName(), ugi.getGroupNames()[0], new Path(theInternalDir.fullPath).makeQualified( myUri, ROOT_PATH)); } {noformat} If the node in cluster haven't creat user like tangjijun,ugi.getGroupNames()[0] will throw ArrayIndexOutOfBoundsException.Because no user mean no group. I create user tangjijun on that node. Then the job was executed normally. I think this code should check ugi.getGroupNames() is empty.When it is empty ,then print some log. Not to throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12309) [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException
Ajith S created HADOOP-12309: Summary: [Refactor] Use java.lang.Throwable.addSuppressed(Throwable) instead of class org.apache.hadoop.io.MultipleIOException Key: HADOOP-12309 URL: https://issues.apache.org/jira/browse/HADOOP-12309 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Minor Can use java.lang.Throwable.addSuppressed(Throwable) instead of org.apache.hadoop.io.MultipleIOException as 1.7+ java provides support for this. org.apache.hadoop.io.MultipleIOException can be deprecated as for now {code} catch (IOException e) { if(generalException == null) { generalException = new IOException(General exception); } generalException.addSuppressed(e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-10019) make symlinks production-ready
[ https://issues.apache.org/jira/browse/HADOOP-10019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HADOOP-10019: Assignee: Ajith S make symlinks production-ready -- Key: HADOOP-10019 URL: https://issues.apache.org/jira/browse/HADOOP-10019 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.3.0 Reporter: Colin Patrick McCabe Assignee: Ajith S This is an umbrella JIRA for all the things we have to do to make symlinks production-ready for Hadoop 2.3. Note that some of these subtasks are scheduled for 2.1.2 / 2.2, but the overall effort is for 2.3. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12253) ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0
[ https://issues.apache.org/jira/browse/HADOOP-12253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653190#comment-14653190 ] Ajith S commented on HADOOP-12253: -- Hi {noformat}ugi.getGroupNames()[0]{noformat} seems to cause trouble in other command like listStatus or getAclStatus. So as a fix, can we check if user belongs to at least one group when ViewFileSystem is initialized i.e in ViewFileSystem() constructor.?? so that issue is reflected in setup stage.?? ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0 Key: HADOOP-12253 URL: https://issues.apache.org/jira/browse/HADOOP-12253 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.6.0 Environment: hadoop 2.6.0 hive 1.1.0 tez0.7 cenos6.4 Reporter: tangjunjie Assignee: Ajith S When I enable hdfs federation.I run a query on hive on tez. Then it occur a exception: {noformat} 8.784 PM WARNorg.apache.hadoop.security.UserGroupInformation No groups available for user tangjijun 3:12:28.784 PMERROR org.apache.hadoop.hive.ql.exec.Task Failed to execute tez graph. java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.fs.viewfs.ViewFileSystem$InternalDirOfViewFs.getFileStatus(ViewFileSystem.java:771) at org.apache.hadoop.fs.viewfs.ViewFileSystem.getFileStatus(ViewFileSystem.java:359) at org.apache.tez.client.TezClientUtils.checkAncestorPermissionsForAllUsers(TezClientUtils.java:955) at org.apache.tez.client.TezClientUtils.setupTezJarsLocalResources(TezClientUtils.java:184) at org.apache.tez.client.TezClient.getTezJarResources(TezClient.java:787) at org.apache.tez.client.TezClient.start(TezClient.java:337) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:191) at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:234) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} I digging into the issue,I found the code snippet in ViewFileSystem.java as follows: {noformat} @Override public FileStatus getFileStatus(Path f) throws IOException { checkPathIsSlash(f); return new FileStatus(0, true, 0, 0, creationTime, creationTime, PERMISSION_555, ugi.getUserName(), ugi.getGroupNames()[0], new Path(theInternalDir.fullPath).makeQualified( myUri, ROOT_PATH)); } {noformat} If the node in cluster haven't creat user like tangjijun,ugi.getGroupNames()[0] will throw ArrayIndexOutOfBoundsException.Because no user mean no group. I create user tangjijun on that node. Then the job was executed normally. I think this code should check ugi.getGroupNames() is empty.When it is empty ,then print some log. Not to throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-12253) ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0
[ https://issues.apache.org/jira/browse/HADOOP-12253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HADOOP-12253: Assignee: Ajith S ViewFileSystem getFileStatus java.lang.ArrayIndexOutOfBoundsException: 0 Key: HADOOP-12253 URL: https://issues.apache.org/jira/browse/HADOOP-12253 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.6.0 Environment: hadoop 2.6.0 hive 1.1.0 tez0.7 cenos6.4 Reporter: tangjunjie Assignee: Ajith S When I enable hdfs federation.I run a query on hive on tez. Then it occur a exception: 8.784 PM WARNorg.apache.hadoop.security.UserGroupInformation No groups available for user tangjijun 3:12:28.784 PMERROR org.apache.hadoop.hive.ql.exec.Task Failed to execute tez graph. java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.fs.viewfs.ViewFileSystem$InternalDirOfViewFs.getFileStatus(ViewFileSystem.java:771) at org.apache.hadoop.fs.viewfs.ViewFileSystem.getFileStatus(ViewFileSystem.java:359) at org.apache.tez.client.TezClientUtils.checkAncestorPermissionsForAllUsers(TezClientUtils.java:955) at org.apache.tez.client.TezClientUtils.setupTezJarsLocalResources(TezClientUtils.java:184) at org.apache.tez.client.TezClient.getTezJarResources(TezClient.java:787) at org.apache.tez.client.TezClient.start(TezClient.java:337) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:191) at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:234) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) I digging into the issue,I found the code snippet in ViewFileSystem.java as follows: @Override public FileStatus getFileStatus(Path f) throws IOException { checkPathIsSlash(f); return new FileStatus(0, true, 0, 0, creationTime, creationTime, PERMISSION_555, ugi.getUserName(), ugi.getGroupNames()[0], new Path(theInternalDir.fullPath).makeQualified( myUri, ROOT_PATH)); } If the node in cluster haven't creat user like tangjijun,ugi.getGroupNames()[0] will throw ArrayIndexOutOfBoundsException.Because no user mean no group. I create user tangjijun on that node. Then the job was executed normally. I think this code should check ugi.getGroupNames() is empty.When it is empty ,then print some log. Not to throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-9654) IPC timeout doesn't seem to be kicking in
[ https://issues.apache.org/jira/browse/HADOOP-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HADOOP-9654: --- Assignee: Ajith S IPC timeout doesn't seem to be kicking in - Key: HADOOP-9654 URL: https://issues.apache.org/jira/browse/HADOOP-9654 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.1.0-beta Reporter: Roman Shaposhnik Assignee: Ajith S During my Bigtop testing I made the NN OOM. This, in turn, made all of the clients stuck in the IPC call (even the new clients that I run *after* the NN went OOM). Here's an example of a jstack output on the client that was running: {noformat} $ hadoop fs -lsr / {noformat} Stacktrace: {noformat} /usr/java/jdk1.6.0_21/bin/jstack 19078 2013-06-19 23:14:00 Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode): Attach Listener daemon prio=10 tid=0x7fcd8c8c1800 nid=0x5105 waiting on condition [0x] java.lang.Thread.State: RUNNABLE IPC Client (1223039541) connection to ip-10-144-82-213.ec2.internal/10.144.82.213:17020 from root daemon prio=10 tid=0x7fcd8c7ea000 nid=0x4aa0 runnable [0x7fcd443e2000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked 0x7fcd7529de18 (a sun.nio.ch.Util$1) - locked 0x7fcd7529de00 (a java.util.Collections$UnmodifiableSet) - locked 0x7fcd7529da80 (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:116) at java.io.FilterInputStream.read(FilterInputStream.java:116) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:421) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked 0x7fcd752aaf18 (a java.io.BufferedInputStream) at java.io.DataInputStream.readInt(DataInputStream.java:370) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:943) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:840) Low Memory Detector daemon prio=10 tid=0x7fcd8c09 nid=0x4a9b runnable [0x] java.lang.Thread.State: RUNNABLE CompilerThread1 daemon prio=10 tid=0x7fcd8c08d800 nid=0x4a9a waiting on condition [0x] java.lang.Thread.State: RUNNABLE CompilerThread0 daemon prio=10 tid=0x7fcd8c08a800 nid=0x4a99 waiting on condition [0x] java.lang.Thread.State: RUNNABLE Signal Dispatcher daemon prio=10 tid=0x7fcd8c088800 nid=0x4a98 runnable [0x] java.lang.Thread.State: RUNNABLE Finalizer daemon prio=10 tid=0x7fcd8c06a000 nid=0x4a97 in Object.wait() [0x7fcd902e9000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) Reference Handler daemon prio=10 tid=0x7fcd8c068000 nid=0x4a96 in Object.wait() [0x7fcd903ea000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock) main prio=10 tid=0x7fcd8c00a800 nid=0x4a92 in Object.wait() [0x7fcd91b06000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fcd752528e8 (a org.apache.hadoop.ipc.Client$Call) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1284) -
[jira] [Assigned] (HADOOP-9819) FileSystem#rename is broken, deletes target when renaming link to itself
[ https://issues.apache.org/jira/browse/HADOOP-9819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HADOOP-9819: --- Assignee: Ajith S FileSystem#rename is broken, deletes target when renaming link to itself Key: HADOOP-9819 URL: https://issues.apache.org/jira/browse/HADOOP-9819 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: 3.0.0 Reporter: Arpit Agarwal Assignee: Ajith S Uncovered while fixing TestSymlinkLocalFsFileSystem on Windows. This block of code deletes the symlink, the correct behavior is to do nothing. {code:java} try { dstStatus = getFileLinkStatus(dst); } catch (IOException e) { dstStatus = null; } if (dstStatus != null) { if (srcStatus.isDirectory() != dstStatus.isDirectory()) { throw new IOException(Source + src + Destination + dst + both should be either file or directory); } if (!overwrite) { throw new FileAlreadyExistsException(rename destination + dst + already exists.); } // Delete the destination that is a file or an empty directory if (dstStatus.isDirectory()) { FileStatus[] list = listStatus(dst); if (list != null list.length != 0) { throw new IOException( rename cannot overwrite non empty destination directory + dst); } } delete(dst, false); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-7614) Reloading configuration when using imputstream resources results in org.xml.sax.SAXParseException
[ https://issues.apache.org/jira/browse/HADOOP-7614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-7614: Labels: (was: BB2015-05-TBR) Reloading configuration when using imputstream resources results in org.xml.sax.SAXParseException - Key: HADOOP-7614 URL: https://issues.apache.org/jira/browse/HADOOP-7614 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.21.0 Reporter: Ferdy Galema Priority: Minor Fix For: 2.7.0 Attachments: HADOOP-7614-v1.patch, HADOOP-7614-v2.patch When using an inputstream as a resource for configuration, reloading this configuration will throw the following exception: Exception in thread main java.lang.RuntimeException: org.xml.sax.SAXParseException: Premature end of file. at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1576) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1445) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1381) at org.apache.hadoop.conf.Configuration.get(Configuration.java:569) ... Caused by: org.xml.sax.SAXParseException: Premature end of file. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1504) ... 4 more To reproduce see following testcode: Configuration conf = new Configuration(); ByteArrayInputStream bais = new ByteArrayInputStream(configuration/configuration.getBytes()); conf.addResource(bais); System.out.println(conf.get(blah)); conf.addResource(core-site.xml); //just add a named resource, doesn't matter which one System.out.println(conf.get(blah)); Allowing inputstream resources is flexible, but in cases such as this in can lead to difficult to debug problems. What do you think is the best solution? We could: A) reset the inputstream after it is read instead of closing it (but what to do when the stream does not support marking?) B) leave it up to the client (for example make sure you implement close() so that it resets the steam) C) when reading the inputstream for the first time, cache or wrap the contents somehow so that is can be read multiple times (let's at least document it) D) remove inputstream method altogether e) something else? For now I have attached a patch for solution A. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-7614) Reloading configuration when using imputstream resources results in org.xml.sax.SAXParseException
[ https://issues.apache.org/jira/browse/HADOOP-7614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-7614: Resolution: Cannot Reproduce Fix Version/s: 2.7.0 Status: Resolved (was: Patch Available) Tested the issue in 2.7.0 and in trunk. Unable to reproduce using the testcase. Closing the issue. Feel free to reopen Reloading configuration when using imputstream resources results in org.xml.sax.SAXParseException - Key: HADOOP-7614 URL: https://issues.apache.org/jira/browse/HADOOP-7614 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.21.0 Reporter: Ferdy Galema Priority: Minor Fix For: 2.7.0 Attachments: HADOOP-7614-v1.patch, HADOOP-7614-v2.patch When using an inputstream as a resource for configuration, reloading this configuration will throw the following exception: Exception in thread main java.lang.RuntimeException: org.xml.sax.SAXParseException: Premature end of file. at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1576) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1445) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1381) at org.apache.hadoop.conf.Configuration.get(Configuration.java:569) ... Caused by: org.xml.sax.SAXParseException: Premature end of file. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1504) ... 4 more To reproduce see following testcode: Configuration conf = new Configuration(); ByteArrayInputStream bais = new ByteArrayInputStream(configuration/configuration.getBytes()); conf.addResource(bais); System.out.println(conf.get(blah)); conf.addResource(core-site.xml); //just add a named resource, doesn't matter which one System.out.println(conf.get(blah)); Allowing inputstream resources is flexible, but in cases such as this in can lead to difficult to debug problems. What do you think is the best solution? We could: A) reset the inputstream after it is read instead of closing it (but what to do when the stream does not support marking?) B) leave it up to the client (for example make sure you implement close() so that it resets the steam) C) when reading the inputstream for the first time, cache or wrap the contents somehow so that is can be read multiple times (let's at least document it) D) remove inputstream method altogether e) something else? For now I have attached a patch for solution A. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-7814) Add ability to add the contents of a properties file to Configuration
[ https://issues.apache.org/jira/browse/HADOOP-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HADOOP-7814: --- Assignee: Ajith S Add ability to add the contents of a properties file to Configuration - Key: HADOOP-7814 URL: https://issues.apache.org/jira/browse/HADOOP-7814 Project: Hadoop Common Issue Type: Improvement Affects Versions: 1.0.0 Reporter: Kristofer Tomasette Assignee: Ajith S Priority: Minor Attachments: HADOOP-7814.patch, HADOOP-7814.patch, HADOOP-7814.patch Original Estimate: 2h Remaining Estimate: 2h Add a method to Configuration that will take a location on the local filesystem of a properties file. Method should read in the file's properties and add them to the Configuration object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-7814) Add ability to add the contents of a properties file to Configuration
[ https://issues.apache.org/jira/browse/HADOOP-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-7814: Labels: (was: BB2015-05-TBR) Add ability to add the contents of a properties file to Configuration - Key: HADOOP-7814 URL: https://issues.apache.org/jira/browse/HADOOP-7814 Project: Hadoop Common Issue Type: Improvement Affects Versions: 1.0.0 Reporter: Kristofer Tomasette Priority: Minor Attachments: HADOOP-7814.patch, HADOOP-7814.patch, HADOOP-7814.patch Original Estimate: 2h Remaining Estimate: 2h Add a method to Configuration that will take a location on the local filesystem of a properties file. Method should read in the file's properties and add them to the Configuration object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-7814) Add ability to add the contents of a properties file to Configuration
[ https://issues.apache.org/jira/browse/HADOOP-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533999#comment-14533999 ] Ajith S commented on HADOOP-7814: - I would like to work on this Add ability to add the contents of a properties file to Configuration - Key: HADOOP-7814 URL: https://issues.apache.org/jira/browse/HADOOP-7814 Project: Hadoop Common Issue Type: Improvement Affects Versions: 1.0.0 Reporter: Kristofer Tomasette Assignee: Ajith S Priority: Minor Attachments: HADOOP-7814.patch, HADOOP-7814.patch, HADOOP-7814.patch Original Estimate: 2h Remaining Estimate: 2h Add a method to Configuration that will take a location on the local filesystem of a properties file. Method should read in the file's properties and add them to the Configuration object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-7814) Add ability to add the contents of a properties file to Configuration
[ https://issues.apache.org/jira/browse/HADOOP-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-7814: Status: Open (was: Patch Available) I would like to work on this. I am cancelling the patch and will upload a new one Add ability to add the contents of a properties file to Configuration - Key: HADOOP-7814 URL: https://issues.apache.org/jira/browse/HADOOP-7814 Project: Hadoop Common Issue Type: Improvement Affects Versions: 1.0.0 Reporter: Kristofer Tomasette Assignee: Ajith S Priority: Minor Attachments: HADOOP-7814.patch, HADOOP-7814.patch, HADOOP-7814.patch Original Estimate: 2h Remaining Estimate: 2h Add a method to Configuration that will take a location on the local filesystem of a properties file. Method should read in the file's properties and add them to the Configuration object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-8087) Paths that start with a double slash cause No filesystem for scheme: null errors
[ https://issues.apache.org/jira/browse/HADOOP-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-8087: Labels: (was: BB2015-05-TBR) Paths that start with a double slash cause No filesystem for scheme: null errors -- Key: HADOOP-8087 URL: https://issues.apache.org/jira/browse/HADOOP-8087 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.23.0, 0.24.0 Reporter: Daryn Sharp Assignee: Rajesh Kartha Attachments: HADOOP-8087.001.patch, HADOOP-8087.002.patch, HADOOP-8087.003.patch, HADOOP-8087.004.patch, HADOOP-8087.005.patch {{Path}} is incorrectly parsing {{//dir/path}} in a very unexpected way. While it should translate to the directory {{$fs.default.name}/dir/path}}, it instead discards the {{//dir}} and returns {{$fs.default.name/path}}. The problem is {{Path}} is trying to parsing an authority even when a scheme is not present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-8087) Paths that start with a double slash cause No filesystem for scheme: null errors
[ https://issues.apache.org/jira/browse/HADOOP-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534247#comment-14534247 ] Ajith S commented on HADOOP-8087: - Looks like the issue is similar to HADOOP-7418. I'm closing as duplicate, please feel free to reopen if some one feels its valid. [Reference discussion thread|https://issues.apache.org/jira/browse/HADOOP-7418?focusedCommentId=14534243page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14534243] Paths that start with a double slash cause No filesystem for scheme: null errors -- Key: HADOOP-8087 URL: https://issues.apache.org/jira/browse/HADOOP-8087 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.23.0, 0.24.0 Reporter: Daryn Sharp Assignee: Rajesh Kartha Labels: BB2015-05-TBR Attachments: HADOOP-8087.001.patch, HADOOP-8087.002.patch, HADOOP-8087.003.patch, HADOOP-8087.004.patch, HADOOP-8087.005.patch {{Path}} is incorrectly parsing {{//dir/path}} in a very unexpected way. While it should translate to the directory {{$fs.default.name}/dir/path}}, it instead discards the {{//dir}} and returns {{$fs.default.name/path}}. The problem is {{Path}} is trying to parsing an authority even when a scheme is not present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-8087) Paths that start with a double slash cause No filesystem for scheme: null errors
[ https://issues.apache.org/jira/browse/HADOOP-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-8087: Resolution: Won't Fix Status: Resolved (was: Patch Available) Paths that start with a double slash cause No filesystem for scheme: null errors -- Key: HADOOP-8087 URL: https://issues.apache.org/jira/browse/HADOOP-8087 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.23.0, 0.24.0 Reporter: Daryn Sharp Assignee: Rajesh Kartha Attachments: HADOOP-8087.001.patch, HADOOP-8087.002.patch, HADOOP-8087.003.patch, HADOOP-8087.004.patch, HADOOP-8087.005.patch {{Path}} is incorrectly parsing {{//dir/path}} in a very unexpected way. While it should translate to the directory {{$fs.default.name}/dir/path}}, it instead discards the {{//dir}} and returns {{$fs.default.name/path}}. The problem is {{Path}} is trying to parsing an authority even when a scheme is not present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-9723) Improve error message when hadoop archive output path already exists
[ https://issues.apache.org/jira/browse/HADOOP-9723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-9723: Labels: BB2015-05-RFC (was: BB-2015-05-RFC) Improve error message when hadoop archive output path already exists Key: HADOOP-9723 URL: https://issues.apache.org/jira/browse/HADOOP-9723 Project: Hadoop Common Issue Type: Improvement Affects Versions: 3.0.0, 2.0.4-alpha Reporter: Stephen Chu Assignee: Akira AJISAKA Priority: Trivial Labels: BB2015-05-RFC Attachments: HADOOP-9723.2.patch, HADOOP-9723.patch When creating a hadoop archive and specifying an output path of an already existing file, we get an Invalid Output error message. {code} [schu@hdfs-vanilla-1 ~]$ hadoop archive -archiveName foo.har -p /user/schu testDir1 /user/schu Invalid Output: /user/schu/foo.har {code} This error can be improved to tell users immediately that the output path already exists. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-8174) Remove confusing comment in Path#isAbsolute()
[ https://issues.apache.org/jira/browse/HADOOP-8174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-8174: Labels: BB2015-05-RFC (was: BB2015-05-TBR) Remove confusing comment in Path#isAbsolute() - Key: HADOOP-8174 URL: https://issues.apache.org/jira/browse/HADOOP-8174 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Labels: BB2015-05-RFC Attachments: HADOOP-8174.patch, HADOOP-8174.txt The method is checking for absolute path correctly. When scheme and authority are present, the path is absolute. So the following comment needs to be removed. {noformat} /** * There is some ambiguity here. An absolute path is a slash * relative name without a scheme or an authority. * So either this method was incorrectly named or its * implementation is incorrect. This method returns true * even if there is a scheme and authority. */ {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-8087) Paths that start with a double slash cause No filesystem for scheme: null errors
[ https://issues.apache.org/jira/browse/HADOOP-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534327#comment-14534327 ] Ajith S commented on HADOOP-8087: - Thanks for the pointers, I have a query Acc to http://tools.ietf.org/html/rfc3986#section-3.3 It says If a URI does not contain an authority component, then the path cannot begin with two slash characters (//) I find it conflicting with your comment. Please correct me if I'm wrong. Paths that start with a double slash cause No filesystem for scheme: null errors -- Key: HADOOP-8087 URL: https://issues.apache.org/jira/browse/HADOOP-8087 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.23.0, 0.24.0 Reporter: Daryn Sharp Assignee: Rajesh Kartha Attachments: HADOOP-8087.001.patch, HADOOP-8087.002.patch, HADOOP-8087.003.patch, HADOOP-8087.004.patch, HADOOP-8087.005.patch {{Path}} is incorrectly parsing {{//dir/path}} in a very unexpected way. While it should translate to the directory {{$fs.default.name}/dir/path}}, it instead discards the {{//dir}} and returns {{$fs.default.name/path}}. The problem is {{Path}} is trying to parsing an authority even when a scheme is not present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-7947) Validate XMLs if a relevant tool is available, when using scripts
[ https://issues.apache.org/jira/browse/HADOOP-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534239#comment-14534239 ] Ajith S commented on HADOOP-7947: - Thanks for the patch I was testing the patch and have few comments/questions 1. According to the help, *-conffile* can accept a file or folder, but when we give neither, its throwing stacktrace, its better we instead return a normal message like Expected a File or Folder. Just a suggestion {quote} *root@voltan:/opt/app/hadoop-2.7.0/bin# ./hadoop conftest -h* Usage: hadoop conftest [-conffile path|-h|--help] Options: -conffile path If specified, that path will be verified. You can specify this option multiple times. Both file and directory are acceptable. If a directory specified, the files whose name end with .xml in that directory will be verified. If not specified, the files whose name end with .xml in $HADOOP_CONF_DIR will be verified. -h, --help Print this help *root@voltan:/opt/app/hadoop-2.7.0/bin# ./hadoop conftest -conffile* Exception in thread main org.apache.commons.cli.MissingArgumentException: Missing argument for option: conffile at org.apache.commons.cli.Parser.processArgs(Parser.java:343) at org.apache.commons.cli.Parser.processOption(Parser.java:393) at org.apache.commons.cli.Parser.parse(Parser.java:199) at org.apache.commons.cli.Parser.parse(Parser.java:85) at org.apache.hadoop.util.ConfTest.main(ConfTest.java:220) {quote} 2. *conftest -conf* behaviour is quite confusing when argument is missing, its printing the help *and* also its validating the default folder, is this fine.?? i feel in this scenario it should have printed the error with generic usage and return. {quote} *root@voltan:/opt/app/hadoop-2.7.0/bin# ./hadoop conftest -conf* 15/05/08 15:03:25 WARN util.GenericOptionsParser: options parsing failed: Missing argument for option: conf usage: general options are: -archives paths comma separated archives to be unarchived on the compute machines. -conf configuration file specify an application configuration file -D property=valueuse value for given property -files paths comma separated files to be copied to the map reduce cluster -fs local|namenode:port specify a namenode -jt local|resourcemanager:port specify a ResourceManager -libjars paths comma separated jar files to include in the classpath. -tokenCacheFile tokensFile name of the file with the tokens /opt/app/hadoop-2.7.0/etc/hadoop/hdfs-site.xml: valid /opt/app/hadoop-2.7.0/etc/hadoop/capacity-scheduler.xml: valid /opt/app/hadoop-2.7.0/etc/hadoop/hadoop-policy.xml: valid /opt/app/hadoop-2.7.0/etc/hadoop/core-site.xml: valid /opt/app/hadoop-2.7.0/etc/hadoop/kms-acls.xml: valid /opt/app/hadoop-2.7.0/etc/hadoop/kms-site.xml: valid /opt/app/hadoop-2.7.0/etc/hadoop/mapred-site.xml: valid /opt/app/hadoop-2.7.0/etc/hadoop/yarn-site.xml: valid /opt/app/hadoop-2.7.0/etc/hadoop/httpfs-site.xml: valid OK root@voltan:/opt/app/hadoop-2.7.0/bin# {quote} Validate XMLs if a relevant tool is available, when using scripts - Key: HADOOP-7947 URL: https://issues.apache.org/jira/browse/HADOOP-7947 Project: Hadoop Common Issue Type: Wish Components: scripts Affects Versions: 2.7.0 Reporter: Harsh J Assignee: Kengo Seki Labels: BB2015-05-TBR, newbie Attachments: HADOOP-7947.001.patch, HADOOP-7947.002.patch, HADOOP-7947.003.patch, HADOOP-7947.004.patch Given that we are locked down to using only XML for configuration and most of the administrators need to manage it by themselves (unless a tool that manages for you is used), it would be good to also validate the provided config XML (*-site.xml) files with a tool like {{xmllint}} or maybe Xerces somehow, when running a command or (at least) when starting up daemons. We should use this only if a relevant tool is available, and optionally be silent if the env. requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11709) Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable
[ https://issues.apache.org/jira/browse/HADOOP-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-11709: - Attachment: 001-HDFS-7919.patch Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable - Key: HADOOP-11709 URL: https://issues.apache.org/jira/browse/HADOOP-11709 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Trivial Labels: beginner, newbie Attachments: 001-HDFS-7919.patch NANOSECONDS_PER_MILLISECOND constant can be moved to class level instead of creating it in each method call. {code} org.apache.hadoop.util.Time.java public static long monotonicNow() { final long NANOSECONDS_PER_MILLISECOND = 100; return System.nanoTime() / NANOSECONDS_PER_MILLISECOND; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11709) Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable
[ https://issues.apache.org/jira/browse/HADOOP-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HADOOP-11709: - Status: Patch Available (was: Open) Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable - Key: HADOOP-11709 URL: https://issues.apache.org/jira/browse/HADOOP-11709 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Trivial Labels: beginner, newbie Attachments: 001-HDFS-7919.patch NANOSECONDS_PER_MILLISECOND constant can be moved to class level instead of creating it in each method call. {code} org.apache.hadoop.util.Time.java public static long monotonicNow() { final long NANOSECONDS_PER_MILLISECOND = 100; return System.nanoTime() / NANOSECONDS_PER_MILLISECOND; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11709) Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable
[ https://issues.apache.org/jira/browse/HADOOP-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14369285#comment-14369285 ] Ajith S commented on HADOOP-11709: -- Thank you for the inputs. Submitting patch to move the variable to a class level constant. Please review the same Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable - Key: HADOOP-11709 URL: https://issues.apache.org/jira/browse/HADOOP-11709 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Trivial Labels: beginner, newbie Attachments: 001-HDFS-7919.patch NANOSECONDS_PER_MILLISECOND constant can be moved to class level instead of creating it in each method call. {code} org.apache.hadoop.util.Time.java public static long monotonicNow() { final long NANOSECONDS_PER_MILLISECOND = 100; return System.nanoTime() / NANOSECONDS_PER_MILLISECOND; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11709) Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable
[ https://issues.apache.org/jira/browse/HADOOP-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370771#comment-14370771 ] Ajith S commented on HADOOP-11709: -- test cases are not required as no new scenario is introduced, the existing test cases passing is enough Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable - Key: HADOOP-11709 URL: https://issues.apache.org/jira/browse/HADOOP-11709 Project: Hadoop Common Issue Type: Improvement Reporter: Ajith S Assignee: Ajith S Priority: Trivial Labels: beginner, newbie Attachments: 001-HDFS-7919.patch NANOSECONDS_PER_MILLISECOND constant can be moved to class level instead of creating it in each method call. {code} org.apache.hadoop.util.Time.java public static long monotonicNow() { final long NANOSECONDS_PER_MILLISECOND = 100; return System.nanoTime() / NANOSECONDS_PER_MILLISECOND; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)