Not getting JIRA email on cc: or tag
Hello, I haven't been getting emailed when someone tags or cc:s me in a JIRA. Is there a way to change this? Thanks! Aaron
[jira] [Created] (HADOOP-16282) AvoidFileStream to improve performance
Ayush Saxena created HADOOP-16282: - Summary: AvoidFileStream to improve performance Key: HADOOP-16282 URL: https://issues.apache.org/jira/browse/HADOOP-16282 Project: Hadoop Common Issue Type: Improvement Reporter: Ayush Saxena Assignee: Ayush Saxena The FileInputStream and FileOutputStream classes contains a finalizer method which will cause garbage collection pauses. See [JDK-8080225|https://bugs.openjdk.java.net/browse/JDK-8080225] for details. The FileReader and FileWriter constructors instantiate FileInputStream and FileOutputStream, again causing garbage collection issues while finalizer methods are called. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16281) ABFS: Rename operation, GetFileStatus before rename operation and throw exception on the driver side
Da Zhou created HADOOP-16281: Summary: ABFS: Rename operation, GetFileStatus before rename operation and throw exception on the driver side Key: HADOOP-16281 URL: https://issues.apache.org/jira/browse/HADOOP-16281 Project: Hadoop Common Issue Type: Sub-task Components: fs/azure Affects Versions: 3.2.0 Reporter: Da Zhou Assignee: Da Zhou -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/ [Apr 29, 2019 3:57:59 AM] (github) HDDS-1471. Update ratis dependency to 0.3.0. Contributed by Ajay Kumar. [Apr 29, 2019 12:27:28 PM] (stevel) HADOOP-16242. ABFS: add bufferpool to AbfsOutputStream. [Apr 29, 2019 2:46:01 PM] (xyao) HDDS-1472. Add retry to kinit command in smoketests. Contributed by Ajay [Apr 29, 2019 6:18:11 PM] (github) HDDS-1455. Inconsistent naming convention with Ozone Kerberos [Apr 29, 2019 7:05:38 PM] (bharat) HDDS-1476. Fix logIfNeeded logic in EndPointStateMachine. (#779) [Apr 29, 2019 8:28:19 PM] (7813154+ajayydv) HDDS-1462. Fix content and format of Ozone documentation. Contributed by [Apr 29, 2019 9:07:23 PM] (github) HDDS-1430. NPE if secure ozone if KMS uri is not defined. Contributed by [Apr 29, 2019 9:49:35 PM] (arp) HDFS-13677. Dynamic refresh Disk configuration results in overwriting -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore Unread field:TimelineEventSubDoc.java:[line 56] Unread field:TimelineMetricSubDoc.java:[line 44] FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core Class org.apache.hadoop.applications.mawo.server.common.TaskStatus implements Cloneable but does not define or use clone method At TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 39-346] Equals method for org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument is of type WorkerId At WorkerId.java:the argument is of type WorkerId At WorkerId.java:[line 114] org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does not check for null argument At WorkerId.java:null argument At WorkerId.java:[lines 114-115] Failed junit tests : hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 hadoop.yarn.applications.distributedshell.TestDistributedShell hadoop.mapreduce.v2.app.TestRuntimeEstimators hadoop.ozone.freon.TestDataValidateWithSafeByteOperations cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/diff-compile-javac-root.txt [332K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/diff-checkstyle-root.txt [17M] hadolint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/diff-patch-pylint.txt [84K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/diff-patch-shelldocs.txt [44K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/whitespace-eol.txt [9.6M] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/whitespace-tabs.txt [1.1M] findbugs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-documentstore-warnings.html [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-mawo_hadoop-yarn-applications-mawo-core-warnings.html [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/branch-findbugs-hadoop-submarine_hadoop-submarine-tony-runtime.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1122/artifact/out/branch-findbugs-hadoop-submarine_hadoop-submarine-yarnservice-runtime.txt [4.0K] javadoc:
[jira] [Created] (HADOOP-16280) S3Guard: Retry failed read with backoff in Authoritative mode when file can be opened
Gabor Bota created HADOOP-16280: --- Summary: S3Guard: Retry failed read with backoff in Authoritative mode when file can be opened Key: HADOOP-16280 URL: https://issues.apache.org/jira/browse/HADOOP-16280 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Reporter: Gabor Bota When using S3Guard in authoritative mode a file can be reported from AWS S3 that's missing like it is described in the following exception: {noformat} java.io.FileNotFoundException: re-open s3a://cloudera-dev-gabor-ireland/test/TMCDOR-021df1ad-633f-47b8-97f5-6cd93f0b82d0 at 0 on s3a://cloudera-dev-gabor-ireland/test/TMCDOR-021df1ad-633f-47b8-97f5-6cd93f0b82d0: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: E1FF9EA9B5DBBD7E; S3 Extended Request ID: NzNIL4+dyA89WTnfbcwuYQK+hCfx51TfavwgC3oEvQI0IQ9M/zAspbXOfBIis8/nTolc4tRB9ik=), S3 Extended Request ID: NzNIL4+dyA89WTnfbcwuYQK+hCfx51TfavwgC3oEvQI0IQ9M/zAspbXOfBIis8/nTolc4tRB9ik=:NoSuchKey {noformat} But the metadata in S3Guard (e.g dynamo db) is there, so it can be opened. The operation will not fail when it's opened, it will fail when we try to read it, so the call {noformat}FSDataInputStream is = guardedFs.open(testFilePath);{noformat}} won't fail, but the next call {noformat} byte[] firstRead = new byte[text.length()]; is.read(firstRead, 0, firstRead.length); {noformat} will fail with the exception message like what's above. Once Authoritative mode is on, we assume that there is no out of band operation, so the file will appear eventually. We should re-try in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16279) S3Guard: Implement time-based (TTL) expiry for entries (and tombstones)
Gabor Bota created HADOOP-16279: --- Summary: S3Guard: Implement time-based (TTL) expiry for entries (and tombstones) Key: HADOOP-16279 URL: https://issues.apache.org/jira/browse/HADOOP-16279 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Reporter: Gabor Bota In HADOOP-15621 we implemented TTL for Authoritative Directory Listings and added {{ExpirableMetadata}}. {{DDBPathMetadata}} extends {{PathMetadata}} extends {{ExpirableMetadata}}, so all metadata entries in ddb can expire, but the implementation is not done yet. To complete this feature the following should be done: * Add new tests for metadata entry and tombstone expiry to {{ITestS3GuardTtl}} * Implement metadata entry and tombstone expiry I would like to start a debate on whether we need to use separate expiry times for entries and tombstones. My +1 on not using separate settings - so only one config name and value. Notes: * In HADOOP-13649 the metadata TTL is implemented in LocalMetadataStore, using an existing feature in guava's cache implementation. Expiry is set with {{fs.s3a.s3guard.local.ttl}}. * This is not the same, and not using the DDB's feature of ttl [(DOCS)|https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/TTL.html]. We need stronger consistency guarantees than what ddb promises: [cleaning once a day with a background job|https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/howitworks-ttl.html] is not usable for this feature - although it can be used as a general cleanup solution separately and independently from S3Guard. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/ No changes -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-common-project/hadoop-common Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient non-serializable instance field map In GlobalStorageStatistics.java:instance field map In GlobalStorageStatistics.java FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:[line 335] Failed CTEST tests : test_test_libhdfs_zerocopy_hdfs_static Failed junit tests : hadoop.minikdc.TestChangeOrgNameAndDomain hadoop.security.authentication.client.TestKerberosAuthenticator hadoop.contrib.bkjournal.TestBookKeeperSpeculativeRead hadoop.contrib.bkjournal.TestBookKeeperJournalManager hadoop.contrib.bkjournal.TestBookKeeperSpeculativeRead hadoop.contrib.bkjournal.TestBookKeeperJournalManager hadoop.hdfs.server.federation.router.TestRouterRpc hadoop.registry.secure.TestSecureLogins hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt [328K] cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-compile-cc-root-jdk1.8.0_191.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-compile-javac-root-jdk1.8.0_191.txt [308K] checkstyle: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-checkstyle-root.txt [16M] hadolint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-patch-shellcheck.txt [72K] shelldocs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-patch-shelldocs.txt [8.0K] whitespace: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/whitespace-tabs.txt [1.2M] xml: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/xml.txt [12K] findbugs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html [8.0K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html [8.0K] javadoc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/307/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt [16K]
[jira] [Resolved] (HADOOP-16221) S3Guard: fail write that doesn't update metadata store
[ https://issues.apache.org/jira/browse/HADOOP-16221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-16221. - Resolution: Fixed Fix Version/s: 3.3.0 +1, PR#666 committed. Thanks! > S3Guard: fail write that doesn't update metadata store > -- > > Key: HADOOP-16221 > URL: https://issues.apache.org/jira/browse/HADOOP-16221 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.2.0 >Reporter: Ben Roling >Assignee: Ben Roling >Priority: Major > Fix For: 3.3.0 > > > Right now, a failure to write to the S3Guard metadata store (e.g. DynamoDB) > is [merely > logged|https://github.com/apache/hadoop/blob/rel/release-3.1.2/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2708-L2712]. > It does not fail the S3AFileSystem write operation itself. As such, the > writer has no idea that anything went wrong. The implication of this is that > S3Guard doesn't always provide the consistency it advertises. > For example [this > article|https://blog.cloudera.com/blog/2017/08/introducing-s3guard-s3-consistency-for-apache-hadoop/] > states: > {quote}If a Hadoop S3A client creates or moves a file, and then a client > lists its directory, that file is now guaranteed to be included in the > listing. > {quote} > Unfortunately, this is sort of untrue and could result in exactly the sort of > problem S3Guard is supposed to avoid: > {quote}Missing data that is silently dropped. Multi-step Hadoop jobs that > depend on output of previous jobs may silently omit some data. This omission > happens when a job chooses which files to consume based on a directory > listing, which may not include recently-written items. > {quote} > Imagine the typical multi-job Hadoop processing pipeline. Job 1 runs and > succeeds, but one (or more) S3Guard metadata write failed under the covers. > Job 2 picks up the output directory from Job 1 and runs its processing, > potentially seeing an inconsistent listing, silently missing some of the Job > 1 output files. > S3Guard should at least provide a configuration option to fail if the > metadata write fails. It seems even ideally this should be the default? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16278) With S3 Filesystem, Long Running services End up Doing lot of GC and eventually die
Rajat Khandelwal created HADOOP-16278: - Summary: With S3 Filesystem, Long Running services End up Doing lot of GC and eventually die Key: HADOOP-16278 URL: https://issues.apache.org/jira/browse/HADOOP-16278 Project: Hadoop Common Issue Type: Bug Components: common, hadoop-aws, metrics Affects Versions: 3.1.2, 3.1.1, 3.1.0 Reporter: Rajat Khandelwal Fix For: 3.1.3 Attachments: Screenshot 2019-04-30 at 12.52.42 PM.png, Screenshot 2019-04-30 at 2.33.59 PM.png I'll start with the symptoms and eventually come to the cause. We are using HDP 3.1 and Noticed that every couple of days the Hive Metastore starts doing GC, sometimes with 30 minute long pauses. Although nothing is collected and the Heap remains fully used. Next, we looked at the Heap Dump and found that 99% of the memory is taken up by one Executor Service for its task queue. !Screenshot 2019-04-30 at 12.52.42 PM.png! The Instance is Created like this: {{ private static final ScheduledExecutorService scheduler = Executors}} {{ .newScheduledThreadPool(1, new ThreadFactoryBuilder().setDaemon(true)}} {{ .setNameFormat("MutableQuantiles-%d").build());}} So All the instances of MutableQuantiles are using a Shared single threaded ExecutorService The second thing to notice is this block of code in the Constructor of MutableQuantiles: {{this.scheduledTask = scheduler.scheduleAtFixedRate(new MutableQuantiles.RolloverSample(this), (long)interval, (long)interval, TimeUnit.SECONDS);}} So As soon as a MutableQuantiles Instance is created, one task is scheduled at Fix Rate. Instead of that, it could schedule them at Fixed Delay (Refer HADOOP-16248). Now coming to why it's related to S3. S3AFileSystem Creates an instance of S3AInstrumentation, which creates two quantiles (related to S3Guard) with 1s(hardcoded) interval and leaves them hanging. By hanging I mean perpetually scheduled. As and when new Instances of S3AFileSystem are created, two new quantiles are created, which in turn create two scheduled tasks and never cancel them. This way number of scheduled tasks keeps on growing without ever getting cleaned up, leading to GC/OOM/Crash. MutableQuantiles has a numInfo field which tells things like the name of the metric. From the Heapdump, I found one numInfo and traced all objects refering that. !Screenshot 2019-04-30 at 2.33.59 PM.png! There seem to be 300K objects of for the same metric (S3Guard_metadatastore_throttle_rate). As expected, there are other 300K objects for the other MutableQuantiles created by S3AInstrumentation class. Although the number of instances of S3AInstrumentation class is only 4. Clearly, there is a leak. One S3AInstrumentation instance is creating two scheduled tasks to be run every second. These tasks are left scheduled and not cancelled when S3AInstrumentation.close() is called. Hence, they are never cleaned up. GC is also not able to collect them since they are referred by the scheduler. Who creates S3AInstrumentation instances? S3AFileSystem.initialize(), which is called in FileSystem.get(URI, Configuration). Since hive metastore is a service that deals with a lot of Path Objects and hence needs to do a lot of calls to FileSystem.get, it's the one to first shows these symptoms. We're seeing similar symptoms in AM for long-running jobs (for both Tez AM and MR AM). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org