[jira] [Created] (HADOOP-16418) Fix checkstyle and findbugs warnings in hadoop-dynamometer
Masatake Iwasaki created HADOOP-16418: - Summary: Fix checkstyle and findbugs warnings in hadoop-dynamometer Key: HADOOP-16418 URL: https://issues.apache.org/jira/browse/HADOOP-16418 Project: Hadoop Common Issue Type: Bug Components: tools Reporter: Masatake Iwasaki -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16417) abfs can't access storage account without password
Jose Luis Pedrosa created HADOOP-16417: -- Summary: abfs can't access storage account without password Key: HADOOP-16417 URL: https://issues.apache.org/jira/browse/HADOOP-16417 Project: Hadoop Common Issue Type: Bug Components: fs/azure Affects Versions: 3.2.0 Reporter: Jose Luis Pedrosa It does not seem possible to access storage accounts without passwords using abfs, but it is possible using wasb. This sample code (Spark based) to illustrate, the following code using abfs_path with throw an exception {noformat} Exception in thread "main" java.lang.IllegalArgumentException: Invalid account key. at org.apache.hadoop.fs.azurebfs.services.SharedKeyCredentials.(SharedKeyCredentials.java:70) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:812) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.(AzureBlobFileSystemStore.java:149) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:108) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) {noformat} While using the wasb_path will work normally, {code:java} import org.apache.spark.api.java.function.FilterFunction; import org.apache.spark.sql.RuntimeConfig; import org.apache.spark.sql.SparkSession; import org.apache.spark.sql.Dataset; import org.apache.spark.sql.Row; public class SimpleApp { static String blob_account_name = "azureopendatastorage"; static String blob_container_name = "gfsweatherdatacontainer"; static String blob_relative_path = "GFSWeather/GFSProcessed"; static String blob_sas_token = ""; static String abfs_path = "abfs://"+blob_container_name+"@"+blob_account_name+".dfs.core.windows.net/"+blob_relative_path; static String wasbs_path = "wasbs://"+blob_container_name + "@"+blob_account_name+".blob.core.windows.net/" + blob_relative_path; public static void main(String[] args) { SparkSession spark = SparkSession.builder().appName("NOAAGFS Run").getOrCreate(); configureAzureHadoopConnetor(spark); RuntimeConfig conf = spark.conf(); conf.set("fs.azure.account.key."+blob_account_name+".dfs.core.windows.net", blob_sas_token); conf.set("fs.azure.account.key."+blob_account_name+".blob.core.windows.net", blob_sas_token); System.out.println("Creating parquet dataset"); Dataset logData = spark.read().parquet(abfs_path); System.out.println("Creating temp view"); logData.createOrReplaceTempView("source"); System.out.println("SQL"); spark.sql("SELECT * FROM source LIMIT 10").show(); spark.stop(); } public static void configureAzureHadoopConnetor(SparkSession session) { RuntimeConfig conf = session.conf(); conf.set("fs.AbstractFileSystem.wasb.impl","org.apache.hadoop.fs.azure.Wasb"); conf.set("fs.AbstractFileSystem.wasbs.impl","org.apache.hadoop.fs.azure.Wasbs"); conf.set("fs.wasb.impl","org.apache.hadoop.fs.azure.NativeAzureFileSystem"); conf.set("fs.wasbs.impl","org.apache.hadoop.fs.azure.NativeAzureFileSystem$Secure"); conf.set("fs.azure.secure.mode", false); conf.set("fs.abfs.impl", "org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem"); conf.set("fs.abfss.impl", "org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem"); conf.set("fs.AbstractFileSystem.abfs.impl","org.apache.hadoop.fs.azurebfs.Abfs"); conf.set("fs.AbstractFileSystem.abfss.impl","org.apache.hadoop.fs.azurebfs.Abfss"); // Works in conjuction with fs.azure.secure.mode. Setting this config to true //results in fs.azure.NativeAzureFileSystem using the local SAS key generation //where the SAS keys are generating in the same process as fs.azure.NativeAzureFileSystem. //If fs.azure.secure.mode flag is set to false, this flag has no effect. conf.set("fs.azure.local.sas.key.mode", false); } } {code} Sample build.gradle {noformat} plugins { id 'java' } group 'org.samples' version '1.0-SNAPSHOT' sourceCompatibility = 1.8 repositories { mavenCentral() } dependencies { compile 'org.apache.spark:spark-sql_2.12:2.4.3' } {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail:
[jira] [Created] (HADOOP-16416) mark DynamoDBMetadataStore.deleteTrackingValueMap as final
Steve Loughran created HADOOP-16416: --- Summary: mark DynamoDBMetadataStore.deleteTrackingValueMap as final Key: HADOOP-16416 URL: https://issues.apache.org/jira/browse/HADOOP-16416 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.2.0 Reporter: Steve Loughran S3Guard's {{DynamoDBMetadataStore.deleteTrackingValueMap}} field is static and can/should be marked as final; its name changed to upper case to match the coding conventions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-16409) Allow authoritative mode on non-qualified paths
[ https://issues.apache.org/jira/browse/HADOOP-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota resolved HADOOP-16409. - Resolution: Fixed Fix Version/s: 3.3.0 > Allow authoritative mode on non-qualified paths > --- > > Key: HADOOP-16409 > URL: https://issues.apache.org/jira/browse/HADOOP-16409 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Fix For: 3.3.0 > > > fs.s3a.authoritative.path currently requires a qualified URI (e.g. > s3a://bucket/path) which is how I see this being used most immediately, but > it also make sense for someone to just be able to configure /path, if all of > their buckets follow that pattern, or if they're providing configuration > already in a bucket-specific context (e.g. job-level configs, etc.) Just need > to qualify whatever is passed in to allowAuthoritative to make that work. > Also, in HADOOP-16396 Gabor pointed out a few whitepace nits that I neglected > to fix before merging. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16415) Speed up S3A test runs
Steve Loughran created HADOOP-16415: --- Summary: Speed up S3A test runs Key: HADOOP-16415 URL: https://issues.apache.org/jira/browse/HADOOP-16415 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.3.0 Reporter: Steve Loughran S3A Test runs are way too slow. Speed them by * reducing test setup/teardown costs * eliminating obsolete test cases * merge small tests into larger ones. One thing i see is that the main S3A test cases create and destroy new FS instances; There's both a setup and teardown cost there, but it does guarantee better isolation. Maybe if we know all test cases in a specific suite need the same options, we can manage that better; demand create the FS but only delete it in an @Afterclass method. That'd give us the OO-inheritance based setup of tests, but mean only one instance is done per suite -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/ No changes -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore Unread field:TimelineEventSubDoc.java:[line 56] Unread field:TimelineMetricSubDoc.java:[line 44] FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core Class org.apache.hadoop.applications.mawo.server.common.TaskStatus implements Cloneable but does not define or use clone method At TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 39-346] Equals method for org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument is of type WorkerId At WorkerId.java:the argument is of type WorkerId At WorkerId.java:[line 114] org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does not check for null argument At WorkerId.java:null argument At WorkerId.java:[lines 114-115] FindBugs : module:hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra org.apache.hadoop.tools.dynamometer.Client.addFileToZipRecursively(File, File, ZipOutputStream) may fail to clean up java.io.InputStream on checked exception Obligation to clean up resource created at Client.java:to clean up java.io.InputStream on checked exception Obligation to clean up resource created at Client.java:[line 859] is not discharged Exceptional return value of java.io.File.mkdirs() ignored in org.apache.hadoop.tools.dynamometer.DynoInfraUtils.fetchHadoopTarball(File, String, Configuration, Logger) At DynoInfraUtils.java:ignored in org.apache.hadoop.tools.dynamometer.DynoInfraUtils.fetchHadoopTarball(File, String, Configuration, Logger) At DynoInfraUtils.java:[line 138] Found reliance on default encoding in org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]):in org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]): new java.io.InputStreamReader(InputStream) At SimulatedDataNodes.java:[line 149] org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]) invokes System.exit(...), which shuts down the entire virtual machine At SimulatedDataNodes.java:down the entire virtual machine At SimulatedDataNodes.java:[line 123] org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]) may fail to close stream At SimulatedDataNodes.java:stream At SimulatedDataNodes.java:[line 149] FindBugs : module:hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-blockgen Self assignment of field BlockInfo.replication in new org.apache.hadoop.tools.dynamometer.blockgenerator.BlockInfo(BlockInfo) At BlockInfo.java:in new org.apache.hadoop.tools.dynamometer.blockgenerator.BlockInfo(BlockInfo) At BlockInfo.java:[line 78] Failed junit tests : hadoop.util.TestDiskCheckerWithDiskIo hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination hadoop.ozone.container.ozoneimpl.TestOzoneContainer hadoop.ozone.om.TestOzoneManagerHA hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis hadoop.ozone.client.rpc.TestOzoneAtRestEncryption hadoop.ozone.client.rpc.TestOzoneRpcClient hadoop.ozone.client.rpc.TestSecureOzoneRpcClient hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-compile-javac-root.txt [336K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-checkstyle-root.txt [17M] hadolint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-patch-hadolint.txt [8.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-patch-pylint.txt [120K] shellcheck:
[jira] [Created] (HADOOP-16414) ITestS3AMiniYarnCluster fails on sequential runs with Kerberos error
Steve Loughran created HADOOP-16414: --- Summary: ITestS3AMiniYarnCluster fails on sequential runs with Kerberos error Key: HADOOP-16414 URL: https://issues.apache.org/jira/browse/HADOOP-16414 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3, test Affects Versions: 3.3.0 Reporter: Steve Loughran If you do a sequential test run of hadoop-aws, you get a failure on {{ITestS3AMiniYarnCluster}}, with a message about Kerberos coming from inside job launch. {code} [ERROR] testWithMiniCluster(org.apache.hadoop.fs.s3a.yarn.ITestS3AMiniYarnCluster) Time elapsed: 3.438 s <<< ERROR! java.io.IOException: Can't get Master Kerberos principal for use as renewer at org.apache.hadoop.fs.s3a.yarn.ITestS3AMiniYarnCluster.testWithMiniCluster(ITestS3AMiniYarnCluster.java:117) {code} Assumption: some state in the single JVM is making this test think it should be using Kerberos. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/ No changes -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-common-project/hadoop-common Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient non-serializable instance field map In GlobalStorageStatistics.java:instance field map In GlobalStorageStatistics.java FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:[line 335] Failed junit tests : hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead hadoop.registry.secure.TestSecureLogins hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 hadoop.yarn.sls.TestSLSRunner cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt [328K] cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-cc-root-jdk1.8.0_212.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-javac-root-jdk1.8.0_212.txt [308K] checkstyle: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-checkstyle-root.txt [16M] hadolint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-shellcheck.txt [72K] shelldocs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-shelldocs.txt [8.0K] whitespace: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/whitespace-tabs.txt [1.2M] xml: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/xml.txt [12K] findbugs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html [8.0K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html [8.0K] javadoc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt [16K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_212.txt [1.1M] unit: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [228K]
[Hadoop] Question about protobuf
Hi Hadoop: I was trying to build Hadoop on AArch64 platform recently, and I've encontered some issues, one of them is about protobuf 2.5.0, the lowest version of protobuf can be provided on AArch64 is 2.6.1 and that causes version mismatch error when try to build Hadoop as hadoop requires 2.5.0, I noticed there was an issues filed 3 years ago: https://issues.apache.org/jira/browse/HADOOP-13363 and there was quite a lot discussion at the beginning but not much actions since last year. I'm just wondering, are there any conclusion or updates on this issue? Or are there any official walkaround on this issue? Because I can find some pretty hacky way to solve the problem but seems too hacky... for example: https://groups.google.com/forum/#!topic/protobuf/fwLF5_t3q3U and https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/do-component-build#L45-L105 Thanks in advance, Kevin Zheng
[jira] [Created] (HADOOP-16413) ITestS3ARemoteFileChanged doesn't overwrite test data creation
Steve Loughran created HADOOP-16413: --- Summary: ITestS3ARemoteFileChanged doesn't overwrite test data creation Key: HADOOP-16413 URL: https://issues.apache.org/jira/browse/HADOOP-16413 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3, test Affects Versions: 3.3.0 Reporter: Steve Loughran The tests in {{ITestS3ARemoteFileChanged}} write files with overwrite = false, and so when run against a store which (for any reason) already has those files, the tests fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16412) S3a getFileStatus to update DDB if an S3 query returns etag/versionID
Steve Loughran created HADOOP-16412: --- Summary: S3a getFileStatus to update DDB if an S3 query returns etag/versionID Key: HADOOP-16412 URL: https://issues.apache.org/jira/browse/HADOOP-16412 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.3.0 Reporter: Steve Loughran now that S3Guard tables support etags and version IDs, we should do more to populate this. # listStatus/listFiles doesn't give us all the information; the AWS v1 and v2 list operations only return the etags # a treewalk on import with a HEAD on each object would be expensive and slow What we can do is, on a getFileStatus call, update version markers to any S3Guard table entry where * the etag is already in the S3Guard entry * the probe of the store returns an entry with the same etag and a version ID In that situation we know the S3 data and S3Guard data are consistent, so updating the version ID fills out the data. We could also think about updating etags from entries created by older versions of S3Guard; it'd be a bit trickier there to decide if the S3 store entry was current. Probably safest to leave alone... -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org