[jira] [Created] (HADOOP-16418) Fix checkstyle and findbugs warnings in hadoop-dynamometer

2019-07-08 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created HADOOP-16418:
-

 Summary: Fix checkstyle and findbugs warnings in hadoop-dynamometer
 Key: HADOOP-16418
 URL: https://issues.apache.org/jira/browse/HADOOP-16418
 Project: Hadoop Common
  Issue Type: Bug
  Components: tools
Reporter: Masatake Iwasaki






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16417) abfs can't access storage account without password

2019-07-08 Thread Jose Luis Pedrosa (JIRA)
Jose Luis Pedrosa created HADOOP-16417:
--

 Summary: abfs can't access storage account without password
 Key: HADOOP-16417
 URL: https://issues.apache.org/jira/browse/HADOOP-16417
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/azure
Affects Versions: 3.2.0
Reporter: Jose Luis Pedrosa


It does not seem possible to access storage accounts without passwords using 
abfs, but it is possible using wasb.

 

This sample code (Spark based) to illustrate, the following code using 
abfs_path with throw an exception
{noformat}
Exception in thread "main" java.lang.IllegalArgumentException: Invalid account 
key.
at 
org.apache.hadoop.fs.azurebfs.services.SharedKeyCredentials.(SharedKeyCredentials.java:70)
at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:812)
at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.(AzureBlobFileSystemStore.java:149)
at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:108)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
{noformat}
  While using the wasb_path will work normally,
{code:java}
import org.apache.spark.api.java.function.FilterFunction;
import org.apache.spark.sql.RuntimeConfig;
import org.apache.spark.sql.SparkSession;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;

public class SimpleApp {

static String blob_account_name = "azureopendatastorage";
static String blob_container_name = "gfsweatherdatacontainer";
static String blob_relative_path = "GFSWeather/GFSProcessed";
static String blob_sas_token = "";
static String abfs_path = 
"abfs://"+blob_container_name+"@"+blob_account_name+".dfs.core.windows.net/"+blob_relative_path;
static String wasbs_path = "wasbs://"+blob_container_name + 
"@"+blob_account_name+".blob.core.windows.net/" + blob_relative_path;


public static void main(String[] args) {
   
SparkSession spark = SparkSession.builder().appName("NOAAGFS 
Run").getOrCreate();
configureAzureHadoopConnetor(spark);
RuntimeConfig conf = spark.conf();


conf.set("fs.azure.account.key."+blob_account_name+".dfs.core.windows.net", 
blob_sas_token);

conf.set("fs.azure.account.key."+blob_account_name+".blob.core.windows.net", 
blob_sas_token);

System.out.println("Creating parquet dataset");
Dataset logData = spark.read().parquet(abfs_path);

System.out.println("Creating temp view");
logData.createOrReplaceTempView("source");

System.out.println("SQL");
spark.sql("SELECT * FROM source LIMIT 10").show();
spark.stop();
}

public static void configureAzureHadoopConnetor(SparkSession session) {
RuntimeConfig conf = session.conf();


conf.set("fs.AbstractFileSystem.wasb.impl","org.apache.hadoop.fs.azure.Wasb");

conf.set("fs.AbstractFileSystem.wasbs.impl","org.apache.hadoop.fs.azure.Wasbs");

conf.set("fs.wasb.impl","org.apache.hadoop.fs.azure.NativeAzureFileSystem");

conf.set("fs.wasbs.impl","org.apache.hadoop.fs.azure.NativeAzureFileSystem$Secure");

conf.set("fs.azure.secure.mode", false);

conf.set("fs.abfs.impl",  
"org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem");
conf.set("fs.abfss.impl", 
"org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem");


conf.set("fs.AbstractFileSystem.abfs.impl","org.apache.hadoop.fs.azurebfs.Abfs");

conf.set("fs.AbstractFileSystem.abfss.impl","org.apache.hadoop.fs.azurebfs.Abfss");

// Works in conjuction with fs.azure.secure.mode. Setting this config 
to true
//results in fs.azure.NativeAzureFileSystem using the local SAS key 
generation
//where the SAS keys are generating in the same process as 
fs.azure.NativeAzureFileSystem.
//If fs.azure.secure.mode flag is set to false, this flag has no 
effect.
conf.set("fs.azure.local.sas.key.mode", false);
}
}
{code}
Sample build.gradle
{noformat}
plugins {
id 'java'
}

group 'org.samples'
version '1.0-SNAPSHOT'

sourceCompatibility = 1.8

repositories {
mavenCentral()
}

dependencies {
compile  'org.apache.spark:spark-sql_2.12:2.4.3'
}
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Created] (HADOOP-16416) mark DynamoDBMetadataStore.deleteTrackingValueMap as final

2019-07-08 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-16416:
---

 Summary: mark DynamoDBMetadataStore.deleteTrackingValueMap as final
 Key: HADOOP-16416
 URL: https://issues.apache.org/jira/browse/HADOOP-16416
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.2.0
Reporter: Steve Loughran


S3Guard's {{DynamoDBMetadataStore.deleteTrackingValueMap}} field is static and 
can/should be marked as final; its name changed to upper case to match the 
coding conventions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16409) Allow authoritative mode on non-qualified paths

2019-07-08 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16409.
-
   Resolution: Fixed
Fix Version/s: 3.3.0

> Allow authoritative mode on non-qualified paths
> ---
>
> Key: HADOOP-16409
> URL: https://issues.apache.org/jira/browse/HADOOP-16409
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Major
> Fix For: 3.3.0
>
>
> fs.s3a.authoritative.path currently requires a qualified URI (e.g. 
> s3a://bucket/path) which is how I see this being used most immediately, but 
> it also make sense for someone to just be able to configure /path, if all of 
> their buckets follow that pattern, or if they're providing configuration 
> already in a bucket-specific context (e.g. job-level configs, etc.) Just need 
> to qualify whatever is passed in to allowAuthoritative to make that work.
> Also, in HADOOP-16396 Gabor pointed out a few whitepace nits that I neglected 
> to fix before merging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16415) Speed up S3A test runs

2019-07-08 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-16415:
---

 Summary: Speed up S3A test runs
 Key: HADOOP-16415
 URL: https://issues.apache.org/jira/browse/HADOOP-16415
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.3.0
Reporter: Steve Loughran


S3A Test runs are way too slow.

Speed them by

* reducing test setup/teardown costs
* eliminating obsolete test cases
* merge small tests into larger ones.

One thing i see is that the main S3A test cases create and destroy new FS 
instances; There's both a setup and teardown cost there, but it does guarantee 
better isolation.

Maybe if we know all test cases in a specific suite need the same options, we 
can manage that better; demand create the FS but only delete it in an 
@Afterclass method. That'd give us the OO-inheritance based setup of tests, but 
mean only one instance is done per suite



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-07-08 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/

No changes




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore
 
   Unread field:TimelineEventSubDoc.java:[line 56] 
   Unread field:TimelineMetricSubDoc.java:[line 44] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core
 
   Class org.apache.hadoop.applications.mawo.server.common.TaskStatus 
implements Cloneable but does not define or use clone method At 
TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 
39-346] 
   Equals method for 
org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument 
is of type WorkerId At WorkerId.java:the argument is of type WorkerId At 
WorkerId.java:[line 114] 
   
org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does 
not check for null argument At WorkerId.java:null argument At 
WorkerId.java:[lines 114-115] 

FindBugs :

   module:hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra 
   org.apache.hadoop.tools.dynamometer.Client.addFileToZipRecursively(File, 
File, ZipOutputStream) may fail to clean up java.io.InputStream on checked 
exception Obligation to clean up resource created at Client.java:to clean up 
java.io.InputStream on checked exception Obligation to clean up resource 
created at Client.java:[line 859] is not discharged 
   Exceptional return value of java.io.File.mkdirs() ignored in 
org.apache.hadoop.tools.dynamometer.DynoInfraUtils.fetchHadoopTarball(File, 
String, Configuration, Logger) At DynoInfraUtils.java:ignored in 
org.apache.hadoop.tools.dynamometer.DynoInfraUtils.fetchHadoopTarball(File, 
String, Configuration, Logger) At DynoInfraUtils.java:[line 138] 
   Found reliance on default encoding in 
org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]):in 
org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]): new 
java.io.InputStreamReader(InputStream) At SimulatedDataNodes.java:[line 149] 
   org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]) 
invokes System.exit(...), which shuts down the entire virtual machine At 
SimulatedDataNodes.java:down the entire virtual machine At 
SimulatedDataNodes.java:[line 123] 
   org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]) may 
fail to close stream At SimulatedDataNodes.java:stream At 
SimulatedDataNodes.java:[line 149] 

FindBugs :

   module:hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-blockgen 
   Self assignment of field BlockInfo.replication in new 
org.apache.hadoop.tools.dynamometer.blockgenerator.BlockInfo(BlockInfo) At 
BlockInfo.java:in new 
org.apache.hadoop.tools.dynamometer.blockgenerator.BlockInfo(BlockInfo) At 
BlockInfo.java:[line 78] 

Failed junit tests :

   hadoop.util.TestDiskCheckerWithDiskIo 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup 
   hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken 
   hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination 
   hadoop.ozone.container.ozoneimpl.TestOzoneContainer 
   hadoop.ozone.om.TestOzoneManagerHA 
   hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis 
   hadoop.ozone.client.rpc.TestOzoneAtRestEncryption 
   hadoop.ozone.client.rpc.TestOzoneRpcClient 
   hadoop.ozone.client.rpc.TestSecureOzoneRpcClient 
   hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-compile-javac-root.txt
  [336K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-checkstyle-root.txt
  [17M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-patch-hadolint.txt
  [8.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-patch-pylint.txt
  [120K]

   shellcheck:

   

[jira] [Created] (HADOOP-16414) ITestS3AMiniYarnCluster fails on sequential runs with Kerberos error

2019-07-08 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-16414:
---

 Summary: ITestS3AMiniYarnCluster fails on sequential runs with 
Kerberos error
 Key: HADOOP-16414
 URL: https://issues.apache.org/jira/browse/HADOOP-16414
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3, test
Affects Versions: 3.3.0
Reporter: Steve Loughran


If you do a sequential test run of hadoop-aws, you get a failure on 
{{ITestS3AMiniYarnCluster}}, with a message about Kerberos coming from inside 
job launch.

{code}
[ERROR] 
testWithMiniCluster(org.apache.hadoop.fs.s3a.yarn.ITestS3AMiniYarnCluster)  
Time elapsed: 3.438 s  <<< ERROR!
java.io.IOException: Can't get Master Kerberos principal for use as renewer
at 
org.apache.hadoop.fs.s3a.yarn.ITestS3AMiniYarnCluster.testWithMiniCluster(ITestS3AMiniYarnCluster.java:117)
{code}

Assumption: some state in the single JVM is making this test think it should be 
using Kerberos.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2019-07-08 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/

No changes




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   module:hadoop-common-project/hadoop-common 
   Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient 
non-serializable instance field map In GlobalStorageStatistics.java:instance 
field map In GlobalStorageStatistics.java 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead 
   hadoop.registry.secure.TestSecureLogins 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 
   hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 
   hadoop.yarn.sls.TestSLSRunner 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [328K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-cc-root-jdk1.8.0_212.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-javac-root-jdk1.8.0_212.txt
  [308K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-shellcheck.txt
  [72K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/whitespace-tabs.txt
  [1.2M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/xml.txt
  [12K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_212.txt
  [1.1M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [228K]
   

[Hadoop] Question about protobuf

2019-07-08 Thread Zhenyu Zheng
Hi Hadoop:

I was trying to build Hadoop on AArch64 platform recently, and I've
encontered some issues, one of them is about protobuf 2.5.0, the lowest
version of protobuf can be provided on AArch64 is 2.6.1 and that causes
version mismatch error when try to build Hadoop as hadoop requires 2.5.0, I
noticed there was an issues filed 3 years ago:
https://issues.apache.org/jira/browse/HADOOP-13363 and there was quite a
lot discussion at the beginning but not much actions since last year. I'm
just wondering, are there any conclusion or updates on this issue? Or are
there any official walkaround on this issue? Because I can find some pretty
hacky way to solve the problem but seems too hacky... for example:
https://groups.google.com/forum/#!topic/protobuf/fwLF5_t3q3U and
https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/do-component-build#L45-L105

Thanks in advance,

Kevin Zheng


[jira] [Created] (HADOOP-16413) ITestS3ARemoteFileChanged doesn't overwrite test data creation

2019-07-08 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-16413:
---

 Summary: ITestS3ARemoteFileChanged doesn't overwrite test data 
creation
 Key: HADOOP-16413
 URL: https://issues.apache.org/jira/browse/HADOOP-16413
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3, test
Affects Versions: 3.3.0
Reporter: Steve Loughran


The tests in {{ITestS3ARemoteFileChanged}} write files with overwrite = false, 
and so when run against a store which (for any reason) already has those files, 
the tests fail.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16412) S3a getFileStatus to update DDB if an S3 query returns etag/versionID

2019-07-08 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-16412:
---

 Summary: S3a getFileStatus to update DDB if an S3 query returns 
etag/versionID
 Key: HADOOP-16412
 URL: https://issues.apache.org/jira/browse/HADOOP-16412
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.3.0
Reporter: Steve Loughran


now that S3Guard tables support etags and version IDs, we should do more to 
populate this.

# listStatus/listFiles doesn't give us all the information; the AWS v1 and v2 
list operations only return the etags
# a treewalk on import with a HEAD on each object would be expensive and slow

What we can do is, on a getFileStatus call, update version markers to any 
S3Guard table entry where

* the etag is already in the S3Guard entry
* the probe of the store returns an entry with the same etag and a version ID

In that situation we know the S3 data and S3Guard data are consistent, so 
updating the version ID fills out the data. 

We could also think about updating etags from entries created by older versions 
of S3Guard; it'd be a bit trickier there to decide if the S3 store entry was 
current. Probably safest to leave alone...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org