[ 
https://issues.apache.org/jira/browse/HADOOP-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831082#comment-17831082
 ] 

ASF GitHub Bot commented on HADOOP-19093:
-----------------------------------------

steveloughran commented on PR #6596:
URL: https://github.com/apache/hadoop/pull/6596#issuecomment-2021481003

   I am having fun testing at scale with a throttled store. 
   ```
   [ERROR] 
testSeekZeroByteFile(org.apache.hadoop.fs.azurebfs.contract.ITestAbfsFileSystemContractSeek)
  Time elapsed: 0.339 s  <<< ERROR!
   java.io.IOException: Failed with java.io.IOException while processing 
file/directory :[/fork-0003/test/seekfile.txt] in method:[Operation failed: 
"The resource was created or modified by the Azure Blob Service API and cannot 
be written to by the Azure Data Lake Storage Service API.", 409, PUT, 
https://stevelukwest.dfs.core.windows.net/stevel-testing/fork-0003/test/seekfile.txt?action=flush&retainUncommittedData=false&position=1024&close=true&timeout=90,
 rId: 2c806902-b01f-0087-20c1-7fec44000000, InvalidFlushOperation, "The 
resource was created or modified by the Azure Blob Service API and cannot be 
written to by the Azure Data Lake Storage Service API. 
RequestId:2c806902-b01f-0087-20c1-7fec44000000 
Time:2024-03-26T21:04:12.1799296Z"]
   ```
   
   + likely regression from the manifest changes
   ```
   [ERROR] Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
10.484 s <<< FAILURE! - in 
org.apache.hadoop.fs.azurebfs.commit.ITestAbfsCreateOutputDirectoriesStage
   [ERROR] 
testPrepareDirtyTree(org.apache.hadoop.fs.azurebfs.commit.ITestAbfsCreateOutputDirectoriesStage)
  Time elapsed: 9.767 s  <<< FAILURE!
   java.lang.AssertionError: Expected a java.io.IOException to be thrown, but 
got the result: : Result{directory count=64}
           at 
org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:499)
           at 
org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:384)
           at 
org.apache.hadoop.mapreduce.lib.output.committer.manifest.TestCreateOutputDirectoriesStage.testPrepareDirtyTree(TestCreateOutputDirectoriesStage.java:254)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at 
   ```
   
   ```
   [ERROR] 
test_120_terasort(org.apache.hadoop.fs.azurebfs.commit.ITestAbfsTerasort)  Time 
elapsed: 27.232 s  <<< ERROR!
   java.io.IOException: 
   java.io.IOException: Unknown Job job_1711487009902_0002
           at 
org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:240)
           at 
org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getCounters(HistoryClientService.java:254)
           at 
org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getCounters(MRClientProtocolPBServiceImpl.java:159)
           at 
org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:287)
           at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
           at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
           at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
           at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1249)
           at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1172)
   ```
    will need to isolate




> Improve rate limiting through ABFS in Manifest Committer
> --------------------------------------------------------
>
>                 Key: HADOOP-19093
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19093
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure, test
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>
> I need a load test to verify that the rename resilience of the manifest 
> committer actually works as intended
> * test suite with name ILoadTest* prefix (as with s3)
> * parallel test running with many threads doing many renames
> * verify that rename recovery should be detected
> * and that all renames MUST NOT fail.
> maybe also: metrics for this in fs and doc update. 
> Possibly; LogExactlyOnce to warn of load issues



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to