[GitHub] [hadoop] pranavsaxena-microsoft commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed
pranavsaxena-microsoft commented on PR #5176: URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1343863437 > getting a test failure locally, ITestReadBufferManager failing as one of its asserts isn't valid. > > going to reopen the jira @pranavsaxena-microsoft can you see if you can replicate the problem and add a followup patch (use the same jira). do make sure you are running this test _first_, and that it is failing for you. thanks > > ``` > INFO] Running org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager > [ERROR] Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.816 s <<< FAILURE! - in org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager > [ERROR] testPurgeBufferManagerForSequentialStream(org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager) Time elapsed: 1.995 s <<< FAILURE! > java.lang.AssertionError: > [Buffers associated with closed input streams shouldn't be present] > Expecting: > gauges=(); > minimums=((action_http_get_request.failures.min=-1) (action_http_get_request.min=-1)); > maximums=((action_http_get_request.max=-1) (action_http_get_request.failures.max=-1)); > means=((action_http_get_request.failures.mean=(samples=0, sum=0, mean=0.)) (action_http_get_request.mean=(samples=0, sum=0, mean=0.))); > }AbfsInputStream@(1517329307){StreamStatistics{counters=((stream_read_seek_bytes_skipped=0) (seek_in_buffer=0) (stream_read_bytes=1) (stream_read_seek_operations=0) (remote_bytes_read=81920) (stream_read_operations=1) (bytes_read_buffer=1) (action_http_get_request.failures=0) (action_http_get_request=0) (stream_read_seek_forward_operations=0) (stream_read_bytes_backwards_on_seek=0) (read_ahead_bytes_read=16384) (stream_read_seek_backward_operations=0) (remote_read_op=8)); > gauges=(); > minimums=((action_http_get_request.min=-1) (action_http_get_request.failures.min=-1)); > maximums=((action_http_get_request.max=-1) (action_http_get_request.failures.max=-1)); > means=((action_http_get_request.mean=(samples=0, sum=0, mean=0.)) (action_http_get_request.failures.mean=(samples=0, sum=0, mean=0.))); > }}> > not to be equal to: > gauges=(); > minimums=((action_http_get_request.min=-1) (action_http_get_request.failures.min=-1)); > maximums=((action_http_get_request.max=-1) (action_http_get_request.failures.max=-1)); > means=((action_http_get_request.mean=(samples=0, sum=0, mean=0.)) (action_http_get_request.failures.mean=(samples=0, sum=0, mean=0.))); > }AbfsInputStream@(1517329307){StreamStatistics{counters=((remote_read_op=8) (stream_read_seek_forward_operations=0) (stream_read_seek_backward_operations=0) (read_ahead_bytes_read=16384) (action_http_get_request.failures=0) (bytes_read_buffer=1) (stream_read_seek_operations=0) (stream_read_bytes=1) (stream_read_bytes_backwards_on_seek=0) (action_http_get_request=0) (seek_in_buffer=0) (stream_read_seek_bytes_skipped=0) (remote_bytes_read=81920) (stream_read_operations=1)); > gauges=(); > minimums=((action_http_get_request.failures.min=-1) (action_http_get_request.min=-1)); > maximums=((action_http_get_request.failures.max=-1) (action_http_get_request.max=-1)); > means=((action_http_get_request.failures.mean=(samples=0, sum=0, mean=0.)) (action_http_get_request.mean=(samples=0, sum=0, mean=0.))); > }}> > >at org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager.assertListDoesnotContainBuffersForIstream(ITestReadBufferManager.java:145) >at org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager.testPurgeBufferManagerForSequentialStream(ITestReadBufferManager.java:120) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >at java.lang.reflect.Method.invoke(Method.java:498) >at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) >at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) >at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) >at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) >at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) >at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) >at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) >at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) >at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) >at java.util.concurrent.FutureTask.run(FutureTask.java:266) >at jav
[GitHub] [hadoop] pranavsaxena-microsoft commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed
pranavsaxena-microsoft commented on PR #5176: URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1339119467 > one final change; the cleanup of the input stream in the test. > > giving a +1 pending that, and I'm going to test this through spark today ... writing a test to do replicate the failure and then verify that all is good when the jar is update Thanks. We are doing inputStream.close() at https://github.com/apache/hadoop/pull/5176/files#diff-bdc464e1bfa3d270e552bdf740fc29ec808be9ab2c4f77a99bf896ac605a5698R546. Kindly advise please what is expected from the inputStream cleanup. I agree to the comment for String.format, I shall refactor the code accordingly. Regards. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] pranavsaxena-microsoft commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed
pranavsaxena-microsoft commented on PR #5176: URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1339051008 > sorry, should have been clearer: a local spark build and spark-shell process is ideal for replication and validation -as all splits are processed in different worker threads in that process, it recreates the exact failure mode. > > script you can take and tune for your system; uses the mkcsv command in cloudstore JAR. > > I am going to add this as a scalatest suite in the same module https://github.com/hortonworks-spark/cloud-integration/blob/master/spark-cloud-integration/src/scripts/validating-csv-record-io.sc Thanks for the script. I had applied following changes on the script: https://github.com/pranavsaxena-microsoft/cloud-integration/commit/1d779f22150be3102635819e4525967573602dd9. On trunk's jar, got exception: ``` 22/12/05 23:51:27 ERROR Executor: Exception in task 4.0 in stage 1.0 (TID 5) java.lang.NullPointerException: Null value appeared in non-nullable field: - field (class: "scala.Long", name: "rowId") - root class: "$line85.$read.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.CsvRecord" If the schema is inferred from a Scala tuple/case class, or a Java bean, please try to use scala.Option[_] or other nullable types (e.g. java.lang.Integer instead of int/scala.Int). at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply_0_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown Source) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at scala.collection.Iterator.foreach(Iterator.scala:943) at scala.collection.Iterator.foreach$(Iterator.scala:943) at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) at org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:1001) at org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:1001) at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2302) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) at org.apache.spark.scheduler.Task.run(Task.scala:139) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1502) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) ``` Using the jar of the PR's code: ``` minimums=((action_http_get_request.min=-1) (action_http_get_request.failures.min=-1)); maximums=((action_http_get_request.max=-1) (action_http_get_request.failures.max=-1)); means=((action_http_get_request.failures.mean=(samples=0, sum=0, mean=0.)) (action_http_get_request.mean=(samples=0, sum=0, mean=0.))); }} 22/12/06 01:04:22 INFO TaskSetManager: Finished task 8.0 in stage 1.0 (TID 9) in 14727 ms on snvijaya-Virtual-Machine.mshome.net (executor driver) (9/9) 22/12/06 01:04:22 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 22/12/06 01:04:22 INFO DAGScheduler: ResultStage 1 (foreach at /home/snvijaya/Desktop/cloud-integration/spark-cloud-integration/src/scripts/validating-csv-record-io.sc:46) finished in 115.333 s 22/12/06 01:04:22 INFO DAGScheduler: Job 1 is finished. Cancelling potential speculative or zombie tasks for this job 22/12/06 01:04:22 INFO TaskSchedulerImpl: Killing all running tasks in stage 1: Stage finished 22/12/06 01:04:22 INFO DAGScheduler: Job 1 finished: foreach at /home/snvijaya/Desktop/cloud-integration/spark-cloud-integration/src/scripts/validating-csv-record-io.sc:46, took 115.337621 s res35: String = validation completed [start: string, rowId: bigint ... 6 more fields] ``` Commands executed: ``` :load /home/snvijaya/Desktop/cloud-integration/spark-cloud-integration/src/scripts/validating-csv-record-io.sc validateDS(rowsDS) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.
[GitHub] [hadoop] pranavsaxena-microsoft commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed
pranavsaxena-microsoft commented on PR #5176: URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1334834625 - Test results - [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [ERROR] Errors: [ERROR] TestExponentialRetryPolicy.testOperationOnAccountIdle:216 » AccessDenied Opera... [INFO] [ERROR] Tests run: 111, Failures: 1, Errors: 1, Skipped: 1 [INFO] Results: [INFO] [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:329 » TestTimedOut test timed o... [ERROR] ITestAzureBlobFileSystemOauth.testBlobDataContributor:84 » AccessDenied Operat... [ERROR] ITestAzureBlobFileSystemOauth.testBlobDataReader:143 » AccessDenied Operation ... [INFO] [ERROR] Tests run: 567, Failures: 0, Errors: 3, Skipped: 99 [INFO] Results: [INFO] [ERROR] Failures: [ERROR] ITestAbfsFileSystemContractSeek.testSeekAndReadWithReadAhead:130->assertNoIncrementInRemoteReadOps:258 [Number of remote read ops shouldn't increase] expected:<[1]L> but was:<[2]L> [ERROR] Errors: [ERROR] ITestAbfsTerasort.test_120_terasort:262->executeStage:206 » IO The ownership o... [INFO] [ERROR] Tests run: 335, Failures: 1, Errors: 1, Skipped: 54 Time taken: 9 mins 40 secs. Find test result for the combination (AppendBlob-HNS-OAuth) in: dev-support/testlogs/2022-12-02_06-12-45/Test-Logs-AppendBlob-HNS-OAuth.txt Consolidated test result is saved in: dev-support/testlogs/2022-12-02_06-12-45/Test-Results.txt AGGREGATED TEST RESULT HNS-OAuth [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [ERROR] Errors: [ERROR] TestExponentialRetryPolicy.testOperationOnAccountIdle:216 » AccessDenied Opera... [INFO] [ERROR] Tests run: 111, Failures: 1, Errors: 1, Skipped: 1 [INFO] Results: [INFO] [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:329 » TestTimedOut test timed o... [ERROR] ITestAzureBlobFileSystemOauth.testBlobDataContributor:84 » AccessDenied Operat... [ERROR] ITestAzureBlobFileSystemOauth.testBlobDataReader:143 » AccessDenied Operation ... [INFO] [ERROR] Tests run: 567, Failures: 0, Errors: 3, Skipped: 99 [INFO] Results: [INFO] [ERROR] Failures: [ERROR] ITestAbfsFileSystemContractSeek.testSeekAndReadWithReadAhead:130->assertNoIncrementInRemoteReadOps:258 [Number of remote read ops shouldn't increase] expected:<[1]L> but was:<[2]L> [ERROR] Errors: [ERROR] ITestAbfsTerasort.test_120_terasort:262->executeStage:206 » IO The ownership o... [INFO] [ERROR] Tests run: 335, Failures: 1, Errors: 1, Skipped: 54 HNS-SharedKey [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [ERROR] TestAbfsClientThrottlingAnalyzer.testManySuccessAndErrorsAndWaiting:181->fuzzyValidate:64 The actual value 9 is not within the expected range: [5.60, 8.40]. [ERROR] Errors: [ERROR] TestExponentialRetryPolicy.testOperationOnAccountIdle:216 » AccessDenied Opera... [INFO] [ERROR] Tests run: 111, Failures: 2, Errors: 1, Skipped: 2 [INFO] Results: [INFO] [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:329 » TestTimedOut test timed o... [INFO] [ERROR] Tests run: 567, Failures: 0, Errors: 1, Skipped: 54 [INFO] Results: [INFO] [ERROR] Failures: [ERROR] ITestAbfsFileSystemContractSeek.testSeekAndReadWithReadAhead:130->assertNoIncrementInRemoteReadOps:258 [Number of remote read ops shouldn't increase] expected:<[1]L> but was:<[2]L> [INFO] [ERROR] Tests run: 335, Failures: 1, Errors: 0, Skipped: 41 NonHNS-SharedKey [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [ERROR] Errors: [ERROR] TestExponentialRetryPolic
[GitHub] [hadoop] pranavsaxena-microsoft commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed
pranavsaxena-microsoft commented on PR #5176: URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1333736063 > ReadAhead feature can reenabled back by default as we are undoing the known problem in the corruption issue reported before. Please include the change into this PR. > > Also have some comments on tests. Please take a look. Commit https://github.com/apache/hadoop/commit/69e50c7b4499bffc1eb372799ccba3f26c5fe54e ([HADOOP-18528](https://issues.apache.org/jira/browse/HADOOP-18528). Disable abfs prefetching by default (https://github.com/apache/hadoop/pull/5134)) is reverted in the PR on commit: https://github.com/apache/hadoop/pull/5176/commits/02d39ca453c35cfe69c7c78ed3fcae00c7211615. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org