[GitHub] [hadoop] steveloughran commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed

2022-12-09 Thread GitBox


steveloughran commented on PR #5176:
URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1344513110

   #5205 is another followup with the logging and a probe through path 
capabilities; this allows me to verify that backports are in.
   
   an abfs instance is vulnerable if
   ```java
   fs.hasPathcapability("fs.capability.paths.acls") && 
!fs.hasPathcapability("HADOOP-18546")
   ```
   if that holds, then you need to make sure readahead is disabled/no queue 
depth. setting queue depth is the one guaranteed to work everywhere.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed

2022-12-09 Thread GitBox


steveloughran commented on PR #5176:
URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1344461171

   (oh, and on my personal backport I have added a TRACE log in the buffer 
manager to record its state; abfsInputStream.toString does it too.
   ```
 private ReadBufferManager() {
   LOGGER.trace("Creating readbuffer manager with HADOOP-18546 patch");
 }
   ```
   think i will retain those internally for a debug option


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed

2022-12-09 Thread GitBox


steveloughran commented on PR #5176:
URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1344325189

   update: full e2e tests through spark shell are happy! i was trying to do 
scalatest tests for this but not been able to replicate the test failure 
through my test suite (which rebuilds the .csv file every run, so was also v. 
slow). with manual tests running and #5198 in then all is good. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed

2022-12-08 Thread GitBox


steveloughran commented on PR #5176:
URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1343192257

   it's a race condition in the test, which is why you didn't see 
it...different machine, network etc. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed

2022-12-08 Thread GitBox


steveloughran commented on PR #5176:
URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1343033763

   getting a test failure locally, ITestReadBufferManager failing as one of its 
asserts isn't valid.
   
   going to reopen the jira
   @pranavsaxena-microsoft can you see if you can replicate the problem and add 
a followup patch (use the same jira). 
   do make sure you are running this test *first*, and that it is failing for 
you. thanks
   
   ```
   INFO] Running org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager
   [ERROR] Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
3.816 s <<< FAILURE! - in 
org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager
   [ERROR] 
testPurgeBufferManagerForSequentialStream(org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager)
  Time elapsed: 1.995 s  <<< FAILURE!
   java.lang.AssertionError:
   [Buffers associated with closed input streams shouldn't be present]
   Expecting:


   not to be equal to:


   
at 
org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager.assertListDoesnotContainBuffersForIstream(ITestReadBufferManager.java:145)
at 
org.apache.hadoop.fs.azurebfs.services.ITestReadBufferManager.testPurgeBufferManagerForSequentialStream(ITestReadBufferManager.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)
   
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed

2022-12-06 Thread GitBox


steveloughran commented on PR #5176:
URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1339526719

   clarified the cleanup problem


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5176: HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed

2022-12-03 Thread GitBox


steveloughran commented on PR #5176:
URL: https://github.com/apache/hadoop/pull/5176#issuecomment-1336158644

   sorry, should have been clearer: a local spark build and spark-shell process 
is ideal for replication and validation -as all splits are processed in 
different worker threads in that process, it recreates the exact failure mode.
   
   script you can take and tune for your system; uses the mkcsv command in 
cloudstore JAR.
   
   I am going to add this as a scalatest suite in the same module
   
https://github.com/hortonworks-spark/cloud-integration/blob/master/spark-cloud-integration/src/scripts/validating-csv-record-io.sc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org