[ 
https://issues.apache.org/jira/browse/HADOOP-17313?focusedWorklogId=514259&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-514259
 ]

ASF GitHub Bot logged work on HADOOP-17313:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 19/Nov/20 18:15
            Start Date: 19/Nov/20 18:15
    Worklog Time Spent: 10m 
      Work Description: hadoop-yetus commented on pull request #2396:
URL: https://github.com/apache/hadoop/pull/2396#issuecomment-730550232


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 34s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 44s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  22m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  17m 55s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 51s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 31s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 20s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 24s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 42s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  20m 55s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m  9s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 46s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 29s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  15m 44s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 55s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  10m  0s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   1m 40s |  |  hadoop-aws in the patch passed. 
 |
   | -1 :x: |  asflicense  |   0m 56s | 
[/patch-asflicense-problems.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2396/5/artifact/out/patch-asflicense-problems.txt)
 |  The patch generated 2 ASF License warnings.  |
   |  |   | 194m 37s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2396/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2396 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle markdownlint |
   | uname | Linux 1c1e123c2239 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 34aa6137bd8 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2396/5/testReport/ |
   | Max. process+thread count | 2427 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws 
U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2396/5/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 514259)
    Time Spent: 4h  (was: 3h 50m)

> FileSystem.get to support slow-to-instantiate FS clients
> --------------------------------------------------------
>
>                 Key: HADOOP-17313
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17313
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs, fs/azure, fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> A recurrent problem in processes with many worker threads (hive, spark etc) 
> is that calling `FileSystem.get(URI-to-object-store)` triggers the creation 
> and then discard of many FS clients -all but one for the same URL. As well as 
> the direct performance hit, this can exacerbate locking problems and make 
> instantiation a lot slower than it would otherwise be.
> This has been observed with the S3A and ABFS connectors.
> The ultimate solution here would probably be something more complicated to 
> ensure that only one thread was ever creating a connector for a given URL 
> -the rest would wait for it to be initialized. This would (a) reduce 
> contention & CPU, IO network load, and (b) reduce the time for all but the 
> first thread to resume processing to that of the remaining time in 
> .initialize(). This would also benefit the S3A connector.
> We'd need something like
> # A (per-user) map of filesystems being created <URI, FileSystem>
> # split createFileSystem into two: instantiateFileSystem and 
> initializeFileSystem
> # each thread to instantiate the FS, put() it into the new map
> # If there was one already, discard the old one and wait for the new one to 
> be ready via a call to Object.wait()
> # If there wasn't an entry, call initializeFileSystem) and then, finally, 
> call Object.notifyAll(), and move it from the map of filesystems being 
> initialized to the map of created filesystems
> This sounds too straightforward to be that simple; the troublespots are 
> probably related to race conditions moving entries between the two maps and 
> making sure that no thread will block on the FS being initialized while it 
> has already been initialized (and so wait() will block forever).
> Rather than seek perfection, it may be safest go for a best-effort 
> optimisation of the #of FS instances created/initialized. That is: its better 
> to maybe create a few more FS instances than needed than it is to block 
> forever.
> Something is doable here, it's just not quick-and-dirty. Testing will be 
> "fun"; probably best to isolate this new logic somewhere where we can 
> simulate slow starts on one thread with many other threads waiting for it.
> A simpler option would be to have a lock on the construction process: only 
> one FS can be instantiated per user at a a time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to