[ 
https://issues.apache.org/jira/browse/VFS-862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilan Goldfeld updated VFS-862:
------------------------------
    Description: 
h2. Description
h3. Steps to reproduce
{code:java}
DefaultFileSystemManager manager = new DefaultFileSystemManager();
manager.addProvider("ftp", new FtpFileProvider());
manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
manager.init();
{code}
*Example 1 – pre-existing, before PR #758:*
Checking existence of two sibling files should require 1 LIST on the parent, 
but the second file's internal {{getParent()}} re-resolves the parent via 
{{{}resolveFile(){}}}, triggering an ON_RESOLVE refresh that forces a redundant 
LIST.
{code:java}
FileObject file1 = manager.resolveFile("ftp://user:pass@host/dir/file1.txt";, 
options);
FileObject file2 = manager.resolveFile("ftp://user:pass@host/dir/file2.txt";, 
options);
file1.exists();  // LIST on parent to determine type
file2.exists();  // redundant LIST — getParent() refreshes parent via ON_RESOLVE
// Expected: 1 LIST
// Actual:   2 LIST commands
{code}
*Example 2 – after PR #758, amplified to O(N):*
Scanning a directory with N files now produces ~N LIST commands instead of 1.
{code:java}
FileObject dir = manager.resolveFile("ftp://user:pass@host/directory";, options);
dir.refresh();
FileObject[] children = dir.findFiles(new FileDepthSelector(1, 1));
// Expected: 1 LIST
// Actual:   ~N LIST commands — each child's getParent() refreshes the parent,
//           and the unconditional childMap = null in refresh() clears the
//           cached listing every time
{code}
h3. The issue

{{CacheStrategy.ON_RESOLVE}} is designed to refresh a file when the user 
resolves it via the public API. However, internal VFS navigation methods also 
call {{{}fileSystem.resolveFile(){}}}, which triggers the same ON_RESOLVE 
refresh. This means that internal plumbing – not user actions – invalidates 
cached state on files the user never asked to refresh.

Affected internal call sites:
 * {{AbstractFileObject.getParent()}} – resolves the parent directory
 * {{AbstractFileObject.getChildren()}} – resolves each child via 
{{resolveFiles()}}
 * {{AbstractFileSystem.getRoot()}} – resolves the root file object (called by 
{{getParent()}} for the root check)
 * {{FtpFileObject.getLinkDestination()}} – resolves symlink targets
 * {{SftpFileObject.doListChildrenResolved()}} – resolves children during SFTP 
listing

For example, resolving two sibling files and checking their existence should 
require a single LIST on the parent directory. Instead, the second file's 
internal {{getParent()}} call re-resolves the parent via {{{}resolveFile(){}}}, 
triggering an ON_RESOLVE refresh that clears the parent's cached directory 
listing, forcing a redundant LIST.
h3. History

This was always a wasteful pattern, but it became a severe performance 
regression after PR #758 ("Fix refresh() not clearing cached state when file 
was never attached").

PR #758 addressed a real correctness issue: provider-specific cached fields 
like {{FtpFileObject.childMap}} could be populated without going through 
{{attach()}} (e.g. via {{getChildFile()}} -> {{{}doGetChildren(){}}}). When 
{{refresh()}} was called on such a file, {{detach()}} skipped {{doDetach()}} 
because {{{}attached == false{}}}, silently preserving stale data. The fix 
added {{childMap = null}} to {{FtpFileObject.refresh()}} to ensure cached state 
is always cleared on refresh.

However, this turned the unnecessary internal refreshes into an O(N) 
amplification: when scanning a directory with N files via {{{}findFiles(){}}}, 
each child's {{getType()}} call triggers {{{}getParent(){}}}, which re-resolves 
the parent via {{{}resolveFile(){}}}. With ON_RESOLVE, this refreshes the 
parent, and the new unconditional {{childMap = null}} clears the parent's 
cached directory listing for every child – forcing a new FTP LIST command each 
time. A directory with 50 files produces ~50 LIST commands instead of 1.

Before PR #758, the parent's {{childMap}} survived these unnecessary refreshes 
because {{refresh()}} only cleared {{childMap}} via {{{}doDetach(){}}}, which 
was guarded by {{{}if (attached){}}}. After the first child's refresh set the 
parent to {{{}attached=false{}}}, subsequent refreshes left {{childMap}} 
intact. PR #758 removed this accidental protection.

Even without PR #758, the pattern was wasteful. For example, resolving two 
sibling files and checking their existence would produce 2 LIST commands 
instead of 1, because the second file's internal {{getParent()}} call refreshes 
the parent (clearing its state via {{doDetach()}} when the parent is attached).
h3. Fix

The fix has two parts:

*1. Internal navigation skips ON_RESOLVE.* A new {{resolveFileInternal()}} 
method resolves files from the cache without applying the ON_RESOLVE refresh 
policy. All internal navigation call sites use this instead of the public 
{{{}resolveFile(){}}}:
 * {{AbstractFileObject.getParent()}} – parent resolution
 * {{AbstractFileObject.resolveFile(FileName)}} – child resolution in 
{{{}resolveFiles(){}}}/{{{}getChild(){}}}
 * {{AbstractFileSystem.getRoot()}} – root file object lookup
 * {{FtpFileObject.getLinkDestination()}} – symlink target resolution
 * {{SftpFileObject.doListChildrenResolved()}} – SFTP child resolution

A protected bridge method in {{AbstractFileObject}} makes 
{{resolveFileInternal()}} available to provider subclasses.

*2. Fresh directory listings propagate metadata to cached children in-place.* 
When a directory listing is performed after a refresh, FTP and SFTP providers 
update cached child objects directly with the fresh metadata from the listing, 
preserving object identity:
 * {*}FTP{*}: {{FtpFileObject.getChildren()}} detects a fresh LIST (because 
{{childMap}} was null after refresh). After the LIST populates 
{{{}childMap{}}}, it iterates the returned children and sets each child's 
{{ftpFile}} to the matching entry from the fresh {{{}childMap{}}}, then clears 
the cached {{type}} via {{injectType(null)}} so it re-evaluates from the fresh 
metadata.
 * {*}SFTP{*}: {{SftpFileObject.doListChildrenResolved()}} already sets fresh 
{{attrs}} on each child via {{{}setStat(){}}}. Now it also clears the cached 
{{type}} via {{injectType(null)}} so it re-evaluates from the updated 
{{{}attrs{}}}.

This approach was chosen over cache eviction (removing and recreating child 
objects) because eviction breaks object identity – existing VFS tests assert 
that resolving the same path returns the same FileObject instance. In-place 
propagation updates the same objects with fresh data.

Other providers like SMB and local files don't need this – their 
{{doGetType()}} resolves directly from the server/filesystem on every call, so 
after {{detach()}} clears the cached {{{}type{}}}, children naturally get fresh 
data.

Together, these changes establish a clear contract: *cached state is used until 
the user explicitly calls {{refresh()}} or resolves a file via the public API.* 
Internal navigation never triggers server operations. When a directory is 
refreshed, both the directory listing and its children are fresh.

 h3. Jackrabbit 1.x test fix

  The {{resolveFileInternal}} change in {{getParent()}} exposed a pre-existing 
bug in the Jackrabbit 1.x embedded test server 
({{WebdavProviderTest.testFileCreate}} failed with "expected size 0, got 112").

  *Root cause:* Jackrabbit 1.x bundles Jetty 6.x, which does not drain 
unconsumed request bodies before sending error responses on persistent HTTP 
connections. When a PROPFIND with a 112-byte XML request body
  receives a 404, and the connection is reused for a subsequent PUT to the same 
URL, the server reads the stale PROPFIND bytes as the PUT body — storing the 
XML as file content instead of an empty file.

  Previously this was masked: {{getParent()}} used {{resolveFile()}} which 
triggered an ON_RESOLVE refresh on the parent directory, sending an extra 
PROPFIND request that happened to flush the stale bytes from
  the connection. With {{resolveFileInternal}}, the PUT follows the 404 
immediately on the same connection, exposing the server bug.

  This is a known class of issue in Jetty:
  * [jetty#651|https://github.com/jetty/jetty.project/issues/651]
  * [jetty#4117|https://github.com/jetty/jetty.project/issues/4117]
  * [jetty#6168|https://github.com/jetty/jetty.project/issues/6168]

  Jetty 9.4+ drains or closes connections with unconsumed content; Jetty 6.x 
does not.

  *Fix:* Add {{Connection: close}} to the Jackrabbit 1.x test {{HttpClient}} 
configuration, disabling persistent connection reuse for the embedded test 
server. The Jackrabbit 2 tests (newer Jetty + HttpClient
  4.x) pass without any changes. No production VFS code is modified.

  Note: Jackrabbit 1.x tests are already skipped on Java 14+ 
({{<jdk>[14,)</jdk>}} in pom.xml) since Jackrabbit 1.6.5 (2011) cannot run on 
modern JVMs.
 
h3. Tests
 * {{{}FtpGetChildrenListCommandTest{}}}: verifies that {{findFiles()}} on an 
FTP directory with 50 files issues exactly 1 LIST command with 
{{{}CacheStrategy.ON_RESOLVE{}}}.
 * {{{}SftpGetChildrenListCommandTest{}}}: verifies that refreshing an SFTP 
directory and calling {{findFiles()}} returns fresh children reflecting 
filesystem changes, proving in-place metadata propagation works for SFTP.

h3. Compatibility note

This changes the observable behavior of {{CacheStrategy.ON_RESOLVE}} for files 
reached through internal navigation. The following call sites no longer trigger 
ON_RESOLVE refresh:
 * {{AbstractFileObject.getParent()}} – resolving the parent directory
 * {{AbstractFileObject.getChildren()}} / {{getChild()}} – resolving children
 * {{AbstractFileSystem.getRoot()}} – resolving the root file object
 * {{FtpFileObject.getLinkDestination()}} – resolving symlink targets
 * {{SftpFileObject.doListChildrenResolved()}} – resolving children during SFTP 
listing

Previously, these methods called {{{}fileSystem.resolveFile(){}}}, which 
triggered an ON_RESOLVE refresh on each returned file. After this fix, 
internally navigated files are served from the cache and only refreshed when 
the user explicitly calls {{refresh()}} or re-resolves them via the public API.

When a directory is refreshed and its children are listed, the FTP and SFTP 
providers propagate fresh metadata from the listing to cached child objects 
in-place (updating {{{}ftpFile{}}}/{{{}attrs{}}} and clearing the cached 
{{{}type{}}}). Object identity is preserved – the same FileObject instances are 
returned with updated data.

This behavior is consistent with {{CacheStrategy.MANUAL}} and aligns with the 
intended ON_RESOLVE contract: refresh on explicit user resolution, not on 
internal navigation.

  was:
h2. Description

h3. Steps to reproduce

{code:java}
DefaultFileSystemManager manager = new DefaultFileSystemManager();
manager.addProvider("ftp", new FtpFileProvider());
manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
manager.init();
{code}

*Example 1 -- pre-existing, before PR #758:*
Checking existence of two sibling files should require 1 LIST on the parent, 
but the second file's internal {{getParent()}} re-resolves the parent via 
{{resolveFile()}}, triggering an ON_RESOLVE refresh that forces a redundant 
LIST.

{code:java}
FileObject file1 = manager.resolveFile("ftp://user:pass@host/dir/file1.txt";, 
options);
FileObject file2 = manager.resolveFile("ftp://user:pass@host/dir/file2.txt";, 
options);
file1.exists();  // LIST on parent to determine type
file2.exists();  // redundant LIST — getParent() refreshes parent via ON_RESOLVE
// Expected: 1 LIST
// Actual:   2 LIST commands
{code}

*Example 2 -- after PR #758, amplified to O(N):*
Scanning a directory with N files now produces ~N LIST commands instead of 1.

{code:java}
FileObject dir = manager.resolveFile("ftp://user:pass@host/directory";, options);
dir.refresh();
FileObject[] children = dir.findFiles(new FileDepthSelector(1, 1));
// Expected: 1 LIST
// Actual:   ~N LIST commands — each child's getParent() refreshes the parent,
//           and the unconditional childMap = null in refresh() clears the
//           cached listing every time
{code}

h3. The issue

{{CacheStrategy.ON_RESOLVE}} is designed to refresh a file when the user 
resolves it via the public API. However, internal VFS navigation methods also 
call {{fileSystem.resolveFile()}}, which triggers the same ON_RESOLVE refresh. 
This means that internal plumbing -- not user actions -- invalidates cached 
state on files the user never asked to refresh.

Affected internal call sites:
* {{AbstractFileObject.getParent()}} -- resolves the parent directory
* {{AbstractFileObject.getChildren()}} -- resolves each child via 
{{resolveFiles()}}
* {{AbstractFileSystem.getRoot()}} -- resolves the root file object (called by 
{{getParent()}} for the root check)
* {{FtpFileObject.getLinkDestination()}} -- resolves symlink targets
* {{SftpFileObject.doListChildrenResolved()}} -- resolves children during SFTP 
listing

For example, resolving two sibling files and checking their existence should 
require a single LIST on the parent directory. Instead, the second file's 
internal {{getParent()}} call re-resolves the parent via {{resolveFile()}}, 
triggering an ON_RESOLVE refresh that clears the parent's cached directory 
listing, forcing a redundant LIST.

h3. History

This was always a wasteful pattern, but it became a severe performance 
regression after PR #758 ("Fix refresh() not clearing cached state when file 
was never attached").

PR #758 addressed a real correctness issue: provider-specific cached fields 
like {{FtpFileObject.childMap}} could be populated without going through 
{{attach()}} (e.g. via {{getChildFile()}} -> {{doGetChildren()}}). When 
{{refresh()}} was called on such a file, {{detach()}} skipped {{doDetach()}} 
because {{attached == false}}, silently preserving stale data. The fix added 
{{childMap = null}} to {{FtpFileObject.refresh()}} to ensure cached state is 
always cleared on refresh.

However, this turned the unnecessary internal refreshes into an O(N) 
amplification: when scanning a directory with N files via {{findFiles()}}, each 
child's {{getType()}} call triggers {{getParent()}}, which re-resolves the 
parent via {{resolveFile()}}. With ON_RESOLVE, this refreshes the parent, and 
the new unconditional {{childMap = null}} clears the parent's cached directory 
listing for every child -- forcing a new FTP LIST command each time. A 
directory with 50 files produces ~50 LIST commands instead of 1.

Before PR #758, the parent's {{childMap}} survived these unnecessary refreshes 
because {{refresh()}} only cleared {{childMap}} via {{doDetach()}}, which was 
guarded by {{if (attached)}}. After the first child's refresh set the parent to 
{{attached=false}}, subsequent refreshes left {{childMap}} intact. PR #758 
removed this accidental protection.

Even without PR #758, the pattern was wasteful. For example, resolving two 
sibling files and checking their existence would produce 2 LIST commands 
instead of 1, because the second file's internal {{getParent()}} call refreshes 
the parent (clearing its state via {{doDetach()}} when the parent is attached).

h3. Fix

The fix has two parts:

*1. Internal navigation skips ON_RESOLVE.* A new {{resolveFileInternal()}} 
method resolves files from the cache without applying the ON_RESOLVE refresh 
policy. All internal navigation call sites use this instead of the public 
{{resolveFile()}}:

* {{AbstractFileObject.getParent()}} -- parent resolution
* {{AbstractFileObject.resolveFile(FileName)}} -- child resolution in 
{{resolveFiles()}}/{{getChild()}}
* {{AbstractFileSystem.getRoot()}} -- root file object lookup
* {{FtpFileObject.getLinkDestination()}} -- symlink target resolution
* {{SftpFileObject.doListChildrenResolved()}} -- SFTP child resolution

A protected bridge method in {{AbstractFileObject}} makes 
{{resolveFileInternal()}} available to provider subclasses.

*2. Fresh directory listings propagate metadata to cached children in-place.* 
When a directory listing is performed after a refresh, FTP and SFTP providers 
update cached child objects directly with the fresh metadata from the listing, 
preserving object identity:

* *FTP*: {{FtpFileObject.getChildren()}} detects a fresh LIST (because 
{{childMap}} was null after refresh). After the LIST populates {{childMap}}, it 
iterates the returned children and sets each child's {{ftpFile}} to the 
matching entry from the fresh {{childMap}}, then clears the cached {{type}} via 
{{injectType(null)}} so it re-evaluates from the fresh metadata.
* *SFTP*: {{SftpFileObject.doListChildrenResolved()}} already sets fresh 
{{attrs}} on each child via {{setStat()}}. Now it also clears the cached 
{{type}} via {{injectType(null)}} so it re-evaluates from the updated {{attrs}}.

This approach was chosen over cache eviction (removing and recreating child 
objects) because eviction breaks object identity -- existing VFS tests assert 
that resolving the same path returns the same FileObject instance. In-place 
propagation updates the same objects with fresh data.

Other providers like SMB and local files don't need this -- their 
{{doGetType()}} resolves directly from the server/filesystem on every call, so 
after {{detach()}} clears the cached {{type}}, children naturally get fresh 
data.

Together, these changes establish a clear contract: *cached state is used until 
the user explicitly calls {{refresh()}} or resolves a file via the public API.* 
Internal navigation never triggers server operations. When a directory is 
refreshed, both the directory listing and its children are fresh.

h3. Tests

* {{FtpGetChildrenListCommandTest}}: verifies that {{findFiles()}} on an FTP 
directory with 50 files issues exactly 1 LIST command with 
{{CacheStrategy.ON_RESOLVE}}.
* {{SftpGetChildrenListCommandTest}}: verifies that refreshing an SFTP 
directory and calling {{findFiles()}} returns fresh children reflecting 
filesystem changes, proving in-place metadata propagation works for SFTP.

h3. Compatibility note

This changes the observable behavior of {{CacheStrategy.ON_RESOLVE}} for files 
reached through internal navigation. The following call sites no longer trigger 
ON_RESOLVE refresh:

* {{AbstractFileObject.getParent()}} -- resolving the parent directory
* {{AbstractFileObject.getChildren()}} / {{getChild()}} -- resolving children
* {{AbstractFileSystem.getRoot()}} -- resolving the root file object
* {{FtpFileObject.getLinkDestination()}} -- resolving symlink targets
* {{SftpFileObject.doListChildrenResolved()}} -- resolving children during SFTP 
listing

Previously, these methods called {{fileSystem.resolveFile()}}, which triggered 
an ON_RESOLVE refresh on each returned file. After this fix, internally 
navigated files are served from the cache and only refreshed when the user 
explicitly calls {{refresh()}} or re-resolves them via the public API.

When a directory is refreshed and its children are listed, the FTP and SFTP 
providers propagate fresh metadata from the listing to cached child objects 
in-place (updating {{ftpFile}}/{{attrs}} and clearing the cached {{type}}). 
Object identity is preserved -- the same FileObject instances are returned with 
updated data.

This behavior is consistent with {{CacheStrategy.MANUAL}} and aligns with the 
intended ON_RESOLVE contract: refresh on explicit user resolution, not on 
internal navigation.



> CacheStrategy.ON_RESOLVE triggers refresh on internal VFS navigation, causing 
> redundant FTP/SFTP operations
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: VFS-862
>                 URL: https://issues.apache.org/jira/browse/VFS-862
>             Project: Commons VFS
>          Issue Type: Bug
>            Reporter: Ilan Goldfeld
>            Priority: Major
>             Fix For: 2.11.0
>
>
> h2. Description
> h3. Steps to reproduce
> {code:java}
> DefaultFileSystemManager manager = new DefaultFileSystemManager();
> manager.addProvider("ftp", new FtpFileProvider());
> manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> manager.init();
> {code}
> *Example 1 – pre-existing, before PR #758:*
> Checking existence of two sibling files should require 1 LIST on the parent, 
> but the second file's internal {{getParent()}} re-resolves the parent via 
> {{{}resolveFile(){}}}, triggering an ON_RESOLVE refresh that forces a 
> redundant LIST.
> {code:java}
> FileObject file1 = manager.resolveFile("ftp://user:pass@host/dir/file1.txt";, 
> options);
> FileObject file2 = manager.resolveFile("ftp://user:pass@host/dir/file2.txt";, 
> options);
> file1.exists();  // LIST on parent to determine type
> file2.exists();  // redundant LIST — getParent() refreshes parent via 
> ON_RESOLVE
> // Expected: 1 LIST
> // Actual:   2 LIST commands
> {code}
> *Example 2 – after PR #758, amplified to O(N):*
> Scanning a directory with N files now produces ~N LIST commands instead of 1.
> {code:java}
> FileObject dir = manager.resolveFile("ftp://user:pass@host/directory";, 
> options);
> dir.refresh();
> FileObject[] children = dir.findFiles(new FileDepthSelector(1, 1));
> // Expected: 1 LIST
> // Actual:   ~N LIST commands — each child's getParent() refreshes the parent,
> //           and the unconditional childMap = null in refresh() clears the
> //           cached listing every time
> {code}
> h3. The issue
> {{CacheStrategy.ON_RESOLVE}} is designed to refresh a file when the user 
> resolves it via the public API. However, internal VFS navigation methods also 
> call {{{}fileSystem.resolveFile(){}}}, which triggers the same ON_RESOLVE 
> refresh. This means that internal plumbing – not user actions – invalidates 
> cached state on files the user never asked to refresh.
> Affected internal call sites:
>  * {{AbstractFileObject.getParent()}} – resolves the parent directory
>  * {{AbstractFileObject.getChildren()}} – resolves each child via 
> {{resolveFiles()}}
>  * {{AbstractFileSystem.getRoot()}} – resolves the root file object (called 
> by {{getParent()}} for the root check)
>  * {{FtpFileObject.getLinkDestination()}} – resolves symlink targets
>  * {{SftpFileObject.doListChildrenResolved()}} – resolves children during 
> SFTP listing
> For example, resolving two sibling files and checking their existence should 
> require a single LIST on the parent directory. Instead, the second file's 
> internal {{getParent()}} call re-resolves the parent via 
> {{{}resolveFile(){}}}, triggering an ON_RESOLVE refresh that clears the 
> parent's cached directory listing, forcing a redundant LIST.
> h3. History
> This was always a wasteful pattern, but it became a severe performance 
> regression after PR #758 ("Fix refresh() not clearing cached state when file 
> was never attached").
> PR #758 addressed a real correctness issue: provider-specific cached fields 
> like {{FtpFileObject.childMap}} could be populated without going through 
> {{attach()}} (e.g. via {{getChildFile()}} -> {{{}doGetChildren(){}}}). When 
> {{refresh()}} was called on such a file, {{detach()}} skipped {{doDetach()}} 
> because {{{}attached == false{}}}, silently preserving stale data. The fix 
> added {{childMap = null}} to {{FtpFileObject.refresh()}} to ensure cached 
> state is always cleared on refresh.
> However, this turned the unnecessary internal refreshes into an O(N) 
> amplification: when scanning a directory with N files via 
> {{{}findFiles(){}}}, each child's {{getType()}} call triggers 
> {{{}getParent(){}}}, which re-resolves the parent via {{{}resolveFile(){}}}. 
> With ON_RESOLVE, this refreshes the parent, and the new unconditional 
> {{childMap = null}} clears the parent's cached directory listing for every 
> child – forcing a new FTP LIST command each time. A directory with 50 files 
> produces ~50 LIST commands instead of 1.
> Before PR #758, the parent's {{childMap}} survived these unnecessary 
> refreshes because {{refresh()}} only cleared {{childMap}} via 
> {{{}doDetach(){}}}, which was guarded by {{{}if (attached){}}}. After the 
> first child's refresh set the parent to {{{}attached=false{}}}, subsequent 
> refreshes left {{childMap}} intact. PR #758 removed this accidental 
> protection.
> Even without PR #758, the pattern was wasteful. For example, resolving two 
> sibling files and checking their existence would produce 2 LIST commands 
> instead of 1, because the second file's internal {{getParent()}} call 
> refreshes the parent (clearing its state via {{doDetach()}} when the parent 
> is attached).
> h3. Fix
> The fix has two parts:
> *1. Internal navigation skips ON_RESOLVE.* A new {{resolveFileInternal()}} 
> method resolves files from the cache without applying the ON_RESOLVE refresh 
> policy. All internal navigation call sites use this instead of the public 
> {{{}resolveFile(){}}}:
>  * {{AbstractFileObject.getParent()}} – parent resolution
>  * {{AbstractFileObject.resolveFile(FileName)}} – child resolution in 
> {{{}resolveFiles(){}}}/{{{}getChild(){}}}
>  * {{AbstractFileSystem.getRoot()}} – root file object lookup
>  * {{FtpFileObject.getLinkDestination()}} – symlink target resolution
>  * {{SftpFileObject.doListChildrenResolved()}} – SFTP child resolution
> A protected bridge method in {{AbstractFileObject}} makes 
> {{resolveFileInternal()}} available to provider subclasses.
> *2. Fresh directory listings propagate metadata to cached children in-place.* 
> When a directory listing is performed after a refresh, FTP and SFTP providers 
> update cached child objects directly with the fresh metadata from the 
> listing, preserving object identity:
>  * {*}FTP{*}: {{FtpFileObject.getChildren()}} detects a fresh LIST (because 
> {{childMap}} was null after refresh). After the LIST populates 
> {{{}childMap{}}}, it iterates the returned children and sets each child's 
> {{ftpFile}} to the matching entry from the fresh {{{}childMap{}}}, then 
> clears the cached {{type}} via {{injectType(null)}} so it re-evaluates from 
> the fresh metadata.
>  * {*}SFTP{*}: {{SftpFileObject.doListChildrenResolved()}} already sets fresh 
> {{attrs}} on each child via {{{}setStat(){}}}. Now it also clears the cached 
> {{type}} via {{injectType(null)}} so it re-evaluates from the updated 
> {{{}attrs{}}}.
> This approach was chosen over cache eviction (removing and recreating child 
> objects) because eviction breaks object identity – existing VFS tests assert 
> that resolving the same path returns the same FileObject instance. In-place 
> propagation updates the same objects with fresh data.
> Other providers like SMB and local files don't need this – their 
> {{doGetType()}} resolves directly from the server/filesystem on every call, 
> so after {{detach()}} clears the cached {{{}type{}}}, children naturally get 
> fresh data.
> Together, these changes establish a clear contract: *cached state is used 
> until the user explicitly calls {{refresh()}} or resolves a file via the 
> public API.* Internal navigation never triggers server operations. When a 
> directory is refreshed, both the directory listing and its children are fresh.
>  h3. Jackrabbit 1.x test fix
>   The {{resolveFileInternal}} change in {{getParent()}} exposed a 
> pre-existing bug in the Jackrabbit 1.x embedded test server 
> ({{WebdavProviderTest.testFileCreate}} failed with "expected size 0, got 
> 112").
>   *Root cause:* Jackrabbit 1.x bundles Jetty 6.x, which does not drain 
> unconsumed request bodies before sending error responses on persistent HTTP 
> connections. When a PROPFIND with a 112-byte XML request body
>   receives a 404, and the connection is reused for a subsequent PUT to the 
> same URL, the server reads the stale PROPFIND bytes as the PUT body — storing 
> the XML as file content instead of an empty file.
>   Previously this was masked: {{getParent()}} used {{resolveFile()}} which 
> triggered an ON_RESOLVE refresh on the parent directory, sending an extra 
> PROPFIND request that happened to flush the stale bytes from
>   the connection. With {{resolveFileInternal}}, the PUT follows the 404 
> immediately on the same connection, exposing the server bug.
>   This is a known class of issue in Jetty:
>   * [jetty#651|https://github.com/jetty/jetty.project/issues/651]
>   * [jetty#4117|https://github.com/jetty/jetty.project/issues/4117]
>   * [jetty#6168|https://github.com/jetty/jetty.project/issues/6168]
>   Jetty 9.4+ drains or closes connections with unconsumed content; Jetty 6.x 
> does not.
>   *Fix:* Add {{Connection: close}} to the Jackrabbit 1.x test {{HttpClient}} 
> configuration, disabling persistent connection reuse for the embedded test 
> server. The Jackrabbit 2 tests (newer Jetty + HttpClient
>   4.x) pass without any changes. No production VFS code is modified.
>   Note: Jackrabbit 1.x tests are already skipped on Java 14+ 
> ({{<jdk>[14,)</jdk>}} in pom.xml) since Jackrabbit 1.6.5 (2011) cannot run on 
> modern JVMs.
>  
> h3. Tests
>  * {{{}FtpGetChildrenListCommandTest{}}}: verifies that {{findFiles()}} on an 
> FTP directory with 50 files issues exactly 1 LIST command with 
> {{{}CacheStrategy.ON_RESOLVE{}}}.
>  * {{{}SftpGetChildrenListCommandTest{}}}: verifies that refreshing an SFTP 
> directory and calling {{findFiles()}} returns fresh children reflecting 
> filesystem changes, proving in-place metadata propagation works for SFTP.
> h3. Compatibility note
> This changes the observable behavior of {{CacheStrategy.ON_RESOLVE}} for 
> files reached through internal navigation. The following call sites no longer 
> trigger ON_RESOLVE refresh:
>  * {{AbstractFileObject.getParent()}} – resolving the parent directory
>  * {{AbstractFileObject.getChildren()}} / {{getChild()}} – resolving children
>  * {{AbstractFileSystem.getRoot()}} – resolving the root file object
>  * {{FtpFileObject.getLinkDestination()}} – resolving symlink targets
>  * {{SftpFileObject.doListChildrenResolved()}} – resolving children during 
> SFTP listing
> Previously, these methods called {{{}fileSystem.resolveFile(){}}}, which 
> triggered an ON_RESOLVE refresh on each returned file. After this fix, 
> internally navigated files are served from the cache and only refreshed when 
> the user explicitly calls {{refresh()}} or re-resolves them via the public 
> API.
> When a directory is refreshed and its children are listed, the FTP and SFTP 
> providers propagate fresh metadata from the listing to cached child objects 
> in-place (updating {{{}ftpFile{}}}/{{{}attrs{}}} and clearing the cached 
> {{{}type{}}}). Object identity is preserved – the same FileObject instances 
> are returned with updated data.
> This behavior is consistent with {{CacheStrategy.MANUAL}} and aligns with the 
> intended ON_RESOLVE contract: refresh on explicit user resolution, not on 
> internal navigation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to