[
https://issues.apache.org/jira/browse/VFS-862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ilan Goldfeld updated VFS-862:
------------------------------
Description:
h2. Description
h3. Steps to reproduce
{code:java}
DefaultFileSystemManager manager = new DefaultFileSystemManager();
manager.addProvider("ftp", new FtpFileProvider());
manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
manager.init();
{code}
*Example 1 – pre-existing, before PR #758:*
Checking existence of two sibling files should require 1 LIST on the parent,
but the second file's internal {{getParent()}} re-resolves the parent via
{{{}resolveFile(){}}}, triggering an ON_RESOLVE refresh that forces a redundant
LIST.
{code:java}
FileObject file1 = manager.resolveFile("ftp://user:pass@host/dir/file1.txt",
options);
FileObject file2 = manager.resolveFile("ftp://user:pass@host/dir/file2.txt",
options);
file1.exists(); // LIST on parent to determine type
file2.exists(); // redundant LIST — getParent() refreshes parent via ON_RESOLVE
// Expected: 1 LIST
// Actual: 2 LIST commands
{code}
*Example 2 – after PR #758, amplified to O(N):*
Scanning a directory with N files now produces ~N LIST commands instead of 1.
{code:java}
FileObject dir = manager.resolveFile("ftp://user:pass@host/directory", options);
dir.refresh();
FileObject[] children = dir.findFiles(new FileDepthSelector(1, 1));
// Expected: 1 LIST
// Actual: ~N LIST commands — each child's getParent() refreshes the parent,
// and the unconditional childMap = null in refresh() clears the
// cached listing every time
{code}
h3. The issue
{{CacheStrategy.ON_RESOLVE}} is designed to refresh a file when the user
resolves it via the public API. However, internal VFS navigation methods also
call {{{}fileSystem.resolveFile(){}}}, which triggers the same ON_RESOLVE
refresh. This means that internal plumbing – not user actions – invalidates
cached state on files the user never asked to refresh.
Affected internal call sites:
* {{AbstractFileObject.getParent()}} – resolves the parent directory
* {{AbstractFileObject.getChildren()}} – resolves each child via
{{resolveFiles()}}
* {{AbstractFileSystem.getRoot()}} – resolves the root file object (called by
{{getParent()}} for the root check)
* {{FtpFileObject.getLinkDestination()}} – resolves symlink targets
* {{SftpFileObject.doListChildrenResolved()}} – resolves children during SFTP
listing
For example, resolving two sibling files and checking their existence should
require a single LIST on the parent directory. Instead, the second file's
internal {{getParent()}} call re-resolves the parent via {{{}resolveFile(){}}},
triggering an ON_RESOLVE refresh that clears the parent's cached directory
listing, forcing a redundant LIST.
h3. History
This was always a wasteful pattern, but it became a severe performance
regression after PR #758 ("Fix refresh() not clearing cached state when file
was never attached").
PR #758 addressed a real correctness issue: provider-specific cached fields
like {{FtpFileObject.childMap}} could be populated without going through
{{attach()}} (e.g. via {{getChildFile()}} -> {{{}doGetChildren(){}}}). When
{{refresh()}} was called on such a file, {{detach()}} skipped {{doDetach()}}
because {{{}attached == false{}}}, silently preserving stale data. The fix
added {{childMap = null}} to {{FtpFileObject.refresh()}} to ensure cached state
is always cleared on refresh.
However, this turned the unnecessary internal refreshes into an O(N)
amplification: when scanning a directory with N files via {{{}findFiles(){}}},
each child's {{getType()}} call triggers {{{}getParent(){}}}, which re-resolves
the parent via {{{}resolveFile(){}}}. With ON_RESOLVE, this refreshes the
parent, and the new unconditional {{childMap = null}} clears the parent's
cached directory listing for every child – forcing a new FTP LIST command each
time. A directory with 50 files produces ~50 LIST commands instead of 1.
Before PR #758, the parent's {{childMap}} survived these unnecessary refreshes
because {{refresh()}} only cleared {{childMap}} via {{{}doDetach(){}}}, which
was guarded by {{{}if (attached){}}}. After the first child's refresh set the
parent to {{{}attached=false{}}}, subsequent refreshes left {{childMap}}
intact. PR #758 removed this accidental protection.
Even without PR #758, the pattern was wasteful. For example, resolving two
sibling files and checking their existence would produce 2 LIST commands
instead of 1, because the second file's internal {{getParent()}} call refreshes
the parent (clearing its state via {{doDetach()}} when the parent is attached).
h3. Fix
The fix has two parts:
*1. Internal navigation skips ON_RESOLVE.* A new {{resolveFileInternal()}}
method resolves files from the cache without applying the ON_RESOLVE refresh
policy. All internal navigation call sites use this instead of the public
{{{}resolveFile(){}}}:
* {{AbstractFileObject.getParent()}} – parent resolution
* {{AbstractFileObject.resolveFile(FileName)}} – child resolution in
{{{}resolveFiles(){}}}/{{{}getChild(){}}}
* {{AbstractFileSystem.getRoot()}} – root file object lookup
* {{FtpFileObject.getLinkDestination()}} – symlink target resolution
* {{SftpFileObject.doListChildrenResolved()}} – SFTP child resolution
A protected bridge method in {{AbstractFileObject}} makes
{{resolveFileInternal()}} available to provider subclasses.
*2. Fresh directory listings propagate metadata to cached children in-place.*
When a directory listing is performed after a refresh, FTP and SFTP providers
update cached child objects directly with the fresh metadata from the listing,
preserving object identity:
* {*}FTP{*}: {{FtpFileObject.getChildren()}} detects a fresh LIST (because
{{childMap}} was null after refresh). After the LIST populates
{{{}childMap{}}}, it iterates the returned children and sets each child's
{{ftpFile}} to the matching entry from the fresh {{{}childMap{}}}, then clears
the cached {{type}} via {{injectType(null)}} so it re-evaluates from the fresh
metadata.
* {*}SFTP{*}: {{SftpFileObject.doListChildrenResolved()}} already sets fresh
{{attrs}} on each child via {{{}setStat(){}}}. Now it also clears the cached
{{type}} via {{injectType(null)}} so it re-evaluates from the updated
{{{}attrs{}}}.
This approach was chosen over cache eviction (removing and recreating child
objects) because eviction breaks object identity – existing VFS tests assert
that resolving the same path returns the same FileObject instance. In-place
propagation updates the same objects with fresh data.
Other providers like SMB and local files don't need this – their
{{doGetType()}} resolves directly from the server/filesystem on every call, so
after {{detach()}} clears the cached {{{}type{}}}, children naturally get fresh
data.
Together, these changes establish a clear contract: *cached state is used until
the user explicitly calls {{refresh()}} or resolves a file via the public API.*
Internal navigation never triggers server operations. When a directory is
refreshed, both the directory listing and its children are fresh.
h3. Jackrabbit 1.x test fix
The {{resolveFileInternal}} change in {{getParent()}} exposed a pre-existing
bug in the Jackrabbit 1.x embedded test server
({{WebdavProviderTest.testFileCreate}} failed with "expected size 0, got 112").
*Root cause:* Jackrabbit 1.x bundles Jetty 6.x, which does not drain
unconsumed request bodies before sending error responses on persistent HTTP
connections. When a PROPFIND with a 112-byte XML request body
receives a 404, and the connection is reused for a subsequent PUT to the same
URL, the server reads the stale PROPFIND bytes as the PUT body — storing the
XML as file content instead of an empty file.
Previously this was masked: {{getParent()}} used {{resolveFile()}} which
triggered an ON_RESOLVE refresh on the parent directory, sending an extra
PROPFIND request that happened to flush the stale bytes from
the connection. With {{resolveFileInternal}}, the PUT follows the 404
immediately on the same connection, exposing the server bug.
This is a known class of issue in Jetty:
* [jetty#651|https://github.com/jetty/jetty.project/issues/651]
* [jetty#4117|https://github.com/jetty/jetty.project/issues/4117]
* [jetty#6168|https://github.com/jetty/jetty.project/issues/6168]
Jetty 9.4+ drains or closes connections with unconsumed content; Jetty 6.x
does not.
*Fix:* Add {{Connection: close}} to the Jackrabbit 1.x test {{HttpClient}}
configuration, disabling persistent connection reuse for the embedded test
server. The Jackrabbit 2 tests (newer Jetty + HttpClient
4.x) pass without any changes. No production VFS code is modified.
Note: Jackrabbit 1.x tests are already skipped on Java 14+
({{<jdk>[14,)</jdk>}} in pom.xml) since Jackrabbit 1.6.5 (2011) cannot run on
modern JVMs.
h3. Tests
* {{{}FtpGetChildrenListCommandTest{}}}: verifies that {{findFiles()}} on an
FTP directory with 50 files issues exactly 1 LIST command with
{{{}CacheStrategy.ON_RESOLVE{}}}.
* {{{}SftpGetChildrenListCommandTest{}}}: verifies that refreshing an SFTP
directory and calling {{findFiles()}} returns fresh children reflecting
filesystem changes, proving in-place metadata propagation works for SFTP.
h3. Compatibility note
This changes the observable behavior of {{CacheStrategy.ON_RESOLVE}} for files
reached through internal navigation. The following call sites no longer trigger
ON_RESOLVE refresh:
* {{AbstractFileObject.getParent()}} – resolving the parent directory
* {{AbstractFileObject.getChildren()}} / {{getChild()}} – resolving children
* {{AbstractFileSystem.getRoot()}} – resolving the root file object
* {{FtpFileObject.getLinkDestination()}} – resolving symlink targets
* {{SftpFileObject.doListChildrenResolved()}} – resolving children during SFTP
listing
Previously, these methods called {{{}fileSystem.resolveFile(){}}}, which
triggered an ON_RESOLVE refresh on each returned file. After this fix,
internally navigated files are served from the cache and only refreshed when
the user explicitly calls {{refresh()}} or re-resolves them via the public API.
When a directory is refreshed and its children are listed, the FTP and SFTP
providers propagate fresh metadata from the listing to cached child objects
in-place (updating {{{}ftpFile{}}}/{{{}attrs{}}} and clearing the cached
{{{}type{}}}). Object identity is preserved – the same FileObject instances are
returned with updated data.
This behavior is consistent with {{CacheStrategy.MANUAL}} and aligns with the
intended ON_RESOLVE contract: refresh on explicit user resolution, not on
internal navigation.
was:
h2. Description
h3. Steps to reproduce
{code:java}
DefaultFileSystemManager manager = new DefaultFileSystemManager();
manager.addProvider("ftp", new FtpFileProvider());
manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
manager.init();
{code}
*Example 1 -- pre-existing, before PR #758:*
Checking existence of two sibling files should require 1 LIST on the parent,
but the second file's internal {{getParent()}} re-resolves the parent via
{{resolveFile()}}, triggering an ON_RESOLVE refresh that forces a redundant
LIST.
{code:java}
FileObject file1 = manager.resolveFile("ftp://user:pass@host/dir/file1.txt",
options);
FileObject file2 = manager.resolveFile("ftp://user:pass@host/dir/file2.txt",
options);
file1.exists(); // LIST on parent to determine type
file2.exists(); // redundant LIST — getParent() refreshes parent via ON_RESOLVE
// Expected: 1 LIST
// Actual: 2 LIST commands
{code}
*Example 2 -- after PR #758, amplified to O(N):*
Scanning a directory with N files now produces ~N LIST commands instead of 1.
{code:java}
FileObject dir = manager.resolveFile("ftp://user:pass@host/directory", options);
dir.refresh();
FileObject[] children = dir.findFiles(new FileDepthSelector(1, 1));
// Expected: 1 LIST
// Actual: ~N LIST commands — each child's getParent() refreshes the parent,
// and the unconditional childMap = null in refresh() clears the
// cached listing every time
{code}
h3. The issue
{{CacheStrategy.ON_RESOLVE}} is designed to refresh a file when the user
resolves it via the public API. However, internal VFS navigation methods also
call {{fileSystem.resolveFile()}}, which triggers the same ON_RESOLVE refresh.
This means that internal plumbing -- not user actions -- invalidates cached
state on files the user never asked to refresh.
Affected internal call sites:
* {{AbstractFileObject.getParent()}} -- resolves the parent directory
* {{AbstractFileObject.getChildren()}} -- resolves each child via
{{resolveFiles()}}
* {{AbstractFileSystem.getRoot()}} -- resolves the root file object (called by
{{getParent()}} for the root check)
* {{FtpFileObject.getLinkDestination()}} -- resolves symlink targets
* {{SftpFileObject.doListChildrenResolved()}} -- resolves children during SFTP
listing
For example, resolving two sibling files and checking their existence should
require a single LIST on the parent directory. Instead, the second file's
internal {{getParent()}} call re-resolves the parent via {{resolveFile()}},
triggering an ON_RESOLVE refresh that clears the parent's cached directory
listing, forcing a redundant LIST.
h3. History
This was always a wasteful pattern, but it became a severe performance
regression after PR #758 ("Fix refresh() not clearing cached state when file
was never attached").
PR #758 addressed a real correctness issue: provider-specific cached fields
like {{FtpFileObject.childMap}} could be populated without going through
{{attach()}} (e.g. via {{getChildFile()}} -> {{doGetChildren()}}). When
{{refresh()}} was called on such a file, {{detach()}} skipped {{doDetach()}}
because {{attached == false}}, silently preserving stale data. The fix added
{{childMap = null}} to {{FtpFileObject.refresh()}} to ensure cached state is
always cleared on refresh.
However, this turned the unnecessary internal refreshes into an O(N)
amplification: when scanning a directory with N files via {{findFiles()}}, each
child's {{getType()}} call triggers {{getParent()}}, which re-resolves the
parent via {{resolveFile()}}. With ON_RESOLVE, this refreshes the parent, and
the new unconditional {{childMap = null}} clears the parent's cached directory
listing for every child -- forcing a new FTP LIST command each time. A
directory with 50 files produces ~50 LIST commands instead of 1.
Before PR #758, the parent's {{childMap}} survived these unnecessary refreshes
because {{refresh()}} only cleared {{childMap}} via {{doDetach()}}, which was
guarded by {{if (attached)}}. After the first child's refresh set the parent to
{{attached=false}}, subsequent refreshes left {{childMap}} intact. PR #758
removed this accidental protection.
Even without PR #758, the pattern was wasteful. For example, resolving two
sibling files and checking their existence would produce 2 LIST commands
instead of 1, because the second file's internal {{getParent()}} call refreshes
the parent (clearing its state via {{doDetach()}} when the parent is attached).
h3. Fix
The fix has two parts:
*1. Internal navigation skips ON_RESOLVE.* A new {{resolveFileInternal()}}
method resolves files from the cache without applying the ON_RESOLVE refresh
policy. All internal navigation call sites use this instead of the public
{{resolveFile()}}:
* {{AbstractFileObject.getParent()}} -- parent resolution
* {{AbstractFileObject.resolveFile(FileName)}} -- child resolution in
{{resolveFiles()}}/{{getChild()}}
* {{AbstractFileSystem.getRoot()}} -- root file object lookup
* {{FtpFileObject.getLinkDestination()}} -- symlink target resolution
* {{SftpFileObject.doListChildrenResolved()}} -- SFTP child resolution
A protected bridge method in {{AbstractFileObject}} makes
{{resolveFileInternal()}} available to provider subclasses.
*2. Fresh directory listings propagate metadata to cached children in-place.*
When a directory listing is performed after a refresh, FTP and SFTP providers
update cached child objects directly with the fresh metadata from the listing,
preserving object identity:
* *FTP*: {{FtpFileObject.getChildren()}} detects a fresh LIST (because
{{childMap}} was null after refresh). After the LIST populates {{childMap}}, it
iterates the returned children and sets each child's {{ftpFile}} to the
matching entry from the fresh {{childMap}}, then clears the cached {{type}} via
{{injectType(null)}} so it re-evaluates from the fresh metadata.
* *SFTP*: {{SftpFileObject.doListChildrenResolved()}} already sets fresh
{{attrs}} on each child via {{setStat()}}. Now it also clears the cached
{{type}} via {{injectType(null)}} so it re-evaluates from the updated {{attrs}}.
This approach was chosen over cache eviction (removing and recreating child
objects) because eviction breaks object identity -- existing VFS tests assert
that resolving the same path returns the same FileObject instance. In-place
propagation updates the same objects with fresh data.
Other providers like SMB and local files don't need this -- their
{{doGetType()}} resolves directly from the server/filesystem on every call, so
after {{detach()}} clears the cached {{type}}, children naturally get fresh
data.
Together, these changes establish a clear contract: *cached state is used until
the user explicitly calls {{refresh()}} or resolves a file via the public API.*
Internal navigation never triggers server operations. When a directory is
refreshed, both the directory listing and its children are fresh.
h3. Tests
* {{FtpGetChildrenListCommandTest}}: verifies that {{findFiles()}} on an FTP
directory with 50 files issues exactly 1 LIST command with
{{CacheStrategy.ON_RESOLVE}}.
* {{SftpGetChildrenListCommandTest}}: verifies that refreshing an SFTP
directory and calling {{findFiles()}} returns fresh children reflecting
filesystem changes, proving in-place metadata propagation works for SFTP.
h3. Compatibility note
This changes the observable behavior of {{CacheStrategy.ON_RESOLVE}} for files
reached through internal navigation. The following call sites no longer trigger
ON_RESOLVE refresh:
* {{AbstractFileObject.getParent()}} -- resolving the parent directory
* {{AbstractFileObject.getChildren()}} / {{getChild()}} -- resolving children
* {{AbstractFileSystem.getRoot()}} -- resolving the root file object
* {{FtpFileObject.getLinkDestination()}} -- resolving symlink targets
* {{SftpFileObject.doListChildrenResolved()}} -- resolving children during SFTP
listing
Previously, these methods called {{fileSystem.resolveFile()}}, which triggered
an ON_RESOLVE refresh on each returned file. After this fix, internally
navigated files are served from the cache and only refreshed when the user
explicitly calls {{refresh()}} or re-resolves them via the public API.
When a directory is refreshed and its children are listed, the FTP and SFTP
providers propagate fresh metadata from the listing to cached child objects
in-place (updating {{ftpFile}}/{{attrs}} and clearing the cached {{type}}).
Object identity is preserved -- the same FileObject instances are returned with
updated data.
This behavior is consistent with {{CacheStrategy.MANUAL}} and aligns with the
intended ON_RESOLVE contract: refresh on explicit user resolution, not on
internal navigation.
> CacheStrategy.ON_RESOLVE triggers refresh on internal VFS navigation, causing
> redundant FTP/SFTP operations
> -----------------------------------------------------------------------------------------------------------
>
> Key: VFS-862
> URL: https://issues.apache.org/jira/browse/VFS-862
> Project: Commons VFS
> Issue Type: Bug
> Reporter: Ilan Goldfeld
> Priority: Major
> Fix For: 2.11.0
>
>
> h2. Description
> h3. Steps to reproduce
> {code:java}
> DefaultFileSystemManager manager = new DefaultFileSystemManager();
> manager.addProvider("ftp", new FtpFileProvider());
> manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> manager.init();
> {code}
> *Example 1 – pre-existing, before PR #758:*
> Checking existence of two sibling files should require 1 LIST on the parent,
> but the second file's internal {{getParent()}} re-resolves the parent via
> {{{}resolveFile(){}}}, triggering an ON_RESOLVE refresh that forces a
> redundant LIST.
> {code:java}
> FileObject file1 = manager.resolveFile("ftp://user:pass@host/dir/file1.txt",
> options);
> FileObject file2 = manager.resolveFile("ftp://user:pass@host/dir/file2.txt",
> options);
> file1.exists(); // LIST on parent to determine type
> file2.exists(); // redundant LIST — getParent() refreshes parent via
> ON_RESOLVE
> // Expected: 1 LIST
> // Actual: 2 LIST commands
> {code}
> *Example 2 – after PR #758, amplified to O(N):*
> Scanning a directory with N files now produces ~N LIST commands instead of 1.
> {code:java}
> FileObject dir = manager.resolveFile("ftp://user:pass@host/directory",
> options);
> dir.refresh();
> FileObject[] children = dir.findFiles(new FileDepthSelector(1, 1));
> // Expected: 1 LIST
> // Actual: ~N LIST commands — each child's getParent() refreshes the parent,
> // and the unconditional childMap = null in refresh() clears the
> // cached listing every time
> {code}
> h3. The issue
> {{CacheStrategy.ON_RESOLVE}} is designed to refresh a file when the user
> resolves it via the public API. However, internal VFS navigation methods also
> call {{{}fileSystem.resolveFile(){}}}, which triggers the same ON_RESOLVE
> refresh. This means that internal plumbing – not user actions – invalidates
> cached state on files the user never asked to refresh.
> Affected internal call sites:
> * {{AbstractFileObject.getParent()}} – resolves the parent directory
> * {{AbstractFileObject.getChildren()}} – resolves each child via
> {{resolveFiles()}}
> * {{AbstractFileSystem.getRoot()}} – resolves the root file object (called
> by {{getParent()}} for the root check)
> * {{FtpFileObject.getLinkDestination()}} – resolves symlink targets
> * {{SftpFileObject.doListChildrenResolved()}} – resolves children during
> SFTP listing
> For example, resolving two sibling files and checking their existence should
> require a single LIST on the parent directory. Instead, the second file's
> internal {{getParent()}} call re-resolves the parent via
> {{{}resolveFile(){}}}, triggering an ON_RESOLVE refresh that clears the
> parent's cached directory listing, forcing a redundant LIST.
> h3. History
> This was always a wasteful pattern, but it became a severe performance
> regression after PR #758 ("Fix refresh() not clearing cached state when file
> was never attached").
> PR #758 addressed a real correctness issue: provider-specific cached fields
> like {{FtpFileObject.childMap}} could be populated without going through
> {{attach()}} (e.g. via {{getChildFile()}} -> {{{}doGetChildren(){}}}). When
> {{refresh()}} was called on such a file, {{detach()}} skipped {{doDetach()}}
> because {{{}attached == false{}}}, silently preserving stale data. The fix
> added {{childMap = null}} to {{FtpFileObject.refresh()}} to ensure cached
> state is always cleared on refresh.
> However, this turned the unnecessary internal refreshes into an O(N)
> amplification: when scanning a directory with N files via
> {{{}findFiles(){}}}, each child's {{getType()}} call triggers
> {{{}getParent(){}}}, which re-resolves the parent via {{{}resolveFile(){}}}.
> With ON_RESOLVE, this refreshes the parent, and the new unconditional
> {{childMap = null}} clears the parent's cached directory listing for every
> child – forcing a new FTP LIST command each time. A directory with 50 files
> produces ~50 LIST commands instead of 1.
> Before PR #758, the parent's {{childMap}} survived these unnecessary
> refreshes because {{refresh()}} only cleared {{childMap}} via
> {{{}doDetach(){}}}, which was guarded by {{{}if (attached){}}}. After the
> first child's refresh set the parent to {{{}attached=false{}}}, subsequent
> refreshes left {{childMap}} intact. PR #758 removed this accidental
> protection.
> Even without PR #758, the pattern was wasteful. For example, resolving two
> sibling files and checking their existence would produce 2 LIST commands
> instead of 1, because the second file's internal {{getParent()}} call
> refreshes the parent (clearing its state via {{doDetach()}} when the parent
> is attached).
> h3. Fix
> The fix has two parts:
> *1. Internal navigation skips ON_RESOLVE.* A new {{resolveFileInternal()}}
> method resolves files from the cache without applying the ON_RESOLVE refresh
> policy. All internal navigation call sites use this instead of the public
> {{{}resolveFile(){}}}:
> * {{AbstractFileObject.getParent()}} – parent resolution
> * {{AbstractFileObject.resolveFile(FileName)}} – child resolution in
> {{{}resolveFiles(){}}}/{{{}getChild(){}}}
> * {{AbstractFileSystem.getRoot()}} – root file object lookup
> * {{FtpFileObject.getLinkDestination()}} – symlink target resolution
> * {{SftpFileObject.doListChildrenResolved()}} – SFTP child resolution
> A protected bridge method in {{AbstractFileObject}} makes
> {{resolveFileInternal()}} available to provider subclasses.
> *2. Fresh directory listings propagate metadata to cached children in-place.*
> When a directory listing is performed after a refresh, FTP and SFTP providers
> update cached child objects directly with the fresh metadata from the
> listing, preserving object identity:
> * {*}FTP{*}: {{FtpFileObject.getChildren()}} detects a fresh LIST (because
> {{childMap}} was null after refresh). After the LIST populates
> {{{}childMap{}}}, it iterates the returned children and sets each child's
> {{ftpFile}} to the matching entry from the fresh {{{}childMap{}}}, then
> clears the cached {{type}} via {{injectType(null)}} so it re-evaluates from
> the fresh metadata.
> * {*}SFTP{*}: {{SftpFileObject.doListChildrenResolved()}} already sets fresh
> {{attrs}} on each child via {{{}setStat(){}}}. Now it also clears the cached
> {{type}} via {{injectType(null)}} so it re-evaluates from the updated
> {{{}attrs{}}}.
> This approach was chosen over cache eviction (removing and recreating child
> objects) because eviction breaks object identity – existing VFS tests assert
> that resolving the same path returns the same FileObject instance. In-place
> propagation updates the same objects with fresh data.
> Other providers like SMB and local files don't need this – their
> {{doGetType()}} resolves directly from the server/filesystem on every call,
> so after {{detach()}} clears the cached {{{}type{}}}, children naturally get
> fresh data.
> Together, these changes establish a clear contract: *cached state is used
> until the user explicitly calls {{refresh()}} or resolves a file via the
> public API.* Internal navigation never triggers server operations. When a
> directory is refreshed, both the directory listing and its children are fresh.
> h3. Jackrabbit 1.x test fix
> The {{resolveFileInternal}} change in {{getParent()}} exposed a
> pre-existing bug in the Jackrabbit 1.x embedded test server
> ({{WebdavProviderTest.testFileCreate}} failed with "expected size 0, got
> 112").
> *Root cause:* Jackrabbit 1.x bundles Jetty 6.x, which does not drain
> unconsumed request bodies before sending error responses on persistent HTTP
> connections. When a PROPFIND with a 112-byte XML request body
> receives a 404, and the connection is reused for a subsequent PUT to the
> same URL, the server reads the stale PROPFIND bytes as the PUT body — storing
> the XML as file content instead of an empty file.
> Previously this was masked: {{getParent()}} used {{resolveFile()}} which
> triggered an ON_RESOLVE refresh on the parent directory, sending an extra
> PROPFIND request that happened to flush the stale bytes from
> the connection. With {{resolveFileInternal}}, the PUT follows the 404
> immediately on the same connection, exposing the server bug.
> This is a known class of issue in Jetty:
> * [jetty#651|https://github.com/jetty/jetty.project/issues/651]
> * [jetty#4117|https://github.com/jetty/jetty.project/issues/4117]
> * [jetty#6168|https://github.com/jetty/jetty.project/issues/6168]
> Jetty 9.4+ drains or closes connections with unconsumed content; Jetty 6.x
> does not.
> *Fix:* Add {{Connection: close}} to the Jackrabbit 1.x test {{HttpClient}}
> configuration, disabling persistent connection reuse for the embedded test
> server. The Jackrabbit 2 tests (newer Jetty + HttpClient
> 4.x) pass without any changes. No production VFS code is modified.
> Note: Jackrabbit 1.x tests are already skipped on Java 14+
> ({{<jdk>[14,)</jdk>}} in pom.xml) since Jackrabbit 1.6.5 (2011) cannot run on
> modern JVMs.
>
> h3. Tests
> * {{{}FtpGetChildrenListCommandTest{}}}: verifies that {{findFiles()}} on an
> FTP directory with 50 files issues exactly 1 LIST command with
> {{{}CacheStrategy.ON_RESOLVE{}}}.
> * {{{}SftpGetChildrenListCommandTest{}}}: verifies that refreshing an SFTP
> directory and calling {{findFiles()}} returns fresh children reflecting
> filesystem changes, proving in-place metadata propagation works for SFTP.
> h3. Compatibility note
> This changes the observable behavior of {{CacheStrategy.ON_RESOLVE}} for
> files reached through internal navigation. The following call sites no longer
> trigger ON_RESOLVE refresh:
> * {{AbstractFileObject.getParent()}} – resolving the parent directory
> * {{AbstractFileObject.getChildren()}} / {{getChild()}} – resolving children
> * {{AbstractFileSystem.getRoot()}} – resolving the root file object
> * {{FtpFileObject.getLinkDestination()}} – resolving symlink targets
> * {{SftpFileObject.doListChildrenResolved()}} – resolving children during
> SFTP listing
> Previously, these methods called {{{}fileSystem.resolveFile(){}}}, which
> triggered an ON_RESOLVE refresh on each returned file. After this fix,
> internally navigated files are served from the cache and only refreshed when
> the user explicitly calls {{refresh()}} or re-resolves them via the public
> API.
> When a directory is refreshed and its children are listed, the FTP and SFTP
> providers propagate fresh metadata from the listing to cached child objects
> in-place (updating {{{}ftpFile{}}}/{{{}attrs{}}} and clearing the cached
> {{{}type{}}}). Object identity is preserved – the same FileObject instances
> are returned with updated data.
> This behavior is consistent with {{CacheStrategy.MANUAL}} and aligns with the
> intended ON_RESOLVE contract: refresh on explicit user resolution, not on
> internal navigation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)