[jira] [Updated] (SOLR-17351) Cosmetic changes to v2 filestore "get file" API

2025-02-14 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-17351:
--
Attachment: SOLR-17351.test-failures.tgz
Status: Open  (was: Open)

{quote}...the PR definitely touches ObjectReleaseTraacker stuff, so the failure 
makes sense even though I can't reproduce, ...
{quote}
I find that very troubling.

I'm attaching a TGZ with the full failure logs from re-running the 5 reproduce 
lines i mentioned above as of 2ab12431c04

AFAICT it looks like there are 2 distinct code paths that are problematic
{noformat}
   > 
org.eclipse.jetty.client.util.InputStreamResponseListener$Input:org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException:
 org.eclipse.jetty.client.util.InputStreamResponseListener$Input
   >at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:54)
   >at 
org.apache.solr.client.solrj.impl.Http2SolrClient$InputStreamReleaseTrackingResponseListener$ObjectReleaseTrackedInputStream.(Http2SolrClient.java:1213)
   >at 
org.apache.solr.client.solrj.impl.Http2SolrClient$InputStreamReleaseTrackingResponseListener.getInputStream(Http2SolrClient.java:1207)
   >at 
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:519)
   >at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:260)
   >at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:276)
   >at 
org.apache.solr.filestore.DistribFileStore.distribute(DistribFileStore.java:389)
{noformat}
{noformat}
   > 
org.eclipse.jetty.client.util.InputStreamResponseListener$Input:org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException:
 org.eclipse.jetty.client.util.InputStreamRe
sponseListener$Input
   >at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:54)
   >at 
org.apache.solr.client.solrj.impl.Http2SolrClient$InputStreamReleaseTrackingResponseListener$ObjectReleaseTrackedInputStream.(Http2SolrClient.java:1213)
   >at 
org.apache.solr.client.solrj.impl.Http2SolrClient$InputStreamReleaseTrackingResponseListener.getInputStream(Http2SolrClient.java:1207)
   >at 
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:519)
   >at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:260)
   >at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:276)
   >at 
org.apache.solr.filestore.DistribFileStore$FileInfo.fetchFileFromNodeAndPersist(DistribFileStore.java:192)
{noformat}
At a glance, i don't see an obvious explanation for the problem with 
{{DistribFileStore.distribute(DistribFileStore.java:389)}} – but i also didn't 
drill down into the impl of the request/response being used.

On the other hand: 
{{DistribFileStore$FileInfo.fetchFileFromNodeAndPersist(DistribFileStore.java:192)}}
 seems like a very cut & dry problem...
{code:java}
metadata =
Utils.newBytesConsumer((int) MAX_PKG_SIZE)
.accept(response.getResponseStreamIfSuccessful());
{code}
 * {{response.getResponseStreamIfSuccessful()}} returns the tracked stream 
(IIUC)
 * the consumer returned by {{Utils.newBytesConsumer}} does nothing to close 
that stream.

A suspicious code smell is that a few lines before this an {{InputStream is = 
null;}} is declared, and a few line after this is 
{{org.apache.solr.common.util.IOUtils.closeQuietly(is);}} ... but other then 
that {{is}} is dead code, never assigned. (but prior to 2ab12431c04 it was)

Also, FWIW, the one other line in your patch where you added a call to 
{{getResponseStreamIfSuccessful()}} you did so in a "try-with-resources" .. but 
here you did not.

> Cosmetic changes to v2 filestore "get file" API
> ---
>
> Key: SOLR-17351
> URL: https://issues.apache.org/jira/browse/SOLR-17351
> Project: Solr
>  Issue Type: Sub-task
>  Components: Package Manager, v2 API
>Affects Versions: 9.6.1
>Reporter: Jason Gerlowski
>Priority: Minor
>  Labels: pull-request-available
> Attachments: SOLR-17351.test-failures.tgz
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Solr's filestore APIs fit well with the REST-ful design we're targeting with 
> our v2 APIs, with one large exception: the "get file" API current available 
> at {{GET /api/node/files/somePath.txt}}.  This API stands out for a few 
> reasons:
> 1. It uses a different path-prefix than all other filestore APIs.  (i.e. 
> {{/api/node/files}} instead of {{/api/cluster/files}})
> 2. It exposes 4 or 5 conceptually distinct operations. Obviously in the 
> "default" case it allows callers to r

[jira] [Updated] (SOLR-17351) Cosmetic changes to v2 filestore "get file" API

2025-02-14 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-17351:
--
Status: Open  (was: Open)

> Cosmetic changes to v2 filestore "get file" API
> ---
>
> Key: SOLR-17351
> URL: https://issues.apache.org/jira/browse/SOLR-17351
> Project: Solr
>  Issue Type: Sub-task
>  Components: Package Manager, v2 API
>Affects Versions: 9.6.1
>Reporter: Jason Gerlowski
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Solr's filestore APIs fit well with the REST-ful design we're targeting with 
> our v2 APIs, with one large exception: the "get file" API current available 
> at {{GET /api/node/files/somePath.txt}}.  This API stands out for a few 
> reasons:
> 1. It uses a different path-prefix than all other filestore APIs.  (i.e. 
> {{/api/node/files}} instead of {{/api/cluster/files}})
> 2. It exposes 4 or 5 conceptually distinct operations. Obviously in the 
> "default" case it allows callers to retrieve filestore contents, but based on 
> query params it can instead:
>   - return filestore entry metadata (when {{meta=true}} is specified)
>   - instruct the receiving Solr node to pull a file from another node's 
> filestore and cache it locally (when {{getFrom=someOtherNode}} is specified)
>   - instruct the receiving Solr node to push its cached copy of a file out to 
> all other Solr nodes (when {{sync=true}} is specified)
> 3. Even in the default case of returning "raw" filestore contents, the API 
> can provide two different styles of response:
>   - if {{wt=json}} is specified Solr will take the filestore entry bytes, 
> attempt to stringify them, and then return a JSON object that uses this 
> string as the value for a "response" key.  It's unclear how this would work 
> for binary content 
>   - for all other values of "wt", the API will return the raw file content.
> We should reconsider this endpoint and see if it can't be massaged into being 
> more in line with our other v2 APIs.  Some cosmetic tweaks will go a long 
> way, but the biggest benefit is likely to come from breaking the endpoint up 
> into multiple distinct APIs.  In its current form, the API returns such a 
> variety of responses that it's hard to write client code for.  (I suspect 
> this is the main reason these "filestore" APIs were never made available in 
> SolrJ.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17351) Cosmetic changes to v2 filestore "get file" API

2025-01-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17351:
--
Labels: pull-request-available  (was: )

> Cosmetic changes to v2 filestore "get file" API
> ---
>
> Key: SOLR-17351
> URL: https://issues.apache.org/jira/browse/SOLR-17351
> Project: Solr
>  Issue Type: Sub-task
>  Components: Package Manager, v2 API
>Affects Versions: 9.6.1
>Reporter: Jason Gerlowski
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Solr's filestore APIs fit well with the REST-ful design we're targeting with 
> our v2 APIs, with one large exception: the "get file" API current available 
> at {{GET /api/node/files/somePath.txt}}.  This API stands out for a few 
> reasons:
> 1. It uses a different path-prefix than all other filestore APIs.  (i.e. 
> {{/api/node/files}} instead of {{/api/cluster/files}})
> 2. It exposes 4 or 5 conceptually distinct operations. Obviously in the 
> "default" case it allows callers to retrieve filestore contents, but based on 
> query params it can instead:
>   - return filestore entry metadata (when {{meta=true}} is specified)
>   - instruct the receiving Solr node to pull a file from another node's 
> filestore and cache it locally (when {{getFrom=someOtherNode}} is specified)
>   - instruct the receiving Solr node to push its cached copy of a file out to 
> all other Solr nodes (when {{sync=true}} is specified)
> 3. Even in the default case of returning "raw" filestore contents, the API 
> can provide two different styles of response:
>   - if {{wt=json}} is specified Solr will take the filestore entry bytes, 
> attempt to stringify them, and then return a JSON object that uses this 
> string as the value for a "response" key.  It's unclear how this would work 
> for binary content 
>   - for all other values of "wt", the API will return the raw file content.
> We should reconsider this endpoint and see if it can't be massaged into being 
> more in line with our other v2 APIs.  Some cosmetic tweaks will go a long 
> way, but the biggest benefit is likely to come from breaking the endpoint up 
> into multiple distinct APIs.  In its current form, the API returns such a 
> variety of responses that it's hard to write client code for.  (I suspect 
> this is the main reason these "filestore" APIs were never made available in 
> SolrJ.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17351) Cosmetic changes to v2 filestore "get file" API

2025-01-04 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-17351:
---
Description: 
Solr's filestore APIs fit well with the REST-ful design we're targeting with 
our v2 APIs, with one large exception: the "get file" API current available at 
{{GET /api/node/files/somePath.txt}}.  This API stands out for a few reasons:

1. It uses a different path-prefix than all other filestore APIs.  (i.e. 
{{/api/node/files}} instead of {{/api/cluster/files}})
2. It exposes 4 or 5 conceptually distinct operations. Obviously in the 
"default" case it allows callers to retrieve filestore contents, but based on 
query params it can instead:
  - return filestore entry metadata (when {{meta=true}} is specified)
  - instruct the receiving Solr node to pull a file from another node's 
filestore and cache it locally (when {{getFrom=someOtherNode}} is specified)
  - instruct the receiving Solr node to push its cached copy of a file out to 
all other Solr nodes (when {{sync=true}} is specified)

3. Even in the default case of returning "raw" filestore contents, the API can 
provide two different styles of response:
  - if {{wt=json}} is specified Solr will take the filestore entry bytes, 
attempt to stringify them, and then return a JSON object that uses this string 
as the value for a "response" key.  It's unclear how this would work for binary 
content 
  - for all other values of "wt", the API will return the raw file content.

We should reconsider this endpoint and see if it can't be massaged into being 
more in line with our other v2 APIs.  Some cosmetic tweaks will go a long way, 
but the biggest benefit is likely to come from breaking the endpoint up into 
multiple distinct APIs.  In its current form, the API returns such a variety of 
responses that it's hard to write client code for.  (I suspect this is the main 
reason these "filestore" APIs were never made available in SolrJ.)

  was:
Solr's filestore APIs fit well with the REST-ful design we're targeting with 
our v2 APIs, with one large exception: the "get file" API current available at 
{{GET /api/node/files/somePath.txt}}.  This API stands out for a few reasons:

1. It uses a different path-prefix than all other filestore APIs.  (i.e. 
{{/api/node/files}} instead of {{/api/cluster/files}})
2. It exposes 4 or 5 conceptually distinct operations. Obviously in the 
"default" case it allows callers to retrieve filestore contents, but based on 
query params it can instead:
  - return filestore entry metadata (when {{meta=true}} is specified)
  - instruct the receiving Solr node to pull a file from another node's 
filestore and cache it locally (when {{getFrom=someOtherNode}} is specified)
  - instruct the receiving Solr node to push its cached copy of a file out to 
all other Solr nodes (when {{sync=true}} is specified)
3. Even in the default case of returning "raw" filestore contents, the API can 
provide two different styles of response:
  - if {{wt=json}} is specified Solr will take the filestore entry bytes, 
attempt to stringify them, and then return a JSON object that uses this string 
as the value for a "response" key.  It's unclear how this would work for binary 
content 
  - for all other values of "wt", the API will return the raw file content.

We should reconsider this endpoint and see if it can't be massaged into being 
more in line with our other v2 APIs.  Some cosmetic tweaks will go a long way, 
but the biggest benefit is likely to come from breaking the endpoint up into 
multiple distinct APIs.  In its current form, the API returns such a variety of 
responses that it's hard to write client code for.  (I suspect this is the main 
reason these "filestore" APIs were never made available in SolrJ.)


> Cosmetic changes to v2 filestore "get file" API
> ---
>
> Key: SOLR-17351
> URL: https://issues.apache.org/jira/browse/SOLR-17351
> Project: Solr
>  Issue Type: Sub-task
>  Components: Package Manager, v2 API
>Affects Versions: 9.6.1
>Reporter: Jason Gerlowski
>Priority: Minor
>
> Solr's filestore APIs fit well with the REST-ful design we're targeting with 
> our v2 APIs, with one large exception: the "get file" API current available 
> at {{GET /api/node/files/somePath.txt}}.  This API stands out for a few 
> reasons:
> 1. It uses a different path-prefix than all other filestore APIs.  (i.e. 
> {{/api/node/files}} instead of {{/api/cluster/files}})
> 2. It exposes 4 or 5 conceptually distinct operations. Obviously in the 
> "default" case it allows callers to retrieve filestore contents, but based on 
> query params it can instead:
>   - return filestore entry metadata (when {{meta=true}} is specified)
>   - instruct the receiving Solr node to pull a file from another node's 
> filestore and cache