[jira] [Comment Edited] (SLING-10118) Fully cache the GraphQLSchema objects

Bertrand Delacretaz (Jira) Fri, 23 Apr 2021 07:24:04 -0700


    [ 
https://issues.apache.org/jira/browse/SLING-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330813#comment-17330813
 ]


Bertrand Delacretaz edited comment on SLING-10118 at 4/23/21, 2:23 PM:
-----------------------------------------------------------------------

Thank you for the clarification, I understand the "wired" part now. I have 
slightly refactored the schema generation as there was some code duplication, 
in [commit 
9abddb77|https://github.com/apache/sling-org-apache-sling-graphql-core/commit/9abddb77904a079a5c3c58de0901ac576fb6b53f].

As you say the {{GraphQLSchema}} object is created in several steps:
{code:java}
String schemaSdl = prepareSchemaDefinition(schemaProvider, queryResource, 
selectors);
TypeDefinitionRegistry typeDefinitionRegistry = 
getTypeDefinitionRegistry(schemaSdl, queryResource, selectors);
schema = buildSchema(typeDefinitionRegistry, queryResource);
{code}
And currently only the result of {{getTypeDefinitionRegistry}} is cached. 
Caching the {{GraphQLSchema}} objects should be more efficient ([~tyge] do you 
have metrics that demonstrate that?).
h2. How about caching based on a "vary" schema annotation ?

The problem for caching is that the generated schema can potentially depend on 
many things:
 * 1) The Resource to which the query applies, including potentially all its 
attributes
 * 2) The Request's selectors
 * 3) The current set of OSGi services and Sling scripts or servlets, which 
might change at runtime.

However, in practice and especially on production systems it can be safe to 
assume that 3) never changes after startup - which might be expressed by a 
configuration of the GraphQL Core module ("assume stable OSGi services and 
scripts").

To allow whatever generates the schema to indicate what can cause it to change, 
we might use a schema comment similar to the HTTP Vary header:
{code:java}
# sling-schema-vary: resource-type, selectors
{code}
This example value indicates that the generated schema depends only on the 
current resource type and on the request selectors, which is probably true in 
the vast majority of cases.

So I think the below rules would allow us to implement caching of the 
{{GraphQLSchema}} in a simple way, using the resource type + selectors as the 
key:
 * To activate the cache, the "assume stable OSGi services and scripts" option 
must be set. That might actually be a cache lifetime value, where zero disables 
the cache and N meaning "assume stability for N minutes". That N value can then 
define the cache entries time to live.
 * For a specific schema to be cached, it must include the 
{{sling-schema-vary}} comment (or directive, TBD)
 * We initially recognize only {{resource-type}} and {{selectors}} in that 
"vary" value and build the cache keys accordingly.

I think this allows for fully caching the schema generation steps up to the 
creation of the {{GraphQLSchema}} object, with constraints that should be 
acceptable.

WDYT?


was (Author: bdelacretaz):
Thank you for the clarification, I understand the "wired" part now. I have 
slightly refactored the schema generation as there was some code duplication, 
in [commit 
9abddb77|https://github.com/apache/sling-org-apache-sling-graphql-core/commit/9abddb77904a079a5c3c58de0901ac576fb6b53f].

As you say the {{GraphQLSchema}} object is created in several steps:

{code}
String schemaSdl = prepareSchemaDefinition(schemaProvider, queryResource, 
selectors);
TypeDefinitionRegistry typeDefinitionRegistry = 
getTypeDefinitionRegistry(schemaSdl, queryResource, selectors);
schema = buildSchema(typeDefinitionRegistry, queryResource);
{code}

And currently only the result of {{getTypeDefinitionRegistry}} is cached. 
Caching the {{GraphQLSchema}} objects should be more efficient ([~tyge] do you 
have metrics that demonstrate that?).

The problem for caching is that the generated schema can potentially depend on 
many things: 

* 1) The Resource to which the query applies, including potentially all its 
attributes
* 2) The Request's selectors
* 3) The current set of OSGi services and Sling scripts or servlets, which 
might change at runtime.

However, in practice and especially on production systems it can be safe to 
assume that 3) never changes after startup - which might be expressed by a 
configuration of the GraphQL Core module ("assume stable OSGi services and 
scripts").

To allow whatever generates the schema to indicate what can cause it to change, 
we might use a schema comment similar to the HTTP Vary header:

{code}
# sling-schema-vary: resource-type, selectors
{code}

This example value indicates that the generated schema depends only on the 
current resource type and on the request selectors, which is probably true in 
the vast majority of cases.

So I think the below rules would allow us to implement caching of the 
{{GraphQLSchema}} in a simple way, using the resource type + selectors as the 
key:

* To activate the cache, the "assume stable OSGi services and scripts" option 
must be set. That might actually be a cache lifetime value, where zero disables 
the cache and N meaning "assume stability for N minutes". That N value can then 
define the cache entries time to live.
* For a specific schema to be cached, it must include the {{sling-schema-vary}} 
comment (or directive, TBD)
* We initially recognize only {{resource-type}} and {{selectors}} in that 
"vary" value and build the cache keys accordingly.

I think this allows for fully caching the schema generation steps up to the 
creation of the {{GraphQLSchema}} object, with constraints that should be 
acceptable.

WDYT?

> Fully cache the GraphQLSchema objects
> -------------------------------------
>
>                 Key: SLING-10118
>                 URL: https://issues.apache.org/jira/browse/SLING-10118
>             Project: Sling
>          Issue Type: Bug
>          Components: GraphQL
>    Affects Versions: GraphQL Core 0.0.10
>            Reporter: Thierry Ygé
>            Priority: Major
>
> Currently in SLING-10085 the GraphQLSchema couldn't be cached due to wired 
> resource.
> Thus it still need half the time to spend on building the schema, while 
> generally the resource is only used to be passed later to the fetcher. As 
> that resource then change, I suggested to wrap it via a proxy Resource 
> implementation that would be passed instead of the real resource.
> That proxy will then use a map  to lookup for the current resource used at 
> execute/validate time. The key for the map is to use the current thread.
> I will submit a PR with the solution I suggest to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (SLING-10118) Fully cache the GraphQLSchema objects

Reply via email to