looking forward to the new resource mapping strategy!

One use case that probably also should be considered here is the case where static pages are generated with sling. In that case the rendering of the page is not tied to a request.

Ruben

On 8/23/2020 2:24 AM, Carsten Ziegeler wrote:
Thanks for taking this up.

As noted in the issue, I think we should investigate whether this is a good opportunity to rethink mapping / resolving in general. In hindsight, we probably shouldn't have added the mapping part to the resource resolver. The resolve() method does two things: it applies mappings (which is independent of the current user) and it resolves to a resource (finding the resource and figuring out the selectors, extensions etc.) based on the user.

So how about moving the mapping part out of the resource resolver? As we've seen recently other parts of Sling like auth core could benefit from this as well.

I haven't fully thought this through, but this is the rough idea:

We create a new OSGi service, PathMapperFactory (its probably not a good name but I don't want to invest time into naming at this stage). It has one method:

   PathMapper newPathMapper(HttpServletRequest request)

where the request parameter is optional. PathMapper has the resolve/map methods doing the mapping (based on the newly suggested SPI). If a request has been provided to the factory method, that object is used for resolving/mapping. I'm unsure about the exact methods of PathMapper, but it needs a resolve method taking in a path (the request path) and returning the resolved path (vanity path, aliases etc are applied) and a flag whether a redirect should happen.

Sling's main servlet (or more precise the processor) is changed: for an incoming request it calls the PathMapperFactory with the request and then the resolve method on PathMapper. If the result is absolut or the redirect flag is set, the redirect happens.

Otherwise, authentication is invoked with the resolved path - this frees auth core (and all auth handlers etc) from dealing with mappings.

If authentication is successful, a new resource resolver is created and the PathMapper is passed to the resource resolver via the attributes of the resource resolver factory. ResourceResolver gets a new method "getPathMapper()" - if no PathMapper is passed to the resource resolver factory, the factory creates a new one by invoking newPathMapper(null) on the PathMapperFactory. In all cases, a resource resolver has a PathMapper.

We add a new method to Resource Resolver, similar to the resolve() method but just doing the resolution part - no mapping. This method is used by the main servlet instead of the existing resolve() method. We deprecate resolve() and map() and change the implementation to internally use the passed in PathMapper instance.

This way we have a clear separation between mapping and resolving and in which contexts the operations happen. With adding the PathMapper to the resource resolver we make it available to all parts in the system which need to call todays map() method - and as the PathMapper instance knows the request, there is no need to pass it around anymore.

Now, again, just a rough write down of the idea, don't get hung up on the naming that can and needs to improve. Its also not directly about the new SPI, but rather how we make the SPI available to the rest of the system.

Regards
Carsten

Am 14.08.2020 um 07:57 schrieb Georg Henzler:
Hi all,

so I eventually got around to continue on this. I created JIRA
issue [1] and the branches [2] in API and [3] in resourceresolver
with a first working version.


I definitely prefer immutable classes with a dedicated builder

@Konrad I was critical first because the verbosity of the code,
but it made implementing ResourceUriMappingChain actually easier.
My concern about the verbosity was not that much a problem with
Java 8 anymore.

...And I do believe that mapping is
about the relationship between requests and resources ...
and
- RequestMapper: it maps a "request" to a Resource OR it maps a
Resource to a form that can be requested.

@Julian: So these are very good points, I think RequestMapper is
not ideal because in async code (e.g. Sling jobs), you might
have the case that you have a resource and you map it to a URI
(e.g. for sending an email). in rr.map() you use no
request and usually some manual work is needed to add the host.
My idea for the future would be to provide a simple
ExampleRequest that can be used to provide a hostname or headers
which will then tell the active mappers to do the right thing for
the async, non-request use case. I ended up calling the
SPI interface itself ResourceToUriMapper (which I think is
best spot on for what it does).

So overall I have it all working now in the branches [2] and [3],
currently ResourceUriMappingChain is called from
ResourceResolverImpl.resolveInternal() and
ResourceMapperImpl.getAllMappings(). Other aspects like Sling
alias/vanity path handling, namespace mangling and context
paths are still handled there but could potentially be moved
to a ResourceToUriMapper in further steps. Also see [4]
for some examples on what ResourceToUriMapper can look like.

Let me know what you think!

-Georg


[1] https://issues.apache.org/jira/browse/SLING-9662
[2] https://github.com/apache/sling-org-apache-sling-api/tree/feature/SLING-9662-Introduce-Resource-Mapping-SPI [3] https://github.com/apache/sling-org-apache-sling-resourceresolver/tree/feature/SLING-9662-Introduce-Resource-Mapping-SPI [4] https://github.com/apache/sling-org-apache-sling-resourceresolver/tree/feature/SLING-9662-Introduce-Resource-Mapping-SPI/src/main/java/org/apache/sling/resourceresolver/impl/mappingchain/defaultmappers

On 2020-04-01 10:04, Georg Henzler wrote:
Hi Julian,

I'm all for disentangling the mapping implementations (/etc/map,
vanity, alias etc.) and making the mapping pluggable.

great - this is really the main purpose of this thread, to sense
if our community will support the move (because it'll be a bit of
work and for that I'd like to have an agreement on the direction
first)

Despite your suggestion to discuss naming later, Georg, I believe that
naming helps structure thoughts.

yes, we shouldn't be too strict :)

- RequestMapper: it maps a "request" to a Resource OR it maps a
Resource to a form that can be requested.

or maybe UriMapper - while many use cases have a request at hand, there
is always the default mapping that kicks in when null is passed as
request (mostly relevant for for async processes like e.g. sling jobs)

I would rather give the Sling instance an identity (e.g.
protocol, hostname and port)

This is an interesting idea! It wouldn't have to be protocol, hostname
and port, just the protocol would make it a valid uri already, the
internal representation could be slinguri:/path/to/resource.sel.html,
ResourceUri.toString() could just return /path/to/resource.sel.html.

-Georg

On 2020-04-01 00:28, Julian Sedding wrote:
Hi

I'm all for disentangling the mapping implementations (/etc/map,
vanity, alias etc.) and making the mapping pluggable.

Over the years I have come to think that the ResourceResolver methods
"resolve" and "map" are actually in the "wrong" interface, because the
handle a different concern than the rest of the RR. Both "resolve" and
"map" are concerned about the HttpServletRequest, which is none of the
RR's business. We cannot fix this due to backwards compatibility, but
maybe we can improve the current situation.

Despite your suggestion to discuss naming later, Georg, I believe that
naming helps structure thoughts. And I do believe that mapping is
about the relationship between requests and resources. Pretty much
everything in Sling is about Resources. Therefore, I suggest
reflecting the request aspect in the naming. This does not
fundamentally change what you are suggesting, and I fully agree on the
ordered chaining of mappers.

- RequestMapper: it maps a "request" to a Resource OR it maps a
Resource to a form that can be requested.
- RequestUri: an abstraction that can be serialized to the string
representation of a URI

I'm not sure we need an external URI and a local one variant of
RequestUri. I would rather give the Sling instance an identity (e.g.
protocol, hostname and port) and use that in conjunction with the
RequestUri in order to decide whether the string representation should
be absolute or relative (i.e. if protocol, hostname and port match, it
can be relative). Or maybe that identity is per HTTP request, I'm not
sure yet.

Just some (yet) unstructured thoughts. I hope I'm making sense.

Regards
Julian


On Tue, Mar 31, 2020 at 11:50 PM Georg Henzler <ghenz...@apache.org> wrote:

Hi Konrad,

yes, ResourceURI at the moment would be both for internal and external
links.

Immutable vs. Mutable is to be discussed (both ways have their own
advantages)

> ResourceUri resolve(ResourceURI, ...) (not useful for absolute/external
> URIs)
> SlingUri map(SlingUri, ...) (can be useful even to map an external URI)

Because it is meant to be a pipeline of OSGi services in a row of which
any
could make the decision to switch from a plain path to a full URI IMHO
we need
to pass an interface that covers both. This approach could mean (names
are
more descriptive than a real suggestion for now):

interface ResourceUri extends ArbitraryLink {...}
interface ResourcePath extends ArbitraryLink {...} #
RequestPathInfo+query+fragment

The ResourceMapping interface would then look symmetric again:

ArbitraryLink resolve(ArbitraryLink, ...)
ArbitraryLink map(ArbitraryLink, ...)

-Georg

On 2020-03-30 21:17, Konrad Windszus wrote:
> Hi Georg,
>
> Is the class
> https://github.com/apache/sling-org-apache-sling-api/blob/7dd23418280fcef234b42dbabf6e13e43ef2d4d0/src/main/java/org/apache/sling/api/resource/uri/ResourceURI.java > <https://github.com/apache/sling-org-apache-sling-api/blob/7dd23418280fcef234b42dbabf6e13e43ef2d4d0/src/main/java/org/apache/sling/api/resource/uri/ResourceURI.java> > supposed to be used for both internal as well as for external links? > I would rather come up with two separate classes (they don't have much
> in common).
>
> What about a generic SlingUri class with the two specializations
> ResourceUri and AbsoluteUri (or ExternalUri)?
>
> That you would lead the following signatures in
> https://github.com/apache/sling-org-apache-sling-api/blob/7dd23418280fcef234b42dbabf6e13e43ef2d4d0/src/main/java/org/apache/sling/api/resource/mapping/spi/ResourceMapping.java > <https://github.com/apache/sling-org-apache-sling-api/blob/7dd23418280fcef234b42dbabf6e13e43ef2d4d0/src/main/java/org/apache/sling/api/resource/mapping/spi/ResourceMapping.java>:
>

>
> I definitely prefer immutable classes with a dedicated builder but
> that is up for another discussion I guess,
>
> Konrad
>
>> On 30. Mar 2020, at 02:13, Georg Henzler <ghenz...@apache.org> wrote:
>>
>> Hi all,
>>
>> so to eventually follow up on what has been discussed at last year's >> adaptTo Hackathon and in the mailing list afterwards [1]/[2] - I now
>> found the time to draft the proposal for a Resource Mapping SPI.
>>
>> Two important points before we start:
>> * This is not meant to replace the Rewriting Pipelines as they exist >>  today - it just would allow some Sling setups (maybe > 70%?) to move >>  to a simpler solution avoiding the overhead of the rewriter pipeline
>>  while obviously losing its flexibility, but for many that will
>>  be ok (I fully agree with Bertrand [3] here)
>> * Let's focus the discussion first on if we would like to go into this >>  direction instead of details (like mutable vs. immutable or naming, >>  we can have detailed discussions around it once we decide to move on)
>>
>> But now to the actual proposal to introduce two new interfaces to the
>> Sling API:
>>
>> ResourceURI:
>>
>> On Sling API level we treat links as strings (e.g. rr.map() returns >> a String). Using Strings for links has produced many bugs in the past,
>> I think having an abstraction for a URI/link that is tailored for
>> Sling would help a lot (obviously plain JDK URI can be used, but
>> there is no support for selectors, suffix etc.).
>>
>> Besides being generally a good idea IMHO, ResourceURI shall be used
>> for the other interface:
>>
>> ResourceMapper:
>>
>> Allows to hook into the rr.resolve() and rr.map() by implementing
>> exactly those two methods that work on instances of ResourceURI. The
>> idea is that n OSGi service instances of ResourceMapper build a
>> pipeline of mappers that rr would run through when rr.resolve() or
>> rr.map() is called (in the opposite direction for map). The request
>> is passed in as context to both methods but may be null as it is
>> today, implementations of ResourceMapper need to be able to handle
>> null.
>>
>> The following mechanisms exist today for resource mapping in ootb
>> Sling:
>>
>> * Config "resource.resolver.mapping" in JcrResourceResolverFactoryImpl
>> * Mappings via /etc/map
>> * Vanity Paths
>> * sling:alias
>>
>> Those could be broken up into four implementations of ResourceMapper
>> that are configured by default.
>>
>> About backwards-compatibility/breaking existing implementations: So
>> this is a BIG CHANGE. However to keep it safe I would start with
>> exactly one ResourceMapper active that is backed by today's
>> implementation. The next step is to break it in separate resource
>> mappers (as mentioned above) and just put them in a row.
>>
>> The Resource Mapping SPI would bring the following benefits:
>>
>> * Being able to hook into the resource mapping via SPI allows
>>  for flexibility that is otherwise only possible Rewriting
>>  Pipelines - while link rewriting/checking is the only
>>  reason most projects use Rewriting Pipelines (with all its
>>  performance overhead)
>>
>> * Extending the mapping process via a well-defined API that
>>  allows to write Touring-complete code is better than working
>>  with Strings in the Transformer (also for many cases it can
>>  be cleaner and better 'unit-testable' than /etc/maps)
>>
>> * HTL there can be an option to automatically map all
>>  attributes of context uri (better performance, better for
>>  debugging if map() is called synchronously)
>>
>> * Introducing the interface ResourceURI (with a backing helper
>>  class for everybody to use) is useful in general
>>
>> See branch [4] for some first code of the proposal, in
>> particular ResourceMapping.java [5] and ResourceURI.java [6]
>>
>> -Georg
>>
>> [1] "[hackathon] externalization / resource mapping / rewriter"
>> https://www.mail-archive.com/dev@sling.apache.org/msg87736.html
>> [2] "Why get rid of the rewriter?"
>> https://www.mail-archive.com/dev@sling.apache.org/msg87867.html
>> [3] https://www.mail-archive.com/dev@sling.apache.org/msg87855.html
>>
>> [4]
>> https://github.com/apache/sling-org-apache-sling-api/compare/feature/Resource_Mapping_SPI
>> [5]
>> https://github.com/apache/sling-org-apache-sling-api/blob/7dd23418280fcef234b42dbabf6e13e43ef2d4d0/src/main/java/org/apache/sling/api/resource/mapping/spi/ResourceMapping.java
>> [6]
>> https://github.com/apache/sling-org-apache-sling-api/blob/7dd23418280fcef234b42dbabf6e13e43ef2d4d0/src/main/java/org/apache/sling/api/resource/uri/ResourceURI.java
>>
>>
>>
>> On 2020-01-17 10:45, Georg Henzler wrote:
>>> Hi Konrad,
>>> I think SLING-9005 is a good idea, but that is independent.
>>> My idea idea was to leave resourceResolver.map() unchanged (as there >>> is a lot of code using it out there, also it makes conceptually sense >>> as is). Rather I‘d like to provide an SPI that allows to customize
>>> the
>>> way of the behavior of resourceResolver.map() - I‘m thinking of a
>>> ranked list of ResourceMapper services (default one being today’s
>>> behavior). Robert also has done some work into this direction
>>> already,
>>> I‘d like to extend on that.
>>> I‘m currently on vacation but I have it on my list for when I‘m back.
>>> -Georg
>>>> On 16. Jan 2020, at 15:01, Konrad Windszus <konra...@gmx.de> wrote:
>>>> I would like to revive this.
>>>> @Georg: IIRC you wanted to come up with a proposal for a
>>>> externalizer API. Are you working on this?
>>>> Can we start by creating a JIRA ticket?
>>>> There has been a related ticket opened today:
>>>> https://issues.apache.org/jira/browse/SLING-9005
>>>> <https://issues.apache.org/jira/browse/SLING-9005>
>>>> Konrad
>>>>> On 5. Sep 2019, at 17:54, Jason E Bailey <jason.bai...@24601.org>
>>>>> wrote:
>>>>> Specifically with regard to the re-writer
>>>>> I'm working on a replacement for this which I'm currently calling
>>>>> the transformer ->
>>>>> https://github.com/apache/sling-whiteboard/tree/master/transformer >>>>> 1. It uses an HTML tokenizer that was added to the HTML utilities
>>>>> to generate a stream of  elements that can be modified and
>>>>> recombined. It's much faster then tagsoup or jsoup since it's not >>>>> trying to build a valid document. This also means that it supports
>>>>> anything that you write out in html. HTML4,5+
>>>>> 2. It uses services with an ordering priority and does pattern
>>>>> matching so that you can fine tune when the transformer is applied >>>>> 3. The specific use case I was working on is creating a CSP header >>>>> with a nonce or hash attribute to lock down the javascript that's
>>>>> on the page.
>>>>> It's currently working but it's not finished.
>>>>> Are there other problems with the rewriter that I haven't
>>>>> addressed?
>>>>> --
>>>>> Jason
>>>>>> On Thu, Sep 5, 2019, at 10:34 AM, Stefan Seifert wrote:
>>>>>> - resource mapping
>>>>>> - add a new SPI to define the mapping
>>>>>> - add a default impl that reads the mapping from /etc/map as it is
>>>>>> done today
>>>>>> - possible to override with service ranking
>>>>>> - but: there is already the ResourceMapper interface
>>>>>>  - introduced 1-2 years ago, use case was to get all existing
>>>>>> mappings
>>>>>>  - with this it's currently possible to replace mapping completely
>>>>>> with new logic
>>>>>> - maybe add a support to "contribute" additional mappings via a
>>>>>> service interface additional to this
>>>>>> - generic externalizer API
>>>>>> - naming not fixed yet, should not named "link" as this is too
>>>>>> specific. probably "URL".
>>>>>> - needs a java API for using, and an SPI for hooking in special
>>>>>> rules
>>>>>> - then add mappings for views in HTL*, JSP, model exporter/JSON >>>>>>  * probably starting with a sling-only HTL extension for try-out,
>>>>>> add it later to the spec when design is validated
>>>>>> - might be inspired by the basic features for wcm.io URL handler
>>>>>> [1]
>>>>>> - rewriter pipeline
>>>>>> - should be deprecated and no longer be used - especially not for
>>>>>> link externalization via postprocessing
>>>>>> - it may still be in use out there for more exotic use cases like
>>>>>> PDF
>>>>>> generation
>>>>>> - should probably removed from sling starter
>>>>>> stefan
>>>>>> [1] https://wcm.io/handler/url/

Reply via email to