Re: [Geotools-devel] [Geoserver-devel] GSIP 218: Control remote HTTP requests sent by GeoTools \ GeoServer

2023-03-23 Thread Roar Brænden
Hi,

I understand that this suggestion is more involved in GeoServer codebase than 
GeoTools.
And I also understand you want to keep a narrow scope.

But it brings on some interesting thoughts. What about MonitoringHTTPClient?

Hilsen Roar




> 23. mar. 2023 kl. 10:50 skrev Andrea Aime :
> 
> HI Roar,
> for the first round of implementation, I have no place that's actually using 
> the GeoTools HTTPClient interface.
> Given this would be a wrapper, it does not seem much of a HTTPBehavior, but 
> more of a hint, where the hint
> value would be the URLChecker to use.
> 
> Generally speaking, I'd expect it to  be handled in a similar way to the 
> LoggingHTTPClient.
> 
> Cheers
> Andrea
> 
> 
> On Wed, Mar 22, 2023 at 8:49 PM Roar Brænden  > wrote:
> Hi,
> 
> This looks like something I've been thinking about. Would love to implement 
> such a solution. Too bad I'm not in a position to do so.
> 
> Could that blocking, you wanted Jody, be handled by throwing an exception?
> 
> Should this involve an addition to the HTTPClient interface as well? How to 
> react on a failure to evaluate the url. Should it log a message or throw an 
> exception? I suppose that is the thinking behind that SecureHTTPClient.
> 
> Often it's hard to know exactly where a URL is specified. Especially if we 
> include the XML Schema locations. Since it will be the HTTPClient that will 
> log a warning or throw an exception. How could that object get enough 
> information to give the user good feedback on which store / layer / process 
> the admin needs to change? I suppose that varies from case to case and would 
> be for a later version.
> 
> Best regards,
> Roar Brænden
> 
> 
> 
>> 22. mar. 2023 kl. 19:00 skrev Jody Garnett > >:
>> 
>> Idea (feel free to indicate if it is out of scope).
>> 
>> Environmental variables were introduced to control access for entity 
>> resolution:
>> - It may be possible to replace these with the new URLChecker and simplify 
>> the application. Or;
>> - show them as a URLChecker that cannot be disabled in the user interface 
>> (to make it clear they are in play)
>> 
>> Reference: 
>> https://docs.geoserver.org/2.19.x/en/user/production/config.html#production-config-external-entities
>>  
>> 
>> --
>> Jody Garnett
>> 
>> 
>> On Wed, Mar 22, 2023 at 10:44 AM Andrea Aime 
>> > > wrote:
>> Yep, makes sense, proposal updated.
>> 
>> Cheers
>> Andrea
>> 
>> On Wed, Mar 22, 2023 at 6:31 PM Jody Garnett > > wrote:
>> Indeed if you are just intended to back from a regex; then rephrase the 
>> javadoc or make the method name more clear than "evaluate": 
>> 
>> /**
>>  * Provide implementation to evaluate location/URL/URI passed in string 
>> form
>>  *
>>  * @param location the subject of evaluation
>>  * @return true if the location is accepted, false otherwise
>>  */
>> boolean evaluate(String location);
>> 
>> "otherwise" above indicates the location would not be accepted.
>> 
>> To clarify intent:
>> 
>> /**
>>  * Used to confirm location is allowed for use. 
>> *
>>  * URLChecker is used to confirm if a location is allowed for use, 
>> returning {@true} when they recognize a location as permitted.
>>  * Several URLChecker instances are expected to be available, as long as 
>> one URLChecker can confirm a location it is permitted for use.
>>  * 
>>  * @param location Location expressed as URL, URI or path.
>>  * @return {@code true} indicates the URLChecker can confirm the 
>> location is allowed for use, {@code false} indicates the URLChecker is 
>> unable to confirm.
>>  */
>> boolean confirm(String location);
>> 
>> --
>> Jody Garnett
>> 
>> 
>> On Wed, Mar 22, 2023 at 10:07 AM Andrea Aime 
>> > > wrote:
>> Hi Jody,
>> while the suggestion seems to clarify things, it seems to me it's making the 
>> implementation harder.
>> 
>> With a regular expression based system, how do you distinguish BLOCK and 
>> NO_OPINION (imagine we'd have different implementations, one based on 
>> regexes for user configured sites, and another one for the well known schema 
>> sites, such as schemas.opengis.org  and xml.org 
>> , or a dynamic one allowing a store to declare that the 
>> server it's talking to is safe).
>> 
>> The idea here is that the URL is now allowed, unless explicitly approved. 
>> All that we're looking for is a "yes".
>> The problem with the other state, is that it's really just "not yes", 
>> without any extra useful semantic attached to it.
>> 
>> Having a state like "BLOCK" would imply the implementation is based on a 
>> black list instead (anything but not this one).
>> Do you have a use case for it?
>> 
>> 

Re: [Geotools-devel] [Geoserver-devel] GSIP 218: Control remote HTTP requests sent by GeoTools \ GeoServer

2023-03-23 Thread Andrea Aime
HI Roar,
for the first round of implementation, I have no place that's actually
using the GeoTools HTTPClient interface.
Given this would be a wrapper, it does not seem much of a HTTPBehavior, but
more of a hint, where the hint
value would be the URLChecker to use.

Generally speaking, I'd expect it to  be handled in a similar way to
the LoggingHTTPClient.

Cheers
Andrea


On Wed, Mar 22, 2023 at 8:49 PM Roar Brænden 
wrote:

> Hi,
>
> This looks like something I've been thinking about. Would love to
> implement such a solution. Too bad I'm not in a position to do so.
>
> Could that blocking, you wanted Jody, be handled by throwing an exception?
>
> Should this involve an addition to the HTTPClient interface as well? How
> to react on a failure to evaluate the url. Should it log a message or throw
> an exception? I suppose that is the thinking behind that SecureHTTPClient.
>
> Often it's hard to know exactly where a URL is specified. Especially if we
> include the XML Schema locations. Since it will be the HTTPClient that will
> log a warning or throw an exception. How could that object get enough
> information to give the user good feedback on which store / layer / process
> the admin needs to change? I suppose that varies from case to case and
> would be for a later version.
>
> Best regards,
> Roar Brænden
>
>
>
> 22. mar. 2023 kl. 19:00 skrev Jody Garnett :
>
> Idea (feel free to indicate if it is out of scope).
>
> Environmental variables were introduced to control access for entity
> resolution:
> - It may be possible to replace these with the new URLChecker and simplify
> the application. Or;
> - show them as a URLChecker that cannot be disabled in the user interface
> (to make it clear they are in play)
>
> Reference:
> https://docs.geoserver.org/2.19.x/en/user/production/config.html#production-config-external-entities
> --
> Jody Garnett
>
>
> On Wed, Mar 22, 2023 at 10:44 AM Andrea Aime <
> andrea.a...@geosolutionsgroup.com> wrote:
>
>> Yep, makes sense, proposal updated.
>>
>> Cheers
>> Andrea
>>
>> On Wed, Mar 22, 2023 at 6:31 PM Jody Garnett 
>> wrote:
>>
>>> Indeed if you are just intended to back from a regex; then rephrase the
>>> javadoc or make the method name more clear than "evaluate":
>>>
>>> /**
>>>  * Provide implementation to evaluate location/URL/URI passed in
>>> string form
>>>  *
>>>  * @param location the subject of evaluation
>>>  * @return true if the location is accepted, false otherwise
>>>  */
>>> boolean evaluate(String location);
>>>
>>> "otherwise" above indicates the location would not be accepted.
>>>
>>> To clarify intent:
>>>
>>> /**
>>>  * Used to confirm location is allowed for use.
>>> *
>>>  * URLChecker is used to confirm if a location is allowed for use,
>>> returning {@true} when they recognize a location as permitted.
>>>  * Several URLChecker instances are expected to be available, as
>>> long as one URLChecker can confirm a location it is permitted for use.
>>>  *
>>>  * @param location Location expressed as URL, URI or path.
>>>  * @return {@code true} indicates the URLChecker can confirm the
>>> location is allowed for use, {@code false} indicates the URLChecker is
>>> unable to confirm.
>>>  */
>>> boolean confirm(String location);
>>>
>>> --
>>> Jody Garnett
>>>
>>>
>>> On Wed, Mar 22, 2023 at 10:07 AM Andrea Aime <
>>> andrea.a...@geosolutionsgroup.com> wrote:
>>>
 Hi Jody,
 while the suggestion seems to clarify things, it seems to me it's
 making the implementation harder.

 With a regular expression based system, how do you distinguish BLOCK
 and NO_OPINION (imagine we'd have different implementations, one based on
 regexes for user configured sites, and another one for the well known
 schema sites, such as schemas.opengis.org and xml.org, or a dynamic
 one allowing a store to declare that the server it's talking to is safe).

 The idea here is that the URL is now allowed, unless
 explicitly approved. All that we're looking for is a "yes".
 The problem with the other state, is that it's really just "not yes",
 without any extra useful semantic attached to it.

 Having a state like "BLOCK" would imply the implementation is based on
 a black list instead (anything but not this one).
 Do you have a use case for it?

 Cheers
 Andrea



 On Wed, Mar 22, 2023 at 5:45 PM Jody Garnett 
 wrote:

> The URL checker has a yes/no response - but is written as a yes/don’t
> care - since to access only one URL checker needs to say yes.
>
> To address feedback:
> - Adjust javadoc, or
> - Provide three states: ALLOW, BLOCK, NO_OPINION
>
> My preference is to return an Enum even if just two states are
> permitted to prevent any confusion.
>
> On Wed, Mar 22, 2023 at 9:15 AM Andrea Aime <
> andrea.a...@geosolutionsgroup.com> wrote:
>
>> HI 

Re: [Geotools-devel] GSIP 218: Control remote HTTP requests sent by GeoTools \ GeoServer

2023-03-23 Thread Andrea Aime
Hi Jody,
the proposal indicates the scope of the initial implementation: WMS remote
SLD, WMS feature portrayal, WPS remote inputs.

That said, yes, entity resolution as set up in AllowListEntityResolver
could be rewritten to operate
against a URLChecker... I guess the GeoServer URL checker could bind
together the list of patterns
configured in the UI, and add the ones that are built-in for the
AllowListEntityResolver (the GUI would
have to clarify that I guess, or have a separate list for those?).

For now it's out of scope, but it's certainly up for consideration down the
line

Cheers
Andrea

On Wed, Mar 22, 2023 at 7:01 PM Jody Garnett  wrote:

> Idea (feel free to indicate if it is out of scope).
>
> Environmental variables were introduced to control access for entity
> resolution:
> - It may be possible to replace these with the new URLChecker and simplify
> the application. Or;
> - show them as a URLChecker that cannot be disabled in the user interface
> (to make it clear they are in play)
>
> Reference:
> https://docs.geoserver.org/2.19.x/en/user/production/config.html#production-config-external-entities
> --
> Jody Garnett
>
>
> On Wed, Mar 22, 2023 at 10:44 AM Andrea Aime <
> andrea.a...@geosolutionsgroup.com> wrote:
>
>> Yep, makes sense, proposal updated.
>>
>> Cheers
>> Andrea
>>
>> On Wed, Mar 22, 2023 at 6:31 PM Jody Garnett 
>> wrote:
>>
>>> Indeed if you are just intended to back from a regex; then rephrase the
>>> javadoc or make the method name more clear than "evaluate":
>>>
>>> /**
>>>  * Provide implementation to evaluate location/URL/URI passed in
>>> string form
>>>  *
>>>  * @param location the subject of evaluation
>>>  * @return true if the location is accepted, false otherwise
>>>  */
>>> boolean evaluate(String location);
>>>
>>> "otherwise" above indicates the location would not be accepted.
>>>
>>> To clarify intent:
>>>
>>> /**
>>>  * Used to confirm location is allowed for use.
>>> *
>>>  * URLChecker is used to confirm if a location is allowed for use,
>>> returning {@true} when they recognize a location as permitted.
>>>  * Several URLChecker instances are expected to be available, as
>>> long as one URLChecker can confirm a location it is permitted for use.
>>>  *
>>>  * @param location Location expressed as URL, URI or path.
>>>  * @return {@code true} indicates the URLChecker can confirm the
>>> location is allowed for use, {@code false} indicates the URLChecker is
>>> unable to confirm.
>>>  */
>>> boolean confirm(String location);
>>>
>>> --
>>> Jody Garnett
>>>
>>>
>>> On Wed, Mar 22, 2023 at 10:07 AM Andrea Aime <
>>> andrea.a...@geosolutionsgroup.com> wrote:
>>>
 Hi Jody,
 while the suggestion seems to clarify things, it seems to me it's
 making the implementation harder.

 With a regular expression based system, how do you distinguish BLOCK
 and NO_OPINION (imagine we'd have different implementations, one based on
 regexes for user configured sites, and another one for the well known
 schema sites, such as schemas.opengis.org and xml.org, or a dynamic
 one allowing a store to declare that the server it's talking to is safe).

 The idea here is that the URL is now allowed, unless
 explicitly approved. All that we're looking for is a "yes".
 The problem with the other state, is that it's really just "not yes",
 without any extra useful semantic attached to it.

 Having a state like "BLOCK" would imply the implementation is based on
 a black list instead (anything but not this one).
 Do you have a use case for it?

 Cheers
 Andrea



 On Wed, Mar 22, 2023 at 5:45 PM Jody Garnett 
 wrote:

> The URL checker has a yes/no response - but is written as a yes/don’t
> care - since to access only one URL checker needs to say yes.
>
> To address feedback:
> - Adjust javadoc, or
> - Provide three states: ALLOW, BLOCK, NO_OPINION
>
> My preference is to return an Enum even if just two states are
> permitted to prevent any confusion.
>
> On Wed, Mar 22, 2023 at 9:15 AM Andrea Aime <
> andrea.a...@geosolutionsgroup.com> wrote:
>
>> HI all,
>> this is a revival of the old GSIP-189, a bit modernized, with a
>> smaller initial scope (that should help us get an implementation going
>> safeguarding some remote access functionality sooner rather than later).
>>
>> Please review, discuss, vote:
>> https://github.com/geoserver/geoserver/wiki/GSIP-218
>>
>> Best regards
>> Andrea
>>
>>
>> ==
>> GeoServer Professional Services from the experts!
>>
>> Visit http://bit.ly/gs-services-us for more information.
>> ==
>>
>> Ing. Andrea Aime
>> @geowolf
>> Technical Lead
>>
>> GeoSolutions Group
>> phone: +39 0584 962313
>>
>> fax: +39 0584 1660272
>>
>> mob: 

Re: [Geotools-devel] [Geoserver-devel] GSIP 218: Control remote HTTP requests sent by GeoTools \ GeoServer

2023-03-23 Thread Roar Brænden
Hi,

Ok, got that.

Regards, Roar

> 23. mar. 2023 kl. 00:09 skrev Jody Garnett :
> 
> Roar:
> 
> I was mostly interested in clarifying the api; I just had an experience with 
> enabling/disabling resources for different layers that had a similar OR test 
> where any true was sufficient - and it was very confusing.
> 
> I do think that when this is ready it can be applied to geotools codebase as 
> a wrapper on the http client… but that is an implementation detail. 
> 
> The important thing is that this is a good addition and we should have an 
> opportunity for future work simplifying the application.  
> 
> On Wed, Mar 22, 2023 at 12:49 PM Roar Brænden  > wrote:
> Hi,
> 
> This looks like something I've been thinking about. Would love to implement 
> such a solution. Too bad I'm not in a position to do so.
> 
> Could that blocking, you wanted Jody, be handled by throwing an exception?
> 
> Should this involve an addition to the HTTPClient interface as well? How to 
> react on a failure to evaluate the url. Should it log a message or throw an 
> exception? I suppose that is the thinking behind that SecureHTTPClient.
> 
> Often it's hard to know exactly where a URL is specified. Especially if we 
> include the XML Schema locations. Since it will be the HTTPClient that will 
> log a warning or throw an exception. How could that object get enough 
> information to give the user good feedback on which store / layer / process 
> the admin needs to change? I suppose that varies from case to case and would 
> be for a later version.
> 
> Best regards,
> Roar Brænden
> 
> 
> 
> 
>> 22. mar. 2023 kl. 19:00 skrev Jody Garnett > >:
>> 
> 
> 
>> Idea (feel free to indicate if it is out of scope).
>> 
>> Environmental variables were introduced to control access for entity 
>> resolution:
>> - It may be possible to replace these with the new URLChecker and simplify 
>> the application. Or;
>> - show them as a URLChecker that cannot be disabled in the user interface 
>> (to make it clear they are in play)
>> 
>> Reference: 
>> https://docs.geoserver.org/2.19.x/en/user/production/config.html#production-config-external-entities
>>  
>> 
>> --
>> Jody Garnett
>> 
>> 
>> On Wed, Mar 22, 2023 at 10:44 AM Andrea Aime 
>> > > wrote:
>> Yep, makes sense, proposal updated.
>> 
>> Cheers
>> Andrea
>> 
>> On Wed, Mar 22, 2023 at 6:31 PM Jody Garnett > > wrote:
>> Indeed if you are just intended to back from a regex; then rephrase the 
>> javadoc or make the method name more clear than "evaluate": 
>> 
>> /**
>>  * Provide implementation to evaluate location/URL/URI passed in string 
>> form
>>  *
>>  * @param location the subject of evaluation
>>  * @return true if the location is accepted, false otherwise
>>  */
>> boolean evaluate(String location);
>> 
>> "otherwise" above indicates the location would not be accepted.
>> 
>> To clarify intent:
>> 
>> /**
>>  * Used to confirm location is allowed for use. 
>> *
>>  * URLChecker is used to confirm if a location is allowed for use, 
>> returning {@true} when they recognize a location as permitted.
>>  * Several URLChecker instances are expected to be available, as long as 
>> one URLChecker can confirm a location it is permitted for use.
>>  * 
>>  * @param location Location expressed as URL, URI or path.
>>  * @return {@code true} indicates the URLChecker can confirm the 
>> location is allowed for use, {@code false} indicates the URLChecker is 
>> unable to confirm.
>>  */
>> boolean confirm(String location);
>> 
>> --
>> Jody Garnett
>> 
>> 
>> On Wed, Mar 22, 2023 at 10:07 AM Andrea Aime 
>> > > wrote:
>> Hi Jody,
>> while the suggestion seems to clarify things, it seems to me it's making the 
>> implementation harder.
>> 
>> With a regular expression based system, how do you distinguish BLOCK and 
>> NO_OPINION (imagine we'd have different implementations, one based on 
>> regexes for user configured sites, and another one for the well known schema 
>> sites, such as schemas.opengis.org  and xml.org 
>> , or a dynamic one allowing a store to declare that the 
>> server it's talking to is safe).
>> 
>> The idea here is that the URL is now allowed, unless explicitly approved. 
>> All that we're looking for is a "yes".
>> The problem with the other state, is that it's really just "not yes", 
>> without any extra useful semantic attached to it.
>> 
>> Having a state like "BLOCK" would imply the implementation is based on a 
>> black list instead (anything but not this one).
>> Do you have a use case for it?
>> 
>> Cheers
>> Andrea
>> 
>> 
>> 
>> On Wed, Mar 22, 2023 at 5:45 PM Jody Garnett >