> and needs some explaining why we put open endpoints on the web without great 
> restrictions

I've always been puzzled by this as well. You never see a publicly reachable 
PostgreSQL or MariaDB servers, or any other database. There is always a layer 
in between which defines a list of possible requests, and then every requests 
is optimized to retrieve data from the database. With a public endpoint 
instead, this optimization is not possible since anybody can write any query. I 
think the reason is simply that a sparql endpoint is supposed to answer any 
type of query which traverses any path that is not well defined a priori. If 
you only want the server to serve a specific kind of queries instead, in this 
case you can in fact use some kind of REST API in front of it and translate 
every request to a sparql query; in this scenario you don't need the endpoint 
to be public, but you're limiting the type of queries that a user can ask.

 
 

Sent: Tuesday, December 18, 2018 at 11:40 PM
From: "Marco Neumann" <marco.neum...@gmail.com>
To: "Bruno P. Kinoshita" <brunodepau...@yahoo.com.br>, users@jena.apache.org
Subject: Re: blocking IP to prevent malicious sparql queries
It's good to see people using sparql one way or another. It's still an
unusual thing in the wild and needs some explaining why we put open
endpoints on the web without great restrictions. But since this one is
intended to be a sandbox to play with and learn I take indeed a positive
view on this incident.

On Tue 18 Dec 2018 at 21:34, Bruno P. Kinoshita
<brunodepau...@yahoo.com.br.invalid> wrote:

> I think Laura's option is the best/easiest one, and good on you for the
> positive point-of-view on these spams Marco! :D
> Bruno
>
> From: Marco Neumann <marco.neum...@gmail.com>
> To: users@jena.apache.org
> Sent: Wednesday, 19 December 2018 8:58 AM
> Subject: Re: blocking IP to prevent malicious sparql queries
>
> Thank you Laura,
>
> I was hoping for a quick fix and something along the lines of a fuseki
> blacklist filter in the shiro.ini
>
> but yes the reverse proxy is probably a more sensible approach at this
> point.
>
> In any event good to see sparql spam like this here, it means that the
> Semantic Web has most certainly arrived in the mainstream ;)
>
>
>
> On Tue, Dec 18, 2018 at 5:35 PM Laura Morales <laure...@mail.com> wrote:
>
> > While I think the correct answer is YES (perhaps by implementing a custom
> > filter), I guess the answer is going to be "use a reverse proxy".
> >
> >
> >
> >
> > Sent: Tuesday, December 18, 2018 at 6:16 PM
> > From: "Marco Neumann" <marco.neum...@gmail.com>
> > To: users@jena.apache.org
> > Subject: blocking IP to prevent malicious sparql queries
> > is it possible to block indiviual IPs with the shiro.ini?
> >
> > We receive a number of malicious sparql queries from an IP in France
> > (193.52.210.70) today
> >
> > that continuously issues the following SPARQL query:
> >
> > SELECT ?r (count(*) AS ?count)
> > WHERE{ ?x ?r ?s
> > { SELECT ?s WHERE
> > { ?s a ?o }
> > OFFSET 124639 LIMIT 1000 }
> > } GROUP BY ?s ?r OFFSET 0 LIMIT 10000
> >
> > resulting in:
> >
> > [2018-12-18 18:10:31] AbstractConnector WARN
> > java.lang.OutOfMemoryError: GC overhead limit exceeded
> > [2018-12-18 18:10:34] Fuseki WARN [424] RC = 500 : GC overhead limit
> > exceeded
> > java.lang.OutOfMemoryError: GC overhead limit exceeded
> > [2018-12-18 18:10:34] Fuseki INFO [424] 500 GC overhead limit exceeded
> > (39.946 s)
> >
> > and pushes fuseki offline for a few minutes.
> >
> >
> > --
> >
> >
> > ---
> > Marco Neumann
> > KONA
> >
>
>
> --
>
>
> ---
> Marco Neumann
> KONA
>
>
>

--


---
Marco Neumann
KONA

Reply via email to