> and needs some explaining why we put open endpoints on the web without great > restrictions
I've always been puzzled by this as well. You never see a publicly reachable PostgreSQL or MariaDB servers, or any other database. There is always a layer in between which defines a list of possible requests, and then every requests is optimized to retrieve data from the database. With a public endpoint instead, this optimization is not possible since anybody can write any query. I think the reason is simply that a sparql endpoint is supposed to answer any type of query which traverses any path that is not well defined a priori. If you only want the server to serve a specific kind of queries instead, in this case you can in fact use some kind of REST API in front of it and translate every request to a sparql query; in this scenario you don't need the endpoint to be public, but you're limiting the type of queries that a user can ask. Sent: Tuesday, December 18, 2018 at 11:40 PM From: "Marco Neumann" <marco.neum...@gmail.com> To: "Bruno P. Kinoshita" <brunodepau...@yahoo.com.br>, users@jena.apache.org Subject: Re: blocking IP to prevent malicious sparql queries It's good to see people using sparql one way or another. It's still an unusual thing in the wild and needs some explaining why we put open endpoints on the web without great restrictions. But since this one is intended to be a sandbox to play with and learn I take indeed a positive view on this incident. On Tue 18 Dec 2018 at 21:34, Bruno P. Kinoshita <brunodepau...@yahoo.com.br.invalid> wrote: > I think Laura's option is the best/easiest one, and good on you for the > positive point-of-view on these spams Marco! :D > Bruno > > From: Marco Neumann <marco.neum...@gmail.com> > To: users@jena.apache.org > Sent: Wednesday, 19 December 2018 8:58 AM > Subject: Re: blocking IP to prevent malicious sparql queries > > Thank you Laura, > > I was hoping for a quick fix and something along the lines of a fuseki > blacklist filter in the shiro.ini > > but yes the reverse proxy is probably a more sensible approach at this > point. > > In any event good to see sparql spam like this here, it means that the > Semantic Web has most certainly arrived in the mainstream ;) > > > > On Tue, Dec 18, 2018 at 5:35 PM Laura Morales <laure...@mail.com> wrote: > > > While I think the correct answer is YES (perhaps by implementing a custom > > filter), I guess the answer is going to be "use a reverse proxy". > > > > > > > > > > Sent: Tuesday, December 18, 2018 at 6:16 PM > > From: "Marco Neumann" <marco.neum...@gmail.com> > > To: users@jena.apache.org > > Subject: blocking IP to prevent malicious sparql queries > > is it possible to block indiviual IPs with the shiro.ini? > > > > We receive a number of malicious sparql queries from an IP in France > > (193.52.210.70) today > > > > that continuously issues the following SPARQL query: > > > > SELECT ?r (count(*) AS ?count) > > WHERE{ ?x ?r ?s > > { SELECT ?s WHERE > > { ?s a ?o } > > OFFSET 124639 LIMIT 1000 } > > } GROUP BY ?s ?r OFFSET 0 LIMIT 10000 > > > > resulting in: > > > > [2018-12-18 18:10:31] AbstractConnector WARN > > java.lang.OutOfMemoryError: GC overhead limit exceeded > > [2018-12-18 18:10:34] Fuseki WARN [424] RC = 500 : GC overhead limit > > exceeded > > java.lang.OutOfMemoryError: GC overhead limit exceeded > > [2018-12-18 18:10:34] Fuseki INFO [424] 500 GC overhead limit exceeded > > (39.946 s) > > > > and pushes fuseki offline for a few minutes. > > > > > > -- > > > > > > --- > > Marco Neumann > > KONA > > > > > -- > > > --- > Marco Neumann > KONA > > > -- --- Marco Neumann KONA