BBlack added a comment.

In https://phabricator.wikimedia.org/T112151#1739918, @Smalyshev wrote:

> > As Andrew said above, why not support this directly in WQDS if you have to 
> > support it at all?
>
>
> Because in Blazegraph, allowing POST means allowing write requests. This is 
> not good for security.


So even in blazegraph, GET and POST have standard semantic meanings... why is 
it that clients don't honor this?

> > even though most (all?) SPARQL traffic is readonly and probably can be 
> > cached

> 

> 

> I don't think it is a good idea to cache SPARQL queries. They are big, they 
> are rarely repeated as-is, and if they are repeated this is usually because 
> the client expects new result. At least until we have some setup that 
> repeatedly requests same data over SPARQL, I don't see much point in caching 
> SPARQL responses.


It would be wiser to give them some (perhaps minimal) cacheability via 
Cache-Control, if nothing else as a buffer against simplistic DoS attacks that 
spam the same query at high rates...

> > This also conflicts with our overall strategy for multi-datacenter work,

> 

> 

> Given that there is no multi-datacenter setup for wdqs, I'm not sure how it 
> is relevant. Most SPARQL clients for which it is relevant wouldn't probably 
> have cookie storage mechanisms anyway.


There is eventually multi-datacenter for **everything** in the long term, so 
yeah that includes wdqs and is very relevant.  Every service is going to have 
to account for it in the long run with how state is managed, and one of our 
mechanisms for balancing traffic and avoiding state replication lag is the idea 
that while a normal (no special cookie) GET request is balanced between the DCs 
based on geography and load like normal, but POST requests are always directed 
to the primary DC only.  Additionally, once a POST request is seen, it sets a 
short-duration session cookie that maps all of that client's GET requests to 
the primary DC only as well (so that they don't suffer lag effects in reading 
the results of their own modifications).  If a read-(only|mostly) service uses 
POST for all of its traffic due to client deficiency, all of that traffic will 
be stuck on the primary DC only and not benefit from read load balancing.


TASK DETAIL
  https://phabricator.wikimedia.org/T112151

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev, BBlack
Cc: JanZerebecki, BBlack, Andrew, Deskana, Joe, gerritbot, nichtich, Jneubert, 
Karima, Aklapper, Smalyshev, JGirault, jkroll, Wikidata-bugs, Jdouglas, aude, 
Manybubbles



_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to