java.net.URLDecoder should do whatever is necessary.

        Andy

On 17/05/11 13:47, Svatopluk Šperka wrote:
Hi,

I'm using Jena's ARQ to parse a large amount of URI encoded queries.
I would like to ask if someone would have a tip for the best way to
preprocess queries before feeding them to ARQ.

Problem is that if I use simple URI decoding everything gets
decoded.

For example, if I have encoded query:

%0APREFIX%20foaf%3A%20%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0APREFIX%20p%3A%20%20%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2F%3E%0ASELECT%20%2A%20WHERE%20%7B%20%0A%20%20%20%20%20%20%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FThomas_%22Scotch_Tom%22_Nelson%3E%20p%3Aabstract%20%3Fabstract.%0A%20%20%20%20%20%20%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FThomas_%22Scotch_Tom%22_Nelson%3E%20foaf%3Apage%20%3Fwiki.%20%0A%20%20%20%20%20%20OPTIONAL%20%7B%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FThomas_%22Scotch_Tom%22_Nelson%3E%20foaf%3Adepiction%20%3Fimg.%7D%0A%20%20%20%20%20%20FILTER%20%28lang%28%3Fabstract%29%20%3D%20%27en%27%29%0A%20%20%20%7D%0A

 it turns into

\nPREFIX foaf:<http://xmlns.com/foaf/0.1/>\nPREFIX
p:<http://dbpedia.org/property/>\nSELECT * WHERE {
\n<http://dbpedia.org/resource/Thomas_"Scotch_Tom"_Nelson>
p:abstract
?abstract.\n<http://dbpedia.org/resource/Thomas_"Scotch_Tom"_Nelson>
foaf:page ?wiki. \n      OPTIONAL
{<http://dbpedia.org/resource/Thomas_"Scotch_Tom"_Nelson>
foaf:depiction ?img.}\n      FILTER (lang(?abstract) = 'en')\n
}\n"

which cannot be parsed because of quotes in
http://dbpedia.org/resource/Thomas_"Scotch_Tom"_Nelson.

This is not a simple encoding problem.

Either the original query had " in them in which case it's bad (it should have %22), or if it did then that gets encoded into %2522

%25 is % itself.  %2522 is encoded %22

        Andy

Is there some canonical way how to do the preprocessing correctly ?

Thank you very much.


Svatopluk Šperka

Reply via email to