java.net.URLDecoder should do whatever is necessary.
Andy
On 17/05/11 13:47, Svatopluk Šperka wrote:
Hi,
I'm using Jena's ARQ to parse a large amount of URI encoded queries.
I would like to ask if someone would have a tip for the best way to
preprocess queries before feeding them to ARQ.
Problem is that if I use simple URI decoding everything gets
decoded.
For example, if I have encoded query:
%0APREFIX%20foaf%3A%20%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0APREFIX%20p%3A%20%20%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2F%3E%0ASELECT%20%2A%20WHERE%20%7B%20%0A%20%20%20%20%20%20%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FThomas_%22Scotch_Tom%22_Nelson%3E%20p%3Aabstract%20%3Fabstract.%0A%20%20%20%20%20%20%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FThomas_%22Scotch_Tom%22_Nelson%3E%20foaf%3Apage%20%3Fwiki.%20%0A%20%20%20%20%20%20OPTIONAL%20%7B%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FThomas_%22Scotch_Tom%22_Nelson%3E%20foaf%3Adepiction%20%3Fimg.%7D%0A%20%20%20%20%20%20FILTER%20%28lang%28%3Fabstract%29%20%3D%20%27en%27%29%0A%20%20%20%7D%0A
it turns into
\nPREFIX foaf:<http://xmlns.com/foaf/0.1/>\nPREFIX
p:<http://dbpedia.org/property/>\nSELECT * WHERE {
\n<http://dbpedia.org/resource/Thomas_"Scotch_Tom"_Nelson>
p:abstract
?abstract.\n<http://dbpedia.org/resource/Thomas_"Scotch_Tom"_Nelson>
foaf:page ?wiki. \n OPTIONAL
{<http://dbpedia.org/resource/Thomas_"Scotch_Tom"_Nelson>
foaf:depiction ?img.}\n FILTER (lang(?abstract) = 'en')\n
}\n"
which cannot be parsed because of quotes in
http://dbpedia.org/resource/Thomas_"Scotch_Tom"_Nelson.
This is not a simple encoding problem.
Either the original query had " in them in which case it's bad (it
should have %22), or if it did then that gets encoded into %2522
%25 is % itself. %2522 is encoded %22
Andy
Is there some canonical way how to do the preprocessing correctly ?
Thank you very much.
Svatopluk Šperka