URL Encoding in Java is properly done through the URI interface, particularly when using this[1] constructor:
URI(String <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html> scheme, String <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html> userInfo, String <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html> host, int port, String <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html> path, String <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html> query, String <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html> fragment) The query part will be properly URL Encoded using the above constructor. So possibly the solution here is to allow GetHTTP to expose the scheme, userInfo, host, port, path, query and fragment parameters as configuration properties? Expression language could still be used on these properties, of course, but running it through #urlEncode wouldn't be necessary. [1] http://docs.oracle.com/javase/7/docs/api/java/net/URI.html#URI%28java.lang.String,%20java.lang.String,%20java.lang.String,%20int,%20java.lang.String,%20java.lang.String,%20java.lang.String%29 On Tue, Dec 15, 2015 at 5:46 PM, Mark Payne <[email protected]> wrote: > That looks about right to me. But that seems a little bit painful. Perhaps > we can make that easier by offering a new Property that allows you to just > tell the Processor that it needs to do the URL Encoding for you. Would that > make life easier for you? > > Thanks > -Mark > > > > On Dec 15, 2015, at 5:43 PM, Igor Kravzov <[email protected]> > wrote: > > > > Guys, I think I get it. Is this how it should be? > > > > > http://api.twingly.com/analytics/Analytics.ashx?key=75744154-6ACB-3340-937A-9B5A59FA8F30&searchpattern=${literal('Matt > > > Dupe'):urlEncode()}&xmloutputversion=2&ts=${now():minus(600000):format("yyyy-MM-dd > > HH:mm:ss")}&tsTo=${now():format("yyyy-MM-dd HH:mm:ss")} > > > > On Tue, Dec 15, 2015 at 5:38 PM, Joe Percivall < > > [email protected]> wrote: > > > >> I think you're thinking about it the wrong way. You aren't actually > using > >> any attributes (attributes are on flowfiles, properties on processors), > >> just utilizing the functions of expression language. So you can manually > >> put in everything you would normally as a literal then reference the > >> expression language functions in the expression using ${*expression*}. > >> > >> That being said, since GetHttp is a source processor you shouldn't be > >> changing that URL all that often so you could just manually replace the > >> space with "%20" (URL encoding of a space). > >> > >> Here is a link to a URL encoder/decoder: > >> http://meyerweb.com/eric/tools/dencoder/ > >> > >> Joe > >> - - - - - - > >> Joseph Percivall > >> linkedin.com/in/Percivall > >> e: [email protected] > >> > >> > >> > >> > >> On Tuesday, December 15, 2015 5:32 PM, Igor Kravzov < > >> [email protected]> wrote: > >> But how I define attribute "url" in GetHTTP processor? I tried to ad a > >> property and got an error that property not defined or supported. > >> > >> > >> On Tue, Dec 15, 2015 at 4:57 PM, Joe Percivall < > >> [email protected]> wrote: > >> > >>> Hello Igor, > >>> > >>> The URL property for GetHTTP supports expression language. Check out > this > >>> function: > >>> > >> > https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#urlencode > >>> > >>> > >>> Hope that helps, > >>> Joe > >>> > >>> - - - - - - > >>> Joseph Percivall > >>> linkedin.com/in/Percivall > >>> e: [email protected] > >>> > >>> > >>> > >>> > >>> On Tuesday, December 15, 2015 4:46 PM, Igor Kravzov < > >>> [email protected]> wrote: > >>> > >>> > >>> > >>> Hi Joe, > >>> > >>> Another quick question. How can I encode some part of the URL? Like in > >>> example bellow after searchpattern= > >>> < > >>> > >> > http://api.twingly.com/analytics/Analytics.ashx?key=75744154-6ACB-3340-937A-9B5A59FA8F30&searchpattern=boycott&xmloutputversion=2&ts=$%7Bnow():minus(600000):format( > >>>> > >>> space > >>> between Matt Dupe needs to be encoded. Otherwise GetHTTP throws an > >> error. > >>> > >>> > >>> > >> > http://api.twingly.com/analytics/Analytics.ashx?key=75744154-6ACB-3340-937A-9B5A59FA8F30&searchpattern=Matt > >>> Dupe&xmloutputversion=2&ts=${now():minus(600000):format( > >>> < > >>> > >> > http://api.twingly.com/analytics/Analytics.ashx?key=75744154-6ACB-3340-937A-9B5A59FA8F30&searchpattern=boycott&xmloutputversion=2&ts=$%7Bnow():minus(600000):format( > >>>> > >>> "yyyy-MM-dd HH:mm:ss")}&tsTo=${now():format("yyyy-MM-dd HH:mm:ss")} > >>> > >>> > >>> On Mon, Dec 14, 2015 at 6:16 PM, Joe Percivall < > >>> [email protected]> wrote: > >>> > >>>> Glad I could help and thanks! > >>>> - - - - - - > >>>> Joseph Percivall > >>>> linkedin.com/in/Percivall > >>>> e: [email protected] > >>>> > >>>> > >>>> > >>>> > >>>> On Monday, December 14, 2015 6:09 PM, Igor Kravzov < > >>> [email protected]> > >>>> wrote: > >>>> Thank you very much Joe. It worked. > >>>> And congratulations. > >>>> > >>>> > >>>> On Mon, Dec 14, 2015 at 6:00 PM, Joe Percivall < > >>>> [email protected]> wrote: > >>>> > >>>>> Hello Igor, > >>>>> > >>>>> You're having trouble because you have a space in your format of the > >>>> dates > >>>>> which need to be URL encoded. You can use the EL method "urlencode" > >> to > >>>> get > >>>>> a valid expression like such: > >>>>> > >>>>> > >>>>> > >>>> > >>> > >> > http://api.twingly.com/analytics/Analytics.ashx?key=75744154-6ACB-3340-937A-9B5A59FA8F30&searchpattern=boycott&xmloutputversion=2&ts=${now():minus(600000):format( > >>>> "yyyy-MM-dd > >>>>> HH:mm:ss"):urlEncode()}&tsTo=${now():format("yyyy-MM-dd > >>>>> HH:mm:ss"):urlEncode()} > >>>>> > >>>>> Hope that helps, > >>>>> Joe > >>>>> - - - - - - > >>>>> Joseph Percivall > >>>>> linkedin.com/in/Percivall > >>>>> e: [email protected] > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Monday, December 14, 2015 5:31 PM, Igor Kravzov < > >>>> [email protected]> > >>>>> wrote: > >>>>> Hi guys, > >>>>> > >>>>> Why I am getting the error bellow? I am constructing URL like this: > >>>>> > >>>>> > >>>> > >>> > >> > http://api.twingly.com/analytics/Analytics.ashx?key=75744154-6ACB-3340-937A-9B5A59FA8F30&searchpattern=boycott&xmloutputversion=2&ts=${now():minus(600000):format( > >>>>> "yyyy-MM-dd > >>>>> HH:mm:ss")}&tsTo=${now():format("yyyy-MM-dd HH:mm:ss")} > >>>>> > >>>>> > >>>>> Am I missing something? Thanks in advance. > >>>>> > >>>>> > >>>>> java.lang.IllegalArgumentException: Illegal character in query at > >> index > >>>>> 143: > >>>>> > >>>>> > >>>> > >>> > >> > http://api.twingly.com/analytics/Analytics.ashx?key=75744154-6ACB-3340-937A-9B5A59FA8F30&searchpattern=boycott&xmloutputversion=2&ts=2015-12-14 > >>>>> 17:03:46&tsTo=2015-12-14 17:13:46 > >>>>> at java.net.URI.create(Unknown Source) ~[na:1.8.0_66] > >>>>> at org.apache.http.client.methods.HttpGet.<init>(HttpGet.java:69) > >>>>> ~[httpclient-4.4.1.jar:4.4.1] > >>>>> at > >>>> > org.apache.nifi.processors.standard.GetHTTP.onTrigger(GetHTTP.java:444) > >>>>> ~[na:na] > >>>>> at > >>>>> > >>>>> > >>>> > >>> > >> > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1146) > >>>>> ~[nifi-framework-core-0.4.0.jar:0.4.0] > >>>>> at > >>>>> > >>>>> > >>>> > >>> > >> > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:139) > >>>>> [nifi-framework-core-0.4.0.jar:0.4.0] > >>>>> at > >>>>> > >>>>> > >>>> > >>> > >> > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:49) > >>>>> [nifi-framework-core-0.4.0.jar:0.4.0] > >>>>> at > >>>>> > >>>>> > >>>> > >>> > >> > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:119) > >>>>> [nifi-framework-core-0.4.0.jar:0.4.0] > >>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown > >> Source) > >>>>> [na:1.8.0_66] > >>>>> at java.util.concurrent.FutureTask.runAndReset(Unknown Source) > >>>>> [na:1.8.0_66] > >>>>> at > >>>>> > >>>>> > >>>> > >>> > >> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown > >>>>> Source) [na:1.8.0_66] > >>>>> at > >>>>> > >>>>> > >>>> > >>> > >> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown > >>>>> Source) [na:1.8.0_66] > >>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > >>>>> [na:1.8.0_66] > >>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > >>>>> [na:1.8.0_66] > >>>>> at java.lang.Thread.run(Unknown Source) [na:1.8.0_66] > >>>>> Caused by: java.net.URISyntaxException: Illegal character in query at > >>>> index > >>>>> 143: > >>>>> > >>>>> > >>>> > >>> > >> > http://api.twingly.com/analytics/Analytics.ashx?key=75744154-6ACB-3340-937A-9B5A59FA8F30&searchpattern=boycott&xmloutputversion=2&ts=2015-12-14 > >>>>> 17:03:46&tsTo=2015-12-14 17:13:46 > >>>>> at java.net.URI$Parser.fail(Unknown Source) ~[na:1.8.0_66] > >>>>> at java.net.URI$Parser.checkChars(Unknown Source) ~[na:1.8.0_66] > >>>>> at java.net.URI$Parser.parseHierarchical(Unknown Source) > >> ~[na:1.8.0_66] > >>>>> at java.net.URI$Parser.parse(Unknown Source) ~[na:1.8.0_66] > >>>>> at java.net.URI.<init>(Unknown Source) ~[na:1.8.0_66] > >>>>> ... 14 common frames omitted > >>>>> > >>>> > >>> > >> > >
