Re: [PLUG] question on linux tool to clean URLs

2019-02-06 Thread Ben Koenig
I don't mean to argue against sed/awk here, it's just that the thing you need to remember is that while yes, all parameters should start with a '?', they don't have to. This is nothing more than a convention used by most webservers, and not a technical requirement. So the problem with doing a

Re: [PLUG] question on linux tool to clean URLs

2019-02-06 Thread John Sechrest
I am sure this is buildable with a one line perl script. Probably with SED as well. Depends on the level of cleaning you want. Likely, you get 90% of the way Judy cutting off everything after the ? In the URL ... Including the ? On Wed, Feb 6, 2019, 4:52 PM Ben Koenig I don't know of a tool

Re: [PLUG] question on linux tool to clean URLs

2019-02-06 Thread Ben Koenig
I don't know of a tool that does this, but URL formatting is common for a lot of programming tasks. If you know python, setting up a small script that returns specific pieces of a URL is trivial. https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse Qt5 (and probably GTK too )

Re: [PLUG] question on linux tool to clean URLs

2019-02-05 Thread David Barr
Hey, Randall, To be pedantic, the tracking tags and such are all stuff that appear after the question mark delimiting character in the HTTP PUT request, right? `https://foo/bar/baz?evil_tag=evil` The trick then, is to select only the lines containing question marks, and then delete from the

Re: [PLUG] question on linux tool to clean URLs

2019-02-05 Thread Rich Shepard
On Tue, 5 Feb 2019, logical american wrote: Is there a linux tool which cleans up the URLs in a text file (I believe Western unicode encoding) so that all the tracking tags, fbclid, etc are removed and the pure URL is left in the text? In one recent email I received, there were 28

[PLUG] question on linux tool to clean URLs

2019-02-05 Thread logical american
Hi: Is there a linux tool which cleans up the URLs in a text file (I believe Western unicode encoding) so that all the tracking tags, fbclid, etc are removed and the pure URL is left in the text? In one recent email I received, there were 28 govdelivery.com tags and others embedded inside