I've added some more detail to the KIP [1] around current scenarios that might break in the future. I actually came up with a second limitation that we'd impose on users and also documented this.
Let me know what you think. Kind regards, Sönke [1] https://cwiki.apache.org/confluence/display/KAFKA/KIP-212%3A+Enforce+set+of+legal+characters+for+connector+names On Thu, Nov 16, 2017 at 2:59 PM, Sönke Liebau <soenke.lie...@opencore.com> wrote: > Hi Randall, > > I had mentioned this edge case in the KIP, but will add some further > detail to further clarify all changing scenarios post pull request. > > Kind regards, > Sönke > > > > > > On Thu, Nov 16, 2017 at 2:06 PM, Randall Hauch <rha...@gmail.com> wrote: > >> No, we need to keep the KIP since we want to change/correct the existing >> behavior. But we do need to clarify in the KIP these edge cases that will >> change. >> >> Thanks for the continued work on this, Sönke. >> >> Regards, >> >> Randall >> >> > On Nov 16, 2017, at 1:56 AM, Sönke Liebau <soenke.lie...@opencore.com >> .INVALID> wrote: >> > >> > Hi Randall, >> > >> > zero length definitely works, that's what sent me down this hole in the >> > first place. I had a customer accidentally create a connector without a >> > name in his environment and then be unable to delete it. No connector >> name >> > doesn't work, as this throws a null pointer exception due to KAFKA-4938 >> , >> > but once that is fixed would create a connector named "null" I think. >> Have >> > not retested this, but seen it in the past. >> > >> > Also, it is possible to create connectors with trailing and leading >> > whitespaces, this errors out on the create request (which will be fixed >> > when KAFKA-4827 is merged), but correctly creates the connector and you >> can >> > access it if you percent-escape the curl call. This for me is the main >> > reason why a KIP might be needed, as we are changing public facing >> behavior >> > here. I agree with you, that this will probably not affect anyone or >> hardly >> > anyone, but in principle it is a change that should need a KIP I think. >> > >> > I've retested and documented this for Confluent 3.3.0: >> > https://gist.github.com/soenkeliebau/9363745cff23560fcc234d9b64ac14c4 >> > >> > I am of course happy to withdraw the KIP if you think it is unnecessary, >> > I've also updated the pull request for KAFKA-4930 to reflect the changes >> > stated in the KIP and tested the code with Arjuns pull request for >> > KAFKA-4827 to ensure they don't interfere with each other. >> > >> > Let me know what you think. >> > >> > Kind regards, >> > Sönke >> > >> > ᐧ >> > >> >> On Tue, Nov 14, 2017 at 7:03 PM, Randall Hauch <rha...@gmail.com> >> wrote: >> >> >> >> Thanks for updating the KIP to reflect the current process. However, I >> >> still question whether it is necessary to have a KIP - it depends on >> >> whether it was possible with prior versions to have connectors with >> >> zero-length or blank names. Have you tried both of these cases? >> >> >> >> On Fri, Nov 10, 2017 at 3:52 AM, Sönke Liebau < >> >> soenke.lie...@opencore.com.invalid> wrote: >> >> >> >>> Hi Randall, >> >>> >> >>> I have set aside some time to work on this next week. The fix itself >> is >> >>> quite simple, but I've yet to write tests to properly catch this, >> which >> >>> turns out to be a bit more complex, as it needs a running restserver >> >> which >> >>> is mocked in the tests I've looked at so far. >> >>> >> >>> Should I withdraw the KIP or update it to reflect the documentation >> >> changes >> >>> and enforced rules around trimming and zero length connector names? >> This >> >> is >> >>> a change to existing behavior, even if it is quite small and probably >> >> won't >> >>> even be noticed by many people.. >> >>> >> >>> best regards, >> >>> Sönke >> >>> >> >>>> On Thu, Nov 9, 2017 at 9:10 PM, Randall Hauch <rha...@gmail.com> >> wrote: >> >>>> >> >>>> Any progress on updating the PR and withdrawing KIP-212? >> >>>> >> >>>> On Fri, Oct 27, 2017 at 5:19 PM, Randall Hauch <rha...@gmail.com> >> >> wrote: >> >>>> >> >>>>> Yes, connector names should not be blank or contain just whitespace. >> >> In >> >>>>> fact, I might recommend that we trim whitespace at the front and >> rear >> >>> of >> >>>>> new connector names and then disallowing any zero-length name. >> >> Existing >> >>>>> connectors would remain valid, and this would not break backward >> >>>>> compatibility. That might require a small kip simply to update the >> >>>>> documentation and specify what names are valid. >> >>>>> >> >>>>> WDYT? >> >>>>> >> >>>>> Randall >> >>>>> >> >>>>> On Fri, Oct 27, 2017 at 1:08 PM, Colin McCabe <cmcc...@apache.org> >> >>>> wrote: >> >>>>> >> >>>>>>> On Wed, Oct 25, 2017, at 01:07, Sönke Liebau wrote: >> >>>>>>> I've spent some time looking at this and testing various >> >> characters >> >>>> and >> >>>>>>> it >> >>>>>>> would appear that Randall's suspicion was spot on. I think we can >> >>>>>> support >> >>>>>>> a >> >>>>>>> fairly large set of characters with very minor changes. >> >>>>>>> >> >>>>>>> I was put of by the exceptions that were thrown when creating >> >>>> connectors >> >>>>>>> with certain characters and suspected a larger underlying problem >> >>> when >> >>>>>> in >> >>>>>>> fact the only issue is, that the URL in the rest request used to >> >>>>>> retrieve >> >>>>>>> the response for the create connector request needs to be percent >> >>>>>> encoded >> >>>>>>> [1]. >> >>>>>>> >> >>>>>>> I've fixed this and done some local testing which worked out quite >> >>>>>>> nicely, >> >>>>>>> apart from two special cases, I've not been able to find >> >> characters >> >>>> that >> >>>>>>> created issues, even space and slash work. >> >>>>>>> The mentioned special cases are: >> >>>>>>> \ - if the name contains a backslash that is not the beginning >> >>> of a >> >>>>>>> valid escape sequence the request fails before we ever get it in >> >>>>>>> ConnectorsResource, so a backslash would need to be escaped: \\ >> >>>>>>> " - Quotation marks need to be escaped as well to keep the json >> >>>> body >> >>>>>>> of >> >>>>>>> the request legal: \" >> >>>>>>> In both cases the escape character will be part of the connector >> >>> name >> >>>>>> and >> >>>>>>> need to be specified in the url to retrieve the connector as well, >> >>>> even >> >>>>>>> though we could URL encode it in a legal way without escaping >> >> here. >> >>> So >> >>>>>>> they >> >>>>>>> work, not sure if I'd recommend using those characters, but no >> >> real >> >>>>>>> reason >> >>>>>>> to prohibit people from using them that I can see either. >> >>>>>> >> >>>>>> Good research, Sönke. >> >>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> What I'd do going forward is: >> >>>>>>> - withdraw the KIP, as I don't see a real need for one, since this >> >>> is >> >>>>>> not >> >>>>>>> changing anything, just fixing things. >> >>>>>>> - add a section to the documentation around legal characters, >> >>> specify >> >>>>>> the >> >>>>>>> ones I tested explicitly (url encoded %20 - %7F) and mention that >> >>> most >> >>>>>>> other characters should work as well but no guarantees are given >> >>>>>>> - update the pull request for KAFKA-4930 to allow all characters >> >> but >> >>>>>>> still >> >>>>>>> prohibit creating a connector with an empty name. I'd propose to >> >>> keep >> >>>>>> the >> >>>>>>> validator though as it'll give us a central location to do any >> >>>> checking >> >>>>>>> that might turn out to be necessary later on. >> >>>>>> >> >>>>>> Are empty names currently allowed? That's unfortunate. >> >>>>>> >> >>>>>>> - add some integration tests to check connectors with special >> >>>> characters >> >>>>>>> in >> >>>>>>> their names work >> >>>>>>> - fix the url encoding line in ConnectorsResource >> >>>>>>> >> >>>>>>> Does that sound fair to everybody? >> >>>>>> >> >>>>>> It sounds good to me, but I will let someone more knowledgeable >> >> about >> >>>>>> connect chime in. >> >>>>>> >> >>>>>> best, >> >>>>>> Colin >> >>>>>> >> >>>>>>> >> >>>>>>> Kind regards, >> >>>>>>> Sönke >> >>>>>>> >> >>>>>>> [1] >> >>>>>>> https://github.com/apache/kafka/blob/trunk/connect/runtime/ >> >>>>>> src/main/java/org/apache/kafka/connect/runtime/rest/ >> >>>>>> resources/ConnectorsResource.java#L102 >> >>>>>>> >> >>>>>>> On Tue, Oct 24, 2017 at 8:40 PM, Colin McCabe <cmcc...@apache.org >> >>> >> >>>>>> wrote: >> >>>>>>> >> >>>>>>>>> On Tue, Oct 24, 2017, at 11:28, Sönke Liebau wrote: >> >>>>>>>>> Hi, >> >>>>>>>>> >> >>>>>>>>> after reading your messages I'll grant that I might have >> >> picked >> >>> a >> >>>>>>>>> somewhat >> >>>>>>>>> draconic option to solve these issues. >> >>>>>>>>> >> >>>>>>>>> In general I believe that properly encoding the URLs after >> >>> having >> >>>>>> created >> >>>>>>>>> the connectors should solve a lot of the issues already. For >> >>> some >> >>>>>>>>> characters the rest api returns an error on creating the >> >>> connector >> >>>>>> as >> >>>>>>>>> well, >> >>>>>>>>> so for that URL encoding won't help. However the connectors do >> >>> get >> >>>>>>>>> created >> >>>>>>>>> even though an error is returned, I've never investigated if >> >>> they >> >>>>>> are in >> >>>>>>>>> a >> >>>>>>>>> consistent state tbh - I'll give this another look. >> >>>>>>>>> >> >>>>>>>>> @colin: Entity encoding would allow us to encode a lot of >> >>>>>> characters, >> >>>>>>>>> however I am unsure whether we should prefer it over url >> >>> encoding >> >>>>>> in this >> >>>>>>>>> case, as mostly the end user would have to encode the >> >> characters >> >>>>>> himself. >> >>>>>>>>> And due to entity encoding ending every character with a ; >> >> which >> >>>>>> causes >> >>>>>>>>> the >> >>>>>>>>> embedded jetty server to cut the connector name at that >> >>> character >> >>>>>> we'd >> >>>>>>>>> probably need to encode that character in URL encoding again >> >> for >> >>>>>> that to >> >>>>>>>>> work out - which might get a bit too complex tbh. >> >>>>>>>> >> >>>>>>>> Sorry, I meant to write percent-encoding, not entity refs. >> >>>>>>>> https://en.wikipedia.org/wiki/Percent-encoding >> >>>>>>>> >> >>>>>>>> best, >> >>>>>>>> Colin >> >>>>>>>> >> >>>>>>>> >> >>>>>>>>> I will further investigate which characters the url decoding >> >>> that >> >>>>>> jetty >> >>>>>>>>> brings to the table will let us use and if all of these are >> >>>>>> correctly >> >>>>>>>>> handled during connector creation and report back with a new >> >>> list >> >>>> of >> >>>>>>>>> characters that I think we can support fairly easily. >> >>>>>>>>> >> >>>>>>>>> Kind regards, >> >>>>>>>>> Sönke >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Tue, Oct 24, 2017 at 6:42 PM, Colin McCabe < >> >>> cmcc...@apache.org >> >>>>> >> >>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>>> It should be possible to use entity references to encode >> >> these >> >>>>>>>>>> characters in URLs. See https://dev.w3.org/html5/html- >> >>>>>> author/charref >> >>>>>>>>>> Maybe I'm misunderstanding the problem, but can we simply >> >>> encode >> >>>>>> the >> >>>>>>>>>> URLs, rather than restricting the names? >> >>>>>>>>>> >> >>>>>>>>>> best, >> >>>>>>>>>> Colin >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>>> On Mon, Oct 23, 2017, at 14:12, Randall Hauch wrote: >> >>>>>>>>>>> Here's the link to KIP-212: >> >>>>>>>>>>> https://cwiki.apache.org/confluence/pages/viewpage. >> >>>>>>>>>> action?pageId=74684586 >> >>>>>>>>>>> >> >>>>>>>>>>> I do think it's worthwhile to define the rules for >> >> connector >> >>>>>> names. >> >>>>>>>>>>> However, I think it would be better to describe the >> >> current >> >>>>>>>> restrictions >> >>>>>>>>>>> for names outside of them appearing within URLs. For >> >>> example, >> >>>>>> if we >> >>>>>>>> can >> >>>>>>>>>>> keep connector names relatively free of constraints but >> >>>> instead >> >>>>>>>> define >> >>>>>>>>>>> how >> >>>>>>>>>>> names should be encoded when used within URLs (e.g., URL >> >>>>>> encoding), >> >>>>>>>> then >> >>>>>>>>>>> we >> >>>>>>>>>>> may not have (m)any backward compatibility issues other >> >> than >> >>>>>> fixing >> >>>>>>>> some >> >>>>>>>>>>> bugs related to proper encoding/decoding. >> >>>>>>>>>>> >> >>>>>>>>>>> Thoughts? >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> On Mon, Oct 23, 2017 at 3:44 PM, Sönke Liebau < >> >>>>>>>>>>> soenke.lie...@opencore.com.invalid> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> All, >> >>>>>>>>>>>> >> >>>>>>>>>>>> I've created a KIP to discuss enforcing of rules on what >> >>>>>>>> characters are >> >>>>>>>>>>>> allowed in connector names. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Since this may break api calls that are currently >> >> working >> >>> I >> >>>>>>>> figured a >> >>>>>>>>>> KIP >> >>>>>>>>>>>> is the better way to go than to just create a jira. >> >>>>>>>>>>>> >> >>>>>>>>>>>> I'd love to hear your input on this! >> >>>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -- >> >>>>>>>>> Sönke Liebau >> >>>>>>>>> Partner >> >>>>>>>>> Tel. +49 179 7940878 >> >>>>>>>>> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - >> >>>>>> Germany >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Sönke Liebau >> >>>>>>> Partner >> >>>>>>> Tel. +49 179 7940878 >> >>>>>>> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - >> >>> Germany >> >>>>>> >> >>>>> >> >>>>> >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> Sönke Liebau >> >>> Partner >> >>> Tel. +49 179 7940878 <+49%20179%207940878> >> >>> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany >> >>> >> >> >> > >> > >> > >> > -- >> > Sönke Liebau >> > Partner >> > Tel. +49 179 7940878 <+49%20179%207940878> >> > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany >> > > > > -- > Sönke Liebau > Partner > Tel. +49 179 7940878 <+49%20179%207940878> > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany > -- Sönke Liebau Partner Tel. +49 179 7940878 OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany