[ 
https://issues.apache.org/jira/browse/KAFKA-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448374#comment-13448374
 ] 

Swapnil Ghike edited comment on KAFKA-495 at 9/5/12 12:21 PM:
--------------------------------------------------------------

So I am planning to insert the following regex check.

Unix constaints - 
1. Ban / and \0 (both not allowed in unix, / is used as a path name component 
separator and \0 terminates file names in unix.)

Zookeeper constaints:- 
1. The null character (\u0000) cannot be part of a path name. (This causes 
problems with the C binding.)
2. The following characters can't be used because they don't display well, or 
render in confusing ways: \u0001 - \u0019 and \u007F - \u009F.
3. The following characters are not allowed: \ud800 -uF8FFF, \uFFF0-uFFFF, 
\uXFFFE - \uXFFFF (where X is a digit 1 - E), \uF0000 - \uFFFFF.
4. . and .. Are not allowed as filenames. Zookeeper supports only absolute 
filepaths.
5. The token "zookeeper" is reserved.

In addition, I suggest the following two - 
1. Ban leading and trailing whitespaces (because they confuse GUI users) - also 
takes care forbidding filenames composed of only whitespaces.
2. Ban leading . And .. Because unix hides those files.

There are other cases where unfortunate errors may happen in ways not related 
to kafka or zookeeper, and programmers would need to take care about them.
                
      was (Author: swapnilghike):
    So I am planning to insert the following regex check.

Unix constaints - 
1. Ban / (both not allowed in unix, / is used as a path name component 
separator and \0 terminates file names in unix.)

Zookeeper constaints:- 
1. The null character (\u0000) cannot be part of a path name. (This causes 
problems with the C binding.)
2. The following characters can't be used because they don't display well, or 
render in confusing ways: \u0001 - \u0019 and \u007F - \u009F.
3. The following characters are not allowed: \ud800 -uF8FFF, \uFFF0-uFFFF, 
\uXFFFE - \uXFFFF (where X is a digit 1 - E), \uF0000 - \uFFFFF.
4. . and .. Are not allowed as filenames. Zookeeper supports only absolute 
filepaths.
5. The token "zookeeper" is reserved.

In addition, I suggest the following two - 
1. Ban leading and trailing whitespaces (because they confuse GUI users) - also 
takes care forbidding filenames composed of only whitespaces.
2. Ban leading . And .. Because unix hides those files.

There are other cases where unfortunate errors may happen in ways not related 
to kafka or zookeeper, and programmers would need to take care about them.
                  
> Handle topic names with "/" on Kafka server
> -------------------------------------------
>
>                 Key: KAFKA-495
>                 URL: https://issues.apache.org/jira/browse/KAFKA-495
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.7, 0.8
>            Reporter: Neha Narkhede
>            Assignee: Swapnil Ghike
>              Labels: bugs
>             Fix For: 0.8, 0.7.1
>
>         Attachments: kafka-495-v1.patch
>
>
> If a producer publishes data to topic "foo/foo", the Kafka server ends up 
> creating an invalid directory structure on the server. This corrupts the 
> zookeeper data structure for the topic - /brokers/topics/foo/foo. This leads 
> to rebalancing failures on the consumer as well as errors on the zookeeper 
> based producer. 
> We need to harden the invalid topic handling on the Kafka server side to 
> avoid this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to