[ 
https://issues.apache.org/jira/browse/KYLIN-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786281#comment-17786281
 ] 

Hongrong Cao edited comment on KYLIN-5706 at 11/15/23 10:05 AM:
----------------------------------------------------------------

h2. *background*

In the current code, there are many scenarios where a cmd needs to be spliced, 
and then executed by ProcessBuilder. The parameters of the spliced cmd may come 
from the interface, and the lack of checking the legitimacy of the parameter 
makes it possible to be maliciously attacked.

When splicing spark commands, the checkCommandInjection method is used to avoid 
injection attacks, but it only avoids injection attacks caused by backquotes 
and $(), such as `rm -rf /` $(rm -rf /), but not other scenarios, such as cat 
nohup.out2 && echo success || echo failed echo failed
h2. *dev design*
h3. Option 1

The org.apache.commons.text.StringEscapeUtils class provides the escapeXSI 
method for escaping special characters on command line parameters

Unify in org.apache.kylin.common.util.CliCommandExecutor#execute method, before 
executing the command, cut the command by space, and then just escape each part 
in turn.

Option 1 is not feasible, in our code logic will also splice some complex 
commands, in the execution of the command is difficult to distinguish whether 
the command is the client injected.
h3. Option 2 (Adoption)

Check the parameters when splicing cmd commands, including the following four 
scenarios:

When typing the diagnostic package, the parameters of the diag.sh script will 
be spliced, such as project, jobId, path, etc., and each parameter will be 
checked in turn, and conforming to ^[a-zA-Z0-9_. /-]+$ is enough
When exporting influxDB data, it will splice the database address and database 
name as the parameter of influx command, the former meets 
[a-zA-Z0-9._-](:[0-9])? and ^[0-9a-zA-Z_-]+$ for the latter.
When fetching yarn's stats, the url of yarn is spliced as an argument to the 
curl command, conforming to ^(http(s)? ://)? [a-zA-Z0-9._-](:[0-9])? 
(/[a-zA-Z0-9._-]+)*/? $ That's it!
When executing the beeline command, the beeline-params in the configuration 
will be spliced into the command. The composition of the beeline-params is more 
complicated, forcing each parameter value to be converted to a string by 
wrapping it with ', such as abc → 'abc', ab'c → 'ab'\''c'
h3. Option 3

The current CliCommandExecutor's execute method accepts a command string and 
then executes it directly, which is not reasonable because it lacks 
verification of the command arguments.

You can do a refactoring, CliCommandExecutor's execute accepts a CMD object, 
the CMD object encapsulates a number of commands to be executed, the parameters 
of each command, the validation method of the parameters, the parameter 
escaping method, etc. The CMD object can be used as an interface class, the 
specific validation method, the escaping method is implemented by the caller of 
the code.


was (Author: JIRAUSER303052):
* *background*

In the current code, there are many scenarios where a cmd needs to be spliced, 
and then executed by ProcessBuilder. The parameters of the spliced cmd may come 
from the interface, and the lack of checking the legitimacy of the parameter 
makes it possible to be maliciously attacked.

When splicing spark commands, the checkCommandInjection method is used to avoid 
injection attacks, but it only avoids injection attacks caused by backquotes 
and $(), such as `rm -rf /` $(rm -rf /), but not other scenarios, such as cat 
nohup.out2 && echo success || echo failed echo failed

 
 * *dev design*
 * Option 1

The org.apache.commons.text.StringEscapeUtils class provides the escapeXSI 
method for escaping special characters on command line parameters

Unify in org.apache.kylin.common.util.CliCommandExecutor#execute method, before 
executing the command, cut the command by space, and then just escape each part 
in turn.

Option 1 is not feasible, in our code logic will also splice some complex 
commands, in the execution of the command is difficult to distinguish whether 
the command is the client injected.
 * Option 2 (Adoption)

Check the parameters when splicing cmd commands, including the following four 
scenarios:

When typing the diagnostic package, the parameters of the diag.sh script will 
be spliced, such as project, jobId, path, etc., and each parameter will be 
checked in turn, and conforming to ^[a-zA-Z0-9_. /-]+$ is enough
When exporting influxDB data, it will splice the database address and database 
name as the parameter of influx command, the former meets 
[a-zA-Z0-9._-](:[0-9])? and ^[0-9a-zA-Z_-]+$ for the latter.
When fetching yarn's stats, the url of yarn is spliced as an argument to the 
curl command, conforming to ^(http(s)? ://)? [a-zA-Z0-9._-](:[0-9])? 
(/[a-zA-Z0-9._-]+)*/? $ That's it!
When executing the beeline command, the beeline-params in the configuration 
will be spliced into the command. The composition of the beeline-params is more 
complicated, forcing each parameter value to be converted to a string by 
wrapping it with ', such as abc → 'abc', ab'c → 'ab'\''c'
 * Option 3

The current CliCommandExecutor's execute method accepts a command string and 
then executes it directly, which is not reasonable because it lacks 
verification of the command arguments.

You can do a refactoring, CliCommandExecutor's execute accepts a CMD object, 
the CMD object encapsulates a number of commands to be executed, the parameters 
of each command, the validation method of the parameters, the parameter 
escaping method, etc. The CMD object can be used as an interface class, the 
specific validation method, the escaping method is implemented by the caller of 
the code.

> Fix the command line injection vulnerability when generating diagnostic 
> packages through scripts
> ------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-5706
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5706
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: 5.0-beta
>            Reporter: Hongrong Cao
>            Assignee: Zhiting Guo
>            Priority: Major
>             Fix For: 5.0.0
>
>
> The diagnostic package will call the command line to execute the shell script 
> through java, and kylin does not escape the input from the user, and directly 
> splices it into the command line, resulting in command line injection.
> Therefore, we need to escape the user input that will be spliced into the cmd.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to