[
https://issues.apache.org/jira/browse/MADLIB-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Frank McQuillan updated MADLIB-1312:
------------------------------------
Fix Version/s: (was: v1.16)
v2.0
> Fix potential SQL injection issues
> ----------------------------------
>
> Key: MADLIB-1312
> URL: https://issues.apache.org/jira/browse/MADLIB-1312
> Project: Apache MADlib
> Issue Type: Improvement
> Components: Deep Learning
> Reporter: Nandish Jayaram
> Priority: Major
> Fix For: v2.0
>
>
> Based on code review comments on PR
> [https://github.com/apache/madlib/pull/355:]
> In the madlib_keras fit function, we use the following prepare statement to
> call fit transition function:
> {code:java}
> run_training_iteration = plpy.prepare("""
> SELECT {0}.fit_step(
> {1}::REAL[],
> {2}::SMALLINT[],
> gp_segment_id,
> {3}::INTEGER,
> ARRAY{4},
> ARRAY{5},
> $MAD${6}$MAD$::TEXT,
> {7}::TEXT,
> {8}::TEXT,
> {9},
> $1
> ) AS iteration_result
> FROM {10}
> """.format(schema_madlib, independent_varname, dependent_varname,
> num_classes, seg_nums, total_buffers_per_seg, model_arch,
> compile_params_to_pass, fit_params_to_pass,
> use_gpu, source_table), ["bytea"])
> {code}
> {code:java}
> $MAD${6}$MAD$::TEXT{code}
> could lead to potential SQL injection issues.
> Quoting related code review comments on PR #355 by [~dvaldano]:
> {quote}This allows the user to inject arbitrary SQL commands, by including
> the string $madlib$ in the compile or fit params. This isn't as bad as
> allowing them to execute arbitrary python code, but it's something that could
> also happen unintentionally and result in strange errors. We could reduce the
> risk by choosing a random string, but see below for a better way of handling
> this.
> Same issue as above, for model_arch. SQL injection possible If the user
> passes a {{model_arch}} json string with {{$MAD$}} in it.
> A better way to pass all of these parameters is in the same way we pass the
> weights--label them as$1,$2,$3 in the SELECT statement string, and pass their
> types as additional args to prepare. The values will then get passed to
> execute as additional params. This is the recommended safe way to do sql
> queries from python, and will do proper quoting and type casting for all of
> them so we don't have to.
> {quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)