[
https://issues.apache.org/jira/browse/MADLIB-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286749#comment-16286749
]
ASF GitHub Bot commented on MADLIB-1186:
----------------------------------------
GitHub user njayaram2 opened a pull request:
https://github.com/apache/madlib/pull/214
Correlation: Fix bug with international characters
JIRA:MADLIB-1186
If the column name of an independent variable used in
madlib.correlation(...) has quotes in it, then the query fails due
to a regular string concat used for finding the average of a column.
This commit uses add_postfix() to create a new string out of a string
that has special chars instead.
Closes #213
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/njayaram2/madlib
bugfix/correlation_international
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/madlib/pull/214.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #214
----
commit d88072988f637789ee6492c8b12e0afaa5e06abf
Author: Nandish Jayaram <[email protected]>
Date: 2017-12-11T22:09:46Z
Correlation: Fix bug with international characters
JIRA:MADLIB-1186
If the column name of an independent variable used in
madlib.correlation(...) has quotes in it, then the query fails due
to a regular string concat used for finding the average of a column.
This commit uses add_postfix() to create a new string out of a string
that has special chars instead.
Closes #213
----
> Correlation query fails with international chars in column name
> ---------------------------------------------------------------
>
> Key: MADLIB-1186
> URL: https://issues.apache.org/jira/browse/MADLIB-1186
> Project: Apache MADlib
> Issue Type: Bug
> Components: Module: Inferential Statistics
> Reporter: Nandish Jayaram
> Assignee: ssoni
> Fix For: v1.13
>
>
> We tried running the correlation module in MADlib, on a table that had
> international characters for schema, and column name of one of the
> independent variables. It resulted in the following error:
> {code}
> select
> madlib.correlation(
> '"nåmespace"."dt_gOlf"',
> '"__madlib_temp_26728701_1513000192_9340672___πøA"',
> 'id, temperature, "Humidity"');
> psql:/tmp/build/60928da1/madlib_testsuite/tests/Correlation/sql_CorrelationInternationalCharOutputTestCase/test_correlation_international_char.sql:12:
> ERROR: spiexceptions.SyntaxError: syntax error at or near ""Humidity""
> LINE 10: ...rature, avg_temperature),coalesce("Humidity", avg_"Humidity"...
> ^
> QUERY:
> CREATE TEMP TABLE __madlib_temp_43456465_1513000195_75531635__ AS
> SELECT
> count(*) AS tot_cnt,
> mean,
> madlib.correlation_agg(x, mean) as cor_mat
> FROM
> (
> SELECT ARRAY[ coalesce(id, avg_id),coalesce(temperature,
> avg_temperature),coalesce("Humidity", avg_"Humidity") ] AS x,
> ARRAY [ avg_id,avg_temperature,avg_"Humidity" ] AS
> mean
> FROM "nåmespace"."dt_gOlf",
> (
> SELECT avg(id) AS avg_id,avg(temperature) AS
> avg_temperature,avg("Humidity") AS avg_"Humidity"
> FROM "nåmespace"."dt_gOlf"
> )sub1
> ) sub2
> GROUP BY mean
> CONTEXT: Traceback (most recent call last):
> PL/Python function "correlation", line 23, in <module>
> return correlation.correlation(**globals())
> PL/Python function "correlation", line 71, in correlation
> PL/Python function "correlation", line 205, in _populate_output_table
> PL/Python function "correlation"
> select * from "__madlib_temp_26728701_1513000192_9340672___πøA" order
> by column_position;
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)