[ 
https://issues.apache.org/jira/browse/MADLIB-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286749#comment-16286749
 ] 

ASF GitHub Bot commented on MADLIB-1186:
----------------------------------------

GitHub user njayaram2 opened a pull request:

    https://github.com/apache/madlib/pull/214

    Correlation: Fix bug with international characters

    JIRA:MADLIB-1186
    
    If the column name of an independent variable used in
    madlib.correlation(...) has quotes in it, then the query fails due
    to a regular string concat used for finding the average of a column.
    This commit uses add_postfix() to create a new string out of a string
    that has special chars instead.
    
    Closes #213

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/njayaram2/madlib 
bugfix/correlation_international

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/madlib/pull/214.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #214
    
----
commit d88072988f637789ee6492c8b12e0afaa5e06abf
Author: Nandish Jayaram <[email protected]>
Date:   2017-12-11T22:09:46Z

    Correlation: Fix bug with international characters
    
    JIRA:MADLIB-1186
    
    If the column name of an independent variable used in
    madlib.correlation(...) has quotes in it, then the query fails due
    to a regular string concat used for finding the average of a column.
    This commit uses add_postfix() to create a new string out of a string
    that has special chars instead.
    
    Closes #213

----


> Correlation query fails with international chars in column name
> ---------------------------------------------------------------
>
>                 Key: MADLIB-1186
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1186
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: Module: Inferential Statistics
>            Reporter: Nandish Jayaram
>            Assignee: ssoni
>             Fix For: v1.13
>
>
> We tried running the correlation module in MADlib, on a table that had
> international characters for schema, and column name of one of the
> independent variables. It resulted in the following error:
> {code}
> select
>       madlib.correlation(
>           '"nåmespace"."dt_gOlf"',
>           '"__madlib_temp_26728701_1513000192_9340672___πøA"',
>           'id, temperature, "Humidity"');
> psql:/tmp/build/60928da1/madlib_testsuite/tests/Correlation/sql_CorrelationInternationalCharOutputTestCase/test_correlation_international_char.sql:12:
>  ERROR:  spiexceptions.SyntaxError: syntax error at or near ""Humidity""
> LINE 10: ...rature, avg_temperature),coalesce("Humidity", avg_"Humidity"...
>                                                               ^
> QUERY:
>             CREATE TEMP TABLE __madlib_temp_43456465_1513000195_75531635__ AS
>             SELECT
>                 count(*) AS tot_cnt,
>                 mean,
>                 madlib.correlation_agg(x, mean) as cor_mat
>             FROM
>             (
>                 SELECT ARRAY[ coalesce(id, avg_id),coalesce(temperature, 
> avg_temperature),coalesce("Humidity", avg_"Humidity") ] AS x,
>                         ARRAY [ avg_id,avg_temperature,avg_"Humidity" ] AS 
> mean
>                 FROM "nåmespace"."dt_gOlf",
>                 (
>                     SELECT avg(id) AS avg_id,avg(temperature) AS 
> avg_temperature,avg("Humidity") AS avg_"Humidity"
>                     FROM "nåmespace"."dt_gOlf"
>                 )sub1
>             ) sub2
>             GROUP BY mean
> CONTEXT:  Traceback (most recent call last):
>   PL/Python function "correlation", line 23, in <module>
>     return correlation.correlation(**globals())
>   PL/Python function "correlation", line 71, in correlation
>   PL/Python function "correlation", line 205, in _populate_output_table
> PL/Python function "correlation"
>       select * from "__madlib_temp_26728701_1513000192_9340672___πøA" order 
> by column_position;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to