GitHub user iyerr3 opened a pull request:

    https://github.com/apache/madlib/pull/268

    DT: Don't use NULL value to get dep_var type

    JIRA: MADLIB-1233
    
    Function `_is_dep_categorical` is used to obtain the type of the
    dependent variable expression. This function gets a random value using
    `LIMIT 1` and checks the type of the corresponding value in Python.
    Further this does not filter out NULL values.
    Since NULL values are not filtered out,
    it's possible the `LIMIT 1` returns a "None" type in Python, leading to
    incorrect results.
    
    This commit updates the type extraction by checking the type in the
    database instead of in Python and also filters out NULL values.
    Additionally it checks if at least one non-NULL value is obtained, else
    throws an appropriate error.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/madlib/madlib bugfix/dt_dep_var_type_null

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/madlib/pull/268.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #268
    
----
commit 6570a27278aca301c0c8869899a8fad26c69bb7d
Author: Rahul Iyer <riyer@...>
Date:   2018-05-01T21:24:34Z

    DT: Don't use NULL value to get dep_var type
    
    JIRA: MADLIB-1233
    
    Function `_is_dep_categorical` is used to obtain the type of the
    dependent variable expression. This function gets a random value using
    `LIMIT 1` and checks the type of the corresponding value in Python.
    Further this does not filter out NULL values.
    Since NULL values are not filtered out,
    it's possible the `LIMIT 1` returns a "None" type in Python, leading to
    incorrect results.
    
    This commit updates the type extraction by checking the type in the
    database instead of in Python and also filters out NULL values.
    Additionally it checks if at least one non-NULL value is obtained, else
    throws an appropriate error.

----


---

Reply via email to