Github user naftaliharris commented on the pull request:

    https://github.com/apache/spark/pull/1057#issuecomment-50508305
  
    My point actually is that you get OverflowError's if margin is too large, 
for example:
    ```python
    >>> from math import exp
    >>> margin = -1000
    >>> prob = 1/(1 + exp(-margin))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OverflowError: math range error
    >>>
    ```
    This actually happened to me, which is how I saw this issue.
    
    There are (at least) two ways to solve this: One is to check margin instead 
of prob; this has the advantage of also saving a computation. Even if you allow 
the user to specify the probability threshold p at some later date, you can 
just change the margin cutoff from 0 to log(p/(1-p)).
    
    The second solution is to replace math.exp with np.exp, which gives a 
warning rather than an exception:
    
    ```python
    >>> import numpy as np
    >>> margin = -1000
    >>> prob = 1/(1 + np.exp(-margin))
    __main__:1: RuntimeWarning: overflow encountered in exp
    >>> prob
    0.0
    >>>
    ```
    
    For robustness of the code, perhaps you should at least do that.
    
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to