Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18926
  
    > Basically, the change just wants to ensure the type of length is int.
    
    Yes, but to be more correct, I think this makes sure if both are same 
types, `Column` or `int`. I think this throws an exception for a better error 
message, (rather than, for example, "Unexpected types: startPos :<type 'int'> 
length: <class 'pyspark.sql.column.Column'>"). I know this sounds rather 
excessive checking but this is still a valid checking.
    
    Possible way I think is with keeping this checking (which I believe is not 
shorter):
    
    ```python
            if isinstance(startPos, int) and isinstance(length, int):
                jc = self._jc.substr(startPos, length)
            elif isinstance(startPos, Column) and isinstance(length, Column)::
                jc = self._jc.substr(startPos._jc, length._jc)
            else:
                if type(startPos) != type(length):
                    raise TypeError(...)
                else:
                    raise TypeError("Unexpected type: %s" % type(startPos))
    ```
    
    or, with removing this excessive checking:
    
    ```python
            if isinstance(startPos, int) and isinstance(length, int):
                jc = self._jc.substr(startPos, length)
            elif isinstance(startPos, Column) and isinstance(length, Column)::
                jc = self._jc.substr(startPos._jc, length._jc)
            else:
                raise TypeError(...)
    ```
    
    I agree this is excessive (the former) but I wonder if we should remove 
already existing one. Please correct me if I am mistaken here.
    
    `long` does not work already but with unexpected exception message, which 
this PR fixes. 
    
    The current state fixes the JIRA specified here.
    
    I will stop staying against if you guys here feel strongly with fixing 
together but if you are not, could we go without that issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to