Github user alexbaretta commented on the pull request:

    https://github.com/apache/spark/pull/4039#issuecomment-70019159
  
    @squito This is not new functionality for which it would make sense to 
write a unit test. This is a hotfix for a bug. I am completely unfamiliar with 
this code, but I understand pretty well that although Int is a subtype of Any, 
MutableInt is not a subtype of MutableAny; hence, whereas, it is possible to 
cast val declared of type Any to an Int--a type-unsafe operation that can fail 
hard but can also succeed if the payload of the Any val is indeed an int--a 
cast from MutableAny to MutableInt is simply impossible and will necessarily 
fail, even if the payload of the MutableAny is indeed an Int. If you look at 
the JIRA you will see this as the cause of failure:
    
    Caused by: java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.MutableAny cannot be cast to 
org.apache.spark.sql.catalyst.expressions.MutableInt
    
    Now, a question worth asking to the author of this class, is why does 
SparkSQL rely on this type-casting mechanism to parse Parquet files? I am 
inclined to believe that there is a deeper issue here. That being said, my 
patch does allow my SQL queries to complete successfully against my Parquet 
dataset instead of failing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to