[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4039 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-82687233 I believe this problem has been fixed and we can close this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-71887315 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-71887893 [Test build #26237 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26237/consoleFull) for PR 4039 at commit [`3e42dc3`](https://github.com/apache/spark/commit/3e42dc3547c361e6d7073ed2a7ccb8facef75513). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-71900510 [Test build #26237 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26237/consoleFull) for PR 4039 at commit [`3e42dc3`](https://github.com/apache/spark/commit/3e42dc3547c361e6d7073ed2a7ccb8facef75513). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-71900517 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26237/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-71537019 The sole purpose of adding `SpecificMutableRow` is to avoid boxing cost, so using `boxed` here doesn't sound good. For SPARK-5236, schema mismatch can often produce similar exceptions. @alexbaretta Would you please add a snippet that helps reproducing this issue in the JIRA ticket description? I guess the root cause hides elsewhere. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user alexbaretta commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70019159 @squito This is not new functionality for which it would make sense to write a unit test. This is a hotfix for a bug. I am completely unfamiliar with this code, but I understand pretty well that although Int is a subtype of Any, MutableInt is not a subtype of MutableAny; hence, whereas, it is possible to cast val declared of type Any to an Int--a type-unsafe operation that can fail hard but can also succeed if the payload of the Any val is indeed an int--a cast from MutableAny to MutableInt is simply impossible and will necessarily fail, even if the payload of the MutableAny is indeed an Int. If you look at the JIRA you will see this as the cause of failure: Caused by: java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.MutableAny cannot be cast to org.apache.spark.sql.catalyst.expressions.MutableInt Now, a question worth asking to the author of this class, is why does SparkSQL rely on this type-casting mechanism to parse Parquet files? I am inclined to believe that there is a deeper issue here. That being said, my patch does allow my SQL queries to complete successfully against my Parquet dataset instead of failing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70014846 can you add a unit test for what this fixes? I don't see how this avoids the exceptions, just seems to push them down into `MutableValue.update`. A test case would help convince me (admittedly I'm really unfamiliar with this code). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70036427 btw, while you're mucking around in there ... it might be nice to change the `SpecificMutableRow` constructor to take varargs. Change this constructor: ``` - def this(dataTypes: Seq[DataType]) = + def this(dataTypes: DataType*) = ``` and then just get rid of the other constructor: ``` - def this() = this(Seq.empty) - ``` that way you don't have to wrap those types with a `Seq`, eg. instead of `new SpecificMutableRow(Seq(StringType [, IntType, ...]))` you can just do `new SpecificMutableRow(StringType [, IntType,...])` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user alexbaretta commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70036683 Ok, happy to look into this, but I will be out for the next few days, so this isn't going to happen before the weekend. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70036236 I think finding fixing a bug in current behavior is a great reason to add a unit test. Some part of the implementation is confusing enough to have allowed a bug in the first place, a test will help prevent that bug from cropping up again in future changes. From the description you gave me, this seems like a minimal test case: ``` class SpecificMutableRowTest extends FunSuite with Matchers { test(update MutableAny) { val row = new SpecificMutableRow(Seq(StringType)) row.update(0, 1) row.getInt(0) should be (1) } } ``` I agree that this does seem kinda suspicious that maybe there is something deeper going on ... why is a field that is supposed to be an int getting assigned a type of `MutableAny` instead of `MutableInt`, though apparently `CatalystPrimitiveConverter` has decided to call the `int` specific methods like `setInt` etc.? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70037269 Back to the question of something deeper being wrong ... I think we'll need to wait for input from somebody more familiar w/ this code. @marmbrus ? But one helpful thing you could do would be to provide a minimal example that reproduced this. Eg., some tiny parquet file an example query maybe? [SPARK-5236](https://issues.apache.org/jira/browse/SPARK-5236) is really short on details. (sorry for splitting into so many comments, keep getting distracted ...) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
GitHub user alexbaretta opened a pull request: https://github.com/apache/spark/pull/4039 [SPARK-5236] Fix ClassCastException in SpecificMutableRow You can merge this pull request into a Git repository by running: $ git pull https://github.com/alexbaretta/spark spark-5236-MutableAny Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4039.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4039 commit 3e42dc3547c361e6d7073ed2a7ccb8facef75513 Author: Alex Baretta alexbare...@gmail.com Date: 2015-01-14T06:48:56Z [SPARK-5236] Fix ClassCastException in SpecificMutableRow --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-69878601 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org