varchar column written by Hive

Kay Ousterhout (JIRA) Thu, 16 Mar 2017 19:15:12 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929334#comment-15929334
 ]


Kay Ousterhout commented on SPARK-19988:
----------------------------------------

With some help from [~joshrosen] I spent some time digging into this and found:

(1) if you look at the failures, they're all from the maven build.  In fact, 
100% of the maven builds shown there fail (and none of the SBT ones).  This is 
weird because this is also failing on the PR builder, which uses SBT. 

(2) The maven build failures are all accompanied by 3 other tests; the group of 
4 tests seems to consistently fail together.  3 tests fail with errors similar 
to this one (saying that some database does not exist).  The 4th test, 
org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite: create temporary 
view using, fails with a more real error.  I filed SPARK-19990 for that issue.

(3) A commit right around the time the tests started failing: 
https://github.com/apache/spark/commit/09829be621f0f9bb5076abb3d832925624699fa9#diff-b7094baa12601424a5d19cb930e3402fR46
 added code to remove all of the databases after each test.  I wonder if that's 
somehow getting run concurrently or asynchronously in the maven build (after 
the HiveCataloguedDDLSuite fails), which is why the error in the DDLSuite 
causes the other tests to fail saying that a database can't be found.  I have 
extremely limited knowledge of both (a) how the maven tests are executed and 
(b) the SQL code so it's possible these are totally unrelated issues.

None of this explains why the test is failing in the PR builder, where the 
failures have been isolated to this test.

> Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column 
> written by Hive
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19988
>                 URL: https://issues.apache.org/jira/browse/SPARK-19988
>             Project: Spark
>          Issue Type: Test
>          Components: SQL, Tests
>    Affects Versions: 2.2.0
>            Reporter: Imran Rashid
>              Labels: flaky-test
>         Attachments: trimmed-unit-test.log
>
>
> "OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by 
> Hive" fails a lot -- right now, I see about a 50% pass rate in the last 3 
> days here:
> https://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.sql.hive.orc.OrcSourceSuite&test_name=SPARK-19459%2FSPARK-18220%3A+read+char%2Fvarchar+column+written+by+Hive
> eg. 
> https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74683/testReport/junit/org.apache.spark.sql.hive.orc/OrcSourceSuite/SPARK_19459_SPARK_18220__read_char_varchar_column_written_by_Hive/
> {noformat}
> sbt.ForkMain$ForkError: 
> org.apache.spark.sql.execution.QueryExecutionException: FAILED: 
> SemanticException [Error 10072]: Database does not exist: db2
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:637)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:621)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:288)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:621)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:611)
>       at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply$mcV$sp(OrcSourceSuite.scala:160)
>       at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
>       at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-19988) Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by Hive

Reply via email to