Github user yhuai commented on the issue:
https://github.com/apache/spark/pull/14148
LGTM. Merging to master and branch 2.0
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enable
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
Many interesting observation after further investigation. Will post the
findings tonight. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
@rxin @cloud-fan @yhuai Will do more investigation and submit a separate PR
for solution review. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply app
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14148
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14148
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62217/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14148
**[Test build #62217 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62217/consoleFull)**
for PR 14148 at commit
[`d92ebcd`](https://github.com/apache/spark/commit/
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/14148
It's easy to infer the schema once when we create the table and store it
into external catalog. However, it's a breaking change which means users can't
change the underlying data file schema after
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
Tomorrow, I will try to dig it deeper and check whether schema evolution
could be an issue if the schema is fixed when creating tables.
---
If your project is set up for it, you can reply to th
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
uh... I see what you mean. Agree.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabl
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/14148
I was not talking about caching here. Caching is transient. I want the
behavior to be the same regardless of how many times I'm restarting Spark ...
And this has nothing to do with refresh. For
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
@rxin Currently, we do not run schema inference every time when metadata
cache contains the plan. Based on my understanding, that is the major reason
why we introduced the metadata cache at the v
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/14148
@cloud-fan, @gatorsmile, and @yhuai - how difficult would it be to change
Spark so that it runs schema inference during table creation, and saves the
table schema when we create the table?
---
If yo
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/14148
Thanks. Just FYI when you make future changes, when a table is added to the
catalog (regardless whether it is temporary, non-temp, external, internal), we
should save its schema. We should not rely on
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14148
**[Test build #62217 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62217/consoleFull)**
for PR 14148 at commit
[`d92ebcd`](https://github.com/apache/spark/commit/d
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/14148
LGTM, pending jenkins
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
Also cc @huaiy @cloud-fan @liancheng @marmbrus
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
```
Seq("parquet", "json", "orc").foreach { fileFormat =>
withTable("t1") {
withTempPath { dir =>
val path = dir.getCanonicalPath
spark.ran
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
@rxin The failed test case is interesting! `REFRESH TABLE` command does not
refresh the metadata stored in the external catalog. When the tables are data
source tables, it is a bug?
Ple
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14148
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14148
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62143/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14148
**[Test build #62143 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62143/consoleFull)**
for PR 14148 at commit
[`473b27d`](https://github.com/apache/spark/commit/
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14148
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14148
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62144/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14148
**[Test build #62144 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62144/consoleFull)**
for PR 14148 at commit
[`a05383c`](https://github.com/apache/spark/commit/
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14148
**[Test build #62144 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62144/consoleFull)**
for PR 14148 at commit
[`a05383c`](https://github.com/apache/spark/commit/a
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14148
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62141/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14148
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14148
**[Test build #62141 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62141/consoleFull)**
for PR 14148 at commit
[`d92ebcd`](https://github.com/apache/spark/commit/
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14148
**[Test build #62143 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62143/consoleFull)**
for PR 14148 at commit
[`473b27d`](https://github.com/apache/spark/commit/4
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
Did a quick check. My understanding is wrong. We did the schema inference
when creating the table. Let me fix it. Thanks!
---
If your project is set up for it, you can reply to this email and ha
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14148
@rxin The created table could be empty. Thus, we are unable to cover all
the cases even if we try schema inference when creating tables. You know, this
is just my understanding. No clue about the
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/14148
Shouldn't schema inference run as soon as the table is created?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14148
**[Test build #62141 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62141/consoleFull)**
for PR 14148 at commit
[`d92ebcd`](https://github.com/apache/spark/commit/d
34 matches
Mail list logo