[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-12-05 Thread mallman
Github user mallman commented on the issue:

https://github.com/apache/spark/pull/22905
  
I think I've made my case for this patch as best I can. It does not appear 
this PR has unanimous support, but I continue to believe we should merge it to 
master. So where do we take it from here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98399/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22905
  
**[Test build #98399 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98399/testReport)**
 for PR 22905 at commit 
[`7124ffa`](https://github.com/apache/spark/commit/7124ffab65a9eb41cc218fc9b52ce60dd8b4873c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread mallman
Github user mallman commented on the issue:

https://github.com/apache/spark/pull/22905
  
> @mallman Could you run the EXPLAIN with this new changes and post it in 
the PR description?

Done.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4721/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22905
  
**[Test build #98399 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98399/testReport)**
 for PR 22905 at commit 
[`7124ffa`](https://github.com/apache/spark/commit/7124ffab65a9eb41cc218fc9b52ce60dd8b4873c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98397/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22905
  
**[Test build #98397 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98397/testReport)**
 for PR 22905 at commit 
[`b4f0584`](https://github.com/apache/spark/commit/b4f05844e3080fa76f8fcda396fe711f7e52c88d).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22905
  
**[Test build #98397 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98397/testReport)**
 for PR 22905 at commit 
[`b4f0584`](https://github.com/apache/spark/commit/b4f05844e3080fa76f8fcda396fe711f7e52c88d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4720/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-11-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22905
  
@mallman Could you run the EXPLAIN with this new changes and post it in the 
PR description?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22905
  
Thank you for pinging me, @mallman .


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98316/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22905
  
**[Test build #98316 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98316/testReport)**
 for PR 22905 at commit 
[`4aa8d04`](https://github.com/apache/spark/commit/4aa8d0454be723f8318e1d0a3ea4e4c138ed5861).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread mallman
Github user mallman commented on the issue:

https://github.com/apache/spark/pull/22905
  
> is there anything blocked by this? I agree this is a good feature, but it 
asks the data source to provide a new ability, which may become a problem when 
migrating file sources to data source v2.

This isn't blocking anything. It's just a contribution that's shown itself 
to be very helpful for us identifying the source of performance problems in 
past experience. I think it would be helpful for others, too.

That being said, I don't know enough about what would be involved in 
migrating file sources to data source v2 to say how difficult that would be. 
This implementation (for Parquet) is essentially a one-liner. All the heavy 
lifting is done by `SparkToParquetSchemaConverter`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22905
  
is there anything blocked by this? I agree this is a good feature, but it 
asks the data source to provide a new ability, which may become a problem when 
migrating file sources to data source v2.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread mallman
Github user mallman commented on the issue:

https://github.com/apache/spark/pull/22905
  
@gatorsmile @viirya @cloud-fan @dbtsai your thoughts?

cc @dongjoon-hyun for ORC file format perspective.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22905
  
**[Test build #98316 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98316/testReport)**
 for PR 22905 at commit 
[`4aa8d04`](https://github.com/apache/spark/commit/4aa8d0454be723f8318e1d0a3ea4e4c138ed5861).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22905
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4664/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org