[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19113
  
We didn't accept parquet 1.9.0 because it has a known performance 
regression, I think this one is fine, merging to master, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-05 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/19113
  
If we need 2.5.x for the fix, then we need 2.5.x. It's worth picking up an 
update if it solves a real problem. And if we're going to update minor 
versions, it's generally good practice to pick the latest maintenance release 
unless there's a specific reason not to. I don't think we have any general 
policy against using the latest version of something; on the contrary. Parquet 
is more critical and perhaps less reliable about maintaining the exact 
behavior, so maybe deserves more caution, but this change seems fine.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-05 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19113
  
Since the expected release of our next version Spark 2.3 is the end of this 
year, we still can revert it back to 2.2.1 if we realize this release 2.5.4 
introduces new bugs or performance regression. 

I am fine to merge it now. Let @rxin @marmbrus @cloud-fan do the final 
confirm.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-05 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19113
  
This release of Univocity was just out a few days ago. To me, this sound 
risky.

We normally do not upgrade it to the latest version.  This is why we are 
not using Parquet 1.9.0. Instead, we asking Parquet community to release 1.8.2. 

cc @rxin @marmbrus @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19113
  
With 2.7G data, I ran a simple Java problem with 2.5.4 and 2.2.1 with 
`CsvParser`, and simple e2e read tests. Elapsed time diff was roughly  -1.7% ~ 
+1.2%. I think virtually no diff (or 0.5 improvement).

I think we generally trust other communities and libraries we decided to 
add such as ORC, Parquet, Jackson and etc., and de-duplicate such efforts with 
the community support. I think we discussed about this before.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19113
  
How about the other popular open source projects? Do you know whether which 
projects are using Univocity 2.5?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19113
  
Any performance measure from 2.2 to 2.5?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19113
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19113
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81368/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19113
  
**[Test build #81368 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81368/testReport)**
 for PR 19113 at commit 
[`fa7eb51`](https://github.com/apache/spark/commit/fa7eb514cfd91bb405fd74680b08d5865911e3f0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19113: [SPARK-20978][SQL] Bump up Univocity version to 2.5.4

2017-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19113
  
**[Test build #81368 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81368/testReport)**
 for PR 19113 at commit 
[`fa7eb51`](https://github.com/apache/spark/commit/fa7eb514cfd91bb405fd74680b08d5865911e3f0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org