[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-07-04 Thread Fokko
Github user Fokko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r200102648
  
--- Diff: pom.xml ---
@@ -158,8 +158,8 @@
 2.11.12
 2.11
 1.9.13
-2.6.7
-
2.6.7.1
+2.9.6
+
2.9.6
--- End diff --

Good point, I've merged them


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-07-03 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r199803209
  
--- Diff: pom.xml ---
@@ -158,8 +158,8 @@
 2.11.12
 2.11
 1.9.13
-2.6.7
-
2.6.7.1
+2.9.6
+
2.9.6
--- End diff --

I suspect we can collapse these two versions; they were broken out to 
handle the fact that a few 2.6.x Jackson releases didn't publish all artifacts. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-29 Thread Fokko
Github user Fokko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r199141020
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,14 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit bin/spark-submit \
+ *  --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  --jars ./sql/core/target/spark-sql_*-tests.jar
+ *  open /tmp/output.txt
--- End diff --

I've reverted this


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-28 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r198772042
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,14 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit bin/spark-submit \
+ *  --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  --jars ./sql/core/target/spark-sql_*-tests.jar
+ *  open /tmp/output.txt
--- End diff --

yea.. similar comment at 
https://github.com/apache/spark/pull/21596#discussion_r197661126


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-28 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r198762319
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,14 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit bin/spark-submit \
+ *  --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  --jars ./sql/core/target/spark-sql_*-tests.jar
+ *  open /tmp/output.txt
--- End diff --

Do we need to update it like this? I prefer original version...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r198686913
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,13 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  ../sql/core/target/spark-sql_2.11-2.4.0-SNAPSHOT-tests.jar > 
/tmp/output.txt
--- End diff --

Let's take out other comments like `make-distribution.sh` and `cd dist/` 
too since they can be varied by how to build.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-26 Thread Fokko
Github user Fokko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r198112472
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,13 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  ../sql/core/target/spark-sql_2.11-2.4.0-SNAPSHOT-tests.jar > 
/tmp/output.txt
--- End diff --

Oops, forgot to add:
```
MacBook-Pro-van-Fokko:spark fokkodriesprong$ git diff
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
index 3b273f81d7..0598406d33 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
@@ -29,8 +29,9 @@ import org.apache.spark.util.{Benchmark, Utils}
  *  mvn clean package -pl sql/core -DskipTests
  *  ./dev/make-distribution.sh --name local-dist
  *  cd dist/
- *  ./bin/spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
- *  ../sql/core/target/spark-sql_2.11-2.4.0-SNAPSHOT-tests.jar > 
/tmp/output.txt
+ *  ./bin/spark-submit bin/spark-submit \
+ *  --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  --jars ./sql/core/target/spark-sql_*-tests.jar
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r198102188
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,13 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  ../sql/core/target/spark-sql_2.11-2.4.0-SNAPSHOT-tests.jar > 
/tmp/output.txt
--- End diff --

Eh, seems missed to update?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-25 Thread Fokko
Github user Fokko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r197715689
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,13 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  ../sql/core/target/spark-sql_2.11-2.4.0-SNAPSHOT-tests.jar > 
/tmp/output.txt
--- End diff --

Ok, I'm fine with that. I've updated the comments.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-25 Thread Fokko
Github user Fokko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r197714976
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -112,12 +118,13 @@ object JSONBenchmarks {
   }
 
   /*
-  Intel(R) Core(TM) i7-7920HQ CPU @ 3.10GHz
+  Java HotSpot(TM) 64-Bit Server VM 1.8.0_172-b11 on Mac OS X 10.13.5
+  Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
 
   JSON per-line parsing:   Best/Avg Time(ms)Rate(M/s)  
 Per Row(ns)   Relative
   

-  No encoding 25947 / 26188  3.9   
  259.5   1.0X
-  UTF-8 is set46319 / 46417  2.2   
  463.2   0.6X
+  No encoding   9936 / 9973 10,1   
   99,4   1,0X
+  UTF-8 is set16018 / 16377  6,2   
  160,2   0,6X
--- End diff --

This is my European computer, I'll change to `.` ;)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-25 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r197713648
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -112,12 +118,13 @@ object JSONBenchmarks {
   }
 
   /*
-  Intel(R) Core(TM) i7-7920HQ CPU @ 3.10GHz
+  Java HotSpot(TM) 64-Bit Server VM 1.8.0_172-b11 on Mac OS X 10.13.5
+  Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
 
   JSON per-line parsing:   Best/Avg Time(ms)Rate(M/s)  
 Per Row(ns)   Relative
   

-  No encoding 25947 / 26188  3.9   
  259.5   1.0X
-  UTF-8 is set46319 / 46417  2.2   
  463.2   0.6X
+  No encoding   9936 / 9973 10,1   
   99,4   1,0X
+  UTF-8 is set16018 / 16377  6,2   
  160,2   0,6X
--- End diff --

BTW, was this `,` instead of `.`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-25 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r197712172
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,13 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  ../sql/core/target/spark-sql_2.11-2.4.0-SNAPSHOT-tests.jar > 
/tmp/output.txt
--- End diff --

But wouldn't this command be invalid if we have another release. I mean, I 
believe we can build by following 
https://spark.apache.org/docs/latest/building-spark.html and run the benchmark.

Let's just say it like `bin/spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks --jars 
./sql/core/target/spark-sql_*.jar` or like `spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks --jars `

Should better to make the steps independent of other possible ways.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-25 Thread Fokko
Github user Fokko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r197707147
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,13 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  ../sql/core/target/spark-sql_2.11-2.4.0-SNAPSHOT-tests.jar > 
/tmp/output.txt
--- End diff --

I agree, but running this benchmark is quite specific and should only 
happend when something is changed with the json functionality. Not sure if you 
want to have a specific build step for it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r197661126
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
 ---
@@ -25,8 +25,13 @@ import org.apache.spark.util.{Benchmark, Utils}
 
 /**
  * The benchmarks aims to measure performance of JSON parsing when 
encoding is set and isn't.
- * To run this:
- *  spark-submit --class  --jars 
+ * To run:
+ *  mvn clean package -pl sql/core -DskipTests
+ *  ./dev/make-distribution.sh --name local-dist
+ *  cd dist/
+ *  ./bin/spark-submit --class 
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ *  ../sql/core/target/spark-sql_2.11-2.4.0-SNAPSHOT-tests.jar > 
/tmp/output.txt
--- End diff --

I think it's better to have more specific commands but let's avoid to have 
"2.4.0-SNAPSHOT" in this. Otherwise, we should change this when we make another 
release .. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-23 Thread Fokko
Github user Fokko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r197604577
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ---
@@ -244,6 +244,13 @@ class JsonExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper with
   "1234")
   }
 
+  test("some big value") {
+val value = "x" * 3000
+checkEvaluation(
+  GetJsonObject(NonFoldableLiteral((s"""{"big": "$value"}"""))
+, NonFoldableLiteral("$.big")), value)
--- End diff --

Fixed, much nicer indeed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21596#discussion_r197604227
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ---
@@ -244,6 +244,13 @@ class JsonExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper with
   "1234")
   }
 
+  test("some big value") {
+val value = "x" * 3000
+checkEvaluation(
+  GetJsonObject(NonFoldableLiteral((s"""{"big": "$value"}"""))
+, NonFoldableLiteral("$.big")), value)
--- End diff --

nit: ->

```scala
checkEvaluation(
  GetJsonObject(NonFoldableLiteral((s"""{"big": "$value"}""")),
  NonFoldableLiteral("$.big")), value)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21596: [SPARK-24601] Bump Jackson version

2018-06-20 Thread Fokko
GitHub user Fokko opened a pull request:

https://github.com/apache/spark/pull/21596

[SPARK-24601] Bump Jackson version

Hi all,

Jackson is incompatible with upstream versions, therefore bump the Jackson 
version to a more recent one. I bumped into some issues with Azure CosmosDB 
that is using a more recent version of Jackson. This can be fixed by adding 
exclusions and then it works without any issues. So no breaking changes in the 
API's.

I would also consider bumping the version of Jackson in Spark. I would 
suggest to keep up to date with the dependencies, since in the future this 
issue will pop up more frequently.

## What changes were proposed in this pull request?

Bump Jackson to 2.9.6

## How was this patch tested?

Compiled and tested it locally to see if anything broke.

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Fokko/spark fd-bump-jackson

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21596.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21596


commit ba454b0e945fcb16a41fbb9eccc1ea2a6941ace5
Author: Fokko Driesprong 
Date:   2018-06-20T08:15:59Z

[SPARK-24601] Bump Jackson version

Jackson is incompatible with upstream versions, therefore bump
the Jackson version to a more recent one.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org