[GitHub] [spark-website] panbingkun commented on pull request #474: [SPARK-44820][DOCS] Switch languages consistently across docs for all code snippets

2023-09-24 Thread via GitHub


panbingkun commented on PR #474:
URL: https://github.com/apache/spark-website/pull/474#issuecomment-1732813770

   > @panbingkun yes let's update the spark website (this repo) to fix this UI 
issue for published docs.
   
   Okay, let me to fix it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (f81f51467b8 -> bb0d287114f)

2023-09-24 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from f81f51467b8 [SPARK-45257][CORE][FOLLOWUP] Correct the from version in 
migration guide
 add bb0d287114f [SPARK-45294][PYTHON][DOCS] Use JDK 17 in Binder 
integration for PySpark live notebooks

No new revisions were added by this update.

Summary of changes:
 binder/apt.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-45257][CORE][FOLLOWUP] Correct the from version in migration guide

2023-09-24 Thread yao
This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f81f51467b8 [SPARK-45257][CORE][FOLLOWUP] Correct the from version in 
migration guide
f81f51467b8 is described below

commit f81f51467b85779086873860d5bac0d5429c9a29
Author: Cheng Pan 
AuthorDate: Mon Sep 25 09:37:01 2023 +0800

[SPARK-45257][CORE][FOLLOWUP] Correct the from version in migration guide

### What changes were proposed in this pull request?

Correct the from version in migration guide

### Why are the changes needed?

Address comments  
https://github.com/apache/spark/commit/8d599972872225e336467700715b1d4771624efe#r128053622

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Review

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #43072 from pan3793/SPARK-45257-followup.

Authored-by: Cheng Pan 
Signed-off-by: Kent Yao 
---
 docs/core-migration-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md
index 765c3494f66..2464d774240 100644
--- a/docs/core-migration-guide.md
+++ b/docs/core-migration-guide.md
@@ -22,7 +22,7 @@ license: |
 * Table of contents
 {:toc}
 
-## Upgrading from Core 3.4 to 4.0
+## Upgrading from Core 3.5 to 4.0
 
 - Since Spark 4.0, Spark will compress event logs. To restore the behavior 
before Spark 4.0, you can set `spark.eventLog.compress` to `false`.
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-45240][SQL][CONNECT] Implement Error Enrichment for Python Client

2023-09-24 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 913991046c6 [SPARK-45240][SQL][CONNECT] Implement Error Enrichment for 
Python Client
913991046c6 is described below

commit 913991046c6d2b707eab64bd8ca874f9b9bb6581
Author: Yihong He 
AuthorDate: Mon Sep 25 09:35:06 2023 +0900

[SPARK-45240][SQL][CONNECT] Implement Error Enrichment for Python Client

### What changes were proposed in this pull request?

- Implemented the reconstruction of the exception with un-truncated error 
messages and full server-side stacktrace (includes cause exceptions) based on 
the responses of FetchErrorDetails RPC.

Examples:
`./bin/pyspark --remote local`
```python
>>> spark.sql("""select from_json('{"d": "02-29"}', 'd date',  
map('dateFormat', 'MM-dd'))""").collect()
Traceback (most recent call last):
  File "", line 1, in 
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/session.py", line 
556, in sql
data, properties = 
self.client.execute_command(cmd.command(self._client))
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 958, in execute_command
data, _, _, _, properties = self._execute_and_fetch(req)
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 1259, in _execute_and_fetch
for response in self._execute_and_fetch_as_iterator(req):
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 1240, in _execute_and_fetch_as_iterator
self._handle_error(error)
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 1479, in _handle_error
self._handle_rpc_error(error)
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 1533, in _handle_rpc_error
raise convert_exception(
pyspark.errors.exceptions.connect.SparkUpgradeException: 
[INCONSISTENT_BEHAVIOR_CROSS_VERSION.PARSE_DATETIME_BY_NEW_PARSER] You may get 
a different result due to the upgrading to Spark >= 3.0:
Fail to parse '02-29' in the new parser. You can set 
"spark.sql.legacy.timeParserPolicy" to "LEGACY" to restore the behavior before 
Spark 3.0, or set to "CORRECTED" and treat it as an invalid datetime string.

JVM stacktrace:
org.apache.spark.SparkUpgradeException: 
[INCONSISTENT_BEHAVIOR_CROSS_VERSION.PARSE_DATETIME_BY_NEW_PARSER] You may get 
a different result due to the upgrading to Spark >= 3.0:
Fail to parse '02-29' in the new parser. You can set 
"spark.sql.legacy.timeParserPolicy" to "LEGACY" to restore the behavior before 
Spark 3.0, or set to "CORRECTED" and treat it as an invalid datetime string.
at 
org.apache.spark.sql.errors.ExecutionErrors.failToParseDateTimeInNewParserError(ExecutionErrors.scala:54)
at 
org.apache.spark.sql.errors.ExecutionErrors.failToParseDateTimeInNewParserError$(ExecutionErrors.scala:48)
at 
org.apache.spark.sql.errors.ExecutionErrors$.failToParseDateTimeInNewParserError(ExecutionErrors.scala:218)
at 
org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$$anonfun$checkParsedDiff$1.applyOrElse(DateTimeFormatterHelper.scala:142)
at 
org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$$anonfun$checkParsedDiff$1.applyOrElse(DateTimeFormatterHelper.scala:135)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:35)
at 
org.apache.spark.sql.catalyst.util.Iso8601DateFormatter.parse(DateFormatter.scala:59)
at 
org.apache.spark.sql.catalyst.json.JacksonParser$$anonfun$$nestedInanonfun$makeConverter$11$1.applyOrElse(JacksonParser.scala:302)
at 
org.apache.spark.sql.catalyst.json.JacksonParser$$anonfun$$nestedInanonfun$makeConverter$11$1.applyOrElse(JacksonParser.scala:299)
at 
org.apache.spark.sql.catalyst.json.JacksonParser.parseJsonToken(JacksonParser.scala:404)
at 
org.apache.spark.sql.catalyst.json.JacksonParser.$anonfun$makeConverter$11(JacksonParser.scala:299)
at 
org.apache.spark.sql.catalyst.json.JacksonParser.org$apache$spark$sql$catalyst$json$JacksonParser$$convertObject(JacksonParser.scala:457)
at 
org.apache.spark.sql.catalyst.json.JacksonParser$$anonfun$$nestedInanonfun$makeStructRootConverter$3$1.applyOrElse(JacksonParser.scala:123)
at 
org.apache.spark.sql.catalyst.json.JacksonParser$$anonfun$$nestedInanonfun$makeStructRootConverter$3$1.applyOrElse(JacksonParser.scala:122)
at 
org.apache.spark.sql.catalyst.json.JacksonParser.parseJsonToken(JacksonParser.scala:404)
at 

[spark] branch master updated: [SPARK-45207][SQL][CONNECT] Implement Error Enrichment for Scala Client

2023-09-24 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4863be5632f [SPARK-45207][SQL][CONNECT] Implement Error Enrichment for 
Scala Client
4863be5632f is described below

commit 4863be5632f3165a5699a525235ea118c1e1f7eb
Author: Yihong He 
AuthorDate: Mon Sep 25 09:35:33 2023 +0900

[SPARK-45207][SQL][CONNECT] Implement Error Enrichment for Scala Client

### What changes were proposed in this pull request?

-  Implemented the reconstruction of the complete exception (un-truncated 
error messages, cause exceptions, server-side stacktrace) based on the 
responses of FetchErrorDetails RPC.

### Why are the changes needed?

- Cause exceptions play an important role in the current control flow, such 
as in StreamingQueryException. They are also valuable for debugging.
- Un-truncated error message is useful for debugging
- Providing server-side stack traces aids in effectively diagnosing 
server-related issues.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

- `build/sbt "connect-client-jvm/testOnly *ClientE2ETestSuite"`
- `build/sbt "connect-client-jvm/testOnly *ClientStreamingQuerySuite"`

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #42987 from heyihong/SPARK-45207.

Authored-by: Yihong He 
Signed-off-by: Hyukjin Kwon 
---
 .../org/apache/spark/sql/ClientE2ETestSuite.scala  |  59 ++-
 .../sql/streaming/ClientStreamingQuerySuite.scala  |  41 -
 .../client/CustomSparkConnectBlockingStub.scala|  44 -
 .../connect/client/GrpcExceptionConverter.scala| 192 +
 4 files changed, 292 insertions(+), 44 deletions(-)

diff --git 
a/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala
 
b/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala
index 21892542eab..ec9b1698a4e 100644
--- 
a/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala
+++ 
b/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala
@@ -18,6 +18,7 @@ package org.apache.spark.sql
 
 import java.io.{ByteArrayOutputStream, PrintStream}
 import java.nio.file.Files
+import java.time.DateTimeException
 import java.util.Properties
 
 import scala.collection.JavaConverters._
@@ -29,7 +30,7 @@ import org.apache.commons.lang3.{JavaVersion, SystemUtils}
 import org.scalactic.TolerantNumerics
 import org.scalatest.PrivateMethodTester
 
-import org.apache.spark.{SparkArithmeticException, SparkException}
+import org.apache.spark.{SparkArithmeticException, SparkException, 
SparkUpgradeException}
 import org.apache.spark.SparkBuildInfo.{spark_version => SPARK_VERSION}
 import 
org.apache.spark.sql.catalyst.analysis.{NamespaceAlreadyExistsException, 
NoSuchDatabaseException, NoSuchTableException, TableAlreadyExistsException, 
TempTableAlreadyExistsException}
 import org.apache.spark.sql.catalyst.encoders.AgnosticEncoders.StringEncoder
@@ -44,6 +45,62 @@ import org.apache.spark.sql.types._
 
 class ClientE2ETestSuite extends RemoteSparkSession with SQLHelper with 
PrivateMethodTester {
 
+  for (enrichErrorEnabled <- Seq(false, true)) {
+test(s"cause exception - ${enrichErrorEnabled}") {
+  withSQLConf("spark.sql.connect.enrichError.enabled" -> 
enrichErrorEnabled.toString) {
+val ex = intercept[SparkUpgradeException] {
+  spark
+.sql("""
+|select from_json(
+|  '{"d": "02-29"}',
+|  'd date',
+|  map('dateFormat', 'MM-dd'))
+|""".stripMargin)
+.collect()
+}
+if (enrichErrorEnabled) {
+  assert(ex.getCause.isInstanceOf[DateTimeException])
+} else {
+  assert(ex.getCause == null)
+}
+  }
+}
+  }
+
+  test(s"throw SparkException with large cause exception") {
+withSQLConf("spark.sql.connect.enrichError.enabled" -> "true") {
+  val session = spark
+  import session.implicits._
+
+  val throwException =
+udf((_: String) => throw new SparkException("test" * 1))
+
+  val ex = intercept[SparkException] {
+Seq("1").toDS.withColumn("udf_val", throwException($"value")).collect()
+  }
+
+  assert(ex.getCause.isInstanceOf[SparkException])
+  assert(ex.getCause.getMessage.contains("test" * 1))
+}
+  }
+
+  for (isServerStackTraceEnabled <- Seq(false, true)) {
+test(s"server-side stack trace is set in exceptions - 
${isServerStackTraceEnabled}") {
+  withSQLConf(
+"spark.sql.connect.serverStacktrace.enabled" -> 
isServerStackTraceEnabled.toString,
+

[spark] branch master updated: [SPARK-45279][PYTHON][CONNECT] Attach plan_id for all logical plans

2023-09-24 Thread ruifengz
This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 609552e19cf [SPARK-45279][PYTHON][CONNECT] Attach plan_id for all 
logical plans
609552e19cf is described below

commit 609552e19cfe75109b1b4641baadd79360e75443
Author: Ruifeng Zheng 
AuthorDate: Mon Sep 25 08:17:08 2023 +0800

[SPARK-45279][PYTHON][CONNECT] Attach plan_id for all logical plans

### What changes were proposed in this pull request?
Attach plan_id for all logical plans, except `CachedRelation`

### Why are the changes needed?
1, all logical plans should contain its plan id in protos
2, catalog plans also contain the plan id in scala client, e.g.


https://github.com/apache/spark/blob/05f5dccbd34218c7d399228529853bdb1595f3a2/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L63-L67

`newDataset` method will set the plan id

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
CI

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #43055 from zhengruifeng/connect_plan_id.

Authored-by: Ruifeng Zheng 
Signed-off-by: Ruifeng Zheng 
---
 python/pyspark/sql/connect/plan.py | 79 +++---
 1 file changed, 40 insertions(+), 39 deletions(-)

diff --git a/python/pyspark/sql/connect/plan.py 
b/python/pyspark/sql/connect/plan.py
index 219545cf646..6758b3673f3 100644
--- a/python/pyspark/sql/connect/plan.py
+++ b/python/pyspark/sql/connect/plan.py
@@ -1190,9 +1190,7 @@ class CollectMetrics(LogicalPlan):
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
 assert self._child is not None
-
-plan = proto.Relation()
-plan.common.plan_id = self._child._plan_id
+plan = self._create_proto_relation()
 plan.collect_metrics.input.CopyFrom(self._child.plan(session))
 plan.collect_metrics.name = self._name
 plan.collect_metrics.metrics.extend([self.col_to_expr(x, session) for 
x in self._exprs])
@@ -1689,7 +1687,9 @@ class CurrentDatabase(LogicalPlan):
 super().__init__(None)
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-return 
proto.Relation(catalog=proto.Catalog(current_database=proto.CurrentDatabase()))
+plan = self._create_proto_relation()
+plan.catalog.current_database.SetInParent()
+return plan
 
 
 class SetCurrentDatabase(LogicalPlan):
@@ -1698,7 +1698,7 @@ class SetCurrentDatabase(LogicalPlan):
 self._db_name = db_name
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = proto.Relation()
+plan = self._create_proto_relation()
 plan.catalog.set_current_database.db_name = self._db_name
 return plan
 
@@ -1709,7 +1709,8 @@ class ListDatabases(LogicalPlan):
 self._pattern = pattern
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = 
proto.Relation(catalog=proto.Catalog(list_databases=proto.ListDatabases()))
+plan = self._create_proto_relation()
+plan.catalog.list_databases.SetInParent()
 if self._pattern is not None:
 plan.catalog.list_databases.pattern = self._pattern
 return plan
@@ -1722,7 +1723,8 @@ class ListTables(LogicalPlan):
 self._pattern = pattern
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = 
proto.Relation(catalog=proto.Catalog(list_tables=proto.ListTables()))
+plan = self._create_proto_relation()
+plan.catalog.list_tables.SetInParent()
 if self._db_name is not None:
 plan.catalog.list_tables.db_name = self._db_name
 if self._pattern is not None:
@@ -1737,7 +1739,8 @@ class ListFunctions(LogicalPlan):
 self._pattern = pattern
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = 
proto.Relation(catalog=proto.Catalog(list_functions=proto.ListFunctions()))
+plan = self._create_proto_relation()
+plan.catalog.list_functions.SetInParent()
 if self._db_name is not None:
 plan.catalog.list_functions.db_name = self._db_name
 if self._pattern is not None:
@@ -1752,7 +1755,7 @@ class ListColumns(LogicalPlan):
 self._db_name = db_name
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = 
proto.Relation(catalog=proto.Catalog(list_columns=proto.ListColumns()))
+plan = self._create_proto_relation()
 plan.catalog.list_columns.table_name = self._table_name
 if self._db_name is not None:
 plan.catalog.list_columns.db_name = self._db_name
@@ -1765,7 +1768,7 @@ class 

[spark] branch branch-3.3 updated: [SPARK-45286][DOCS] Add back Matomo analytics

2023-09-24 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 9a28200f6e4 [SPARK-45286][DOCS] Add back Matomo analytics
9a28200f6e4 is described below

commit 9a28200f6e461c4929dd6e05b6dd55fe984c0924
Author: Sean Owen 
AuthorDate: Sun Sep 24 14:17:55 2023 -0500

[SPARK-45286][DOCS] Add back Matomo analytics

### What changes were proposed in this pull request?

Add analytics to doc pages using the ASF's Matomo service

### Why are the changes needed?

We had previously removed Google Analytics from the website and release 
docs, per ASF policy: https://github.com/apache/spark/pull/36310

We just restored analytics using the ASF-hosted Matomo service on the 
website:

https://github.com/apache/spark-website/commit/a1548627b48a62c2e51870d1488ca3e09397bd30

This change would put the same new tracking code back into the release 
docs. It would let us see what docs and resources are most used, I suppose.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43063 from srowen/SPARK-45286.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
(cherry picked from commit a881438114ea3e8e918d981ef89ed1ab956d6fca)
Signed-off-by: Sean Owen 
---
 docs/_layouts/global.html | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index d4463922766..2d139f5e0fb 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -33,6 +33,25 @@
 https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css; />
 
 
+{% production %}
+
+
+var _paq = window._paq = window._paq || [];
+/* tracker methods like "setCustomDimension" should be called 
before "trackPageView" */
+_paq.push(["disableCookies"]);
+_paq.push(['trackPageView']);
+_paq.push(['enableLinkTracking']);
+(function() {
+  var u="https://analytics.apache.org/";;
+  _paq.push(['setTrackerUrl', u+'matomo.php']);
+  _paq.push(['setSiteId', '40']);
+  var d=document, g=d.createElement('script'), 
s=d.getElementsByTagName('script')[0];
+  g.async=true; g.src=u+'matomo.js'; 
s.parentNode.insertBefore(g,s);
+})();
+
+
+{% endproduction %}
+
 
 
 

[spark] branch branch-3.4 updated: [SPARK-45286][DOCS] Add back Matomo analytics

2023-09-24 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new 20924aa581a [SPARK-45286][DOCS] Add back Matomo analytics
20924aa581a is described below

commit 20924aa581a2c5c49ec700689f1888dd7db79e6b
Author: Sean Owen 
AuthorDate: Sun Sep 24 14:17:55 2023 -0500

[SPARK-45286][DOCS] Add back Matomo analytics

### What changes were proposed in this pull request?

Add analytics to doc pages using the ASF's Matomo service

### Why are the changes needed?

We had previously removed Google Analytics from the website and release 
docs, per ASF policy: https://github.com/apache/spark/pull/36310

We just restored analytics using the ASF-hosted Matomo service on the 
website:

https://github.com/apache/spark-website/commit/a1548627b48a62c2e51870d1488ca3e09397bd30

This change would put the same new tracking code back into the release 
docs. It would let us see what docs and resources are most used, I suppose.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43063 from srowen/SPARK-45286.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
(cherry picked from commit a881438114ea3e8e918d981ef89ed1ab956d6fca)
Signed-off-by: Sean Owen 
---
 docs/_layouts/global.html | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index d4463922766..2d139f5e0fb 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -33,6 +33,25 @@
 https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css; />
 
 
+{% production %}
+
+
+var _paq = window._paq = window._paq || [];
+/* tracker methods like "setCustomDimension" should be called 
before "trackPageView" */
+_paq.push(["disableCookies"]);
+_paq.push(['trackPageView']);
+_paq.push(['enableLinkTracking']);
+(function() {
+  var u="https://analytics.apache.org/";;
+  _paq.push(['setTrackerUrl', u+'matomo.php']);
+  _paq.push(['setSiteId', '40']);
+  var d=document, g=d.createElement('script'), 
s=d.getElementsByTagName('script')[0];
+  g.async=true; g.src=u+'matomo.js'; 
s.parentNode.insertBefore(g,s);
+})();
+
+
+{% endproduction %}
+
 
 
 

[spark] branch branch-3.5 updated: [SPARK-45286][DOCS] Add back Matomo analytics

2023-09-24 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new 609306ff5da [SPARK-45286][DOCS] Add back Matomo analytics
609306ff5da is described below

commit 609306ff5daa8ff7c2212088d33c0911ad0f4989
Author: Sean Owen 
AuthorDate: Sun Sep 24 14:17:55 2023 -0500

[SPARK-45286][DOCS] Add back Matomo analytics

### What changes were proposed in this pull request?

Add analytics to doc pages using the ASF's Matomo service

### Why are the changes needed?

We had previously removed Google Analytics from the website and release 
docs, per ASF policy: https://github.com/apache/spark/pull/36310

We just restored analytics using the ASF-hosted Matomo service on the 
website:

https://github.com/apache/spark-website/commit/a1548627b48a62c2e51870d1488ca3e09397bd30

This change would put the same new tracking code back into the release 
docs. It would let us see what docs and resources are most used, I suppose.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43063 from srowen/SPARK-45286.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
(cherry picked from commit a881438114ea3e8e918d981ef89ed1ab956d6fca)
Signed-off-by: Sean Owen 
---
 docs/_layouts/global.html | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index 9b7c4692461..8c4435fdf31 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -32,6 +32,25 @@
 https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css; />
 
 
+{% production %}
+
+
+var _paq = window._paq = window._paq || [];
+/* tracker methods like "setCustomDimension" should be called 
before "trackPageView" */
+_paq.push(["disableCookies"]);
+_paq.push(['trackPageView']);
+_paq.push(['enableLinkTracking']);
+(function() {
+  var u="https://analytics.apache.org/";;
+  _paq.push(['setTrackerUrl', u+'matomo.php']);
+  _paq.push(['setSiteId', '40']);
+  var d=document, g=d.createElement('script'), 
s=d.getElementsByTagName('script')[0];
+  g.async=true; g.src=u+'matomo.js'; 
s.parentNode.insertBefore(g,s);
+})();
+
+
+{% endproduction %}
+
 
 
 

[spark] branch master updated: [SPARK-45286][DOCS] Add back Matomo analytics

2023-09-24 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a881438114e [SPARK-45286][DOCS] Add back Matomo analytics
a881438114e is described below

commit a881438114ea3e8e918d981ef89ed1ab956d6fca
Author: Sean Owen 
AuthorDate: Sun Sep 24 14:17:55 2023 -0500

[SPARK-45286][DOCS] Add back Matomo analytics

### What changes were proposed in this pull request?

Add analytics to doc pages using the ASF's Matomo service

### Why are the changes needed?

We had previously removed Google Analytics from the website and release 
docs, per ASF policy: https://github.com/apache/spark/pull/36310

We just restored analytics using the ASF-hosted Matomo service on the 
website:

https://github.com/apache/spark-website/commit/a1548627b48a62c2e51870d1488ca3e09397bd30

This change would put the same new tracking code back into the release 
docs. It would let us see what docs and resources are most used, I suppose.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43063 from srowen/SPARK-45286.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
---
 docs/_layouts/global.html | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index e857efad6f0..c2f05cfd6bb 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -32,6 +32,25 @@
 https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css; />
 
 
+{% production %}
+
+
+var _paq = window._paq = window._paq || [];
+/* tracker methods like "setCustomDimension" should be called 
before "trackPageView" */
+_paq.push(["disableCookies"]);
+_paq.push(['trackPageView']);
+_paq.push(['enableLinkTracking']);
+(function() {
+  var u="https://analytics.apache.org/";;
+  _paq.push(['setTrackerUrl', u+'matomo.php']);
+  _paq.push(['setSiteId', '40']);
+  var d=document, g=d.createElement('script'), 
s=d.getElementsByTagName('script')[0];
+  g.async=true; g.src=u+'matomo.js'; 
s.parentNode.insertBefore(g,s);
+})();
+
+
+{% endproduction %}
+