[GitHub] [spark-website] zero323 commented on pull request #358: Fix remotes URLs to point to apache/spark

2021-10-09 Thread GitBox


zero323 commented on pull request #358:
URL: https://github.com/apache/spark-website/pull/358#issuecomment-939361223


   Thanks @srowen!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] srowen closed pull request #358: Fix remotes URLs to point to apache/spark

2021-10-09 Thread GitBox


srowen closed pull request #358:
URL: https://github.com/apache/spark-website/pull/358


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark-website] branch asf-site updated: Fix remotes URLs to point to apache/spark

2021-10-09 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 3cff519  Fix remotes URLs to point to apache/spark
3cff519 is described below

commit 3cff5195eb37664b3f6c4cd7ae664ceb45cf07aa
Author: zero323 
AuthorDate: Sat Oct 9 16:00:47 2021 -0500

Fix remotes URLs to point to apache/spark

How to Merge a Pull Request section describes process of working with main 
Spark
repositiory. However, `git remote` links in the How to Merge a Pull Request 
point to apache/spark-website.



Author: zero323 

Closes #358 from zero323/fix-setting-up-remotes.
---
 committers.md| 12 ++--
 site/committers.html | 12 ++--
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/committers.md b/committers.md
index ad12fa9..4bb255c 100644
--- a/committers.md
+++ b/committers.md
@@ -179,12 +179,12 @@ After cloning your fork of Spark you already have a 
remote `origin` pointing the
 contains at least these lines:
 
 ```
-apache g...@github.com:apache/spark-website.git (fetch)
-apache g...@github.com:apache/spark-website.git (push)
-apache-github  g...@github.com:apache/spark-website.git (fetch)
-apache-github  g...@github.com:apache/spark-website.git (push)
-origin g...@github.com:[your username]/spark-website.git (fetch)
-origin g...@github.com:[your username]/spark-website.git (push)
+apache g...@github.com:apache/spark.git (fetch)
+apache g...@github.com:apache/spark.git (push)
+apache-github  g...@github.com:apache/spark.git (fetch)
+apache-github  g...@github.com:apache/spark.git (push)
+origin g...@github.com:[your username]/spark.git (fetch)
+origin g...@github.com:[your username]/spark.git (push)
 ```
 
 For the `apache` repo, you will need to set up command-line authentication to 
GitHub. This may
diff --git a/site/committers.html b/site/committers.html
index 4e93005..93bdd5c 100644
--- a/site/committers.html
+++ b/site/committers.html
@@ -626,12 +626,12 @@ into the official Spark repo just by specifying your fork 
in the origin pointing there. So 
if correct, your git remote 
-v
 contains at least these lines:
 
-apache   g...@github.com:apache/spark-website.git 
(fetch)
-apache g...@github.com:apache/spark-website.git (push)
-apache-github  g...@github.com:apache/spark-website.git (fetch)
-apache-github  g...@github.com:apache/spark-website.git (push)
-origin g...@github.com:[your username]/spark-website.git (fetch)
-origin g...@github.com:[your username]/spark-website.git (push)
+apache   g...@github.com:apache/spark.git (fetch)
+apache g...@github.com:apache/spark.git (push)
+apache-github  g...@github.com:apache/spark.git (fetch)
+apache-github  g...@github.com:apache/spark.git (push)
+origin g...@github.com:[your username]/spark.git (fetch)
+origin g...@github.com:[your username]/spark.git (push)
 
 
 For the apache 
repo, you will need to set up command-line authentication to GitHub. This may

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] zero323 commented on pull request #358: Fix remotes URLs to point to apache/spark

2021-10-09 Thread GitBox


zero323 commented on pull request #358:
URL: https://github.com/apache/spark-website/pull/358#issuecomment-939354439


   > Can you generate the HTML too? (or simply edit the HTML files too, as this 
is so simple)
   
   Sorry, it slipped my mind. Done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] srowen commented on pull request #358: Fix remotes URLs to point to apache/spark

2021-10-09 Thread GitBox


srowen commented on pull request #358:
URL: https://github.com/apache/spark-website/pull/358#issuecomment-939343934


   Oops, right. Can you generate the HTML too? (or simply edit the HTML files 
too, as this is so simple)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] zero323 opened a new pull request #358: Fix remotes URLs to point to apache/spark

2021-10-09 Thread GitBox


zero323 opened a new pull request #358:
URL: https://github.com/apache/spark-website/pull/358


   How to Merge a Pull Request section describes process of working with main 
Spark
   repositiory. However, `git remote` links in the How to Merge a Pull Request 
point to apache/spark-website.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-36960][SQL] Pushdown filters with ANSI interval values to ORC

2021-10-09 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ebfc6bb  [SPARK-36960][SQL] Pushdown filters with ANSI interval values 
to ORC
ebfc6bb is described below

commit ebfc6bbe0e9200f87ebb52fb71d009b2d71b956d
Author: Kousuke Saruta 
AuthorDate: Sat Oct 9 16:55:59 2021 +0300

[SPARK-36960][SQL] Pushdown filters with ANSI interval values to ORC

### What changes were proposed in this pull request?

This PR proposes to push down filters with ANSI intervals to ORC.

### Why are the changes needed?

After SPARK-36931 (#34184), V1 and V2 ORC datasources support ANSI 
intervals. So it's great to be able to push down filters with ANSI interval 
values for the better performance.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New tests.

Closes #34224 from sarutak/orc-ansi-interval-pushdown.

Lead-authored-by: Kousuke Saruta 
Co-authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
---
 .../apache/spark/sql/catalyst/dsl/package.scala|  4 +-
 .../sql/execution/datasources/orc/OrcFilters.scala | 10 ++-
 .../execution/datasources/orc/OrcFilterSuite.scala | 97 ++
 3 files changed, 108 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala
index 4a97a8d..979c280 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.sql.catalyst
 
 import java.sql.{Date, Timestamp}
-import java.time.{Instant, LocalDate}
+import java.time.{Duration, Instant, LocalDate, Period}
 
 import scala.language.implicitConversions
 
@@ -167,6 +167,8 @@ package object dsl {
 implicit def timestampToLiteral(t: Timestamp): Literal = Literal(t)
 implicit def instantToLiteral(i: Instant): Literal = Literal(i)
 implicit def binaryToLiteral(a: Array[Byte]): Literal = Literal(a)
+implicit def periodToLiteral(p: Period): Literal = Literal(p)
+implicit def durationToLiteral(d: Duration): Literal = Literal(d)
 
 implicit def symbolToUnresolvedAttribute(s: Symbol): 
analysis.UnresolvedAttribute =
   analysis.UnresolvedAttribute(s.name)
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
index 5abfa4c..8e02fc3 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
@@ -17,7 +17,7 @@
 
 package org.apache.spark.sql.execution.datasources.orc
 
-import java.time.{Instant, LocalDate}
+import java.time.{Duration, Instant, LocalDate, Period}
 
 import org.apache.hadoop.hive.common.`type`.HiveDecimal
 import org.apache.hadoop.hive.ql.io.sarg.{PredicateLeaf, SearchArgument}
@@ -26,6 +26,7 @@ import 
org.apache.hadoop.hive.ql.io.sarg.SearchArgumentFactory.newBuilder
 import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
 
 import org.apache.spark.sql.catalyst.util.DateTimeUtils.{instantToMicros, 
localDateToDays, toJavaDate, toJavaTimestamp}
+import org.apache.spark.sql.catalyst.util.IntervalUtils
 import org.apache.spark.sql.errors.QueryExecutionErrors
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.sources.Filter
@@ -140,7 +141,8 @@ private[sql] object OrcFilters extends OrcFiltersBase {
*/
   def getPredicateLeafType(dataType: DataType): PredicateLeaf.Type = dataType 
match {
 case BooleanType => PredicateLeaf.Type.BOOLEAN
-case ByteType | ShortType | IntegerType | LongType => 
PredicateLeaf.Type.LONG
+case ByteType | ShortType | IntegerType | LongType |
+ _: AnsiIntervalType => PredicateLeaf.Type.LONG
 case FloatType | DoubleType => PredicateLeaf.Type.FLOAT
 case StringType => PredicateLeaf.Type.STRING
 case DateType => PredicateLeaf.Type.DATE
@@ -166,6 +168,10 @@ private[sql] object OrcFilters extends OrcFiltersBase {
   toJavaDate(localDateToDays(value.asInstanceOf[LocalDate]))
 case _: TimestampType if value.isInstanceOf[Instant] =>
   toJavaTimestamp(instantToMicros(value.asInstanceOf[Instant]))
+case _: YearMonthIntervalType =>
+  IntervalUtils.periodToMonths(value.asInstanceOf[Period]).longValue()
+case _: DayTimeIntervalType =>
+  IntervalUtils.durationToMicros(value.asInstanceOf[Duration])
 case _ => value
   }
 
diff --git 

[spark] branch master updated (4f825aa -> 7468cd7)

2021-10-09 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4f825aa  [SPARK-36839][INFRA][FOLLOW-UP] Respect Hadoop version 
configured in scheduled GitHub Actions jobs
 add 7468cd7  [SPARK-36804][YARN] Support --verbose option in YARN mode

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/deploy/yarn/ClientArguments.scala | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-36839][INFRA][FOLLOW-UP] Respect Hadoop version configured in scheduled GitHub Actions jobs

2021-10-09 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4f825aa  [SPARK-36839][INFRA][FOLLOW-UP] Respect Hadoop version 
configured in scheduled GitHub Actions jobs
4f825aa is described below

commit 4f825aad65f2650343e7cfbef39465ebb4e403b6
Author: Hyukjin Kwon 
AuthorDate: Sat Oct 9 15:11:43 2021 +0900

[SPARK-36839][INFRA][FOLLOW-UP] Respect Hadoop version configured in 
scheduled GitHub Actions jobs

### What changes were proposed in this pull request?

This is a followup of #34091 and https://github.com/apache/spark/pull/34217.
We should respect the default Hadoop 2.7 (not only in Hive's tests).

### Why are the changes needed?

In order to run the daily job with Hadoop 2 for all tests.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

It will be tested once it's merged. It won't break any build in any event.

Closes #34230 from HyukjinKwon/SPARK-36839-followup-hadoop2.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
---
 .github/workflows/build_and_test.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 1262e65..96451ac 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -91,7 +91,7 @@ jobs:
 java:
   - 8
 hadoop:
-  - hadoop3.2
+  - ${{ needs.configure-jobs.outputs.hadoop }}
 hive:
   - hive2.3
 # TODO(SPARK-32246): We don't test 'streaming-kinesis-asl' for now.

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-36897][PYTHON] Use NamedTuple with variable type hints instead of namedtuple

2021-10-09 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c5c6f48  [SPARK-36897][PYTHON] Use NamedTuple with variable type hints 
instead of namedtuple
c5c6f48 is described below

commit c5c6f48de40a4e012a35ecc78f8db41654d2c8bd
Author: Takuya UESHIN 
AuthorDate: Sat Oct 9 15:00:20 2021 +0900

[SPARK-36897][PYTHON] Use NamedTuple with variable type hints instead of 
namedtuple

### What changes were proposed in this pull request?

Use `NamedTuple` with variable type hints instead of `namedtuple`.

### Why are the changes needed?

Per discussion under 
https://github.com/apache/spark/pull/34133#discussion_r718833451, we wanted to 
replace `collections.namedtuple()` by `typing.NamedTuple`.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #34228 from ueshin/issues/SPARK-36897/namedtuple.

Authored-by: Takuya UESHIN 
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/sql/catalog.py | 35 +--
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/python/pyspark/sql/catalog.py b/python/pyspark/sql/catalog.py
index 61167fa..29f22e4 100644
--- a/python/pyspark/sql/catalog.py
+++ b/python/pyspark/sql/catalog.py
@@ -17,8 +17,7 @@
 
 import sys
 import warnings
-from collections import namedtuple
-from typing import Any, Callable, List, Optional, TYPE_CHECKING
+from typing import Any, Callable, NamedTuple, List, Optional, TYPE_CHECKING
 
 from pyspark import since
 from pyspark.sql.dataframe import DataFrame
@@ -30,10 +29,34 @@ if TYPE_CHECKING:
 from pyspark.sql.types import DataType
 
 
-Database = namedtuple("Database", "name description locationUri")
-Table = namedtuple("Table", "name database description tableType isTemporary")
-Column = namedtuple("Column", "name description dataType nullable isPartition 
isBucket")
-Function = namedtuple("Function", "name description className isTemporary")
+class Database(NamedTuple):
+name: str
+description: Optional[str]
+locationUri: str
+
+
+class Table(NamedTuple):
+name: str
+database: Optional[str]
+description: Optional[str]
+tableType: str
+isTemporary: bool
+
+
+class Column(NamedTuple):
+name: str
+description: Optional[str]
+dataType: str
+nullable: bool
+isPartition: bool
+isBucket: bool
+
+
+class Function(NamedTuple):
+name: str
+description: Optional[str]
+className: str
+isTemporary: bool
 
 
 class Catalog(object):

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org