[GitHub] [spark-website] zero323 commented on pull request #358: Fix remotes URLs to point to apache/spark
zero323 commented on pull request #358: URL: https://github.com/apache/spark-website/pull/358#issuecomment-939361223 Thanks @srowen! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen closed pull request #358: Fix remotes URLs to point to apache/spark
srowen closed pull request #358: URL: https://github.com/apache/spark-website/pull/358 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] branch asf-site updated: Fix remotes URLs to point to apache/spark
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new 3cff519 Fix remotes URLs to point to apache/spark 3cff519 is described below commit 3cff5195eb37664b3f6c4cd7ae664ceb45cf07aa Author: zero323 AuthorDate: Sat Oct 9 16:00:47 2021 -0500 Fix remotes URLs to point to apache/spark How to Merge a Pull Request section describes process of working with main Spark repositiory. However, `git remote` links in the How to Merge a Pull Request point to apache/spark-website. Author: zero323 Closes #358 from zero323/fix-setting-up-remotes. --- committers.md| 12 ++-- site/committers.html | 12 ++-- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/committers.md b/committers.md index ad12fa9..4bb255c 100644 --- a/committers.md +++ b/committers.md @@ -179,12 +179,12 @@ After cloning your fork of Spark you already have a remote `origin` pointing the contains at least these lines: ``` -apache g...@github.com:apache/spark-website.git (fetch) -apache g...@github.com:apache/spark-website.git (push) -apache-github g...@github.com:apache/spark-website.git (fetch) -apache-github g...@github.com:apache/spark-website.git (push) -origin g...@github.com:[your username]/spark-website.git (fetch) -origin g...@github.com:[your username]/spark-website.git (push) +apache g...@github.com:apache/spark.git (fetch) +apache g...@github.com:apache/spark.git (push) +apache-github g...@github.com:apache/spark.git (fetch) +apache-github g...@github.com:apache/spark.git (push) +origin g...@github.com:[your username]/spark.git (fetch) +origin g...@github.com:[your username]/spark.git (push) ``` For the `apache` repo, you will need to set up command-line authentication to GitHub. This may diff --git a/site/committers.html b/site/committers.html index 4e93005..93bdd5c 100644 --- a/site/committers.html +++ b/site/committers.html @@ -626,12 +626,12 @@ into the official Spark repo just by specifying your fork in the origin pointing there. So if correct, your git remote -v contains at least these lines: -apache g...@github.com:apache/spark-website.git (fetch) -apache g...@github.com:apache/spark-website.git (push) -apache-github g...@github.com:apache/spark-website.git (fetch) -apache-github g...@github.com:apache/spark-website.git (push) -origin g...@github.com:[your username]/spark-website.git (fetch) -origin g...@github.com:[your username]/spark-website.git (push) +apache g...@github.com:apache/spark.git (fetch) +apache g...@github.com:apache/spark.git (push) +apache-github g...@github.com:apache/spark.git (fetch) +apache-github g...@github.com:apache/spark.git (push) +origin g...@github.com:[your username]/spark.git (fetch) +origin g...@github.com:[your username]/spark.git (push) For the apache repo, you will need to set up command-line authentication to GitHub. This may - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] zero323 commented on pull request #358: Fix remotes URLs to point to apache/spark
zero323 commented on pull request #358: URL: https://github.com/apache/spark-website/pull/358#issuecomment-939354439 > Can you generate the HTML too? (or simply edit the HTML files too, as this is so simple) Sorry, it slipped my mind. Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen commented on pull request #358: Fix remotes URLs to point to apache/spark
srowen commented on pull request #358: URL: https://github.com/apache/spark-website/pull/358#issuecomment-939343934 Oops, right. Can you generate the HTML too? (or simply edit the HTML files too, as this is so simple) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] zero323 opened a new pull request #358: Fix remotes URLs to point to apache/spark
zero323 opened a new pull request #358: URL: https://github.com/apache/spark-website/pull/358 How to Merge a Pull Request section describes process of working with main Spark repositiory. However, `git remote` links in the How to Merge a Pull Request point to apache/spark-website. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-36960][SQL] Pushdown filters with ANSI interval values to ORC
This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ebfc6bb [SPARK-36960][SQL] Pushdown filters with ANSI interval values to ORC ebfc6bb is described below commit ebfc6bbe0e9200f87ebb52fb71d009b2d71b956d Author: Kousuke Saruta AuthorDate: Sat Oct 9 16:55:59 2021 +0300 [SPARK-36960][SQL] Pushdown filters with ANSI interval values to ORC ### What changes were proposed in this pull request? This PR proposes to push down filters with ANSI intervals to ORC. ### Why are the changes needed? After SPARK-36931 (#34184), V1 and V2 ORC datasources support ANSI intervals. So it's great to be able to push down filters with ANSI interval values for the better performance. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? New tests. Closes #34224 from sarutak/orc-ansi-interval-pushdown. Lead-authored-by: Kousuke Saruta Co-authored-by: Kousuke Saruta Signed-off-by: Max Gekk --- .../apache/spark/sql/catalyst/dsl/package.scala| 4 +- .../sql/execution/datasources/orc/OrcFilters.scala | 10 ++- .../execution/datasources/orc/OrcFilterSuite.scala | 97 ++ 3 files changed, 108 insertions(+), 3 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala index 4a97a8d..979c280 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala @@ -18,7 +18,7 @@ package org.apache.spark.sql.catalyst import java.sql.{Date, Timestamp} -import java.time.{Instant, LocalDate} +import java.time.{Duration, Instant, LocalDate, Period} import scala.language.implicitConversions @@ -167,6 +167,8 @@ package object dsl { implicit def timestampToLiteral(t: Timestamp): Literal = Literal(t) implicit def instantToLiteral(i: Instant): Literal = Literal(i) implicit def binaryToLiteral(a: Array[Byte]): Literal = Literal(a) +implicit def periodToLiteral(p: Period): Literal = Literal(p) +implicit def durationToLiteral(d: Duration): Literal = Literal(d) implicit def symbolToUnresolvedAttribute(s: Symbol): analysis.UnresolvedAttribute = analysis.UnresolvedAttribute(s.name) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala index 5abfa4c..8e02fc3 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala @@ -17,7 +17,7 @@ package org.apache.spark.sql.execution.datasources.orc -import java.time.{Instant, LocalDate} +import java.time.{Duration, Instant, LocalDate, Period} import org.apache.hadoop.hive.common.`type`.HiveDecimal import org.apache.hadoop.hive.ql.io.sarg.{PredicateLeaf, SearchArgument} @@ -26,6 +26,7 @@ import org.apache.hadoop.hive.ql.io.sarg.SearchArgumentFactory.newBuilder import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable import org.apache.spark.sql.catalyst.util.DateTimeUtils.{instantToMicros, localDateToDays, toJavaDate, toJavaTimestamp} +import org.apache.spark.sql.catalyst.util.IntervalUtils import org.apache.spark.sql.errors.QueryExecutionErrors import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.sources.Filter @@ -140,7 +141,8 @@ private[sql] object OrcFilters extends OrcFiltersBase { */ def getPredicateLeafType(dataType: DataType): PredicateLeaf.Type = dataType match { case BooleanType => PredicateLeaf.Type.BOOLEAN -case ByteType | ShortType | IntegerType | LongType => PredicateLeaf.Type.LONG +case ByteType | ShortType | IntegerType | LongType | + _: AnsiIntervalType => PredicateLeaf.Type.LONG case FloatType | DoubleType => PredicateLeaf.Type.FLOAT case StringType => PredicateLeaf.Type.STRING case DateType => PredicateLeaf.Type.DATE @@ -166,6 +168,10 @@ private[sql] object OrcFilters extends OrcFiltersBase { toJavaDate(localDateToDays(value.asInstanceOf[LocalDate])) case _: TimestampType if value.isInstanceOf[Instant] => toJavaTimestamp(instantToMicros(value.asInstanceOf[Instant])) +case _: YearMonthIntervalType => + IntervalUtils.periodToMonths(value.asInstanceOf[Period]).longValue() +case _: DayTimeIntervalType => + IntervalUtils.durationToMicros(value.asInstanceOf[Duration]) case _ => value } diff --git
[spark] branch master updated (4f825aa -> 7468cd7)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4f825aa [SPARK-36839][INFRA][FOLLOW-UP] Respect Hadoop version configured in scheduled GitHub Actions jobs add 7468cd7 [SPARK-36804][YARN] Support --verbose option in YARN mode No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/yarn/ClientArguments.scala | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-36839][INFRA][FOLLOW-UP] Respect Hadoop version configured in scheduled GitHub Actions jobs
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 4f825aa [SPARK-36839][INFRA][FOLLOW-UP] Respect Hadoop version configured in scheduled GitHub Actions jobs 4f825aa is described below commit 4f825aad65f2650343e7cfbef39465ebb4e403b6 Author: Hyukjin Kwon AuthorDate: Sat Oct 9 15:11:43 2021 +0900 [SPARK-36839][INFRA][FOLLOW-UP] Respect Hadoop version configured in scheduled GitHub Actions jobs ### What changes were proposed in this pull request? This is a followup of #34091 and https://github.com/apache/spark/pull/34217. We should respect the default Hadoop 2.7 (not only in Hive's tests). ### Why are the changes needed? In order to run the daily job with Hadoop 2 for all tests. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? It will be tested once it's merged. It won't break any build in any event. Closes #34230 from HyukjinKwon/SPARK-36839-followup-hadoop2. Authored-by: Hyukjin Kwon Signed-off-by: Hyukjin Kwon --- .github/workflows/build_and_test.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/build_and_test.yml b/.github/workflows/build_and_test.yml index 1262e65..96451ac 100644 --- a/.github/workflows/build_and_test.yml +++ b/.github/workflows/build_and_test.yml @@ -91,7 +91,7 @@ jobs: java: - 8 hadoop: - - hadoop3.2 + - ${{ needs.configure-jobs.outputs.hadoop }} hive: - hive2.3 # TODO(SPARK-32246): We don't test 'streaming-kinesis-asl' for now. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-36897][PYTHON] Use NamedTuple with variable type hints instead of namedtuple
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new c5c6f48 [SPARK-36897][PYTHON] Use NamedTuple with variable type hints instead of namedtuple c5c6f48 is described below commit c5c6f48de40a4e012a35ecc78f8db41654d2c8bd Author: Takuya UESHIN AuthorDate: Sat Oct 9 15:00:20 2021 +0900 [SPARK-36897][PYTHON] Use NamedTuple with variable type hints instead of namedtuple ### What changes were proposed in this pull request? Use `NamedTuple` with variable type hints instead of `namedtuple`. ### Why are the changes needed? Per discussion under https://github.com/apache/spark/pull/34133#discussion_r718833451, we wanted to replace `collections.namedtuple()` by `typing.NamedTuple`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. Closes #34228 from ueshin/issues/SPARK-36897/namedtuple. Authored-by: Takuya UESHIN Signed-off-by: Hyukjin Kwon --- python/pyspark/sql/catalog.py | 35 +-- 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/python/pyspark/sql/catalog.py b/python/pyspark/sql/catalog.py index 61167fa..29f22e4 100644 --- a/python/pyspark/sql/catalog.py +++ b/python/pyspark/sql/catalog.py @@ -17,8 +17,7 @@ import sys import warnings -from collections import namedtuple -from typing import Any, Callable, List, Optional, TYPE_CHECKING +from typing import Any, Callable, NamedTuple, List, Optional, TYPE_CHECKING from pyspark import since from pyspark.sql.dataframe import DataFrame @@ -30,10 +29,34 @@ if TYPE_CHECKING: from pyspark.sql.types import DataType -Database = namedtuple("Database", "name description locationUri") -Table = namedtuple("Table", "name database description tableType isTemporary") -Column = namedtuple("Column", "name description dataType nullable isPartition isBucket") -Function = namedtuple("Function", "name description className isTemporary") +class Database(NamedTuple): +name: str +description: Optional[str] +locationUri: str + + +class Table(NamedTuple): +name: str +database: Optional[str] +description: Optional[str] +tableType: str +isTemporary: bool + + +class Column(NamedTuple): +name: str +description: Optional[str] +dataType: str +nullable: bool +isPartition: bool +isBucket: bool + + +class Function(NamedTuple): +name: str +description: Optional[str] +className: str +isTemporary: bool class Catalog(object): - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org