[GitHub] [spark] AmplabJenkins removed a comment on issue #28022: [SPARK-31253][SQL] add metrics to shuffle reader
AmplabJenkins removed a comment on issue #28022: [SPARK-31253][SQL] add metrics to shuffle reader URL: https://github.com/apache/spark/pull/28022#issuecomment-604095073 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28022: [SPARK-31253][SQL] add metrics to shuffle reader
AmplabJenkins removed a comment on issue #28022: [SPARK-31253][SQL] add metrics to shuffle reader URL: https://github.com/apache/spark/pull/28022#issuecomment-604095083 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120366/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP)
AmplabJenkins removed a comment on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-604118631 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP)
AmplabJenkins removed a comment on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-604118640 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25085/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP)
AmplabJenkins commented on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-604118640 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25085/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP)
AmplabJenkins commented on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-604118631 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP)
SparkQA commented on issue #28026: [SPARK-18885][SQL] Unify create table syntax (WIP) URL: https://github.com/apache/spark/pull/28026#issuecomment-604118010 **[Test build #120376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120376/testReport)** for PR 28026 at commit [`0522c93`](https://github.com/apache/spark/commit/0522c93d7f34b74faaf0abcfe02428855491a6a4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] rdblue commented on issue #28027: [SPARK-31255][SQL] Add SupportsMetadataColumns to DSv2 (WIP)
rdblue commented on issue #28027: [SPARK-31255][SQL] Add SupportsMetadataColumns to DSv2 (WIP) URL: https://github.com/apache/spark/pull/28027#issuecomment-604124358 FYI @HeartSaVioR and @brkyvz. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #28016: [SPARK-31238][SQL][test-hive1.2] Rebase dates to/from Julian calendar in write/read for ORC datasource
dongjoon-hyun commented on issue #28016: [SPARK-31238][SQL][test-hive1.2] Rebase dates to/from Julian calendar in write/read for ORC datasource URL: https://github.com/apache/spark/pull/28016#issuecomment-604128330 Retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28001: [SPARK-31237][SQL][TESTS] Replace 3-letter time zones by zone offsets
AmplabJenkins removed a comment on issue #28001: [SPARK-31237][SQL][TESTS] Replace 3-letter time zones by zone offsets URL: https://github.com/apache/spark/pull/28001#issuecomment-604142520 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28001: [SPARK-31237][SQL][TESTS] Replace 3-letter time zones by zone offsets
AmplabJenkins removed a comment on issue #28001: [SPARK-31237][SQL][TESTS] Replace 3-letter time zones by zone offsets URL: https://github.com/apache/spark/pull/28001#issuecomment-604142525 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120370/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] closed pull request #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
github-actions[bot] closed pull request #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on issue #26756: [SPARK-30119][WebUI]Support Pagination for Batch Tables in Streaming Tab
github-actions[bot] commented on issue #26756: [SPARK-30119][WebUI]Support Pagination for Batch Tables in Streaming Tab URL: https://github.com/apache/spark/pull/26756#issuecomment-604154485 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] closed pull request #26270: [SPARK-26544][SQL] Escape struct string in spark thriftserver to keep alignment with hive
github-actions[bot] closed pull request #26270: [SPARK-26544][SQL] Escape struct string in spark thriftserver to keep alignment with hive URL: https://github.com/apache/spark/pull/26270 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28005: [DO-NOT-MERGE] Check if SPARK-31231 is fixed in setuptools side
AmplabJenkins removed a comment on issue #28005: [DO-NOT-MERGE] Check if SPARK-31231 is fixed in setuptools side URL: https://github.com/apache/spark/pull/28005#issuecomment-604161400 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25093/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604164985 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120377/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604164972 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604164972 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
SparkQA removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604120811 **[Test build #120377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120377/testReport)** for PR 28009 at commit [`ef705dc`](https://github.com/apache/spark/commit/ef705dcc937c07c60dc5ba29a59913b81e05ce23). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28024: [SPARK-31254][SQL] Use the current session time zone in `HiveResult.toHiveString`
HyukjinKwon commented on a change in pull request #28024: [SPARK-31254][SQL] Use the current session time zone in `HiveResult.toHiveString` URL: https://github.com/apache/spark/pull/28024#discussion_r398264310 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/HiveResult.scala ## @@ -59,9 +59,9 @@ object HiveResult { .map(_.mkString("\t")) } - private lazy val zoneId = DateTimeUtils.getZoneId(SQLConf.get.sessionLocalTimeZone) - private lazy val dateFormatter = DateFormatter(zoneId) - private lazy val timestampFormatter = TimestampFormatter.getFractionFormatter(zoneId) + private def zoneId = DateTimeUtils.getZoneId(SQLConf.get.sessionLocalTimeZone) + private def dateFormatter = DateFormatter(zoneId) + private def timestampFormatter = TimestampFormatter.getFractionFormatter(zoneId) Review comment: @MaxGekk, which codes path access `SQLConf.get` here? Seems like we should clarify in the documentation that we should take other sessions into account since `TimestampFormatter` behaviours can be dependent on SQL configuration when this instance is created.. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28025: [SPARK-31186][PySpark][SQL] toPandas should not fail on duplicate column names
HyukjinKwon commented on issue #28025: [SPARK-31186][PySpark][SQL] toPandas should not fail on duplicate column names URL: https://github.com/apache/spark/pull/28025#issuecomment-604180059 Looks good. a couple of questions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28025: [SPARK-31186][PySpark][SQL] toPandas should not fail on duplicate column names
HyukjinKwon commented on a change in pull request #28025: [SPARK-31186][PySpark][SQL] toPandas should not fail on duplicate column names URL: https://github.com/apache/spark/pull/28025#discussion_r398270416 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -132,25 +132,36 @@ def toPandas(self): # Below is toPandas without Arrow optimization. pdf = pd.DataFrame.from_records(self.collect(), columns=self.columns) -dtype = {} -for field in self.schema: +dtype = [None] * len(self.schema) +for fieldIdx in range(len(self.schema)): +field = self.schema[fieldIdx] +pandas_col = pdf.iloc[:, fieldIdx] + pandas_type = PandasConversionMixin._to_corrected_pandas_type(field.dataType) # SPARK-21766: if an integer field is nullable and has null values, it can be # inferred by pandas as float column. Once we convert the column with NaN back # to integer type e.g., np.int16, we will hit exception. So we use the inferred # float type, not the corrected type from the schema in this case. if pandas_type is not None and \ not(isinstance(field.dataType, IntegralType) and field.nullable and -pdf[field.name].isnull().any()): -dtype[field.name] = pandas_type +pandas_col.isnull().any()): +dtype[fieldIdx] = pandas_type # Ensure we fall back to nullable numpy types, even when whole column is null: -if isinstance(field.dataType, IntegralType) and pdf[field.name].isnull().any(): -dtype[field.name] = np.float64 -if isinstance(field.dataType, BooleanType) and pdf[field.name].isnull().any(): -dtype[field.name] = np.object +if isinstance(field.dataType, IntegralType) and pandas_col.isnull().any(): +dtype[fieldIdx] = np.float64 +if isinstance(field.dataType, BooleanType) and pandas_col.isnull().any(): +dtype[fieldIdx] = np.object + +df = pd.DataFrame() +for index in range(len(dtype)): +t = dtype[index] +if t is not None: +series = pdf.iloc[:, index].astype(t, copy=False) +else: +series = pdf.iloc[:, index] +df.insert(index, self.schema[index].name, series, allow_duplicates=True) -for f, t in dtype.items(): -pdf[f] = pdf[f].astype(t, copy=False) Review comment: @viirya, out of curiosity, doesn't it work? ```python for index, t in enumerate(dtype): if t is not None pdf.iloc[:, index] = pdf.iloc[:, index].astype(t, copy=False) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator
AmplabJenkins removed a comment on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604189159 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25095/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator
AmplabJenkins removed a comment on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604189153 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator
AmplabJenkins commented on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604189153 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator
AmplabJenkins commented on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604189159 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25095/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc
AmplabJenkins removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604192138 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc
AmplabJenkins commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604192138 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc
AmplabJenkins commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604192145 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120385/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc
SparkQA commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604191935 **[Test build #120385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120385/testReport)** for PR 28017 at commit [`a1fbefe`](https://github.com/apache/spark/commit/a1fbefe03cc6a3e8eacb92ee0103845d26c491f6). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc
AmplabJenkins removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604192145 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120385/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc
SparkQA removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604184213 **[Test build #120385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120385/testReport)** for PR 28017 at commit [`a1fbefe`](https://github.com/apache/spark/commit/a1fbefe03cc6a3e8eacb92ee0103845d26c491f6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28005: [DO-NOT-MERGE] Check if SPARK-31231 is fixed in setuptools side
AmplabJenkins commented on issue #28005: [DO-NOT-MERGE] Check if SPARK-31231 is fixed in setuptools side URL: https://github.com/apache/spark/pull/28005#issuecomment-604199929 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28005: [DO-NOT-MERGE] Check if SPARK-31231 is fixed in setuptools side
AmplabJenkins commented on issue #28005: [DO-NOT-MERGE] Check if SPARK-31231 is fixed in setuptools side URL: https://github.com/apache/spark/pull/28005#issuecomment-604199937 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120383/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on issue #27989: [SPARK-31228][DSTREAMS] Add version information to the configuration of Kafka
HyukjinKwon edited a comment on issue #27989: [SPARK-31228][DSTREAMS] Add version information to the configuration of Kafka URL: https://github.com/apache/spark/pull/27989#issuecomment-604207880 Oh, okay, about options right? Sure, it's best to add the versions I guess. Let's do it separately though technically documenting versions in options are orthogonal to documenting in configurations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28027: [SPARK-31255][SQL] Add SupportsMetadataColumns to DSv2 (WIP)
SparkQA commented on issue #28027: [SPARK-31255][SQL] Add SupportsMetadataColumns to DSv2 (WIP) URL: https://github.com/apache/spark/pull/28027#issuecomment-604208444 **[Test build #120378 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120378/testReport)** for PR 28027 at commit [`0a5c7ca`](https://github.com/apache/spark/commit/0a5c7caebb807f2d256ce696ba8112acc24292b7). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` implicit class MetadataColumnsHelper(metadata: Array[MetadataColumn]) ` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28027: [SPARK-31255][SQL] Add SupportsMetadataColumns to DSv2 (WIP)
SparkQA removed a comment on issue #28027: [SPARK-31255][SQL] Add SupportsMetadataColumns to DSv2 (WIP) URL: https://github.com/apache/spark/pull/28027#issuecomment-604126136 **[Test build #120378 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120378/testReport)** for PR 28027 at commit [`0a5c7ca`](https://github.com/apache/spark/commit/0a5c7caebb807f2d256ce696ba8112acc24292b7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon removed a comment on issue #27989: [SPARK-31228][DSTREAMS] Add version information to the configuration of Kafka
HyukjinKwon removed a comment on issue #27989: [SPARK-31228][DSTREAMS] Add version information to the configuration of Kafka URL: https://github.com/apache/spark/pull/27989#issuecomment-604207880 Oh, okay, about options right? Sure, it's best to add the versions I guess. Let's do it separately though technically documenting versions in options are orthogonal to documenting in configurations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dbtsai commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
dbtsai commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet URL: https://github.com/apache/spark/pull/27728#discussion_r398302492 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetTest.scala ## @@ -62,13 +62,21 @@ private[sql] trait ParquetTest extends FileBasedDataSourceTest { (data: Seq[T]) (f: String => Unit): Unit = withDataSourceFile(data)(f) + protected def toDF[T <: Product: ClassTag: TypeTag](data: Seq[T]): DataFrame = { Review comment: The error is ```scala Error:(45, 20) in trait ParquetTest, multiple overloaded alternatives of method withParquetDataFrame define default arguments. private[sql] trait ParquetTest extends FileBasedDataSourceTest { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator
Ngone51 commented on a change in pull request #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#discussion_r398307088 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -329,27 +329,25 @@ final class ShuffleBlockFetcherIterator( private def createFetchRequest( blocks: Seq[FetchBlockInfo], - address: BlockManagerId, - curRequestSize: Long): FetchRequest = { -logDebug(s"Creating fetch request of $curRequestSize at $address " + address: BlockManagerId): FetchRequest = { +logDebug(s"Creating fetch request of ${blocks.map(_.size).sum} at $address " Review comment: In non batch mode, it is. But in batch mode, `blocks` here can a part of all blocks as they will be grouped by `maxBlocksInFlightPerAddress` before calling `createFetchRequest`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite
SparkQA commented on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite URL: https://github.com/apache/spark/pull/25748#issuecomment-604222600 **[Test build #120392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120392/testReport)** for PR 25748 at commit [`fe68184`](https://github.com/apache/spark/commit/fe68184f6decdeca2969d1be48ffaa71fc1acacb). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei edited a comment on issue #28030: [SPARK-31263][SHUFFLE] Enable yarn shuffle service to close the idle connections
turboFei edited a comment on issue #28030: [SPARK-31263][SHUFFLE] Enable yarn shuffle service to close the idle connections URL: https://github.com/apache/spark/pull/28030#issuecomment-604220710 just keep consistent with: https://github.com/apache/spark/blob/b024a8a69e4ae45c6ded3dd3f9f27e73a0069891/core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala#L107 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite
AmplabJenkins removed a comment on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite URL: https://github.com/apache/spark/pull/25748#issuecomment-604226923 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120392/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite
AmplabJenkins removed a comment on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite URL: https://github.com/apache/spark/pull/25748#issuecomment-604226921 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite
SparkQA commented on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite URL: https://github.com/apache/spark/pull/25748#issuecomment-604226821 **[Test build #120392 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120392/testReport)** for PR 25748 at commit [`fe68184`](https://github.com/apache/spark/commit/fe68184f6decdeca2969d1be48ffaa71fc1acacb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite
AmplabJenkins commented on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite URL: https://github.com/apache/spark/pull/25748#issuecomment-604226923 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120392/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite
AmplabJenkins commented on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite URL: https://github.com/apache/spark/pull/25748#issuecomment-604226921 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27665: [SPARK-30623][Core] Spark external shuffle allow disable of separate event loop group
dongjoon-hyun commented on issue #27665: [SPARK-30623][Core] Spark external shuffle allow disable of separate event loop group URL: https://github.com/apache/spark/pull/27665#issuecomment-604227335 Hi, @cloud-fan . This seems to be not in `branch-3.0` yet. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite
SparkQA removed a comment on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite URL: https://github.com/apache/spark/pull/25748#issuecomment-604222600 **[Test build #120392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120392/testReport)** for PR 25748 at commit [`fe68184`](https://github.com/apache/spark/commit/fe68184f6decdeca2969d1be48ffaa71fc1acacb). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir
SparkQA commented on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir URL: https://github.com/apache/spark/pull/27969#issuecomment-604236864 **[Test build #120395 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120395/testReport)** for PR 27969 at commit [`7501d2c`](https://github.com/apache/spark/commit/7501d2ce9d5c61e1daccd67077228ba8caf2ef31). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir
AmplabJenkins removed a comment on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir URL: https://github.com/apache/spark/pull/27969#issuecomment-604237223 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25105/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir
AmplabJenkins commented on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir URL: https://github.com/apache/spark/pull/27969#issuecomment-604237223 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25105/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir
AmplabJenkins commented on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir URL: https://github.com/apache/spark/pull/27969#issuecomment-604237221 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir
AmplabJenkins removed a comment on issue #27969: [SPARK-31170][SQL][test-hive1.2] Spark SQL Cli should respect hive-site.xml and spark.sql.warehouse.dir URL: https://github.com/apache/spark/pull/27969#issuecomment-604237221 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator
SparkQA commented on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604240932 **[Test build #120397 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120397/testReport)** for PR 28028 at commit [`5342fd7`](https://github.com/apache/spark/commit/5342fd7f9c02edb9ec8854a9fc03db44ff0c99c8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27665: [SPARK-30623][Core] Spark external shuffle allow disable of separate event loop group
dongjoon-hyun commented on issue #27665: [SPARK-30623][Core] Spark external shuffle allow disable of separate event loop group URL: https://github.com/apache/spark/pull/27665#issuecomment-604240855 BTW, @xuanyuanking . Could you confirm the above question? - https://github.com/apache/spark/pull/27665#discussion_r398328411 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite
AmplabJenkins removed a comment on issue #25748: [SPARK-28904][K8S][TESTS] Create mount for PvTestSuite URL: https://github.com/apache/spark/pull/25748#issuecomment-604240730 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25102/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27665: [SPARK-30623][Core] Spark external shuffle allow disable of separate event loop group
dongjoon-hyun commented on a change in pull request #27665: [SPARK-30623][Core] Spark external shuffle allow disable of separate event loop group URL: https://github.com/apache/spark/pull/27665#discussion_r398328411 ## File path: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java ## @@ -339,12 +341,25 @@ public int chunkFetchHandlerThreads() { return 0; } int chunkFetchHandlerThreadsPercent = - conf.getInt("spark.shuffle.server.chunkFetchHandlerThreadsPercent", 100); Review comment: What do you mean by `the config must be set`, @xuanyuanking ? What value do you expect by default? Apparently, this seems to revert SPARK-25641 together without mentioning SPARK-25641. In the PR, only SPARK-24355 is mentioned. > No need to give a default value here, when it comes to here, the config must be set. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28016: [SPARK-31238][SQL][test-hive1.2] Rebase dates to/from Julian calendar in write/read for ORC datasource
cloud-fan commented on a change in pull request #28016: [SPARK-31238][SQL][test-hive1.2] Rebase dates to/from Julian calendar in write/read for ORC datasource URL: https://github.com/apache/spark/pull/28016#discussion_r398329137 ## File path: sql/core/v1.2/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnVector.java ## @@ -130,7 +138,13 @@ public short getShort(int rowId) { @Override public int getInt(int rowId) { -return (int) longData.vector[getRowIndex(rowId)]; +int index = getRowIndex(rowId); +int value = (int) longData.vector[index]; Review comment: nit: `int value = (int) longData.vector[getRowIndex(rowId)];` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator
AmplabJenkins commented on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604241230 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25106/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator
AmplabJenkins commented on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604241225 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator
AmplabJenkins removed a comment on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604241225 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27695: [SPARK-30949][K8S][CORE] decouple requests and parallelism on kubernetes drivers
SparkQA commented on issue #27695: [SPARK-30949][K8S][CORE] decouple requests and parallelism on kubernetes drivers URL: https://github.com/apache/spark/pull/27695#issuecomment-604240942 **[Test build #120398 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120398/testReport)** for PR 27695 at commit [`2b3ad5b`](https://github.com/apache/spark/commit/2b3ad5bff2db4aa1f0c49503c3bffbb230cb). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator
AmplabJenkins removed a comment on issue #28028: [SPARK-31259][CORE] Fix log message about fetch request size in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604241230 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25106/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #28016: [SPARK-31238][SQL][test-hive1.2] Rebase dates to/from Julian calendar in write/read for ORC datasource
MaxGekk commented on a change in pull request #28016: [SPARK-31238][SQL][test-hive1.2] Rebase dates to/from Julian calendar in write/read for ORC datasource URL: https://github.com/apache/spark/pull/28016#discussion_r398331763 ## File path: sql/core/v1.2/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnVector.java ## @@ -42,6 +43,7 @@ private DecimalColumnVector decimalData; private TimestampColumnVector timestampData; Review comment: Yes, it does, but: 1. `DateColumnVector` doesn't have a method similar to `asScratchTimestamp` in `TimestampColumnVector` 2. We don't need to build `java.sql.Date` from serialized days to perform rebasing. It is unnecessary overhead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28015: [SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S DepsTestsSuite
HyukjinKwon commented on issue #28015: [SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S DepsTestsSuite URL: https://github.com/apache/spark/pull/28015#issuecomment-604172033 Haven't taken a close look but looks good. +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #28018: [SPARK-30758][SQL][TESTS][FOLLOWUP] Fix bug tests imported bracketed comments
maropu commented on issue #28018: [SPARK-30758][SQL][TESTS][FOLLOWUP] Fix bug tests imported bracketed comments URL: https://github.com/apache/spark/pull/28018#issuecomment-604175237 I think this issue is worth filing a new jira. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on issue #28020: [SPARK-31258][BUILD] Pin the avro version in SBT
yaooqinn commented on issue #28020: [SPARK-31258][BUILD] Pin the avro version in SBT URL: https://github.com/apache/spark/pull/28020#issuecomment-604178225 @dongjoon-hyun @HyukjinKwon, thanks. I have filed the Jira and update sbt-dependencyTree This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28020: [SPARK-31258][BUILD] Pin the avro version in SBT
HyukjinKwon commented on issue #28020: [SPARK-31258][BUILD] Pin the avro version in SBT URL: https://github.com/apache/spark/pull/28020#issuecomment-604180663 Looks good given they many people faced this randomly in unidoc. I checked the avro version is 1.8.2 from branch-2.4 to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #28025: [SPARK-31186][PySpark][SQL] toPandas should not fail on duplicate column names
viirya commented on a change in pull request #28025: [SPARK-31186][PySpark][SQL] toPandas should not fail on duplicate column names URL: https://github.com/apache/spark/pull/28025#discussion_r398271547 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -132,25 +132,36 @@ def toPandas(self): # Below is toPandas without Arrow optimization. pdf = pd.DataFrame.from_records(self.collect(), columns=self.columns) -dtype = {} -for field in self.schema: +dtype = [None] * len(self.schema) +for fieldIdx in range(len(self.schema)): +field = self.schema[fieldIdx] +pandas_col = pdf.iloc[:, fieldIdx] + pandas_type = PandasConversionMixin._to_corrected_pandas_type(field.dataType) # SPARK-21766: if an integer field is nullable and has null values, it can be # inferred by pandas as float column. Once we convert the column with NaN back # to integer type e.g., np.int16, we will hit exception. So we use the inferred # float type, not the corrected type from the schema in this case. if pandas_type is not None and \ not(isinstance(field.dataType, IntegralType) and field.nullable and -pdf[field.name].isnull().any()): -dtype[field.name] = pandas_type +pandas_col.isnull().any()): +dtype[fieldIdx] = pandas_type # Ensure we fall back to nullable numpy types, even when whole column is null: -if isinstance(field.dataType, IntegralType) and pdf[field.name].isnull().any(): -dtype[field.name] = np.float64 -if isinstance(field.dataType, BooleanType) and pdf[field.name].isnull().any(): -dtype[field.name] = np.object +if isinstance(field.dataType, IntegralType) and pandas_col.isnull().any(): +dtype[fieldIdx] = np.float64 +if isinstance(field.dataType, BooleanType) and pandas_col.isnull().any(): +dtype[fieldIdx] = np.object + +df = pd.DataFrame() +for index in range(len(dtype)): +t = dtype[index] Review comment: ok This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #28025: [SPARK-31186][PySpark][SQL] toPandas should not fail on duplicate column names
viirya commented on a change in pull request #28025: [SPARK-31186][PySpark][SQL] toPandas should not fail on duplicate column names URL: https://github.com/apache/spark/pull/28025#discussion_r398271448 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -132,25 +132,36 @@ def toPandas(self): # Below is toPandas without Arrow optimization. pdf = pd.DataFrame.from_records(self.collect(), columns=self.columns) -dtype = {} -for field in self.schema: +dtype = [None] * len(self.schema) +for fieldIdx in range(len(self.schema)): +field = self.schema[fieldIdx] +pandas_col = pdf.iloc[:, fieldIdx] + pandas_type = PandasConversionMixin._to_corrected_pandas_type(field.dataType) # SPARK-21766: if an integer field is nullable and has null values, it can be # inferred by pandas as float column. Once we convert the column with NaN back # to integer type e.g., np.int16, we will hit exception. So we use the inferred # float type, not the corrected type from the schema in this case. if pandas_type is not None and \ not(isinstance(field.dataType, IntegralType) and field.nullable and -pdf[field.name].isnull().any()): -dtype[field.name] = pandas_type +pandas_col.isnull().any()): +dtype[fieldIdx] = pandas_type # Ensure we fall back to nullable numpy types, even when whole column is null: -if isinstance(field.dataType, IntegralType) and pdf[field.name].isnull().any(): -dtype[field.name] = np.float64 -if isinstance(field.dataType, BooleanType) and pdf[field.name].isnull().any(): -dtype[field.name] = np.object +if isinstance(field.dataType, IntegralType) and pandas_col.isnull().any(): +dtype[fieldIdx] = np.float64 +if isinstance(field.dataType, BooleanType) and pandas_col.isnull().any(): +dtype[fieldIdx] = np.object + +df = pd.DataFrame() +for index in range(len(dtype)): +t = dtype[index] +if t is not None: +series = pdf.iloc[:, index].astype(t, copy=False) +else: +series = pdf.iloc[:, index] +df.insert(index, self.schema[index].name, series, allow_duplicates=True) -for f, t in dtype.items(): -pdf[f] = pdf[f].astype(t, copy=False) Review comment: Yea, not work, I'm not sure why. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #28020: [SPARK-31258][BUILD] Pin the avro version in SBT
HyukjinKwon closed pull request #28020: [SPARK-31258][BUILD] Pin the avro version in SBT URL: https://github.com/apache/spark/pull/28020 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc
AmplabJenkins removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604182369 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] manuzhang commented on issue #27998: [SPARK-31219][YARN] Enable closeIdleConnections in YarnShuffleService
manuzhang commented on issue #27998: [SPARK-31219][YARN] Enable closeIdleConnections in YarnShuffleService URL: https://github.com/apache/spark/pull/27998#issuecomment-604182314 @xuanyuanking @tgravescs Here are detailed timelines of our investigation. 1. We found connections on our clusters building up continuously (> 10k for some nodes). Is that normal ? We don't think so. 2. We looked into the connections on one node and found there were a lot of half-open connections. (connections only existed on one node) 3. We also checked those connections were very old (> 21 hours). (FYI, https://superuser.com/questions/565991/how-to-determine-the-socket-connection-up-time-on-linux) 4. Looking at the code, `TransportContext` registers an `IdleStateHandler` which should fire an `IdleStateEvent` when timeout. We did a heap dump of the `YarnShuffleService` and checked the attributes of `IdleStateHandler`. It turned out `firstAllIdleEvent` of many `IdleStateHandler`s were already `false` so `IdleStateEvent` were already fired. 5. Finally, we realized the `IdleStateEvent` would not be handled since `closeIdleConnections` are hardcoded to `false` for `YarnShuffleService`. The above is based on what we've seen and know. Please correct me if any understanding is wrong or inaccurate. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc
AmplabJenkins removed a comment on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604182374 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25094/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc
AmplabJenkins commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604182369 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc
SparkQA commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604181975 **[Test build #120384 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120384/testReport)** for PR 28017 at commit [`a1fbefe`](https://github.com/apache/spark/commit/a1fbefe03cc6a3e8eacb92ee0103845d26c491f6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc
yaooqinn commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604181984 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator
SparkQA commented on issue #28028: [SPARK-31259][CORE] Fix log error of curRequestSize in ShuffleBlockFetcherIterator URL: https://github.com/apache/spark/pull/28028#issuecomment-604188651 **[Test build #120386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120386/testReport)** for PR 28028 at commit [`5342fd7`](https://github.com/apache/spark/commit/5342fd7f9c02edb9ec8854a9fc03db44ff0c99c8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc
HyukjinKwon commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604196006 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28005: [DO-NOT-MERGE] Check if SPARK-31231 is fixed in setuptools side
HyukjinKwon commented on issue #28005: [DO-NOT-MERGE] Check if SPARK-31231 is fixed in setuptools side URL: https://github.com/apache/spark/pull/28005#issuecomment-604206859 Okay, appearntly it's not fixed yet. I am going back to the original fix and merge it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #28018: [SPARK-31262][SQL][TESTS][FOLLOWUP] Fix bug tests imported bracketed comments
beliefer commented on a change in pull request #28018: [SPARK-31262][SQL][TESTS][FOLLOWUP] Fix bug tests imported bracketed comments URL: https://github.com/apache/spark/pull/28018#discussion_r398296786 ## File path: sql/core/src/test/resources/sql-tests/inputs/comments.sql ## @@ -1,12 +1,5 @@ -- Test comments. - --- the first case of bracketed comment ---QUERY-DELIMITER-START -/* This is the first example of bracketed comment. -SELECT 'ommented out content' AS first; -*/ -SELECT 'selected content' AS first; ---QUERY-DELIMITER-END +--IMPORT nested-comments.sql Review comment: OK. I have put the output example This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on issue #28018: [SPARK-31262][SQL][TESTS][FOLLOWUP] Fix bug tests imported bracketed comments
beliefer commented on issue #28018: [SPARK-31262][SQL][TESTS][FOLLOWUP] Fix bug tests imported bracketed comments URL: https://github.com/apache/spark/pull/28018#issuecomment-604207020 > I think this issue is worth filing a new jira. OK. I created a new JIRA and update the title of this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604219243 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604219243 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604219251 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120390/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
SparkQA commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604218784 **[Test build #120390 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120390/testReport)** for PR 28009 at commit [`2064ce1`](https://github.com/apache/spark/commit/2064ce167d8a2148dd97d224aabe82c21a52de23). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604219173 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25100/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604219173 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25100/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604219168 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
AmplabJenkins removed a comment on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604219168 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wang-zhun commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications
wang-zhun commented on issue #28009: [SPARK-31235][YARN] Separates different categories of applications URL: https://github.com/apache/spark/pull/28009#issuecomment-604219023 @jiangxb1987 @tgravescs Thanks for your responses and suggestions This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27588: [BACKPORT] Backport of [SPARK-20628][CORE][K8S] Start to improve decommissioning
dongjoon-hyun commented on issue #27588: [BACKPORT] Backport of [SPARK-20628][CORE][K8S] Start to improve decommissioning URL: https://github.com/apache/spark/pull/27588#issuecomment-604221764 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dbtsai commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
dbtsai commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet URL: https://github.com/apache/spark/pull/27728#discussion_r398314267 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2049,6 +2049,17 @@ object SQLConf { .booleanConf .createWithDefault(true) + val NESTED_PREDICATE_PUSHDOWN_ENABLED = +buildConf("spark.sql.optimizer.nestedPredicatePushdown.enabled") + .internal() + .doc("When true, Spark tries to push down predicates for nested columns and or names " + +"containing `dots` to data sources. Currently, Parquet implements both optimizations " + +"while ORC only supports predicates for names containing `dots`. The other data sources" + +"don't support this feature yet.") + .version("3.0.0") + .booleanConf + .createWithDefault(true) Review comment: Since the filter apis will be enhanced to support nested columns and column name containing `dots`, it will be nice to introduce it in a major release. It's a good idea! We can make another PR to turn this feature on for specific data sources in a separate PR. This PR already grows too big. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #28003: [SPARK-31234][SQL] ResetCommand should reset config to sc.conf only
cloud-fan commented on issue #28003: [SPARK-31234][SQL] ResetCommand should reset config to sc.conf only URL: https://github.com/apache/spark/pull/28003#issuecomment-604243019 I think this is clearly a bug. If users set a static SQL config in the config file, this should not be affected by SET or RESET. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27588: [BACKPORT] Backport of [SPARK-20628][CORE][K8S] Start to improve decommissioning
AmplabJenkins removed a comment on issue #27588: [BACKPORT] Backport of [SPARK-20628][CORE][K8S] Start to improve decommissioning URL: https://github.com/apache/spark/pull/27588#issuecomment-604242125 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25101/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #28018: [SPARK-30758][SQL][TESTS][FOLLOWUP] Fix bug tests imported bracketed comments
beliefer commented on a change in pull request #28018: [SPARK-30758][SQL][TESTS][FOLLOWUP] Fix bug tests imported bracketed comments URL: https://github.com/apache/spark/pull/28018#discussion_r398261109 ## File path: sql/core/src/test/resources/sql-tests/inputs/comments.sql ## @@ -1,12 +1,5 @@ -- Test comments. - --- the first case of bracketed comment ---QUERY-DELIMITER-START -/* This is the first example of bracketed comment. -SELECT 'ommented out content' AS first; -*/ -SELECT 'selected content' AS first; ---QUERY-DELIMITER-END +--IMPORT nested-comments.sql Review comment: I updated the description to `Golden files can't display the bracketed comments in imported test cases`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc
SparkQA commented on issue #28017: [MINOR][DOCS] Fix some links for python api doc URL: https://github.com/apache/spark/pull/28017#issuecomment-604190858 **[Test build #120384 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120384/testReport)** for PR 28017 at commit [`a1fbefe`](https://github.com/apache/spark/commit/a1fbefe03cc6a3e8eacb92ee0103845d26c491f6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dbtsai commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
dbtsai commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet URL: https://github.com/apache/spark/pull/27728#discussion_r398280601 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetTest.scala ## @@ -62,13 +62,21 @@ private[sql] trait ParquetTest extends FileBasedDataSourceTest { (data: Seq[T]) (f: String => Unit): Unit = withDataSourceFile(data)(f) + protected def toDF[T <: Product: ClassTag: TypeTag](data: Seq[T]): DataFrame = { Review comment: I was thinking to do so, but surprisingly, overloading ```scala protected def withParquetDataFrame[T <: Product: ClassTag: TypeTag] (data: Seq[T], testVectorized: Boolean = true) (f: DataFrame => Unit): Unit ``` and ```scala protected def withParquetDataFrame(df: DataFrame, testVectorized: Boolean = true) (f: DataFrame => Unit): Unit ``` is not allowed in Scala. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28029: [SPARK-31261][SQL] Avoid npe when reading bad csv input with `columnNameCorruptRecord` specified
SparkQA commented on issue #28029: [SPARK-31261][SQL] Avoid npe when reading bad csv input with `columnNameCorruptRecord` specified URL: https://github.com/apache/spark/pull/28029#issuecomment-604198014 **[Test build #120387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120387/testReport)** for PR 28029 at commit [`bdc3d77`](https://github.com/apache/spark/commit/bdc3d77fd4d44fd71e9fa3a1e6c85aacba7871a6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org