[jira] [Created] (SPARK-23169) Run lintr on the changes of lint-r script and .lintr configuration

2018-01-21 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-23169: Summary: Run lintr on the changes of lint-r script and .lintr configuration Key: SPARK-23169 URL: https://issues.apache.org/jira/browse/SPARK-23169 Project: Spark

[jira] [Assigned] (SPARK-23169) Run lintr on the changes of lint-r script and .lintr configuration

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23169: Assignee: Apache Spark > Run lintr on the changes of lint-r script and .lintr configuratio

[jira] [Assigned] (SPARK-23169) Run lintr on the changes of lint-r script and .lintr configuration

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23169: Assignee: (was: Apache Spark) > Run lintr on the changes of lint-r script and .lintr c

[jira] [Commented] (SPARK-23169) Run lintr on the changes of lint-r script and .lintr configuration

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333427#comment-16333427 ] Apache Spark commented on SPARK-23169: -- User 'HyukjinKwon' has created a pull reques

[jira] [Commented] (SPARK-21293) R document update structured streaming

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333438#comment-16333438 ] Apache Spark commented on SPARK-21293: -- User 'felixcheung' has created a pull reques

[jira] [Assigned] (SPARK-21293) R document update structured streaming

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21293: Assignee: Felix Cheung (was: Apache Spark) > R document update structured streaming > ---

[jira] [Assigned] (SPARK-21293) R document update structured streaming

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21293: Assignee: Apache Spark (was: Felix Cheung) > R document update structured streaming > ---

[jira] [Commented] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-21 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333477#comment-16333477 ] Steve Loughran commented on SPARK-23050: there's one thing which worries me here:

[jira] [Commented] (SPARK-23167) Update TPCDS queries from v1.4 to v2.7 (latest)

2018-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333533#comment-16333533 ] Xiao Li commented on SPARK-23167: - [~maropu] Could you add a new suite for TPC-DS 2.7? Th

[jira] [Commented] (SPARK-23168) Hints for fact tables and unique columns

2018-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333534#comment-16333534 ] Xiao Li commented on SPARK-23168: - This is part of https://issues.apache.org/jira/browse/

[jira] [Created] (SPARK-23170) Dump the statistics of effective runs of analyzer and optimizer rules

2018-01-21 Thread Xiao Li (JIRA)
Xiao Li created SPARK-23170: --- Summary: Dump the statistics of effective runs of analyzer and optimizer rules Key: SPARK-23170 URL: https://issues.apache.org/jira/browse/SPARK-23170 Project: Spark

[jira] [Commented] (SPARK-23170) Dump the statistics of effective runs of analyzer and optimizer rules

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333535#comment-16333535 ] Apache Spark commented on SPARK-23170: -- User 'gatorsmile' has created a pull request

[jira] [Assigned] (SPARK-23170) Dump the statistics of effective runs of analyzer and optimizer rules

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23170: Assignee: Xiao Li (was: Apache Spark) > Dump the statistics of effective runs of analyzer

[jira] [Assigned] (SPARK-23170) Dump the statistics of effective runs of analyzer and optimizer rules

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23170: Assignee: Apache Spark (was: Xiao Li) > Dump the statistics of effective runs of analyzer

[jira] [Commented] (SPARK-23167) Update TPCDS queries from v1.4 to v2.7 (latest)

2018-01-21 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333536#comment-16333536 ] Takeshi Yamamuro commented on SPARK-23167: -- ok, will do. > Update TPCDS queries

[jira] [Commented] (SPARK-23171) Reduce the time costs of the rule runs that do not change the plans

2018-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333540#comment-16333540 ] Xiao Li commented on SPARK-23171: - cc [~maropu] > Reduce the time costs of the rule runs

[jira] [Created] (SPARK-23171) Reduce the time costs of the rule runs that do not change the plans

2018-01-21 Thread Xiao Li (JIRA)
Xiao Li created SPARK-23171: --- Summary: Reduce the time costs of the rule runs that do not change the plans Key: SPARK-23171 URL: https://issues.apache.org/jira/browse/SPARK-23171 Project: Spark I

[jira] [Comment Edited] (SPARK-23171) Reduce the time costs of the rule runs that do not change the plans

2018-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333540#comment-16333540 ] Xiao Li edited comment on SPARK-23171 at 1/21/18 2:24 PM: -- cc [~

[jira] [Updated] (SPARK-23166) Add maxDF Parameter to CountVectorizer

2018-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23166: -- Priority: Minor (was: Major) Seems fine; open a pull request. > Add maxDF Parameter to CountVectorize

[jira] [Assigned] (SPARK-22119) Add cosine distance to KMeans

2018-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-22119: - Assignee: Marco Gaido > Add cosine distance to KMeans > - > >

[jira] [Resolved] (SPARK-22119) Add cosine distance to KMeans

2018-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-22119. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 19340 [https://github.co

[jira] [Resolved] (SPARK-23156) Code of method "initialize(I)V" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB

2018-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23156. --- Resolution: Duplicate > Code of method "initialize(I)V" of class > "org.apache.spark.sql.catalyst.e

[jira] [Commented] (SPARK-23171) Reduce the time costs of the rule runs that do not change the plans

2018-01-21 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333551#comment-16333551 ] Takeshi Yamamuro commented on SPARK-23171: -- ok, I'll check code based on these m

[jira] [Commented] (SPARK-23167) Update TPCDS queries from v1.4 to v2.7 (latest)

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333563#comment-16333563 ] Apache Spark commented on SPARK-23167: -- User 'maropu' has created a pull request for

[jira] [Assigned] (SPARK-23167) Update TPCDS queries from v1.4 to v2.7 (latest)

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23167: Assignee: (was: Apache Spark) > Update TPCDS queries from v1.4 to v2.7 (latest) >

[jira] [Assigned] (SPARK-23167) Update TPCDS queries from v1.4 to v2.7 (latest)

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23167: Assignee: Apache Spark > Update TPCDS queries from v1.4 to v2.7 (latest) > ---

[jira] [Commented] (SPARK-23168) Hints for fact tables and unique columns

2018-01-21 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333572#comment-16333572 ] Takeshi Yamamuro commented on SPARK-23168: -- ok > Hints for fact tables and uniq

[jira] [Updated] (SPARK-23168) Hints for fact tables and unique columns

2018-01-21 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-23168: - Issue Type: Sub-task (was: New Feature) Parent: SPARK-19842 > Hints for fact tab

[jira] [Commented] (SPARK-19842) Informational Referential Integrity Constraints Support in Spark

2018-01-21 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333576#comment-16333576 ] Takeshi Yamamuro commented on SPARK-19842: -- What's the status of this tickets no

[jira] [Commented] (SPARK-8294) Break down large methods in YARN code

2018-01-21 Thread Sebastian Piu (JIRATEST)
[ https://issues-test.apache.org/jira/browse/SPARK-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265593#comment-16265593 ] Sebastian Piu commented on SPARK-8294: -- I had a quick look and all those methods

[jira] [Commented] (SPARK-8294) Break down large methods in YARN code

2018-01-21 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333615#comment-16333615 ] Hudson commented on SPARK-8294: --- [ https://issues-test.apache.org/jira/browse/SPARK-8294?pa

[jira] [Created] (SPARK-23172) Respect Project nodes in ReorderJoin

2018-01-21 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-23172: Summary: Respect Project nodes in ReorderJoin Key: SPARK-23172 URL: https://issues.apache.org/jira/browse/SPARK-23172 Project: Spark Issue Type: Impr

[jira] [Commented] (SPARK-23172) Respect Project nodes in ReorderJoin

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333627#comment-16333627 ] Apache Spark commented on SPARK-23172: -- User 'maropu' has created a pull request for

[jira] [Assigned] (SPARK-23172) Respect Project nodes in ReorderJoin

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23172: Assignee: (was: Apache Spark) > Respect Project nodes in ReorderJoin > ---

[jira] [Assigned] (SPARK-23172) Respect Project nodes in ReorderJoin

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23172: Assignee: Apache Spark > Respect Project nodes in ReorderJoin > --

[jira] [Resolved] (SPARK-21293) R document update structured streaming

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-21293. -- Resolution: Fixed Fix Version/s: 2.3.0 Target Version/s: 2.3.0 > R document up

[jira] [Commented] (SPARK-20906) Constrained Logistic Regression for SparkR

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333654#comment-16333654 ] Felix Cheung commented on SPARK-20906: -- [~wm624] would you like to add example of th

[jira] [Commented] (SPARK-22208) Improve percentile_approx by not rounding up targetError and starting from index 0

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333659#comment-16333659 ] Felix Cheung commented on SPARK-22208: -- Is this documented in the SQL programming gu

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333663#comment-16333663 ] Felix Cheung commented on SPARK-20307: -- for SPARK-20307 and SPARK-21381, do you thin

[jira] [Updated] (SPARK-22208) Improve percentile_approx by not rounding up targetError and starting from index 0

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22208: - Labels: releasenotes (was: ) > Improve percentile_approx by not rounding up targetError and star

[jira] [Comment Edited] (SPARK-20906) Constrained Logistic Regression for SparkR

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333654#comment-16333654 ] Felix Cheung edited comment on SPARK-20906 at 1/21/18 8:54 PM:

[jira] [Commented] (SPARK-23115) SparkR 2.3 QA: New R APIs and API docs

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333677#comment-16333677 ] Felix Cheung commented on SPARK-23115: -- Another pass, we should add API doc for SPA

[jira] [Commented] (SPARK-22208) Improve percentile_approx by not rounding up targetError and starting from index 0

2018-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333678#comment-16333678 ] Sean Owen commented on SPARK-22208: --- It's a bug fix, and more of a corner case of behav

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2018-01-21 Thread Joseph Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333682#comment-16333682 ] Joseph Wang commented on SPARK-20307: - Hi Felix, I can do that but I have a family em

[jira] [Commented] (SPARK-19842) Informational Referential Integrity Constraints Support in Spark

2018-01-21 Thread Ioana Delaney (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333703#comment-16333703 ] Ioana Delaney commented on SPARK-19842: --- The benefits of this work is that it opens

[jira] [Commented] (SPARK-23118) SparkR 2.3 QA: Programming guide, migration guide, vignettes updates

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333707#comment-16333707 ] Felix Cheung commented on SPARK-23118: -- for programming guide, perhaps  SPARK-20906

[jira] [Resolved] (SPARK-23118) SparkR 2.3 QA: Programming guide, migration guide, vignettes updates

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-23118. -- Resolution: Fixed Assignee: Felix Cheung Fix Version/s: 2.3.0 > SparkR 2.3 QA:

[jira] [Commented] (SPARK-23108) ML, Graph 2.3 QA: API: Experimental, DeveloperApi, final, sealed audit

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333708#comment-16333708 ] Felix Cheung commented on SPARK-23108: -- >From reviewing R, it would be good to docum

[jira] [Comment Edited] (SPARK-20906) Constrained Logistic Regression for SparkR

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333654#comment-16333654 ] Felix Cheung edited comment on SPARK-20906 at 1/21/18 10:30 PM: ---

[jira] [Commented] (SPARK-23107) ML, Graph 2.3 QA: API: New Scala APIs, docs

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333712#comment-16333712 ] Felix Cheung commented on SPARK-23107: -- We don't have doc on RFormula but it'll be g

[jira] [Comment Edited] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333682#comment-16333682 ] Felix Cheung edited comment on SPARK-20307 at 1/21/18 10:40 PM: ---

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333716#comment-16333716 ] Felix Cheung commented on SPARK-20307: -- I think [~wm624] if you have the time > Spa

[jira] [Commented] (SPARK-23117) SparkR 2.3 QA: Check for new R APIs requiring example code

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333718#comment-16333718 ] Felix Cheung commented on SPARK-23117: -- I did a pass, I think these could use an exa

[jira] [Commented] (SPARK-23116) SparkR 2.3 QA: Update user guide for new features & APIs

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333717#comment-16333717 ] Felix Cheung commented on SPARK-23116: -- I did a pass. > SparkR 2.3 QA: Update user

[jira] [Resolved] (SPARK-23116) SparkR 2.3 QA: Update user guide for new features & APIs

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-23116. -- Resolution: Fixed Assignee: Felix Cheung Fix Version/s: 2.3.0 > SparkR 2.3 QA:

[jira] [Comment Edited] (SPARK-23117) SparkR 2.3 QA: Check for new R APIs requiring example code

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333718#comment-16333718 ] Felix Cheung edited comment on SPARK-23117 at 1/21/18 10:47 PM: ---

[jira] [Commented] (SPARK-23114) Spark R 2.3 QA umbrella

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333725#comment-16333725 ] Felix Cheung commented on SPARK-23114: -- [~sameerag] Here are some ideas for the rel

[jira] [Commented] (SPARK-23114) Spark R 2.3 QA umbrella

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333730#comment-16333730 ] Felix Cheung commented on SPARK-23114: -- [~falaki] [~hyukjin.kwon] About SPARK-21093

[jira] [Comment Edited] (SPARK-23114) Spark R 2.3 QA umbrella

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333725#comment-16333725 ] Felix Cheung edited comment on SPARK-23114 at 1/21/18 11:02 PM: ---

[jira] [Comment Edited] (SPARK-23114) Spark R 2.3 QA umbrella

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333730#comment-16333730 ] Felix Cheung edited comment on SPARK-23114 at 1/21/18 11:03 PM: ---

[jira] [Commented] (SPARK-21727) Operating on an ArrayType in a SparkR DataFrame throws error

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333733#comment-16333733 ] Felix Cheung commented on SPARK-21727: -- how are we doing? > Operating on an ArrayTy

[jira] [Comment Edited] (SPARK-23107) ML, Graph 2.3 QA: API: New Scala APIs, docs

2018-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333712#comment-16333712 ] Felix Cheung edited comment on SPARK-23107 at 1/21/18 11:08 PM: ---

[jira] [Commented] (SPARK-23114) Spark R 2.3 QA umbrella

2018-01-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333755#comment-16333755 ] Hyukjin Kwon commented on SPARK-23114: -- [~felixcheung], I maybe misunderstood but yo

[jira] [Assigned] (SPARK-23169) Run lintr on the changes of lint-r script and .lintr configuration

2018-01-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-23169: Assignee: Hyukjin Kwon > Run lintr on the changes of lint-r script and .lintr configuratio

[jira] [Resolved] (SPARK-23169) Run lintr on the changes of lint-r script and .lintr configuration

2018-01-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-23169. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20339 [https://git

[jira] [Created] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-01-21 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-23173: - Summary: from_json can produce nulls for fields which are marked as non-nullable Key: SPARK-23173 URL: https://issues.apache.org/jira/browse/SPARK-23173 Pro

[jira] [Updated] (SPARK-20947) Encoding/decoding issue in PySpark pipe implementation

2018-01-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-20947: - Fix Version/s: 2.4.0 > Encoding/decoding issue in PySpark pipe implementation > -

[jira] [Resolved] (SPARK-20947) Encoding/decoding issue in PySpark pipe implementation

2018-01-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-20947. -- Resolution: Fixed Fixed in https://github.com/apache/spark/pull/18277 > Encoding/decoding issu

[jira] [Assigned] (SPARK-20947) Encoding/decoding issue in PySpark pipe implementation

2018-01-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-20947: Assignee: Xiaozhe Wang > Encoding/decoding issue in PySpark pipe implementation >

[jira] [Commented] (SPARK-22320) ORC should support VectorUDT/MatrixUDT

2018-01-21 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333784#comment-16333784 ] Dongjoon Hyun commented on SPARK-22320: --- For this one, Parquet saves the original s

[jira] [Commented] (SPARK-22320) ORC should support VectorUDT/MatrixUDT

2018-01-21 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333787#comment-16333787 ] Dongjoon Hyun commented on SPARK-22320: --- With the above workaround, I think this se

[jira] [Updated] (SPARK-22320) ORC should support VectorUDT/MatrixUDT

2018-01-21 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-22320: -- Priority: Minor (was: Major) > ORC should support VectorUDT/MatrixUDT > --

[jira] [Commented] (SPARK-11222) Add style checker rules to validate doc tests aren't included in docs

2018-01-21 Thread Rekha Joshi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333786#comment-16333786 ] Rekha Joshi commented on SPARK-11222: -  I have raised the doctest bank line as an [i

[jira] [Comment Edited] (SPARK-22320) ORC should support VectorUDT/MatrixUDT

2018-01-21 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333787#comment-16333787 ] Dongjoon Hyun edited comment on SPARK-22320 at 1/22/18 2:18 AM: ---

[jira] [Created] (SPARK-23174) Fix pep8 to latest official version

2018-01-21 Thread Rekha Joshi (JIRA)
Rekha Joshi created SPARK-23174: --- Summary: Fix pep8 to latest official version Key: SPARK-23174 URL: https://issues.apache.org/jira/browse/SPARK-23174 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-23174) Fix pep8 to latest official version

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23174: Assignee: Apache Spark > Fix pep8 to latest official version > ---

[jira] [Commented] (SPARK-23174) Fix pep8 to latest official version

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333793#comment-16333793 ] Apache Spark commented on SPARK-23174: -- User 'rekhajoshm' has created a pull request

[jira] [Assigned] (SPARK-23174) Fix pep8 to latest official version

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23174: Assignee: (was: Apache Spark) > Fix pep8 to latest official version >

[jira] [Created] (SPARK-23175) Type conversion does not make sense under case like select ’0.1’ = 0

2018-01-21 Thread Shaoquan Zhang (JIRA)
Shaoquan Zhang created SPARK-23175: -- Summary: Type conversion does not make sense under case like select ’0.1’ = 0 Key: SPARK-23175 URL: https://issues.apache.org/jira/browse/SPARK-23175 Project: Spa

[jira] [Updated] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-01-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-23173: -- Description: The {{from_json}} function uses a schema to convert a string into a Spark

[jira] [Updated] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-01-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-23173: -- Description: The {{from_json}} function uses a schema to convert a string into a Spark

[jira] [Commented] (SPARK-20129) JavaSparkContext should use SparkContext.getOrCreate

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333808#comment-16333808 ] Apache Spark commented on SPARK-20129: -- User 'rekhajoshm' has created a pull request

[jira] [Commented] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-21 Thread Yash Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333809#comment-16333809 ] Yash Sharma commented on SPARK-23050: - Hi [~ste...@apache.org], Thanks for bringing t

[jira] [Comment Edited] (SPARK-23114) Spark R 2.3 QA umbrella

2018-01-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333755#comment-16333755 ] Hyukjin Kwon edited comment on SPARK-23114 at 1/22/18 3:05 AM:

[jira] [Resolved] (SPARK-22808) saveAsTable() should be marked as deprecated

2018-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-22808. - Resolution: Duplicate > saveAsTable() should be marked as deprecated > --

[jira] [Commented] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-01-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333855#comment-16333855 ] Hyukjin Kwon commented on SPARK-23173: -- I believe this one is related with SPARK-177

[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2018-01-21 Thread Gaurav (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333857#comment-16333857 ] Gaurav commented on SPARK-18016: In sequence, df .groupBy({color:#008000}"col1"{color})

[jira] [Comment Edited] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2018-01-21 Thread Gaurav Garg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333857#comment-16333857 ] Gaurav Garg edited comment on SPARK-18016 at 1/22/18 4:27 AM: -

[jira] [Resolved] (SPARK-23026) Add RegisterUDF to PySpark

2018-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23026. - Resolution: Won't Fix > Add RegisterUDF to PySpark > -- > > Key:

[jira] [Commented] (SPARK-23084) Add unboundedPreceding(), unboundedFollowing() and currentRow() to PySpark

2018-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333860#comment-16333860 ] Xiao Li commented on SPARK-23084: - Yeah. Please go ahead. > Add unboundedPreceding(), un

[jira] [Resolved] (SPARK-22976) Worker cleanup can remove running driver directories

2018-01-21 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao resolved SPARK-22976. - Resolution: Fixed Fix Version/s: 2.3.0 > Worker cleanup can remove running driver director

[jira] [Commented] (SPARK-23081) Add colRegex API to PySpark

2018-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333861#comment-16333861 ] Xiao Li commented on SPARK-23081: - Yeah. Please go ahead. > Add colRegex API to PySpark

[jira] [Assigned] (SPARK-22976) Worker cleanup can remove running driver directories

2018-01-21 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao reassigned SPARK-22976: --- Assignee: Russell Spitzer > Worker cleanup can remove running driver directories > -

[jira] [Resolved] (SPARK-22838) Avoid unnecessary copying of data

2018-01-21 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao resolved SPARK-22838. - Resolution: Invalid > Avoid unnecessary copying of data > - > >

[jira] [Resolved] (SPARK-23000) Flaky test suite DataSourceWithHiveMetastoreCatalogSuite in Spark 2.3

2018-01-21 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal resolved SPARK-23000. Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by [https://github.com/apache/

[jira] [Resolved] (SPARK-23175) Type conversion does not make sense under case like select ’0.1’ = 0

2018-01-21 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-23175. - Resolution: Duplicate > Type conversion does not make sense under case like select ’0.1’ = 0 > --

[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2018-01-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333897#comment-16333897 ] Kazuaki Ishizaki commented on SPARK-18016: -- Thanks, I will look at this. > Code

[jira] [Commented] (SPARK-23122) Deprecate register* for UDFs in SQLContext and Catalog in PySpark

2018-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333911#comment-16333911 ] Apache Spark commented on SPARK-23122: -- User 'gatorsmile' has created a pull request

[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2018-01-21 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333916#comment-16333916 ] Ruslan Dautkhanov commented on SPARK-18016: --- In Spark 2.2 I have the same issue

[jira] [Updated] (SPARK-23122) Deprecate register* for UDFs in SQLContext and Catalog in PySpark

2018-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23122: Description: Deprecate register* for UDFs in SQLContext and Catalog in PySpark Seems we allow many other w

  1   2   >