[jira] [Comment Edited] (SPARK-16917) Spark streaming kafka version compatibility.

2016-08-11 Thread Alexey Zotov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418384#comment-15418384 ] Alexey Zotov edited comment on SPARK-16917 at 8/12/16 5:24 AM: --- [~sowen]

[jira] [Commented] (SPARK-16917) Spark streaming kafka version compatibility.

2016-08-11 Thread Alexey Zotov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418384#comment-15418384 ] Alexey Zotov commented on SPARK-16917: -- [~sowen] [~c...@koeninger.org] It really seems to be

[jira] [Commented] (SPARK-16975) Spark-2.0.0 unable to infer schema for parquet data written by Spark-1.6.2

2016-08-11 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418364#comment-15418364 ] Dongjoon Hyun commented on SPARK-16975: --- Hi, [~rxin]. Could you review this PR? > Spark-2.0.0

[jira] [Assigned] (SPARK-17019) Expose off-heap memory usage in various places

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17019: Assignee: Apache Spark > Expose off-heap memory usage in various places >

[jira] [Assigned] (SPARK-17019) Expose off-heap memory usage in various places

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17019: Assignee: (was: Apache Spark) > Expose off-heap memory usage in various places >

[jira] [Commented] (SPARK-17019) Expose off-heap memory usage in various places

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418305#comment-15418305 ] Apache Spark commented on SPARK-17019: -- User 'jerryshao' has created a pull request for this issue:

[jira] [Updated] (SPARK-16434) Avoid record-per type dispatch in JSON when reading

2016-08-11 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-16434: Assignee: Hyukjin Kwon > Avoid record-per type dispatch in JSON when reading >

[jira] [Resolved] (SPARK-16434) Avoid record-per type dispatch in JSON when reading

2016-08-11 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-16434. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14102

[jira] [Resolved] (SPARK-13081) Allow set pythonExec of driver and executor through configuration

2016-08-11 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-13081. Resolution: Fixed Assignee: Jeff Zhang Fix Version/s: 2.1.0 > Allow set

[jira] [Commented] (SPARK-16955) Using ordinals in ORDER BY causes an analysis error when the query has a GROUP BY clause using ordinals

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418288#comment-15418288 ] Apache Spark commented on SPARK-16955: -- User 'clockfly' has created a pull request for this issue:

[jira] [Commented] (SPARK-6235) Address various 2G limits

2016-08-11 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418225#comment-15418225 ] Guoqiang Li commented on SPARK-6235: I'm doing this work and I'll put the patch in this month. >

[jira] [Commented] (SPARK-17029) Dataset toJSON goes through RDD form instead of transforming dataset itself

2016-08-11 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418149#comment-15418149 ] Andrew Ash commented on SPARK-17029: Note RDD form usage from

[jira] [Assigned] (SPARK-17029) Dataset toJSON goes through RDD form instead of transforming dataset itself

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17029: Assignee: (was: Apache Spark) > Dataset toJSON goes through RDD form instead of

[jira] [Commented] (SPARK-16578) Configurable hostname for RBackend

2016-08-11 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418143#comment-15418143 ] Miao Wang commented on SPARK-16578: --- OK. I will check with Junyang. > Configurable hostname for

[jira] [Commented] (SPARK-17029) Dataset toJSON goes through RDD form instead of transforming dataset itself

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418144#comment-15418144 ] Apache Spark commented on SPARK-17029: -- User 'robert3005' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17029) Dataset toJSON goes through RDD form instead of transforming dataset itself

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17029: Assignee: Apache Spark > Dataset toJSON goes through RDD form instead of transforming

[jira] [Closed] (SPARK-17028) Backport SI-9734 for Scala 2.10

2016-08-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu closed SPARK-17028. Resolution: Won't Fix > Backport SI-9734 for Scala 2.10 > --- > >

[jira] [Assigned] (SPARK-17027) PolynomialExpansion.choose is prone to integer overflow

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17027: Assignee: (was: Apache Spark) > PolynomialExpansion.choose is prone to integer

[jira] [Assigned] (SPARK-17027) PolynomialExpansion.choose is prone to integer overflow

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17027: Assignee: Apache Spark > PolynomialExpansion.choose is prone to integer overflow >

[jira] [Commented] (SPARK-16883) SQL decimal type is not properly cast to number when collecting SparkDataFrame

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418111#comment-15418111 ] Apache Spark commented on SPARK-16883: -- User 'wangmiao1981' has created a pull request for this

[jira] [Assigned] (SPARK-16883) SQL decimal type is not properly cast to number when collecting SparkDataFrame

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16883: Assignee: (was: Apache Spark) > SQL decimal type is not properly cast to number when

[jira] [Assigned] (SPARK-16883) SQL decimal type is not properly cast to number when collecting SparkDataFrame

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16883: Assignee: Apache Spark > SQL decimal type is not properly cast to number when collecting

[jira] [Commented] (SPARK-17027) PolynomialExpansion.choose is prone to integer overflow

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418112#comment-15418112 ] Apache Spark commented on SPARK-17027: -- User 'zero323' has created a pull request for this issue:

[jira] [Resolved] (SPARK-17026) warning msg in MulticlassMetricsSuite

2016-08-11 Thread Xin Ren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xin Ren resolved SPARK-17026. - Resolution: Not A Problem > warning msg in MulticlassMetricsSuite >

[jira] [Commented] (SPARK-16803) SaveAsTable does not work when source DataFrame is built on a Hive Table

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418096#comment-15418096 ] Apache Spark commented on SPARK-16803: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-17027) PolynomialExpansion.choose is prone to integer overflow

2016-08-11 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418071#comment-15418071 ] Maciej Szymkiewicz edited comment on SPARK-17027 at 8/11/16 10:38 PM:

[jira] [Assigned] (SPARK-17028) Backport SI-9734 for Scala 2.10

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17028: Assignee: Apache Spark > Backport SI-9734 for Scala 2.10 >

[jira] [Commented] (SPARK-17028) Backport SI-9734 for Scala 2.10

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418068#comment-15418068 ] Apache Spark commented on SPARK-17028: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Resolved] (SPARK-17014) arithmetic.sql

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17014. --- Resolution: Invalid Believe this was opened in error as a duplicate > arithmetic.sql >

[jira] [Assigned] (SPARK-17028) Backport SI-9734 for Scala 2.10

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17028: Assignee: (was: Apache Spark) > Backport SI-9734 for Scala 2.10 >

[jira] [Commented] (SPARK-17027) PolynomialExpansion.choose is prone to integer overflow

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418065#comment-15418065 ] Sean Owen commented on SPARK-17027: --- Is the problem in the naive calculation of n choose k? {code}

[jira] [Created] (SPARK-17028) Backport SI-9734 for Scala 2.10

2016-08-11 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-17028: Summary: Backport SI-9734 for Scala 2.10 Key: SPARK-17028 URL: https://issues.apache.org/jira/browse/SPARK-17028 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-17027) PolynomialExpansion.choose is prone to integer overflow

2016-08-11 Thread Maciej Szymkiewicz (JIRA)
Maciej Szymkiewicz created SPARK-17027: -- Summary: PolynomialExpansion.choose is prone to integer overflow Key: SPARK-17027 URL: https://issues.apache.org/jira/browse/SPARK-17027 Project: Spark

[jira] [Resolved] (SPARK-17022) Potential deadlock in driver handling message

2016-08-11 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-17022. Resolution: Fixed Assignee: Tao Wang Fix Version/s: 2.1.0

[jira] [Resolved] (SPARK-16868) Executor will be both dead and alive when this executor reregister itself to driver.

2016-08-11 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-16868. Resolution: Fixed Assignee: carlmartin Fix Version/s: 2.1.0 > Executor

[jira] [Resolved] (SPARK-13602) o.a.s.deploy.worker.DriverRunner may leak the driver processes

2016-08-11 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-13602. Resolution: Fixed Assignee: Bryan Cutler Fix Version/s: 2.1.0 >

[jira] [Assigned] (SPARK-17026) warning msg in MulticlassMetricsSuite

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17026: Assignee: Apache Spark > warning msg in MulticlassMetricsSuite >

[jira] [Commented] (SPARK-17026) warning msg in MulticlassMetricsSuite

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417995#comment-15417995 ] Apache Spark commented on SPARK-17026: -- User 'keypointt' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17026) warning msg in MulticlassMetricsSuite

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17026: Assignee: (was: Apache Spark) > warning msg in MulticlassMetricsSuite >

[jira] [Created] (SPARK-17026) warning msg in MulticlassMetricsSuite

2016-08-11 Thread Xin Ren (JIRA)
Xin Ren created SPARK-17026: --- Summary: warning msg in MulticlassMetricsSuite Key: SPARK-17026 URL: https://issues.apache.org/jira/browse/SPARK-17026 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-17013) negative numeric literal parsing

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417941#comment-15417941 ] Apache Spark commented on SPARK-17013: -- User 'petermaxlee' has created a pull request for this

[jira] [Commented] (SPARK-3577) Add task metric to report spill time

2016-08-11 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417929#comment-15417929 ] Kay Ousterhout commented on SPARK-3577: --- I believe spill time will currently be displayed as part of

[jira] [Commented] (SPARK-16905) Support SQL DDL: MSCK REPAIR TABLE

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417928#comment-15417928 ] Apache Spark commented on SPARK-16905: -- User 'davies' has created a pull request for this issue:

[jira] [Resolved] (SPARK-17018) literals.sql for testing literal parsing

2016-08-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17018. - Resolution: Fixed Assignee: Peter Lee Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-3577) Add task metric to report spill time

2016-08-11 Thread Tzach Zohar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417916#comment-15417916 ] Tzach Zohar commented on SPARK-3577: Does this mean that currently, spill time will be displayed as

[jira] [Commented] (SPARK-16784) Configurable log4j settings

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417889#comment-15417889 ] Sean Owen commented on SPARK-16784: --- Oh, I really meant {{log4j.configuration}} to specify your own

[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417892#comment-15417892 ] Sean Owen commented on SPARK-16993: --- You would need to show some code or more about the error. >

[jira] [Comment Edited] (SPARK-16784) Configurable log4j settings

2016-08-11 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417857#comment-15417857 ] Michael Gummelt edited comment on SPARK-16784 at 8/11/16 8:11 PM: --

[jira] [Commented] (SPARK-16784) Configurable log4j settings

2016-08-11 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417856#comment-15417856 ] Michael Gummelt commented on SPARK-16784: - `log4j.debug=true` only results in log4j printing its

[jira] [Reopened] (SPARK-16784) Configurable log4j settings

2016-08-11 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Gummelt reopened SPARK-16784: - `log4j.debug=true` only results in log4j printing its debugging messages. It doesn't turn

[jira] [Comment Edited] (SPARK-16784) Configurable log4j settings

2016-08-11 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417857#comment-15417857 ] Michael Gummelt edited comment on SPARK-16784 at 8/11/16 8:10 PM: --

[jira] [Issue Comment Deleted] (SPARK-16784) Configurable log4j settings

2016-08-11 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Gummelt updated SPARK-16784: Comment: was deleted (was: `log4j.debug=true` only results in log4j printing its debugging

[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-11 Thread Dulaj Rajitha (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417847#comment-15417847 ] Dulaj Rajitha commented on SPARK-16993: --- But the thing is if add dummy column as as the label

[jira] [Resolved] (SPARK-17024) Weird behaviour of the DataFrame when a column name contains dots.

2016-08-11 Thread Iaroslav Zeigerman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iaroslav Zeigerman resolved SPARK-17024. Resolution: Duplicate > Weird behaviour of the DataFrame when a column name

[jira] [Commented] (SPARK-17024) Weird behaviour of the DataFrame when a column name contains dots.

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417829#comment-15417829 ] Sean Owen commented on SPARK-17024: --- There are many issues that sound like this, like

[jira] [Comment Edited] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417788#comment-15417788 ] Nicholas Chammas edited comment on SPARK-17025 at 8/11/16 7:33 PM: --- cc

[jira] [Commented] (SPARK-17024) Weird behaviour of the DataFrame when a column name contains dots.

2016-08-11 Thread Iaroslav Zeigerman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417807#comment-15417807 ] Iaroslav Zeigerman commented on SPARK-17024: If I query this way (with backquotes for

[jira] [Comment Edited] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417788#comment-15417788 ] Nicholas Chammas edited comment on SPARK-17025 at 8/11/16 7:27 PM: --- cc

[jira] [Commented] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417788#comment-15417788 ] Nicholas Chammas commented on SPARK-17025: -- cc [~josephkb] [~mengxr] > Cannot persist PySpark

[jira] [Commented] (SPARK-17024) Weird behaviour of the DataFrame when a column name contains dots.

2016-08-11 Thread Iaroslav Zeigerman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417789#comment-15417789 ] Iaroslav Zeigerman commented on SPARK-17024: If this behaviour is expected, is there a way to

[jira] [Created] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2016-08-11 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-17025: Summary: Cannot persist PySpark ML Pipeline model that includes custom Transformer Key: SPARK-17025 URL: https://issues.apache.org/jira/browse/SPARK-17025

[jira] [Updated] (SPARK-17024) Weird behaviour of the DataFrame when a column name contains dots.

2016-08-11 Thread Iaroslav Zeigerman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iaroslav Zeigerman updated SPARK-17024: --- Summary: Weird behaviour of the DataFrame when a column name contains dots. (was:

[jira] [Created] (SPARK-17024) Weird behaviour of the DataFrame when the column name contains dots.

2016-08-11 Thread Iaroslav Zeigerman (JIRA)
Iaroslav Zeigerman created SPARK-17024: -- Summary: Weird behaviour of the DataFrame when the column name contains dots. Key: SPARK-17024 URL: https://issues.apache.org/jira/browse/SPARK-17024

[jira] [Updated] (SPARK-17024) Weird behaviour of the DataFrame when a column name contains dots.

2016-08-11 Thread Iaroslav Zeigerman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iaroslav Zeigerman updated SPARK-17024: --- Description: When a column name contains dots and one of the segment in a name is

[jira] [Resolved] (SPARK-17021) simplify the constructor parameters of QuantileSummaries

2016-08-11 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-17021. -- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14603

[jira] [Updated] (SPARK-17015) group-by-ordinal and order-by-ordinal test cases

2016-08-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17015: Fix Version/s: 2.0.1 > group-by-ordinal and order-by-ordinal test cases >

[jira] [Updated] (SPARK-17016) group-by/order-by ordinal should throw AnalysisException instead of UnresolvedException

2016-08-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17016: Fix Version/s: 2.0.1 > group-by/order-by ordinal should throw AnalysisException instead of >

[jira] [Assigned] (SPARK-17023) Update Kafka connetor to use Kafka 0.10.0.1

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17023: Assignee: (was: Apache Spark) > Update Kafka connetor to use Kafka 0.10.0.1 >

[jira] [Assigned] (SPARK-17023) Update Kafka connetor to use Kafka 0.10.0.1

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17023: Assignee: Apache Spark > Update Kafka connetor to use Kafka 0.10.0.1 >

[jira] [Commented] (SPARK-17023) Update Kafka connetor to use Kafka 0.10.0.1

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417622#comment-15417622 ] Apache Spark commented on SPARK-17023: -- User 'lresende' has created a pull request for this issue:

[jira] [Created] (SPARK-17023) Update Kafka connetor to use Kafka 0.10.0.1

2016-08-11 Thread Luciano Resende (JIRA)
Luciano Resende created SPARK-17023: --- Summary: Update Kafka connetor to use Kafka 0.10.0.1 Key: SPARK-17023 URL: https://issues.apache.org/jira/browse/SPARK-17023 Project: Spark Issue

[jira] [Commented] (SPARK-16577) Add check-cran script to Jenkins

2016-08-11 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417595#comment-15417595 ] Shivaram Venkataraman commented on SPARK-16577: --- Good point - Let me check this with

[jira] [Commented] (SPARK-16519) Handle SparkR RDD generics that create warnings in R CMD check

2016-08-11 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417592#comment-15417592 ] Shivaram Venkataraman commented on SPARK-16519: --- Yeah I think the simplest thing to do is

[jira] [Assigned] (SPARK-17022) Potential deadlock in driver handling message

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17022: Assignee: Apache Spark > Potential deadlock in driver handling message >

[jira] [Commented] (SPARK-17022) Potential deadlock in driver handling message

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417582#comment-15417582 ] Apache Spark commented on SPARK-17022: -- User 'WangTaoTheTonic' has created a pull request for this

[jira] [Assigned] (SPARK-17022) Potential deadlock in driver handling message

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17022: Assignee: (was: Apache Spark) > Potential deadlock in driver handling message >

[jira] [Resolved] (SPARK-16958) Reuse subqueries within single query

2016-08-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16958. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14548

[jira] [Comment Edited] (SPARK-16519) Handle SparkR RDD generics that create warnings in R CMD check

2016-08-11 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417541#comment-15417541 ] Felix Cheung edited comment on SPARK-16519 at 8/11/16 4:40 PM: --- since we

[jira] [Commented] (SPARK-16519) Handle SparkR RDD generics that create warnings in R CMD check

2016-08-11 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417541#comment-15417541 ] Felix Cheung commented on SPARK-16519: -- since we are undecided on what to export for RDD, should we

[jira] [Comment Edited] (SPARK-16577) Add check-cran script to Jenkins

2016-08-11 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417536#comment-15417536 ] Felix Cheung edited comment on SPARK-16577 at 8/11/16 4:36 PM: --- I found

[jira] [Commented] (SPARK-16577) Add check-cran script to Jenkins

2016-08-11 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417536#comment-15417536 ] Felix Cheung commented on SPARK-16577: -- I found that to run the cran check on PDF it requires these

[jira] [Updated] (SPARK-16831) CrossValidator reports incorrect avgMetrics

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16831: -- Fix Version/s: (was: 1.6.3) > CrossValidator reports incorrect avgMetrics >

[jira] [Created] (SPARK-17022) Potential deadlock in driver handling message

2016-08-11 Thread Tao Wang (JIRA)
Tao Wang created SPARK-17022: Summary: Potential deadlock in driver handling message Key: SPARK-17022 URL: https://issues.apache.org/jira/browse/SPARK-17022 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Roi Reshef (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417301#comment-15417301 ] Roi Reshef edited comment on SPARK-17020 at 8/11/16 2:09 PM: - Nevertheless,

[jira] [Comment Edited] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Roi Reshef (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417301#comment-15417301 ] Roi Reshef edited comment on SPARK-17020 at 8/11/16 2:09 PM: - Nevertheless,

[jira] [Commented] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Roi Reshef (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417301#comment-15417301 ] Roi Reshef commented on SPARK-17020: Nevertheless, any attempt to repartition the resulting RDD also

[jira] [Commented] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417291#comment-15417291 ] Sean Owen commented on SPARK-17020: --- I see, I was asking because you show the results of caching a

[jira] [Commented] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Roi Reshef (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417288#comment-15417288 ] Roi Reshef commented on SPARK-17020: The problem occurs only when calling **.rdd** on an

[jira] [Commented] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417268#comment-15417268 ] Sean Owen commented on SPARK-17020: --- Yeah, after it's cached and the partitions are established, I'd

[jira] [Commented] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Roi Reshef (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417254#comment-15417254 ] Roi Reshef commented on SPARK-17020: Also note that I have just called: *data.cache().count()* val

[jira] [Commented] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Roi Reshef (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417250#comment-15417250 ] Roi Reshef commented on SPARK-17020: val ab = SomeReader.read(...) //some reader function that uses

[jira] [Commented] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417231#comment-15417231 ] Sean Owen commented on SPARK-17020: --- I think that's probably material, yes, as is the operations that

[jira] [Commented] (SPARK-16975) Spark-2.0.0 unable to infer schema for parquet data written by Spark-1.6.2

2016-08-11 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417240#comment-15417240 ] Dongjoon Hyun commented on SPARK-16975: --- Great! Thank you for confirming. > Spark-2.0.0 unable to

[jira] [Updated] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17020: -- Priority: Major (was: Critical) > Materialization of RDD via DataFrame.rdd forces a poor

[jira] [Commented] (SPARK-17021) simplify the constructor parameters of QuantileSummaries

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417230#comment-15417230 ] Apache Spark commented on SPARK-17021: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17021) simplify the constructor parameters of QuantileSummaries

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17021: Assignee: Wenchen Fan (was: Apache Spark) > simplify the constructor parameters of

[jira] [Assigned] (SPARK-17021) simplify the constructor parameters of QuantileSummaries

2016-08-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17021: Assignee: Apache Spark (was: Wenchen Fan) > simplify the constructor parameters of

[jira] [Commented] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Roi Reshef (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417218#comment-15417218 ] Roi Reshef commented on SPARK-17020: [~srowen] Should there be any effect on this if I cached and

[jira] [Created] (SPARK-17021) simplify the constructor parameters of QuantileSummaries

2016-08-11 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-17021: --- Summary: simplify the constructor parameters of QuantileSummaries Key: SPARK-17021 URL: https://issues.apache.org/jira/browse/SPARK-17021 Project: Spark Issue

[jira] [Comment Edited] (SPARK-17020) Materialization of RDD via DataFrame.rdd forces a poor re-distribution of data

2016-08-11 Thread Roi Reshef (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417204#comment-15417204 ] Roi Reshef edited comment on SPARK-17020 at 8/11/16 1:13 PM: - [~srowen] I

  1   2   >