[jira] [Assigned] (SPARK-18788) Add getNumPartitions() to SparkR

2017-01-20 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung reassigned SPARK-18788: Assignee: Felix Cheung > Add getNumPartitions() to SparkR >

[jira] [Assigned] (SPARK-18788) Add getNumPartitions() to SparkR

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18788: Assignee: (was: Apache Spark) > Add getNumPartitions() to SparkR >

[jira] [Assigned] (SPARK-18788) Add getNumPartitions() to SparkR

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18788: Assignee: Apache Spark > Add getNumPartitions() to SparkR >

[jira] [Commented] (SPARK-18788) Add getNumPartitions() to SparkR

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832873#comment-15832873 ] Apache Spark commented on SPARK-18788: -- User 'felixcheung' has created a pull request for this

[jira] [Commented] (SPARK-19288) Failure (at test_sparkSQL.R#1300): date functions on a DataFrame in R/run-tests.sh

2017-01-20 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832837#comment-15832837 ] Felix Cheung commented on SPARK-19288: -- hmm, that's odd. what system and R version? I'm wondering if

[jira] [Resolved] (SPARK-19305) partitioned table should always put partition columns at the end of table schema

2017-01-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19305. - Resolution: Fixed Issue resolved by pull request 16655

[jira] [Resolved] (SPARK-14536) NPE in JDBCRDD when array column contains nulls (postgresql)

2017-01-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-14536. - Resolution: Fixed Assignee: Suresh Thalamati Fix Version/s: 2.2.0 > NPE in JDBCRDD when

[jira] [Resolved] (SPARK-16101) Refactoring CSV data source to be consistent with JSON data source

2017-01-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-16101. - Resolution: Fixed Assignee: Hyukjin Kwon Fix Version/s: 2.2.0 > Refactoring CSV

[jira] [Created] (SPARK-19321) Support Hive 2.x's metastore

2017-01-20 Thread Yin Huai (JIRA)
Yin Huai created SPARK-19321: Summary: Support Hive 2.x's metastore Key: SPARK-19321 URL: https://issues.apache.org/jira/browse/SPARK-19321 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-19267) Fix a race condition when stopping StateStore

2017-01-20 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-19267. --- Resolution: Fixed Fix Version/s: 3.0.0 2.1.1 Issue resolved by

[jira] [Created] (SPARK-19320) Allow guaranteed amount of GPU to be used when launching jobs

2017-01-20 Thread Timothy Chen (JIRA)
Timothy Chen created SPARK-19320: Summary: Allow guaranteed amount of GPU to be used when launching jobs Key: SPARK-19320 URL: https://issues.apache.org/jira/browse/SPARK-19320 Project: Spark

[jira] [Commented] (SPARK-19316) Spark event logs are huge compared to 1.5.2

2017-01-20 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832710#comment-15832710 ] Jisoo Kim commented on SPARK-19316: --- I suspect this is due to "SparkListenrTaskEnd" event log having

[jira] [Assigned] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18750: Assignee: (was: Apache Spark) > spark should be able to control the number of

[jira] [Assigned] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18750: Assignee: Apache Spark > spark should be able to control the number of executor and

[jira] [Commented] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832703#comment-15832703 ] Apache Spark commented on SPARK-18750: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19319) SparkR Kmeans summary returns error when the cluster size doesn't equal to k

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19319: Assignee: (was: Apache Spark) > SparkR Kmeans summary returns error when the cluster

[jira] [Commented] (SPARK-19319) SparkR Kmeans summary returns error when the cluster size doesn't equal to k

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832698#comment-15832698 ] Apache Spark commented on SPARK-19319: -- User 'wangmiao1981' has created a pull request for this

[jira] [Assigned] (SPARK-19319) SparkR Kmeans summary returns error when the cluster size doesn't equal to k

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19319: Assignee: Apache Spark > SparkR Kmeans summary returns error when the cluster size

[jira] [Created] (SPARK-19319) SparkR Kmeans summary returns error when the cluster size doesn't equal to k

2017-01-20 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19319: - Summary: SparkR Kmeans summary returns error when the cluster size doesn't equal to k Key: SPARK-19319 URL: https://issues.apache.org/jira/browse/SPARK-19319 Project:

[jira] [Issue Comment Deleted] (SPARK-16599) java.util.NoSuchElementException: None.get at at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)

2017-01-20 Thread Drew Robb (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Robb updated SPARK-16599: -- Comment: was deleted (was: I encountered an identical exception when using a singleton spark session.

[jira] [Commented] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832684#comment-15832684 ] Marcelo Vanzin commented on SPARK-18750: Yay, I can reproduce it with a unit test against

[jira] [Commented] (SPARK-19289) UnCache Dataset using Name

2017-01-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832663#comment-15832663 ] Xiao Li commented on SPARK-19289: - Basically, you are creating a view for that dataframe. View name is

[jira] [Commented] (SPARK-19318) Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle`

2017-01-20 Thread Suresh Thalamati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832659#comment-15832659 ] Suresh Thalamati commented on SPARK-19318: -- I am looking into this test failure. > Docker test

[jira] [Commented] (SPARK-19300) Executor is waiting for lock

2017-01-20 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832658#comment-15832658 ] Shixiong Zhu commented on SPARK-19300: -- Could you provide the full thread dump? Looks like there is

[jira] [Updated] (SPARK-18589) persist() resolves "java.lang.RuntimeException: Invalid PythonUDF (...), requires attributes from more than one child"

2017-01-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18589: -- Fix Version/s: 2.2.0 2.1.1 > persist() resolves

[jira] [Resolved] (SPARK-18589) persist() resolves "java.lang.RuntimeException: Invalid PythonUDF (...), requires attributes from more than one child"

2017-01-20 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18589. --- Resolution: Fixed > persist() resolves "java.lang.RuntimeException: Invalid

[jira] [Created] (SPARK-19318) Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle`

2017-01-20 Thread Xiao Li (JIRA)
Xiao Li created SPARK-19318: --- Summary: Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle` Key: SPARK-19318 URL: https://issues.apache.org/jira/browse/SPARK-19318 Project:

[jira] [Comment Edited] (SPARK-17890) scala.ScalaReflectionException

2017-01-20 Thread Dave DeCaprio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832620#comment-15832620 ] Dave DeCaprio edited comment on SPARK-17890 at 1/20/17 11:46 PM: - I'm

[jira] [Created] (SPARK-19317) UnsupportedOperationException: empty.reduceLeft in LinearSeqOptimized

2017-01-20 Thread Barry Becker (JIRA)
Barry Becker created SPARK-19317: Summary: UnsupportedOperationException: empty.reduceLeft in LinearSeqOptimized Key: SPARK-19317 URL: https://issues.apache.org/jira/browse/SPARK-19317 Project: Spark

[jira] [Commented] (SPARK-17890) scala.ScalaReflectionException

2017-01-20 Thread Dave DeCaprio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832620#comment-15832620 ] Dave DeCaprio commented on SPARK-17890: --- I'm running into this also. Naively changing the above

[jira] [Assigned] (SPARK-13478) Fetching delegation tokens for Hive fails when using proxy users

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13478: Assignee: Apache Spark (was: Marcelo Vanzin) > Fetching delegation tokens for Hive fails

[jira] [Commented] (SPARK-13478) Fetching delegation tokens for Hive fails when using proxy users

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832617#comment-15832617 ] Apache Spark commented on SPARK-13478: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-13478) Fetching delegation tokens for Hive fails when using proxy users

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13478: Assignee: Marcelo Vanzin (was: Apache Spark) > Fetching delegation tokens for Hive fails

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-01-20 Thread Jonathan Alvarado (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832598#comment-15832598 ] Jonathan Alvarado commented on SPARK-12837: --- I am seeing this issue on EMR 5.2.0 with Spark

[jira] [Commented] (SPARK-16599) java.util.NoSuchElementException: None.get at at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)

2017-01-20 Thread Drew Robb (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832569#comment-15832569 ] Drew Robb commented on SPARK-16599: --- I encountered an identical exception when using a singleton spark

[jira] [Comment Edited] (SPARK-18859) Catalyst codegen does not mark column as nullable when it should. Causes NPE

2017-01-20 Thread Erik LaBianca (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832516#comment-15832516 ] Erik LaBianca edited comment on SPARK-18859 at 1/20/17 10:33 PM: - Not

[jira] [Comment Edited] (SPARK-18859) Catalyst codegen does not mark column as nullable when it should. Causes NPE

2017-01-20 Thread Erik LaBianca (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832516#comment-15832516 ] Erik LaBianca edited comment on SPARK-18859 at 1/20/17 10:32 PM: - Not

[jira] [Comment Edited] (SPARK-18859) Catalyst codegen does not mark column as nullable when it should. Causes NPE

2017-01-20 Thread Erik LaBianca (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832516#comment-15832516 ] Erik LaBianca edited comment on SPARK-18859 at 1/20/17 10:32 PM: - Not

[jira] [Commented] (SPARK-18859) Catalyst codegen does not mark column as nullable when it should. Causes NPE

2017-01-20 Thread Erik LaBianca (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832516#comment-15832516 ] Erik LaBianca commented on SPARK-18859: --- Not quite a repro, but here's explain output. {noformat}

[jira] [Comment Edited] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832487#comment-15832487 ] Yun Ni edited comment on SPARK-18392 at 1/20/17 10:29 PM: -- Yes, comparing if the

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832487#comment-15832487 ] Yun Ni commented on SPARK-18392: Yes, comparing if the hash signature equals is faster than computing the

[jira] [Resolved] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-19314. --- Resolution: Fixed Fix Version/s: 3.0.0 2.0.3

[jira] [Commented] (SPARK-19111) S3 Mesos history upload fails silently if too large

2017-01-20 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832473#comment-15832473 ] Jisoo Kim commented on SPARK-19111: --- Related to https://issues.apache.org/jira/browse/SPARK-19316. >

[jira] [Commented] (SPARK-19316) Spark event logs are huge compared to 1.5.2

2017-01-20 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832470#comment-15832470 ] Jisoo Kim commented on SPARK-19316: --- Related to https://issues.apache.org/jira/browse/SPARK-19111 >

[jira] [Comment Edited] (SPARK-19296) Awkward changes for JdbcUtils.saveTable in Spark 2.1.0

2017-01-20 Thread Paul Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831033#comment-15831033 ] Paul Wu edited comment on SPARK-19296 at 1/20/17 9:52 PM: -- We found this Util is

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread David S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832464#comment-15832464 ] David S commented on SPARK-18392: - Hi Yun and thanks for the answer, but my question now is, are there

[jira] [Commented] (SPARK-19111) S3 Mesos history upload fails silently if too large

2017-01-20 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832458#comment-15832458 ] Jisoo Kim commented on SPARK-19111: --- Thanks [~ste...@apache.org] for information, using S3a helped with

[jira] [Created] (SPARK-19316) Spark event logs are huge compared to 1.5.2

2017-01-20 Thread Jisoo Kim (JIRA)
Jisoo Kim created SPARK-19316: - Summary: Spark event logs are huge compared to 1.5.2 Key: SPARK-19316 URL: https://issues.apache.org/jira/browse/SPARK-19316 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2017-01-20 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-18750: --- Description: When running Sql queries on large datasets. Job fails with stack overflow

[jira] [Comment Edited] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832404#comment-15832404 ] Yun Ni edited comment on SPARK-18392 at 1/20/17 9:15 PM: - Hi David, Thanks for

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832404#comment-15832404 ] Yun Ni commented on SPARK-18392: Hi David, Thanks for the question. I did not group the records by their

[jira] [Created] (SPARK-19315) StructType should support nested lookup; throws IllegalArgumentException

2017-01-20 Thread Vinay varma (JIRA)
Vinay varma created SPARK-19315: --- Summary: StructType should support nested lookup; throws IllegalArgumentException Key: SPARK-19315 URL: https://issues.apache.org/jira/browse/SPARK-19315 Project:

[jira] [Assigned] (SPARK-18120) QueryExecutionListener method doesnt' get executed for DataFrameWriter methods

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18120: Assignee: (was: Apache Spark) > QueryExecutionListener method doesnt' get executed

[jira] [Assigned] (SPARK-18120) QueryExecutionListener method doesnt' get executed for DataFrameWriter methods

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18120: Assignee: Apache Spark > QueryExecutionListener method doesnt' get executed for

[jira] [Commented] (SPARK-18120) QueryExecutionListener method doesnt' get executed for DataFrameWriter methods

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832362#comment-15832362 ] Apache Spark commented on SPARK-18120: -- User 'salilsurendran' has created a pull request for this

[jira] [Assigned] (SPARK-18823) Assignation by column name variable not available or bug?

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18823: Assignee: (was: Apache Spark) > Assignation by column name variable not available or

[jira] [Assigned] (SPARK-18823) Assignation by column name variable not available or bug?

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18823: Assignee: Apache Spark > Assignation by column name variable not available or bug? >

[jira] [Commented] (SPARK-18823) Assignation by column name variable not available or bug?

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832324#comment-15832324 ] Apache Spark commented on SPARK-18823: -- User 'felixcheung' has created a pull request for this

[jira] [Commented] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832299#comment-15832299 ] Apache Spark commented on SPARK-19314: -- User 'tdas' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19314: Assignee: Apache Spark (was: Tathagata Das) > Do not allow sort before aggregation in

[jira] [Assigned] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19314: Assignee: Tathagata Das (was: Apache Spark) > Do not allow sort before aggregation in

[jira] [Created] (SPARK-19314) Do not allow sort before aggregation in Structured Streaming plan

2017-01-20 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-19314: - Summary: Do not allow sort before aggregation in Structured Streaming plan Key: SPARK-19314 URL: https://issues.apache.org/jira/browse/SPARK-19314 Project: Spark

[jira] [Assigned] (SPARK-19313) GaussianMixture throws cryptic error when number of features is too high

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19313: Assignee: (was: Apache Spark) > GaussianMixture throws cryptic error when number of

[jira] [Assigned] (SPARK-19313) GaussianMixture throws cryptic error when number of features is too high

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19313: Assignee: Apache Spark > GaussianMixture throws cryptic error when number of features is

[jira] [Commented] (SPARK-19313) GaussianMixture throws cryptic error when number of features is too high

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832290#comment-15832290 ] Apache Spark commented on SPARK-19313: -- User 'sethah' has created a pull request for this issue:

[jira] [Created] (SPARK-19313) GaussianMixture throws cryptic error when number of features is too high

2017-01-20 Thread Seth Hendrickson (JIRA)
Seth Hendrickson created SPARK-19313: Summary: GaussianMixture throws cryptic error when number of features is too high Key: SPARK-19313 URL: https://issues.apache.org/jira/browse/SPARK-19313

[jira] [Commented] (SPARK-18496) java.lang.AssertionError: assertion failed

2017-01-20 Thread John Myers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832214#comment-15832214 ] John Myers commented on SPARK-18496: Upgraded to 2.1.0 using java8, same problem exists, cannot

[jira] [Updated] (SPARK-19299) Nulls in non nullable columns cause data corruption in parquet

2017-01-20 Thread Franklyn Dsouza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklyn Dsouza updated SPARK-19299: Summary: Nulls in non nullable columns cause data corruption in parquet (was: Nulls in

[jira] [Updated] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Franklyn Dsouza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklyn Dsouza updated SPARK-19299: Summary: Nulls in non nullable columns causes data corruption in parquet (was: Nulls in

[jira] [Updated] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Franklyn Dsouza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklyn Dsouza updated SPARK-19299: Description: The problem we're seeing is that if a null occurs in a non-nullable field and

[jira] [Updated] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Franklyn Dsouza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklyn Dsouza updated SPARK-19299: Description: The problem we're seeing is that if a null occurs in a no-nullable field and

[jira] [Updated] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Franklyn Dsouza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklyn Dsouza updated SPARK-19299: Description: The problem we're seeing is that if a null occurs in a non-nullable field and

[jira] [Comment Edited] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Jason White (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832195#comment-15832195 ] Jason White edited comment on SPARK-19299 at 1/20/17 6:14 PM: -- These seem

[jira] [Updated] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Franklyn Dsouza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklyn Dsouza updated SPARK-19299: Priority: Critical (was: Major) > Nulls in non nullable columns causes data corruption in

[jira] [Comment Edited] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Jason White (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832195#comment-15832195 ] Jason White edited comment on SPARK-19299 at 1/20/17 6:09 PM: -- These seem

[jira] [Commented] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Jason White (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832195#comment-15832195 ] Jason White commented on SPARK-19299: - These seem like two or three separate issues. - Python long

[jira] [Comment Edited] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread David S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832157#comment-15832157 ] David S edited comment on SPARK-18392 at 1/20/17 6:02 PM: -- Hi, I have a question

[jira] [Commented] (SPARK-17248) Add native Scala enum support to Dataset Encoders

2017-01-20 Thread Leif Warner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832163#comment-15832163 ] Leif Warner commented on SPARK-17248: - Much more efficient encodings than strings are possible with

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-01-20 Thread David S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832157#comment-15832157 ] David S commented on SPARK-18392: - Hi, I have a question about the approx Nearest Neighbor

[jira] [Commented] (SPARK-19162) UserDefinedFunction constructor should verify that func is callable

2017-01-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832127#comment-15832127 ] Ryan Blue commented on SPARK-19162: --- [~rxin], I think this one is ready for a final review and commit,

[jira] [Comment Edited] (SPARK-19160) Decorator for UDF creation.

2017-01-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832124#comment-15832124 ] Ryan Blue edited comment on SPARK-19160 at 1/20/17 5:14 PM: [~rxin], I think

[jira] [Commented] (SPARK-19160) Decorator for UDF creation.

2017-01-20 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832124#comment-15832124 ] Ryan Blue commented on SPARK-19160: --- @rxin, I think this one is ready to be merged. Who is a good

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-19069: - Assignee: Parag Chaudhari > Expose task 'status' and 'duration' in spark history server REST

[jira] [Resolved] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-19069. -- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16473

[jira] [Commented] (SPARK-19312) Spark gives wrong error message when failes to create file due to hdfs quota limit.

2017-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832072#comment-15832072 ] Sean Owen commented on SPARK-19312: --- Hive on Spark is part of Hive, not Spark. > Spark gives wrong

[jira] [Commented] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Franklyn Dsouza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832048#comment-15832048 ] Franklyn Dsouza commented on SPARK-19299: - These issues also are very likely reproducible in

[jira] [Commented] (SPARK-19282) RandomForestRegressionModel should expose getMaxDepth

2017-01-20 Thread Xin Ren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832022#comment-15832022 ] Xin Ren commented on SPARK-19282: - Thank you Nick. I'll give it a try to fix it. :) >

[jira] [Commented] (SPARK-19299) Nulls in non nullable columns causes data corruption in parquet

2017-01-20 Thread Jason White (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832006#comment-15832006 ] Jason White commented on SPARK-19299: - Also seeing this same behaviour in Spark 2.0.1 when creating a

[jira] [Commented] (SPARK-17602) PySpark - Performance Optimization Large Size of Broadcast Variable

2017-01-20 Thread Junfeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831964#comment-15831964 ] Junfeng commented on SPARK-17602: - [~davies] the trouble really is the python worker share mode is not

[jira] [Assigned] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19311: Assignee: Apache Spark > UDFs disregard UDT type hierarchy >

[jira] [Assigned] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19311: Assignee: (was: Apache Spark) > UDFs disregard UDT type hierarchy >

[jira] [Commented] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831938#comment-15831938 ] Apache Spark commented on SPARK-19311: -- User 'gmoehler' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831888#comment-15831888 ] Liang-Chi Hsieh edited comment on SPARK-19311 at 1/20/17 3:06 PM: --

[jira] [Commented] (SPARK-19311) UDFs disregard UDT type hierarchy

2017-01-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831888#comment-15831888 ] Liang-Chi Hsieh commented on SPARK-19311: - [~Gregor Moehler] I think you already have the fixing.

[jira] [Created] (SPARK-19312) Spark gives wrong error message when failes to create file due to hdfs quota limit.

2017-01-20 Thread Rivkin Andrey (JIRA)
Rivkin Andrey created SPARK-19312: - Summary: Spark gives wrong error message when failes to create file due to hdfs quota limit. Key: SPARK-19312 URL: https://issues.apache.org/jira/browse/SPARK-19312

[jira] [Commented] (SPARK-16683) Group by does not work after multiple joins of the same dataframe

2017-01-20 Thread Andrew Ray (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831878#comment-15831878 ] Andrew Ray commented on SPARK-16683: I'm working on a solution for this > Group by does not work

[jira] [Assigned] (SPARK-19155) MLlib GeneralizedLinearRegression family and link should case insensitive

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang reassigned SPARK-19155: --- Assignee: Yanbo Liang > MLlib GeneralizedLinearRegression family and link should case

[jira] [Updated] (SPARK-19155) MLlib GeneralizedLinearRegression family and link should case insensitive

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-19155: Description: ML {{GeneralizedLinearRegression}} should support both uppercase and lowercase. For

[jira] [Updated] (SPARK-19155) MLlib GeneralizedLinearRegression family and link should case insensitive

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-19155: Description: ML {{GeneralizedLinearRegression}} only support lowercase input for {{family}} and

[jira] [Updated] (SPARK-19155) MLlib GeneralizedLinearRegression family and link should case insensitive

2017-01-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-19155: Summary: MLlib GeneralizedLinearRegression family and link should case insensitive (was: ML GLR

  1   2   >