[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984194#comment-15984194 ] Liang-Chi Hsieh commented on SPARK-20392: - By disabling

[jira] [Assigned] (SPARK-20465) Throws a proper exception rather than ArrayIndexOutOfBoundsException when temp directories could not be got/created

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20465: Assignee: Apache Spark > Throws a proper exception rather than

[jira] [Assigned] (SPARK-20465) Throws a proper exception rather than ArrayIndexOutOfBoundsException when temp directories could not be got/created

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20465: Assignee: (was: Apache Spark) > Throws a proper exception rather than

[jira] [Commented] (SPARK-20465) Throws a proper exception rather than ArrayIndexOutOfBoundsException when temp directories could not be got/created

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984180#comment-15984180 ] Apache Spark commented on SPARK-20465: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Resolved] (SPARK-20437) R wrappers for rollup and cube

2017-04-25 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-20437. -- Resolution: Fixed Assignee: Maciej Szymkiewicz Fix Version/s: 2.3.0

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984173#comment-15984173 ] Liang-Chi Hsieh commented on SPARK-20392: - [~barrybecker4] Currently I think the performance

[jira] [Updated] (SPARK-20465) Throws a proper exception rather than ArrayIndexOutOfBoundsException when temp directories could not be got/created

2017-04-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-20465: - Component/s: Spark Core > Throws a proper exception rather than ArrayIndexOutOfBoundsException

[jira] [Created] (SPARK-20465) Throws a proper exception rather than ArrayIndexOutOfBoundsException when temp directories could not be got/created

2017-04-25 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-20465: Summary: Throws a proper exception rather than ArrayIndexOutOfBoundsException when temp directories could not be got/created Key: SPARK-20465 URL:

[jira] [Commented] (SPARK-20199) GradientBoostedTreesModel doesn't have Column Sampling Rate Paramenter

2017-04-25 Thread 颜发才
[ https://issues.apache.org/jira/browse/SPARK-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984161#comment-15984161 ] Yan Facai (颜发才) commented on SPARK-20199: - The work is easy, however Public method is added and

[jira] [Resolved] (SPARK-16548) java.io.CharConversionException: Invalid UTF-32 character prevents me from querying my data

2017-04-25 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-16548. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.0 >

[jira] [Updated] (SPARK-20439) Catalog.listTables() depends on all libraries used to create tables

2017-04-25 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-20439: Fix Version/s: 2.1.1 > Catalog.listTables() depends on all libraries used to create tables >

[jira] [Updated] (SPARK-20456) Add examples for functions collection for pyspark

2017-04-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-20456: - Component/s: PySpark > Add examples for functions collection for pyspark >

[jira] [Updated] (SPARK-20456) Add examples for functions collection for pyspark

2017-04-25 Thread Michael Patterson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Patterson updated SPARK-20456: -- Summary: Add examples for functions collection for pyspark (was: Document major

[jira] [Updated] (SPARK-20456) Add examples for functions collection for pyspark

2017-04-25 Thread Michael Patterson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Patterson updated SPARK-20456: -- Description: Document `sql.functions.py`: 1. Add examples for the common aggregate

[jira] [Resolved] (SPARK-20457) Spark CSV is not able to Override Schema while reading data

2017-04-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-20457. -- Resolution: Duplicate Currently, the nullability seems being ignored. I am pretty sure that it

[jira] [Commented] (SPARK-20336) spark.read.csv() with wholeFile=True option fails to read non ASCII unicode characters

2017-04-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983896#comment-15983896 ] Hyukjin Kwon commented on SPARK-20336: -- Thank you guys for confirming this. > spark.read.csv() with

[jira] [Commented] (SPARK-20445) pyspark.sql.utils.IllegalArgumentException: u'DecisionTreeClassifier was given input with invalid label column label, without the number of classes specified. See Stri

2017-04-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983892#comment-15983892 ] Hyukjin Kwon commented on SPARK-20445: -- I meant the current codebase, latest build. Probably, I

[jira] [Commented] (SPARK-20456) Document major aggregation functions for pyspark

2017-04-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983888#comment-15983888 ] Hyukjin Kwon commented on SPARK-20456: -- I simply left the comment above as the current status does

[jira] [Commented] (SPARK-18127) Add hooks and extension points to Spark

2017-04-25 Thread Frederick Reiss (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983872#comment-15983872 ] Frederick Reiss commented on SPARK-18127: - Is there a design document or a public design and

[jira] [Resolved] (SPARK-18127) Add hooks and extension points to Spark

2017-04-25 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-18127. - Resolution: Fixed > Add hooks and extension points to Spark > --- >

[jira] [Updated] (SPARK-18127) Add hooks and extension points to Spark

2017-04-25 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-18127: Fix Version/s: 2.2.0 > Add hooks and extension points to Spark > --- >

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Ismael Juma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983853#comment-15983853 ] Ismael Juma commented on SPARK-18057: - It's worth noting that no-one is working on that ticket at the

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Helena Edelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983832#comment-15983832 ] Helena Edelson commented on SPARK-18057: It is the timeout. I think waiting is better, will be

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983820#comment-15983820 ] Michael Armbrust commented on SPARK-18057: -- I guess I'd like to understand more about what

[jira] [Resolved] (SPARK-20130) Flaky test: BlockManagerProactiveReplicationSuite

2017-04-25 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-20130. Resolution: Cannot Reproduce Seems a lot more stable now, so closing this until it becomes

[jira] [Assigned] (SPARK-20421) Mark JobProgressListener (and related classes) as deprecated

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20421: Assignee: (was: Apache Spark) > Mark JobProgressListener (and related classes) as

[jira] [Assigned] (SPARK-20421) Mark JobProgressListener (and related classes) as deprecated

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20421: Assignee: Apache Spark > Mark JobProgressListener (and related classes) as deprecated >

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983809#comment-15983809 ] Shixiong Zhu commented on SPARK-18057: -- I prefer to just wait. The user can still use Kafka 0.10.2.0

[jira] [Commented] (SPARK-20421) Mark JobProgressListener (and related classes) as deprecated

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983808#comment-15983808 ] Apache Spark commented on SPARK-20421: -- User 'vanzin' has created a pull request for this issue:

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Helena Edelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983796#comment-15983796 ] Helena Edelson commented on SPARK-18057: I have a branch off branch-2.2 with the 0.10.2.0 upgrade

[jira] [Assigned] (SPARK-20464) Add a job group and an informative description for streaming queries

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20464: Assignee: Apache Spark > Add a job group and an informative description for streaming

[jira] [Assigned] (SPARK-20464) Add a job group and an informative description for streaming queries

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20464: Assignee: (was: Apache Spark) > Add a job group and an informative description for

[jira] [Commented] (SPARK-20464) Add a job group and an informative description for streaming queries

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983751#comment-15983751 ] Apache Spark commented on SPARK-20464: -- User 'kunalkhamar' has created a pull request for this

[jira] [Updated] (SPARK-20464) Add a job group and an informative description for streaming queries

2017-04-25 Thread Kunal Khamar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khamar updated SPARK-20464: - Summary: Add a job group and an informative description for streaming queries (was: Add a job

[jira] [Updated] (SPARK-20239) Improve HistoryServer ACL mechanism

2017-04-25 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-20239: --- Fix Version/s: 2.1.2 2.0.3 > Improve HistoryServer ACL mechanism >

[jira] [Created] (SPARK-20464) Add a job group and an informative job description for streaming queries

2017-04-25 Thread Kunal Khamar (JIRA)
Kunal Khamar created SPARK-20464: Summary: Add a job group and an informative job description for streaming queries Key: SPARK-20464 URL: https://issues.apache.org/jira/browse/SPARK-20464 Project:

[jira] [Assigned] (SPARK-20463) Expose SPARK SQL <=> operator in PySpark

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20463: Assignee: Apache Spark > Expose SPARK SQL <=> operator in PySpark >

[jira] [Assigned] (SPARK-20463) Expose SPARK SQL <=> operator in PySpark

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20463: Assignee: (was: Apache Spark) > Expose SPARK SQL <=> operator in PySpark >

[jira] [Assigned] (SPARK-20463) Expose SPARK SQL <=> operator in PySpark

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20463: Assignee: Apache Spark > Expose SPARK SQL <=> operator in PySpark >

[jira] [Assigned] (SPARK-20463) Expose SPARK SQL <=> operator in PySpark

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20463: Assignee: (was: Apache Spark) > Expose SPARK SQL <=> operator in PySpark >

[jira] [Commented] (SPARK-20463) Expose SPARK SQL <=> operator in PySpark

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983654#comment-15983654 ] Apache Spark commented on SPARK-20463: -- User 'ptkool' has created a pull request for this issue:

[jira] [Created] (SPARK-20463) Expose SPARK SQL <=> operator in PySpark

2017-04-25 Thread Michael Styles (JIRA)
Michael Styles created SPARK-20463: -- Summary: Expose SPARK SQL <=> operator in PySpark Key: SPARK-20463 URL: https://issues.apache.org/jira/browse/SPARK-20463 Project: Spark Issue Type:

[jira] [Commented] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2017-04-25 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983615#comment-15983615 ] Shixiong Zhu commented on SPARK-13747: -- [~dnaumenko] Unfortunately, Spark uses ThreadLocal variables

[jira] [Assigned] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13747: Assignee: Apache Spark (was: Shixiong Zhu) > Concurrent execution in SQL doesn't work

[jira] [Assigned] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13747: Assignee: Shixiong Zhu (was: Apache Spark) > Concurrent execution in SQL doesn't work

[jira] [Commented] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983612#comment-15983612 ] Apache Spark commented on SPARK-13747: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Updated] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2017-04-25 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-13747: - Fix Version/s: (was: 2.2.0) > Concurrent execution in SQL doesn't work with Scala

[jira] [Reopened] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2017-04-25 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reopened SPARK-13747: -- > Concurrent execution in SQL doesn't work with Scala ForkJoinPool >

[jira] [Created] (SPARK-20462) Spark-Kinesis Direct Connector

2017-04-25 Thread Lauren Moos (JIRA)
Lauren Moos created SPARK-20462: --- Summary: Spark-Kinesis Direct Connector Key: SPARK-20462 URL: https://issues.apache.org/jira/browse/SPARK-20462 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-20456) Document major aggregation functions for pyspark

2017-04-25 Thread Michael Patterson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983590#comment-15983590 ] Michael Patterson commented on SPARK-20456: --- I saw that there are short docstrings for the

[jira] [Commented] (SPARK-9103) Tracking spark's memory usage

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983531#comment-15983531 ] Apache Spark commented on SPARK-9103: - User 'jsoltren' has created a pull request for this issue:

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-25 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983490#comment-15983490 ] Kazuaki Ishizaki commented on SPARK-20392: -- Here are my observations: According to

[jira] [Commented] (SPARK-20427) Issue with Spark interpreting Oracle datatype NUMBER

2017-04-25 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983440#comment-15983440 ] Xiao Li commented on SPARK-20427: - cc [~tsuresh] Are you interested in this? > Issue with Spark

[jira] [Updated] (SPARK-20459) JdbcUtils throws IllegalStateException: Cause already initialized after getting SQLException

2017-04-25 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20459: Target Version/s: 2.2.0 > JdbcUtils throws IllegalStateException: Cause already initialized after >

[jira] [Comment Edited] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983396#comment-15983396 ] Shixiong Zhu edited comment on SPARK-18057 at 4/25/17 6:29 PM: --- [~guozhang]

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983396#comment-15983396 ] Shixiong Zhu commented on SPARK-18057: -- [~guozhang] We have a stress test to test Spark Kafka

[jira] [Assigned] (SPARK-20461) CachedKafkaConsumer may hang forever when it's interrupted

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20461: Assignee: (was: Apache Spark) > CachedKafkaConsumer may hang forever when it's

[jira] [Assigned] (SPARK-20461) CachedKafkaConsumer may hang forever when it's interrupted

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20461: Assignee: Apache Spark > CachedKafkaConsumer may hang forever when it's interrupted >

[jira] [Resolved] (SPARK-5484) Pregel should checkpoint periodically to avoid StackOverflowError

2017-04-25 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-5484. - Resolution: Fixed Assignee: dingding (was: Ankur Dave) Fix Version/s:

[jira] [Commented] (SPARK-20461) CachedKafkaConsumer may hang forever when it's interrupted

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983387#comment-15983387 ] Apache Spark commented on SPARK-20461: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Created] (SPARK-20461) CachedKafkaConsumer may hang forever when it's interrupted

2017-04-25 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-20461: Summary: CachedKafkaConsumer may hang forever when it's interrupted Key: SPARK-20461 URL: https://issues.apache.org/jira/browse/SPARK-20461 Project: Spark

[jira] [Commented] (SPARK-20447) spark mesos scheduler suppress call

2017-04-25 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983314#comment-15983314 ] Michael Gummelt commented on SPARK-20447: - The scheduler doesn't support suppression, no, but it

[jira] [Resolved] (SPARK-20449) Upgrade breeze version to 0.13.1

2017-04-25 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai resolved SPARK-20449. - Resolution: Fixed Fix Version/s: 2.2.0 3.0.0 Issue resolved by pull request

[jira] [Commented] (SPARK-20439) Catalog.listTables() depends on all libraries used to create tables

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983224#comment-15983224 ] Apache Spark commented on SPARK-20439: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Commented] (SPARK-13802) Fields order in Row(**kwargs) is not consistent with Schema.toInternal method

2017-04-25 Thread Furcy Pin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983094#comment-15983094 ] Furcy Pin commented on SPARK-13802: --- Hi, I ran into similar issues and found this Jira, so I would like

[jira] [Commented] (SPARK-20459) JdbcUtils throws IllegalStateException: Cause already initialized after getting SQLException

2017-04-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983092#comment-15983092 ] Sean Owen commented on SPARK-20459: --- Ugh, so there's no actual way to detect whether the exception has

[jira] [Assigned] (SPARK-20460) Make it more consistent to handle column name duplication

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20460: Assignee: (was: Apache Spark) > Make it more consistent to handle column name

[jira] [Assigned] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11968: Assignee: Nick Pentreath (was: Apache Spark) > ALS recommend all methods spend most of

[jira] [Assigned] (SPARK-20460) Make it more consistent to handle column name duplication

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20460: Assignee: Apache Spark > Make it more consistent to handle column name duplication >

[jira] [Assigned] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11968: Assignee: Apache Spark (was: Nick Pentreath) > ALS recommend all methods spend most of

[jira] [Commented] (SPARK-20460) Make it more consistent to handle column name duplication

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983079#comment-15983079 ] Apache Spark commented on SPARK-20460: -- User 'maropu' has created a pull request for this issue:

[jira] [Commented] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983074#comment-15983074 ] Peng Meng commented on SPARK-11968: --- Thanks [~mlnick] , I will post more results here. I latest result

[jira] [Updated] (SPARK-20460) Make it more consistent to handle column name duplication

2017-04-25 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-20460: - Description: In the current master, error handling is different when hitting column name

[jira] [Commented] (SPARK-20445) pyspark.sql.utils.IllegalArgumentException: u'DecisionTreeClassifier was given input with invalid label column label, without the number of classes specified. See Stri

2017-04-25 Thread surya pratap (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983069#comment-15983069 ] surya pratap commented on SPARK-20445: -- Hello Hyukjin Kwon , Thanks for fast reply I am not getting

[jira] [Assigned] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11968: Assignee: Apache Spark (was: Nick Pentreath) > ALS recommend all methods spend most of

[jira] [Assigned] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11968: Assignee: Nick Pentreath (was: Apache Spark) > ALS recommend all methods spend most of

[jira] [Issue Comment Deleted] (SPARK-20445) pyspark.sql.utils.IllegalArgumentException: u'DecisionTreeClassifier was given input with invalid label column label, without the number of classes specifi

2017-04-25 Thread surya pratap (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] surya pratap updated SPARK-20445: - Comment: was deleted (was: Hello Hyukjin Kwon, Thxz for fast reply. You are using which version

[jira] [Commented] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983066#comment-15983066 ] Apache Spark commented on SPARK-11968: -- User 'mpjlu' has created a pull request for this issue:

[jira] [Created] (SPARK-20460) Make it more consistent to handle column name duplication

2017-04-25 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-20460: Summary: Make it more consistent to handle column name duplication Key: SPARK-20460 URL: https://issues.apache.org/jira/browse/SPARK-20460 Project: Spark

[jira] [Commented] (SPARK-20443) The blockSize of MLLIB ALS should be setting by the User

2017-04-25 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983059#comment-15983059 ] Peng Meng commented on SPARK-20443: --- Yes, based on my current test, I agree. But if the data size is

[jira] [Commented] (SPARK-20443) The blockSize of MLLIB ALS should be setting by the User

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983050#comment-15983050 ] Nick Pentreath commented on SPARK-20443: Your PR for SPARK-20446 / SPARK11968 should largely

[jira] [Commented] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983048#comment-15983048 ] Nick Pentreath commented on SPARK-11968: [~peng.m...@intel.com] would you mind posting your

[jira] [Closed] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-20446. -- Resolution: Duplicate > Optimize the process of MLLIB ALS recommendForAll >

[jira] [Created] (SPARK-20459) JdbcUtils throws IllegalStateException: Cause already initialized after getting SQLException

2017-04-25 Thread Jessie Yu (JIRA)
Jessie Yu created SPARK-20459: - Summary: JdbcUtils throws IllegalStateException: Cause already initialized after getting SQLException Key: SPARK-20459 URL: https://issues.apache.org/jira/browse/SPARK-20459

[jira] [Commented] (SPARK-20445) pyspark.sql.utils.IllegalArgumentException: u'DecisionTreeClassifier was given input with invalid label column label, without the number of classes specified. See Stri

2017-04-25 Thread surya pratap (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983040#comment-15983040 ] surya pratap commented on SPARK-20445: -- Hello Hyukjin Kwon, Thxz for fast reply. You are using which

[jira] [Comment Edited] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-25 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982251#comment-15982251 ] Peng Meng edited comment on SPARK-20446 at 4/25/17 3:06 PM: Yes, I compared

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-25 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983030#comment-15983030 ] Peng Meng commented on SPARK-20446: --- Thanks [~mlnick] , I agree with you. I am ok to close this ticket

[jira] [Commented] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983026#comment-15983026 ] Nick Pentreath commented on SPARK-11968: Note, there is a solution proposed in SPARK-20446. I've

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983018#comment-15983018 ] Nick Pentreath commented on SPARK-20446: By the way when I say it is a duplicate I mean for the

[jira] [Commented] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2017-04-25 Thread Dmitry Naumenko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982949#comment-15982949 ] Dmitry Naumenko commented on SPARK-13747: - [~zsxwing] I did a similar test with join and have the

[jira] [Created] (SPARK-20458) support getting Yarn Tracking URL in code

2017-04-25 Thread PJ Fanning (JIRA)
PJ Fanning created SPARK-20458: -- Summary: support getting Yarn Tracking URL in code Key: SPARK-20458 URL: https://issues.apache.org/jira/browse/SPARK-20458 Project: Spark Issue Type:

[jira] [Commented] (SPARK-20445) pyspark.sql.utils.IllegalArgumentException: u'DecisionTreeClassifier was given input with invalid label column label, without the number of classes specified. See Stri

2017-04-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982852#comment-15982852 ] Hyukjin Kwon commented on SPARK-20445: -- Are you maybe able to try this against the current master or

[jira] [Commented] (SPARK-20445) pyspark.sql.utils.IllegalArgumentException: u'DecisionTreeClassifier was given input with invalid label column label, without the number of classes specified. See Stri

2017-04-25 Thread surya pratap (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982830#comment-15982830 ] surya pratap commented on SPARK-20445: -- Hello Hyukjin Kwon Thxz for reply. I tried many times but

[jira] [Updated] (SPARK-20457) Spark CSV is not able to Override Schema while reading data

2017-04-25 Thread Himanshu Gupta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Gupta updated SPARK-20457: --- Description: I have a CSV file, test.csv: {code:xml} col 1 2 3 4 {code} When I read it

[jira] [Created] (SPARK-20457) Spark CSV is not able to Override Schema while reading data

2017-04-25 Thread Himanshu Gupta (JIRA)
Himanshu Gupta created SPARK-20457: -- Summary: Spark CSV is not able to Override Schema while reading data Key: SPARK-20457 URL: https://issues.apache.org/jira/browse/SPARK-20457 Project: Spark

[jira] [Closed] (SPARK-20336) spark.read.csv() with wholeFile=True option fails to read non ASCII unicode characters

2017-04-25 Thread HanCheol Cho (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HanCheol Cho closed SPARK-20336. Resolution: Not A Bug > spark.read.csv() with wholeFile=True option fails to read non ASCII

[jira] [Commented] (SPARK-20336) spark.read.csv() with wholeFile=True option fails to read non ASCII unicode characters

2017-04-25 Thread HanCheol Cho (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982702#comment-15982702 ] HanCheol Cho commented on SPARK-20336: -- Thank you for your additiona test [~original-brownbear]]. I

[jira] [Commented] (SPARK-17922) ClassCastException java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator cannot be cast to org.apache.spark.sql.cata

2017-04-25 Thread kanika dhuria (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982700#comment-15982700 ] kanika dhuria commented on SPARK-17922: --- Hi , I have attached the repro case for this issue. The

[jira] [Updated] (SPARK-17922) ClassCastException java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator cannot be cast to org.apache.spark.sql.cataly

2017-04-25 Thread kanika dhuria (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanika dhuria updated SPARK-17922: -- Attachment: spark_17922.tar.gz Repro case > ClassCastException java.lang.ClassCastException:

[jira] [Commented] (SPARK-13857) Feature parity for ALS ML with MLLIB

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982695#comment-15982695 ] Nick Pentreath commented on SPARK-13857: I'm going to close this as superseded by SPARK-19535.

  1   2   >