[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2017-06-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters.

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2017-06-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters.

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2017-06-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters.

[jira] [Assigned] (SPARK-21188) releaseAllLocksForTask should synchronize the whole method

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21188: Assignee: (was: Apache Spark) > releaseAllLocksForTask should synchronize the whole

[jira] [Commented] (SPARK-21188) releaseAllLocksForTask should synchronize the whole method

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060422#comment-16060422 ] Apache Spark commented on SPARK-21188: -- User 'liufengdb' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21188) releaseAllLocksForTask should synchronize the whole method

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21188: Assignee: Apache Spark > releaseAllLocksForTask should synchronize the whole method >

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2017-06-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Summary: Complete support for remaining Spark data types in Arrow Converters (was: Complete

[jira] [Created] (SPARK-21188) releaseAllLocksForTask should synchronize the whole method

2017-06-22 Thread Feng Liu (JIRA)
Feng Liu created SPARK-21188: Summary: releaseAllLocksForTask should synchronize the whole method Key: SPARK-21188 URL: https://issues.apache.org/jira/browse/SPARK-21188 Project: Spark Issue

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-06-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060413#comment-16060413 ] Bryan Cutler commented on SPARK-13534: -- That is correct [~rxin], this did not have support for

[jira] [Closed] (SPARK-20738) Option to turn off building docs in sbt build.

2017-06-22 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma closed SPARK-20738. --- Resolution: Won't Fix > Option to turn off building docs in sbt build. >

[jira] [Created] (SPARK-21187) Complete support for remaining Spark data type in Arrow Converters

2017-06-22 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-21187: Summary: Complete support for remaining Spark data type in Arrow Converters Key: SPARK-21187 URL: https://issues.apache.org/jira/browse/SPARK-21187 Project: Spark

[jira] [Commented] (SPARK-21066) LibSVM load just one input file

2017-06-22 Thread 颜发才
[ https://issues.apache.org/jira/browse/SPARK-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060381#comment-16060381 ] Yan Facai (颜发才) commented on SPARK-21066: - Downgrade to Trivial since `numFeatures` should work.

[jira] [Updated] (SPARK-21066) LibSVM load just one input file

2017-06-22 Thread 颜发才
[ https://issues.apache.org/jira/browse/SPARK-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Facai (颜发才) updated SPARK-21066: Priority: Trivial (was: Major) > LibSVM load just one input file >

[jira] [Updated] (SPARK-21186) PySpark with --packages fails to import library due to lack of pythonpath to .ivy2/jars/*.jar

2017-06-22 Thread HanCheol Cho (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HanCheol Cho updated SPARK-21186: - Description: I experienced "ImportError: No module named sparkdl" exception while trying to use

[jira] [Created] (SPARK-21186) PySpark with --packages fails to import library due to lack of pythonpath to .ivy2/jars/*.jar

2017-06-22 Thread HanCheol Cho (JIRA)
HanCheol Cho created SPARK-21186: Summary: PySpark with --packages fails to import library due to lack of pythonpath to .ivy2/jars/*.jar Key: SPARK-21186 URL: https://issues.apache.org/jira/browse/SPARK-21186

[jira] [Commented] (SPARK-21171) Speculate task scheduling block dirve handle normal task when a job task number more than one hundred thousand

2017-06-22 Thread wangminfeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060348#comment-16060348 ] wangminfeng commented on SPARK-21171: - We have modified some code for this feature, i will add a

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-06-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060328#comment-16060328 ] Reynold Xin commented on SPARK-13534: - Was this done? I thought there are still other data types that

[jira] [Assigned] (SPARK-20599) ConsoleSink should work with write (batch)

2017-06-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-20599: Assignee: Lubo Zhang > ConsoleSink should work with write (batch) >

[jira] [Commented] (SPARK-21185) Spurious errors in unidoc causing PRs to fail

2017-06-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060296#comment-16060296 ] Hyukjin Kwon commented on SPARK-21185: -- Yea, I believe this is a duplicate of

[jira] [Assigned] (SPARK-21174) Validate sampling fraction in logical operator level

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21174: --- Assignee: Gengliang Wang > Validate sampling fraction in logical operator level >

[jira] [Resolved] (SPARK-21174) Validate sampling fraction in logical operator level

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21174. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18387

[jira] [Commented] (SPARK-20923) TaskMetrics._updatedBlockStatuses uses a lot of memory

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060283#comment-16060283 ] Wenchen Fan commented on SPARK-20923: - This patch changes the public behavior and we should mention

[jira] [Resolved] (SPARK-20923) TaskMetrics._updatedBlockStatuses uses a lot of memory

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20923. - Resolution: Fixed Fix Version/s: 2.3.0 > TaskMetrics._updatedBlockStatuses uses a lot of

[jira] [Updated] (SPARK-20923) TaskMetrics._updatedBlockStatuses uses a lot of memory

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-20923: Labels: releasenotes (was: ) > TaskMetrics._updatedBlockStatuses uses a lot of memory >

[jira] [Assigned] (SPARK-20923) TaskMetrics._updatedBlockStatuses uses a lot of memory

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-20923: --- Assignee: Thomas Graves > TaskMetrics._updatedBlockStatuses uses a lot of memory >

[jira] [Updated] (SPARK-21185) Spurious errors in unidoc causing PRs to fail

2017-06-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-21185: -- Description: Some PRs are failing because of unidoc throwing random errors. When GenJavaDoc

[jira] [Created] (SPARK-21185) Spurious errors in unidoc causing PRs to fail

2017-06-22 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-21185: - Summary: Spurious errors in unidoc causing PRs to fail Key: SPARK-21185 URL: https://issues.apache.org/jira/browse/SPARK-21185 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-13534: --- Assignee: Bryan Cutler > Implement Apache Arrow serializer for Spark DataFrame for use in

[jira] [Resolved] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-13534. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 15821

[jira] [Commented] (SPARK-21159) Cluster mode, driver throws connection refused exception submitted by SparkLauncher

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060208#comment-16060208 ] Apache Spark commented on SPARK-21159: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21159) Cluster mode, driver throws connection refused exception submitted by SparkLauncher

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21159: Assignee: (was: Apache Spark) > Cluster mode, driver throws connection refused

[jira] [Assigned] (SPARK-21159) Cluster mode, driver throws connection refused exception submitted by SparkLauncher

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21159: Assignee: Apache Spark > Cluster mode, driver throws connection refused exception

[jira] [Commented] (SPARK-18343) FileSystem$Statistics$StatisticsDataReferenceCleaner hangs on s3 write

2017-06-22 Thread Kanagha Pradha (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060203#comment-16060203 ] Kanagha Pradha commented on SPARK-18343: I am getting the same error with spark 2.0.2 - scala

[jira] [Commented] (SPARK-21145) Restarted queries reuse same StateStoreProvider, causing multiple concurrent tasks to update same StateStore

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060196#comment-16060196 ] Apache Spark commented on SPARK-21145: -- User 'tdas' has created a pull request for this issue:

[jira] [Commented] (SPARK-20391) Properly rename the memory related fields in ExecutorSummary REST API

2017-06-22 Thread Jose Soltren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060097#comment-16060097 ] Jose Soltren commented on SPARK-20391: -- So, this is months old now and irrelevant, but since you

[jira] [Commented] (SPARK-20655) In-memory key-value store implementation

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060067#comment-16060067 ] Apache Spark commented on SPARK-20655: -- User 'vanzin' has created a pull request for this issue:

[jira] [Commented] (SPARK-20379) Allow setting SSL-related passwords through env variables

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060065#comment-16060065 ] Apache Spark commented on SPARK-20379: -- User 'vanzin' has created a pull request for this issue:

[jira] [Commented] (SPARK-20342) DAGScheduler sends SparkListenerTaskEnd before updating task's accumulators

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060063#comment-16060063 ] Apache Spark commented on SPARK-20342: -- User 'vanzin' has created a pull request for this issue:

[jira] [Resolved] (SPARK-19937) Collect metrics of block sizes when shuffle.

2017-06-22 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19937. Resolution: Fixed Assignee: jin xing Fix Version/s: 2.3.0 > Collect

[jira] [Created] (SPARK-21184) QuantileSummaries implementation is wrong and QuantileSummariesSuite fails with larger n

2017-06-22 Thread Andrew Ray (JIRA)
Andrew Ray created SPARK-21184: -- Summary: QuantileSummaries implementation is wrong and QuantileSummariesSuite fails with larger n Key: SPARK-21184 URL: https://issues.apache.org/jira/browse/SPARK-21184

[jira] [Created] (SPARK-21183) Unable to return Google BigQuery INTEGER data type into Spark via google BigQuery JDBC driver: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to

2017-06-22 Thread Matthew Walton (JIRA)
Matthew Walton created SPARK-21183: -- Summary: Unable to return Google BigQuery INTEGER data type into Spark via google BigQuery JDBC driver: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to long. Key:

[jira] [Created] (SPARK-21182) Structured streaming on Spark-shell on windows

2017-06-22 Thread Vijay (JIRA)
Vijay created SPARK-21182: - Summary: Structured streaming on Spark-shell on windows Key: SPARK-21182 URL: https://issues.apache.org/jira/browse/SPARK-21182 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-21179) Unable to return Hive INT data type into Spark via Hive JDBC driver: Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.

2017-06-22 Thread Matthew Walton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Walton updated SPARK-21179: --- Affects Version/s: 2.1.1 Summary: Unable to return Hive INT data type into

[jira] [Updated] (SPARK-21168) KafkaRDD should always set kafka clientId.

2017-06-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21168: - Component/s: (was: Structured Streaming) DStreams > KafkaRDD should always

[jira] [Commented] (SPARK-21167) Path is not decoded correctly when reading output of FileSink

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059816#comment-16059816 ] Apache Spark commented on SPARK-21167: -- User 'dijingran' has created a pull request for this issue:

[jira] [Issue Comment Deleted] (SPARK-21167) Path is not decoded correctly when reading output of FileSink

2017-06-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21167: - Comment: was deleted (was: User 'dijingran' has created a pull request for this issue:

[jira] [Resolved] (SPARK-20599) ConsoleSink should work with write (batch)

2017-06-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-20599. -- Resolution: Fixed Fix Version/s: 2.3.0 > ConsoleSink should work with write (batch) >

[jira] [Updated] (SPARK-21155) Add (? running tasks) into Spark UI progress

2017-06-22 Thread Eric Vandenberg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Vandenberg updated SPARK-21155: Attachment: Screen Shot 2017-06-22 at 9.58.08 AM.png > Add (? running tasks) into Spark UI

[jira] [Assigned] (SPARK-21181) Suppress memory leak errors reported by netty

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21181: Assignee: (was: Apache Spark) > Suppress memory leak errors reported by netty >

[jira] [Commented] (SPARK-21181) Suppress memory leak errors reported by netty

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059671#comment-16059671 ] Apache Spark commented on SPARK-21181: -- User 'dhruve' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21181) Suppress memory leak errors reported by netty

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21181: Assignee: Apache Spark > Suppress memory leak errors reported by netty >

[jira] [Created] (SPARK-21181) Suppress memory leak errors reported by netty

2017-06-22 Thread Dhruve Ashar (JIRA)
Dhruve Ashar created SPARK-21181: Summary: Suppress memory leak errors reported by netty Key: SPARK-21181 URL: https://issues.apache.org/jira/browse/SPARK-21181 Project: Spark Issue Type:

[jira] [Commented] (SPARK-21110) Structs should be usable in inequality filters

2017-06-22 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059536#comment-16059536 ] Nicholas Chammas commented on SPARK-21110: -- cc [~marmbrus] - Assuming this is a valid feature

[jira] [Commented] (SPARK-21180) Remove conf from stats functions since now we have conf in LogicalPlan

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059525#comment-16059525 ] Apache Spark commented on SPARK-21180: -- User 'wzhfy' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21180) Remove conf from stats functions since now we have conf in LogicalPlan

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21180: Assignee: (was: Apache Spark) > Remove conf from stats functions since now we have

[jira] [Assigned] (SPARK-21180) Remove conf from stats functions since now we have conf in LogicalPlan

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21180: Assignee: Apache Spark > Remove conf from stats functions since now we have conf in

[jira] [Updated] (SPARK-21176) Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs

2017-06-22 Thread Ingo Schuster (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ingo Schuster updated SPARK-21176: -- Description: In reverse proxy mode, Sparks exhausts the Jetty thread pool if the master node

[jira] [Updated] (SPARK-21176) Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs

2017-06-22 Thread Ingo Schuster (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ingo Schuster updated SPARK-21176: -- Description: In reverse proxy mode, Sparks exhausts the Jetty thread pool if the master node

[jira] [Created] (SPARK-21180) Remove conf from stats functions since now we have conf in LogicalPlan

2017-06-22 Thread Zhenhua Wang (JIRA)
Zhenhua Wang created SPARK-21180: Summary: Remove conf from stats functions since now we have conf in LogicalPlan Key: SPARK-21180 URL: https://issues.apache.org/jira/browse/SPARK-21180 Project:

[jira] [Commented] (SPARK-21166) Automated ML persistence

2017-06-22 Thread Mark Hamilton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059481#comment-16059481 ] Mark Hamilton commented on SPARK-21166: --- The code is currently being developed here:

[jira] [Created] (SPARK-21179) Unable to return Hive INT data type into Spark SQL via Hive JDBC driver: Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.

2017-06-22 Thread Matthew Walton (JIRA)
Matthew Walton created SPARK-21179: -- Summary: Unable to return Hive INT data type into Spark SQL via Hive JDBC driver: Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int. Key:

[jira] [Resolved] (SPARK-20832) Standalone master should explicitly inform drivers of worker deaths and invalidate external shuffle service outputs

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20832. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18362

[jira] [Assigned] (SPARK-20832) Standalone master should explicitly inform drivers of worker deaths and invalidate external shuffle service outputs

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-20832: --- Assignee: Jiang Xingbo > Standalone master should explicitly inform drivers of worker

[jira] [Commented] (SPARK-20295) when spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue

2017-06-22 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059273#comment-16059273 ] cen yuhai commented on SPARK-20295: --- I hit this bug... > when spark.sql.adaptive.enabled is enabled,

[jira] [Commented] (SPARK-21178) Add support for label specific metrics in MulticlassClassificationEvaluator

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059251#comment-16059251 ] Apache Spark commented on SPARK-21178: -- User 'rawataaryan9' has created a pull request for this

[jira] [Assigned] (SPARK-21178) Add support for label specific metrics in MulticlassClassificationEvaluator

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21178: Assignee: Apache Spark > Add support for label specific metrics in

[jira] [Assigned] (SPARK-21178) Add support for label specific metrics in MulticlassClassificationEvaluator

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21178: Assignee: (was: Apache Spark) > Add support for label specific metrics in

[jira] [Updated] (SPARK-21178) Add support for label specific metrics in MulticlassClassificationEvaluator

2017-06-22 Thread Aman Rawat (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aman Rawat updated SPARK-21178: --- Target Version/s: (was: 2.1.2) > Add support for label specific metrics in

[jira] [Created] (SPARK-21178) Add support for label specific metrics in MulticlassClassificationEvaluator

2017-06-22 Thread Aman Rawat (JIRA)
Aman Rawat created SPARK-21178: -- Summary: Add support for label specific metrics in MulticlassClassificationEvaluator Key: SPARK-21178 URL: https://issues.apache.org/jira/browse/SPARK-21178 Project:

[jira] [Updated] (SPARK-21177) df.saveAsTable slows down linearly, with number of appends

2017-06-22 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma updated SPARK-21177: Summary: df.saveAsTable slows down linearly, with number of appends (was: df.SaveAsTable

[jira] [Updated] (SPARK-21177) df.SaveAsTable slows down linearly, with number of appends.

2017-06-22 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma updated SPARK-21177: Summary: df.SaveAsTable slows down linearly, with number of appends. (was: Append to hive

[jira] [Created] (SPARK-21177) Append to hive slows down linearly, with number of appends.

2017-06-22 Thread Prashant Sharma (JIRA)
Prashant Sharma created SPARK-21177: --- Summary: Append to hive slows down linearly, with number of appends. Key: SPARK-21177 URL: https://issues.apache.org/jira/browse/SPARK-21177 Project: Spark

[jira] [Commented] (SPARK-11373) Add metrics to the History Server and providers

2017-06-22 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059141#comment-16059141 ] Steve Loughran commented on SPARK-11373: metrics might help with understanding the s3 load issues

[jira] [Resolved] (SPARK-21080) Workaround for HDFS delegation token expiry broken with some Hadoop versions

2017-06-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21080. --- Resolution: Not A Problem > Workaround for HDFS delegation token expiry broken with some Hadoop

[jira] [Commented] (SPARK-21080) Workaround for HDFS delegation token expiry broken with some Hadoop versions

2017-06-22 Thread Lukasz Raszka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059022#comment-16059022 ] Lukasz Raszka commented on SPARK-21080: --- Update: I think it might be a mistake on our side, and it

[jira] [Commented] (SPARK-21167) Path is not decoded correctly when reading output of FileSink

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059019#comment-16059019 ] Apache Spark commented on SPARK-21167: -- User 'dijingran' has created a pull request for this issue:

[jira] [Commented] (SPARK-21171) Speculate task scheduling block dirve handle normal task when a job task number more than one hundred thousand

2017-06-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059018#comment-16059018 ] Sean Owen commented on SPARK-21171: --- There's no real detail here. I'd have to close this. This should

[jira] [Resolved] (SPARK-21161) SparkContext stopped when execute a query on Solr

2017-06-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21161. --- Resolution: Not A Problem > SparkContext stopped when execute a query on Solr >

[jira] [Resolved] (SPARK-21173) There are several configuration about SSL displayed in configuration.md but never be used.

2017-06-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21173. --- Resolution: Not A Problem > There are several configuration about SSL displayed in configuration.md

[jira] [Updated] (SPARK-21176) Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs

2017-06-22 Thread Ingo Schuster (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ingo Schuster updated SPARK-21176: -- Description: In reverse proxy mode, Sparks exhausts the Jetty thread pool if the master node

[jira] [Updated] (SPARK-21176) Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs

2017-06-22 Thread Ingo Schuster (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ingo Schuster updated SPARK-21176: -- Description: In reverse proxy mode, Sparks exhausts the Jetty thread pool if the master node

[jira] [Commented] (SPARK-21080) Workaround for HDFS delegation token expiry broken with some Hadoop versions

2017-06-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058985#comment-16058985 ] Sean Owen commented on SPARK-21080: --- This really isn't my area; maybe [~vanzin]? > Workaround for HDFS

[jira] [Updated] (SPARK-21176) Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs

2017-06-22 Thread Ingo Schuster (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ingo Schuster updated SPARK-21176: -- Affects Version/s: 2.2.1 2.2.0 2.1.1 > Master UI

[jira] [Created] (SPARK-21176) Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs

2017-06-22 Thread Ingo Schuster (JIRA)
Ingo Schuster created SPARK-21176: - Summary: Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs Key: SPARK-21176 URL: https://issues.apache.org/jira/browse/SPARK-21176

[jira] [Commented] (SPARK-14174) Implement the Mini-Batch KMeans

2017-06-22 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058982#comment-16058982 ] zhengruifeng commented on SPARK-14174: -- [~mlnick] I send a new PR for MiniBatch KMeans, and the

[jira] [Commented] (SPARK-14174) Implement the Mini-Batch KMeans

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058975#comment-16058975 ] Apache Spark commented on SPARK-14174: -- User 'zhengruifeng' has created a pull request for this

[jira] [Resolved] (SPARK-21163) DataFrame.toPandas should respect the data type

2017-06-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21163. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18378

[jira] [Updated] (SPARK-14174) Accelerate KMeans via Mini-Batch EM

2017-06-22 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-14174: - Description: The MiniBatchKMeans is a variant of the KMeans algorithm which uses mini-batches

[jira] [Updated] (SPARK-14174) Implement the Mini-Batch KMeans

2017-06-22 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-14174: - Summary: Implement the Mini-Batch KMeans (was: Accelerate KMeans via Mini-Batch EM) >

[jira] [Updated] (SPARK-14174) Implement the Mini-Batch KMeans

2017-06-22 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-14174: - Attachment: MBKM.xlsx > Implement the Mini-Batch KMeans > --- > >

[jira] [Updated] (SPARK-14174) Accelerate KMeans via Mini-Batch EM

2017-06-22 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-14174: - Attachment: (was: MiniBatchKMeans_Performance.pdf) > Accelerate KMeans via Mini-Batch EM >

[jira] [Updated] (SPARK-14174) Accelerate KMeans via Mini-Batch EM

2017-06-22 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-14174: - Attachment: (was: MiniBatchKMeans_Performance_II.pdf) > Accelerate KMeans via Mini-Batch EM

[jira] [Comment Edited] (SPARK-19700) Design an API for pluggable scheduler implementations

2017-06-22 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058930#comment-16058930 ] Andrew Ash edited comment on SPARK-19700 at 6/22/17 7:47 AM: - Public

[jira] [Commented] (SPARK-19700) Design an API for pluggable scheduler implementations

2017-06-22 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058930#comment-16058930 ] Andrew Ash commented on SPARK-19700: Public implementation that's been around a while: Two Sigma's

[jira] [Commented] (SPARK-19700) Design an API for pluggable scheduler implementations

2017-06-22 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058917#comment-16058917 ] Andrew Ash commented on SPARK-19700: Found another potential implementation: Facebook's in-house

[jira] [Assigned] (SPARK-21175) Slow down "open blocks" on shuffle service when memory shortage to avoid OOM.

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21175: Assignee: (was: Apache Spark) > Slow down "open blocks" on shuffle service when

[jira] [Commented] (SPARK-21175) Slow down "open blocks" on shuffle service when memory shortage to avoid OOM.

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058892#comment-16058892 ] Apache Spark commented on SPARK-21175: -- User 'jinxing64' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21175) Slow down "open blocks" on shuffle service when memory shortage to avoid OOM.

2017-06-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21175: Assignee: Apache Spark > Slow down "open blocks" on shuffle service when memory shortage

[jira] [Created] (SPARK-21175) Slow down "open blocks" on shuffle service when memory shortage to avoid OOM.

2017-06-22 Thread jin xing (JIRA)
jin xing created SPARK-21175: Summary: Slow down "open blocks" on shuffle service when memory shortage to avoid OOM. Key: SPARK-21175 URL: https://issues.apache.org/jira/browse/SPARK-21175 Project: Spark

[jira] [Updated] (SPARK-21174) Validate sampling fraction in logical operator level

2017-06-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21174: Component/s: (was: Optimizer) SQL > Validate sampling fraction in logical operator

  1   2   >