[jira] [Created] (SPARK-19383) Spark Sql Fails with Cassandra 3.6 and later PER PARTITION LIMIT option

2017-01-26 Thread Brent Dorsey (JIRA)
Brent Dorsey created SPARK-19383: Summary: Spark Sql Fails with Cassandra 3.6 and later PER PARTITION LIMIT option Key: SPARK-19383 URL: https://issues.apache.org/jira/browse/SPARK-19383 Project: Spa

[jira] [Resolved] (SPARK-18929) Add Tweedie distribution in GLM

2017-01-26 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-18929. - Resolution: Fixed Fix Version/s: 2.2.0 > Add Tweedie distribution in GLM > ---

[jira] [Created] (SPARK-19382) Test sparse vectors in LinearSVCSuite

2017-01-26 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-19382: - Summary: Test sparse vectors in LinearSVCSuite Key: SPARK-19382 URL: https://issues.apache.org/jira/browse/SPARK-19382 Project: Spark Issue Type: T

[jira] [Commented] (SPARK-9215) Implement WAL-free Kinesis receiver that give at-least once guarantee

2017-01-26 Thread Gaurav Shah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15841011#comment-15841011 ] Gaurav Shah commented on SPARK-9215: [~tdas] ping > Implement WAL-free Kinesis receiv

[jira] [Commented] (SPARK-19304) Kinesis checkpoint recovery is 10x slow

2017-01-26 Thread Gaurav Shah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15841009#comment-15841009 ] Gaurav Shah commented on SPARK-19304: - There are two issues in `KinesisSequenceRangeI

[jira] [Resolved] (SPARK-18788) Add getNumPartitions() to SparkR

2017-01-26 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-18788. -- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 Target Versi

[jira] [Commented] (SPARK-18821) Bisecting k-means wrapper in SparkR

2017-01-26 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15841004#comment-15841004 ] Felix Cheung commented on SPARK-18821: -- Need to follow up with programming guide, ex

[jira] [Updated] (SPARK-18821) Bisecting k-means wrapper in SparkR

2017-01-26 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-18821: - Fix Version/s: 2.2.0 > Bisecting k-means wrapper in SparkR > ---

[jira] [Resolved] (SPARK-18821) Bisecting k-means wrapper in SparkR

2017-01-26 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-18821. -- Resolution: Fixed Assignee: Miao Wang > Bisecting k-means wrapper in SparkR > ---

[jira] [Updated] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2017-01-26 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-18218: Assignee: Weichen Xu > Optimize BlockMatrix multiplication, which may cause OOM and low parallelism

[jira] [Resolved] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2017-01-26 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-18218. - Resolution: Implemented Fix Version/s: 2.2.0 Resolved by https://github.com/apache/spark/p

[jira] [Commented] (SPARK-14480) Remove meaningless StringIteratorReader for CSV data source for better performance

2017-01-26 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840872#comment-15840872 ] Hyukjin Kwon commented on SPARK-14480: -- removed `StringIteratorReader` concatenates

[jira] [Created] (SPARK-19381) spark 2.1.0 raises unrelated (unhelpful) error for parquet files beginning with '_'

2017-01-26 Thread Paul Pearce (JIRA)
Paul Pearce created SPARK-19381: --- Summary: spark 2.1.0 raises unrelated (unhelpful) error for parquet files beginning with '_' Key: SPARK-19381 URL: https://issues.apache.org/jira/browse/SPARK-19381 Pro

[jira] [Assigned] (SPARK-14480) Remove meaningless StringIteratorReader for CSV data source for better performance

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14480: Assignee: Apache Spark (was: Hyukjin Kwon) > Remove meaningless StringIteratorReader for

[jira] [Assigned] (SPARK-14480) Remove meaningless StringIteratorReader for CSV data source for better performance

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14480: Assignee: Hyukjin Kwon (was: Apache Spark) > Remove meaningless StringIteratorReader for

[jira] [Updated] (SPARK-19381) spark 2.1.0 raises unrelated (unhelpful) error for parquet filenames beginning with '_'

2017-01-26 Thread Paul Pearce (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Pearce updated SPARK-19381: Summary: spark 2.1.0 raises unrelated (unhelpful) error for parquet filenames beginning with '_' (

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2017-01-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840857#comment-15840857 ] Liang-Chi Hsieh commented on SPARK-18539: - [~lian cheng] Yea, I see. The term {{o

[jira] [Reopened] (SPARK-14480) Remove meaningless StringIteratorReader for CSV data source for better performance

2017-01-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reopened SPARK-14480: This patch have a regression: A column that have escaped newline can't be correctly parsed anymore. >

[jira] [Updated] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2017-01-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18218: -- Shepherd: Burak Yavuz (was: Yanbo Liang) > Optimize BlockMatrix multiplication, which

[jira] [Updated] (SPARK-19380) YARN - Dynamic allocation should use configured number of executors as max number of executors

2017-01-26 Thread Zhe Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated SPARK-19380: -- Description: SPARK-13723 only uses user's number of executors as the initial number of executors when

[jira] [Created] (SPARK-19380) YARN - Dynamic allocation should use configured number of executors as max number of executors

2017-01-26 Thread Zhe Zhang (JIRA)
Zhe Zhang created SPARK-19380: - Summary: YARN - Dynamic allocation should use configured number of executors as max number of executors Key: SPARK-19380 URL: https://issues.apache.org/jira/browse/SPARK-19380

[jira] [Commented] (SPARK-19220) SSL redirect handler only redirects the server's root

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840785#comment-15840785 ] Apache Spark commented on SPARK-19220: -- User 'vanzin' has created a pull request for

[jira] [Updated] (SPARK-19379) SparkAppHandle.getState not registering FAILED state upon Spark app failure in Local mode

2017-01-26 Thread Adam Kramer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kramer updated SPARK-19379: Summary: SparkAppHandle.getState not registering FAILED state upon Spark app failure in Local mode

[jira] [Updated] (SPARK-19220) SSL redirect handler only redirects the server's root

2017-01-26 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-19220: --- Fix Version/s: 2.1.1 > SSL redirect handler only redirects the server's root > --

[jira] [Updated] (SPARK-19379) SparkAppHandle.getState not registered FAILED state upon Spark app failure in Local mode

2017-01-26 Thread Adam Kramer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kramer updated SPARK-19379: Summary: SparkAppHandle.getState not registered FAILED state upon Spark app failure in Local mode

[jira] [Updated] (SPARK-19379) SparkAppHandle.getState not registered FAILED state upon Spark app failure in Local mode

2017-01-26 Thread Adam Kramer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kramer updated SPARK-19379: Affects Version/s: 2.1.0 > SparkAppHandle.getState not registered FAILED state upon Spark app failu

[jira] [Created] (SPARK-19379) SparkAppHandle.getState not registered FAILED state upon Spark app failure

2017-01-26 Thread Adam Kramer (JIRA)
Adam Kramer created SPARK-19379: --- Summary: SparkAppHandle.getState not registered FAILED state upon Spark app failure Key: SPARK-19379 URL: https://issues.apache.org/jira/browse/SPARK-19379 Project: Spa

[jira] [Updated] (SPARK-19378) StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger

2017-01-26 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-19378: Description: If you have a StreamingDataFrame with an aggregation, we report a metric called state

[jira] [Assigned] (SPARK-19378) StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19378: Assignee: Burak Yavuz (was: Apache Spark) > StateOperator metrics should still return the

[jira] [Assigned] (SPARK-19378) StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19378: Assignee: Apache Spark (was: Burak Yavuz) > StateOperator metrics should still return the

[jira] [Commented] (SPARK-19378) StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840757#comment-15840757 ] Apache Spark commented on SPARK-19378: -- User 'brkyvz' has created a pull request for

[jira] [Commented] (SPARK-4638) Spark's MLlib SVM classification to include Kernels like Gaussian / (RBF) to find non linear boundaries

2017-01-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840711#comment-15840711 ] Joseph K. Bradley commented on SPARK-4638: -- Commenting here b/c of the recent dev

[jira] [Created] (SPARK-19378) StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger

2017-01-26 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-19378: --- Summary: StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger Key: SPARK-19378 URL: https://issues.apache.org/jira/bro

[jira] [Updated] (SPARK-18080) Locality Sensitive Hashing (LSH) Python API

2017-01-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18080: -- Assignee: Yun Ni (was: Yanbo Liang) > Locality Sensitive Hashing (LSH) Python API > --

[jira] [Updated] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-01-26 Thread Devaraj K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated SPARK-19354: -- Description: When we enable speculation, we can see there are multiple attempts running for the same t

[jira] [Updated] (SPARK-19377) Killed tasks should have the status as KILLED

2017-01-26 Thread Devaraj K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated SPARK-19377: -- Description: |143|10 |0 |SUCCESS|NODE_LOCAL |6 / x.xx.x.x stdout stderr |2017/

[jira] [Created] (SPARK-19377) Killed tasks should have the status as KILLED

2017-01-26 Thread Devaraj K (JIRA)
Devaraj K created SPARK-19377: - Summary: Killed tasks should have the status as KILLED Key: SPARK-19377 URL: https://issues.apache.org/jira/browse/SPARK-19377 Project: Spark Issue Type: Improveme

[jira] [Assigned] (SPARK-19067) mapWithState Style API

2017-01-26 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das reassigned SPARK-19067: - Assignee: Tathagata Das > mapWithState Style API > -- > >

[jira] [Commented] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-01-26 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840569#comment-15840569 ] Thunder Stumpges commented on SPARK-19371: -- Thanks Sean. I don't think I can (al

[jira] [Resolved] (SPARK-19376) CLONE - CheckAnalysis rejects TPCDS query 32

2017-01-26 Thread Mostafa Shahdadi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Shahdadi resolved SPARK-19376. -- Resolution: Fixed > CLONE - CheckAnalysis rejects TPCDS query 32 >

[jira] [Created] (SPARK-19376) CLONE - CheckAnalysis rejects TPCDS query 32

2017-01-26 Thread Mostafa Shahdadi (JIRA)
Mostafa Shahdadi created SPARK-19376: Summary: CLONE - CheckAnalysis rejects TPCDS query 32 Key: SPARK-19376 URL: https://issues.apache.org/jira/browse/SPARK-19376 Project: Spark Issue Ty

[jira] [Updated] (SPARK-19364) Stream Blocks in Storage Persists Forever when Kinesis Checkpoints are enabled and an exception is thrown

2017-01-26 Thread Andrew Milkowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Milkowski updated SPARK-19364: - Priority: Blocker (was: Major) > Stream Blocks in Storage Persists Forever when Kinesis

[jira] [Updated] (SPARK-19364) Stream Blocks in Storage Persists Forever when Kinesis Checkpoints are enabled and an exception is thrown

2017-01-26 Thread Andrew Milkowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Milkowski updated SPARK-19364: - Description: -- update --- we found that below situation occurs when we encounter "com.a

[jira] [Updated] (SPARK-19364) Stream Blocks in Storage Persists Forever when Kinesis Checkpoints are enabled and an exception is thrown

2017-01-26 Thread Andrew Milkowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Milkowski updated SPARK-19364: - Summary: Stream Blocks in Storage Persists Forever when Kinesis Checkpoints are enabled a

[jira] [Commented] (SPARK-19111) S3 Mesos history upload fails silently if too large

2017-01-26 Thread Charles Allen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840419#comment-15840419 ] Charles Allen commented on SPARK-19111: --- While switching to s3a helped the logs upl

[jira] [Commented] (SPARK-19111) S3 Mesos history upload fails silently if too large

2017-01-26 Thread Charles Allen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840418#comment-15840418 ] Charles Allen commented on SPARK-19111: --- We have a patch https://github.com/apache/

[jira] [Updated] (SPARK-19364) Some Stream Blocks in Storage Persists Forever

2017-01-26 Thread Andrew Milkowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Milkowski updated SPARK-19364: - Description: -- update --- we found that below situation occurs when we encounter "com.a

[jira] [Updated] (SPARK-19364) Some Stream Blocks in Storage Persists Forever

2017-01-26 Thread Andrew Milkowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Milkowski updated SPARK-19364: - Description: -- update --- we found that below situation occurs when we encounter "com.a

[jira] [Updated] (SPARK-19364) Some Stream Blocks in Storage Persists Forever

2017-01-26 Thread Andrew Milkowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Milkowski updated SPARK-19364: - Description: *** update *** we found that below situation occurs when we encounter "com.

[jira] [Commented] (SPARK-15505) Explode nested Array in DF Column into Multiple Columns

2017-01-26 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840390#comment-15840390 ] Herman van Hovell commented on SPARK-15505: --- I am closing this as a won't fix.

[jira] [Closed] (SPARK-19375) na.fill() should not change the data type of column

2017-01-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-19375. -- Resolution: Duplicate > na.fill() should not change the data type of column > -

[jira] [Closed] (SPARK-15505) Explode nested Array in DF Column into Multiple Columns

2017-01-26 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-15505. - Resolution: Won't Fix > Explode nested Array in DF Column into Multiple Columns > --

[jira] [Commented] (SPARK-18080) Locality Sensitive Hashing (LSH) Python API

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840388#comment-15840388 ] Apache Spark commented on SPARK-18080: -- User 'Yunni' has created a pull request for

[jira] [Updated] (SPARK-19374) java.security.KeyManagementException: Default SSLContext is initialized automatically

2017-01-26 Thread Derek M Miller (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Derek M Miller updated SPARK-19374: --- Description: I am currently getting an SSL error when turning on ssl. I have confirmed nothi

[jira] [Updated] (SPARK-19374) java.security.KeyManagementException: Default SSLContext is initialized automatically

2017-01-26 Thread Derek M Miller (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Derek M Miller updated SPARK-19374: --- Description: I am currently getting an SSL error when turning on ssl. I have confirmed nothi

[jira] [Updated] (SPARK-19374) java.security.KeyManagementException: Default SSLContext is initialized automatically

2017-01-26 Thread Derek M Miller (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Derek M Miller updated SPARK-19374: --- Description: I am currently getting an SSL error when turning on ssl. I have confirmed nothi

[jira] [Created] (SPARK-19375) na.fill() should not change the data type of column

2017-01-26 Thread Davies Liu (JIRA)
Davies Liu created SPARK-19375: -- Summary: na.fill() should not change the data type of column Key: SPARK-19375 URL: https://issues.apache.org/jira/browse/SPARK-19375 Project: Spark Issue Type: B

[jira] [Created] (SPARK-19374) java.security.KeyManagementException: Default SSLContext is initialized automatically

2017-01-26 Thread Derek M Miller (JIRA)
Derek M Miller created SPARK-19374: -- Summary: java.security.KeyManagementException: Default SSLContext is initialized automatically Key: SPARK-19374 URL: https://issues.apache.org/jira/browse/SPARK-19374

[jira] [Commented] (SPARK-19316) Spark event logs are huge compared to 1.5.2

2017-01-26 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840348#comment-15840348 ] Jisoo Kim commented on SPARK-19316: --- Found the duplicate and made a PR to resolve the i

[jira] [Assigned] (SPARK-16333) Excessive Spark history event/json data size (5GB each)

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16333: Assignee: (was: Apache Spark) > Excessive Spark history event/json data size (5GB each

[jira] [Assigned] (SPARK-16333) Excessive Spark history event/json data size (5GB each)

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16333: Assignee: Apache Spark > Excessive Spark history event/json data size (5GB each) > ---

[jira] [Commented] (SPARK-16333) Excessive Spark history event/json data size (5GB each)

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840340#comment-15840340 ] Apache Spark commented on SPARK-16333: -- User 'jisookim0513' has created a pull reque

[jira] [Commented] (SPARK-15505) Explode nested Array in DF Column into Multiple Columns

2017-01-26 Thread Jorge Machado (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840227#comment-15840227 ] Jorge Machado commented on SPARK-15505: --- Sometimes you read from a database that ha

[jira] [Assigned] (SPARK-18873) New test cases for scalar subquery

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18873: Assignee: Apache Spark > New test cases for scalar subquery >

[jira] [Commented] (SPARK-18873) New test cases for scalar subquery

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840197#comment-15840197 ] Apache Spark commented on SPARK-18873: -- User 'nsyca' has created a pull request for

[jira] [Assigned] (SPARK-18873) New test cases for scalar subquery

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18873: Assignee: (was: Apache Spark) > New test cases for scalar subquery > -

[jira] [Created] (SPARK-19373) Mesos implementation of spark.scheduler.minRegisteredResourcesRatio looks at acquired cores rather than registerd cores

2017-01-26 Thread Michael Gummelt (JIRA)
Michael Gummelt created SPARK-19373: --- Summary: Mesos implementation of spark.scheduler.minRegisteredResourcesRatio looks at acquired cores rather than registerd cores Key: SPARK-19373 URL: https://issues.apache

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2017-01-26 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840186#comment-15840186 ] Cheng Lian commented on SPARK-18539: [~viirya], sorry for the (super) late reply. Wha

[jira] [Commented] (SPARK-17975) EMLDAOptimizer fails with ClassCastException on YARN

2017-01-26 Thread Ilya Matiach (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840151#comment-15840151 ] Ilya Matiach commented on SPARK-17975: -- [~josephkb] I was able to verify that this i

[jira] [Commented] (SPARK-19220) SSL redirect handler only redirects the server's root

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840138#comment-15840138 ] Apache Spark commented on SPARK-19220: -- User 'vanzin' has created a pull request for

[jira] [Resolved] (SPARK-19338) Always Identical Name for UDF in the EXPLAIN output

2017-01-26 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-19338. - Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 > Always Identical Name for UDF in

[jira] [Updated] (SPARK-19338) Always Identical Name for UDF in the EXPLAIN output

2017-01-26 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-19338: Assignee: Takeshi Yamamuro > Always Identical Name for UDF in the EXPLAIN output > ---

[jira] [Commented] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840113#comment-15840113 ] Sean Owen commented on SPARK-19371: --- There's a tension between waiting for locality and

[jira] [Assigned] (SPARK-18872) New test cases for EXISTS subquery

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18872: Assignee: Apache Spark > New test cases for EXISTS subquery >

[jira] [Commented] (SPARK-18872) New test cases for EXISTS subquery

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840098#comment-15840098 ] Apache Spark commented on SPARK-18872: -- User 'dilipbiswal' has created a pull reques

[jira] [Assigned] (SPARK-18872) New test cases for EXISTS subquery

2017-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18872: Assignee: (was: Apache Spark) > New test cases for EXISTS subquery > -

[jira] [Created] (SPARK-19372) Code generation for Filter predicate including many OR conditions exceeds JVM method size limit

2017-01-26 Thread Jay Pranavamurthi (JIRA)
Jay Pranavamurthi created SPARK-19372: - Summary: Code generation for Filter predicate including many OR conditions exceeds JVM method size limit Key: SPARK-19372 URL: https://issues.apache.org/jira/browse/SPA

[jira] [Updated] (SPARK-19372) Code generation for Filter predicate including many OR conditions exceeds JVM method size limit

2017-01-26 Thread Jay Pranavamurthi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Pranavamurthi updated SPARK-19372: -- Attachment: wide400cols.csv > Code generation for Filter predicate including many OR co

[jira] [Created] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-01-26 Thread Thunder Stumpges (JIRA)
Thunder Stumpges created SPARK-19371: Summary: Cannot spread cached partitions evenly across executors Key: SPARK-19371 URL: https://issues.apache.org/jira/browse/SPARK-19371 Project: Spark

[jira] [Resolved] (SPARK-19369) SparkConf not getting properly initialized in PySpark 2.1.0

2017-01-26 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19369. Resolution: Duplicate Try setting those in the command line for now; this will be fixed in

[jira] [Updated] (SPARK-19370) Flaky test: MetadataCacheSuite

2017-01-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-19370: --- Affects Version/s: 2.1.0 > Flaky test: MetadataCacheSuite > -- > >

[jira] [Created] (SPARK-19370) Flaky test: MetadataCacheSuite

2017-01-26 Thread Davies Liu (JIRA)
Davies Liu created SPARK-19370: -- Summary: Flaky test: MetadataCacheSuite Key: SPARK-19370 URL: https://issues.apache.org/jira/browse/SPARK-19370 Project: Spark Issue Type: Test Repor

[jira] [Updated] (SPARK-19370) Flaky test: MetadataCacheSuite

2017-01-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-19370: --- Description: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.4/2703/conso

[jira] [Updated] (SPARK-19368) Very bad performance in BlockMatrix.toIndexedRowMatrix()

2017-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-19368: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) Do you see ways to optimize for

[jira] [Commented] (SPARK-15505) Explode nested Array in DF Column into Multiple Columns

2017-01-26 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839908#comment-15839908 ] Herman van Hovell commented on SPARK-15505: --- H This is will be quite ba

[jira] [Created] (SPARK-19369) SparkConf not getting properly initialized in PySpark 2.1.0

2017-01-26 Thread Sidney Feiner (JIRA)
Sidney Feiner created SPARK-19369: - Summary: SparkConf not getting properly initialized in PySpark 2.1.0 Key: SPARK-19369 URL: https://issues.apache.org/jira/browse/SPARK-19369 Project: Spark

[jira] [Updated] (SPARK-19369) SparkConf not getting properly initialized in PySpark 2.1.0

2017-01-26 Thread Sidney Feiner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sidney Feiner updated SPARK-19369: -- Description: Trying to migrate from Spark 1.6 to 2.1, I've stumbled upon a small problem - my

[jira] [Commented] (SPARK-19122) Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order

2017-01-26 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839845#comment-15839845 ] Herman van Hovell commented on SPARK-19122: --- [~tejasp] We should fix this. The

[jira] [Commented] (SPARK-5786) Documentation of Narrow Dependencies

2017-01-26 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839788#comment-15839788 ] Hyukjin Kwon commented on SPARK-5786: - It seems they are documented, at least, in API

[jira] [Resolved] (SPARK-2687) after receving allocated containers,amClient should remove ContainerRequest.

2017-01-26 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-2687. - Resolution: Duplicate I am resolving this per https://github.com/apache/spark/pull/3245#issuecomm

[jira] [Commented] (SPARK-18839) Executor is active on web, but actually is dead

2017-01-26 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839719#comment-15839719 ] Hyukjin Kwon commented on SPARK-18839: -- [~uncleGen] Would you mind elaborating why y

[jira] [Commented] (SPARK-18579) spark-csv strips whitespace (pyspark)

2017-01-26 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839704#comment-15839704 ] Hyukjin Kwon commented on SPARK-18579: -- Can we just strip them within the dataframe/

[jira] [Resolved] (SPARK-19361) kafka.maxRatePerPartition for compacted topic cause exception

2017-01-26 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-19361. Resolution: Duplicate > kafka.maxRatePerPartition for compacted topic cause exception > ---

[jira] [Commented] (SPARK-19361) kafka.maxRatePerPartition for compacted topic cause exception

2017-01-26 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839683#comment-15839683 ] Cody Koeninger commented on SPARK-19361: Compacted topics in general don't work w

[jira] [Commented] (SPARK-19368) Very bad performance in BlockMatrix.toIndexedRowMatrix()

2017-01-26 Thread Ohad Raviv (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839673#comment-15839673 ] Ohad Raviv commented on SPARK-19368: caused by.. > Very bad performance in BlockMatr

[jira] [Updated] (SPARK-19368) Very bad performance in BlockMatrix.toIndexedRowMatrix()

2017-01-26 Thread Ohad Raviv (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ohad Raviv updated SPARK-19368: --- Attachment: profiler snapshot.png > Very bad performance in BlockMatrix.toIndexedRowMatrix() > --

[jira] [Resolved] (SPARK-17734) inner equi-join shorthand that returns Datasets, like DataFrame already has

2017-01-26 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-17734. -- Resolution: Won't Fix We have {{joinWith}} to return {{Dataset}}. Also, we have {{join}} and {

[jira] [Updated] (SPARK-19368) Very bad performance in BlockMatrix.toIndexedRowMatrix()

2017-01-26 Thread Ohad Raviv (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ohad Raviv updated SPARK-19368: --- Description: In SPARK-12869, this function was optimized for the case of dense matrices using Breeze

[jira] [Created] (SPARK-19368) Very bad performance in BlockMatrix.toIndexedRowMatrix()

2017-01-26 Thread Ohad Raviv (JIRA)
Ohad Raviv created SPARK-19368: -- Summary: Very bad performance in BlockMatrix.toIndexedRowMatrix() Key: SPARK-19368 URL: https://issues.apache.org/jira/browse/SPARK-19368 Project: Spark Issue Ty

[jira] [Created] (SPARK-19367) Hive metastore temporary configuration doesn't specify default filesystem

2017-01-26 Thread Jacek Lewandowski (JIRA)
Jacek Lewandowski created SPARK-19367: - Summary: Hive metastore temporary configuration doesn't specify default filesystem Key: SPARK-19367 URL: https://issues.apache.org/jira/browse/SPARK-19367 P

  1   2   >