[jira] [Commented] (SPARK-20679) Let ALS recommend for a subset of users/items

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002360#comment-16002360 ] Nick Pentreath commented on SPARK-20679: I'm working on this > Let ALS recommend for a subset of

[jira] [Commented] (SPARK-1449) Please delete old releases from mirroring system

2017-05-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002351#comment-16002351 ] Sean Owen commented on SPARK-1449: -- [~marmbrus] I think this step got missed in the last few releases. I

[jira] [Commented] (SPARK-10802) Let ALS recommend for subset of data

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002343#comment-16002343 ] Nick Pentreath commented on SPARK-10802: Hey folks - since the {{ALSModel}} in the ML API now

[jira] [Created] (SPARK-20679) Let ALS recommend for a subset of users/items

2017-05-09 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-20679: -- Summary: Let ALS recommend for a subset of users/items Key: SPARK-20679 URL: https://issues.apache.org/jira/browse/SPARK-20679 Project: Spark Issue

[jira] [Comment Edited] (SPARK-10408) Autoencoder

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002326#comment-16002326 ] Nick Pentreath edited comment on SPARK-10408 at 5/9/17 9:00 AM: What is

[jira] [Commented] (SPARK-10408) Autoencoder

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002326#comment-16002326 ] Nick Pentreath commented on SPARK-10408: What is the status here? I think it's fairly safe to say

[jira] [Commented] (SPARK-17685) WholeStageCodegenExec throws IndexOutOfBoundsException

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002320#comment-16002320 ] Apache Spark commented on SPARK-17685: -- User 'wangyum' has created a pull request for this issue:

[jira] [Commented] (SPARK-6323) Large rank matrix factorization with Nonlinear loss and constraints

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002310#comment-16002310 ] Nick Pentreath commented on SPARK-6323: --- I think it is safe to say this will not be feasible to

[jira] [Assigned] (SPARK-20615) SparseVector.argmax throws IndexOutOfBoundsException when the sparse vector has a size greater than zero but no elements defined.

2017-05-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-20615: - Assignee: Jon McLean > SparseVector.argmax throws IndexOutOfBoundsException when the sparse

[jira] [Resolved] (SPARK-20615) SparseVector.argmax throws IndexOutOfBoundsException when the sparse vector has a size greater than zero but no elements defined.

2017-05-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20615. --- Resolution: Fixed Fix Version/s: 2.1.2 2.2.1 Issue resolved by pull

[jira] [Commented] (SPARK-20638) Optimize the CartesianRDD to reduce repeatedly data fetching

2017-05-09 Thread Teng Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002299#comment-16002299 ] Teng Jiang commented on SPARK-20638: I think it is just buffered > Optimize the CartesianRDD to

[jira] [Assigned] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20677: Assignee: Nick Pentreath (was: Apache Spark) > Clean up ALS recommend all improvement

[jira] [Commented] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002297#comment-16002297 ] Apache Spark commented on SPARK-20677: -- User 'MLnick' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20677: Assignee: Apache Spark (was: Nick Pentreath) > Clean up ALS recommend all improvement

[jira] [Updated] (SPARK-20678) Ndv for columns not in filter condition should also be updated

2017-05-09 Thread Zhenhua Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhenhua Wang updated SPARK-20678: - Description: In filter estimation, we update column stats for those columns in filter condition.

[jira] [Commented] (SPARK-20638) Optimize the CartesianRDD to reduce repeatedly data fetching

2017-05-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002295#comment-16002295 ] Sean Owen commented on SPARK-20638: --- I am still not clear why grouped() is better than buffering the

[jira] [Assigned] (SPARK-20678) Ndv for columns not in filter condition should also be updated

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20678: Assignee: Apache Spark > Ndv for columns not in filter condition should also be updated >

[jira] [Assigned] (SPARK-20678) Ndv for columns not in filter condition should also be updated

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20678: Assignee: (was: Apache Spark) > Ndv for columns not in filter condition should also

[jira] [Commented] (SPARK-20678) Ndv for columns not in filter condition should also be updated

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002293#comment-16002293 ] Apache Spark commented on SPARK-20678: -- User 'wzhfy' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20600) KafkaRelation should be pretty printed in web UI (Details for Query)

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20600: Assignee: (was: Apache Spark) > KafkaRelation should be pretty printed in web UI

[jira] [Assigned] (SPARK-20600) KafkaRelation should be pretty printed in web UI (Details for Query)

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20600: Assignee: Apache Spark > KafkaRelation should be pretty printed in web UI (Details for

[jira] [Commented] (SPARK-20600) KafkaRelation should be pretty printed in web UI (Details for Query)

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002289#comment-16002289 ] Apache Spark commented on SPARK-20600: -- User 'jaceklaskowski' has created a pull request for this

[jira] [Comment Edited] (SPARK-18004) DataFrame filter Predicate push-down fails for Oracle Timestamp type columns

2017-05-09 Thread Teng Yutong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002282#comment-16002282 ] Teng Yutong edited comment on SPARK-18004 at 5/9/17 8:32 AM: - a silly but

[jira] [Comment Edited] (SPARK-18004) DataFrame filter Predicate push-down fails for Oracle Timestamp type columns

2017-05-09 Thread Teng Yutong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002282#comment-16002282 ] Teng Yutong edited comment on SPARK-18004 at 5/9/17 8:32 AM: - a silly but

[jira] [Created] (SPARK-20678) Ndv for columns not in filter condition should also be updated

2017-05-09 Thread Zhenhua Wang (JIRA)
Zhenhua Wang created SPARK-20678: Summary: Ndv for columns not in filter condition should also be updated Key: SPARK-20678 URL: https://issues.apache.org/jira/browse/SPARK-20678 Project: Spark

[jira] [Commented] (SPARK-18004) DataFrame filter Predicate push-down fails for Oracle Timestamp type columns

2017-05-09 Thread Teng Yutong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002282#comment-16002282 ] Teng Yutong commented on SPARK-18004: - a silly but doable workaround: cast TimestampType row to

[jira] [Resolved] (SPARK-20587) Improve performance of ML ALS recommendForAll

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20587. Resolution: Fixed Fix Version/s: 2.2.1 Issue resolved by pull request 17845

[jira] [Resolved] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-11968. Resolution: Fixed Fix Version/s: 2.2.1 Issue resolved by pull request 17742

[jira] [Assigned] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-11968: -- Assignee: Peng Meng (was: Nick Pentreath) > ALS recommend all methods spend most of

[jira] [Commented] (SPARK-20590) Map default input data source formats to inlined classes

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002225#comment-16002225 ] Apache Spark commented on SPARK-20590: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-20677: -- Assignee: Nick Pentreath > Clean up ALS recommend all improvement code. >

[jira] [Updated] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20677: --- Description: SPARK-11968 and SPARK-20587 added performance improvements to the "recommend

[jira] [Updated] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20677: --- Description: SPARK-11968 and SPARK-20587 added performance improvements to the "recommend

[jira] [Closed] (SPARK-20673) LDA `optimizer` do not really support case insensitive

2017-05-09 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng closed SPARK-20673. Resolution: Duplicate > LDA `optimizer` do not really support case insensitive >

[jira] [Created] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-20677: -- Summary: Clean up ALS recommend all improvement code. Key: SPARK-20677 URL: https://issues.apache.org/jira/browse/SPARK-20677 Project: Spark Issue

[jira] [Updated] (SPARK-20675) Support Index to skip when retrieval disk structure in CoGroupedRDD

2017-05-09 Thread darion yaphet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] darion yaphet updated SPARK-20675: -- Description: CoGroupedRDD's compute() will retrieval each StreamBuffer(a disk structure

[jira] [Closed] (SPARK-20671) Processing muitple kafka topics with single spark streaming context hangs on batchSubmitted.

2017-05-09 Thread amit kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amit kumar closed SPARK-20671. -- Resolution: Not A Problem My Bad . I configured it wrong. setMaster("local[*]") in place of

[jira] [Created] (SPARK-20676) Upload to PyPi

2017-05-09 Thread holdenk (JIRA)
holdenk created SPARK-20676: --- Summary: Upload to PyPi Key: SPARK-20676 URL: https://issues.apache.org/jira/browse/SPARK-20676 Project: Spark Issue Type: Sub-task Components: PySpark

[jira] [Created] (SPARK-20675) Support Index to skip when retrieval disk structure in CoGroupedRDD

2017-05-09 Thread darion yaphet (JIRA)
darion yaphet created SPARK-20675: - Summary: Support Index to skip when retrieval disk structure in CoGroupedRDD Key: SPARK-20675 URL: https://issues.apache.org/jira/browse/SPARK-20675 Project:

[jira] [Assigned] (SPARK-20674) Support registering UserDefinedFunction as named UDF

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20674: Assignee: Apache Spark (was: Reynold Xin) > Support registering UserDefinedFunction as

[jira] [Assigned] (SPARK-20674) Support registering UserDefinedFunction as named UDF

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20674: Assignee: Reynold Xin (was: Apache Spark) > Support registering UserDefinedFunction as

[jira] [Commented] (SPARK-20674) Support registering UserDefinedFunction as named UDF

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002142#comment-16002142 ] Apache Spark commented on SPARK-20674: -- User 'rxin' has created a pull request for this issue:

[jira] [Created] (SPARK-20674) Support registering UserDefinedFunction as named UDF

2017-05-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-20674: --- Summary: Support registering UserDefinedFunction as named UDF Key: SPARK-20674 URL: https://issues.apache.org/jira/browse/SPARK-20674 Project: Spark Issue

[jira] [Assigned] (SPARK-20673) LDA `optimizer` do not really support case insensitive

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20673: Assignee: (was: Apache Spark) > LDA `optimizer` do not really support case

[jira] [Assigned] (SPARK-20673) LDA `optimizer` do not really support case insensitive

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20673: Assignee: Apache Spark > LDA `optimizer` do not really support case insensitive >

[jira] [Commented] (SPARK-20673) LDA `optimizer` do not really support case insensitive

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002130#comment-16002130 ] Apache Spark commented on SPARK-20673: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-20672) Keep the `isStreaming` property in triggerLogicalPlan in Structured Streaming

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20672: Assignee: (was: Apache Spark) > Keep the `isStreaming` property in triggerLogicalPlan

[jira] [Assigned] (SPARK-20672) Keep the `isStreaming` property in triggerLogicalPlan in Structured Streaming

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20672: Assignee: Apache Spark > Keep the `isStreaming` property in triggerLogicalPlan in

[jira] [Commented] (SPARK-20672) Keep the `isStreaming` property in triggerLogicalPlan in Structured Streaming

2017-05-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002127#comment-16002127 ] Apache Spark commented on SPARK-20672: -- User 'uncleGen' has created a pull request for this issue:

[jira] [Created] (SPARK-20673) LDA `optimizer` do not really support case insensitive

2017-05-09 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-20673: Summary: LDA `optimizer` do not really support case insensitive Key: SPARK-20673 URL: https://issues.apache.org/jira/browse/SPARK-20673 Project: Spark

[jira] [Commented] (SPARK-20608) Standby namenodes should be allowed to included in yarn.spark.access.namenodes to support HDFS HA

2017-05-09 Thread Yuechen Chen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002123#comment-16002123 ] Yuechen Chen commented on SPARK-20608: -- All said, I think it is unreasonable that spark application

[jira] [Created] (SPARK-20672) Keep the `isStreaming` property in triggerLogicalPlan in Structured Streaming

2017-05-09 Thread Genmao Yu (JIRA)
Genmao Yu created SPARK-20672: - Summary: Keep the `isStreaming` property in triggerLogicalPlan in Structured Streaming Key: SPARK-20672 URL: https://issues.apache.org/jira/browse/SPARK-20672 Project:

[jira] [Commented] (SPARK-20608) Standby namenodes should be allowed to included in yarn.spark.access.namenodes to support HDFS HA

2017-05-09 Thread Yuechen Chen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002119#comment-16002119 ] Yuechen Chen commented on SPARK-20608: -- Thanks [~liuml07], [~vanzin]. My colleague in charge of

<    1   2