[
https://issues.apache.org/jira/browse/SPARK-8997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14623292#comment-14623292
]
Feynman Liang edited comment on SPARK-8997 at 7/12/15 11:43 PM:
Why PrimitiveKeyOpenHashMap if keys will be Array[Int] (and later
Array[Array[Item]]), which are not primitive and will not benefit from
@specialized annotations?
I'm also not clear on what is meant by 3; aren't list and array both eager (did
you mean to use a Stream (lazy) or ArrayBuffer (in-place update))? Which part
of the code exactly are you referring to?
was (Author: fliang):
Why PrimitiveKeyOpenHashMap if keys will be Array[Int] (and later
Array[Array[Item]]), which are not primitive and will not benefit from
@specialized annotations?
I'm also not clear on what is meant by 3; aren't list and array both eager (did
you mean to use a Stream)? Which part of the code exactly are you referring to?
Improve LocalPrefixSpan performance
---
Key: SPARK-8997
URL: https://issues.apache.org/jira/browse/SPARK-8997
Project: Spark
Issue Type: Improvement
Components: MLlib
Affects Versions: 1.5.0
Reporter: Xiangrui Meng
Assignee: Feynman Liang
Original Estimate: 24h
Remaining Estimate: 24h
We can improve the performance by:
1. run should output Iterator instead of Array
2. Local count shouldn't use groupBy, which creates too many arrays. We can
use PrimitiveKeyOpenHashMap
3. We can use list to avoid materialize frequent sequences
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org