[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738648#comment-14738648
]
Sean Owen commented on SPARK-10493:
---
OK, yes I see now that temp4 is count-ed. I'm out of ideas. I
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738829#comment-14738829
]
Sean Owen commented on SPARK-10493:
---
Maybe union() tides you over; CDH 5.5 = Spark 1.5 is coming in ~2
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738794#comment-14738794
]
Glenn Strycker commented on SPARK-10493:
Unfortunately we don't have anything past 1.3.0. We're
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736879#comment-14736879
]
Sean Owen commented on SPARK-10493:
---
That much should be OK.
zipPartitions only makes sense if you
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736869#comment-14736869
]
Glenn Strycker commented on SPARK-10493:
The RDD I am using has the form ((String, String),
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737001#comment-14737001
]
Glenn Strycker commented on SPARK-10493:
In this example, our RDDs are partitioned with a hash
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737051#comment-14737051
]
Sean Owen commented on SPARK-10493:
---
I think you still have the same issue with zipPartitions, unless
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737050#comment-14737050
]
Sean Owen commented on SPARK-10493:
---
I think you still have the same issue with zipPartitions, unless
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737730#comment-14737730
]
Sean Owen commented on SPARK-10493:
---
checkpoint doesn't materialize the RDD, which is why it occurred
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737727#comment-14737727
]
Glenn Strycker commented on SPARK-10493:
I already have that added in my code that I'm testing...
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737681#comment-14737681
]
Sean Owen commented on SPARK-10493:
---
If the RDD is a result of reduceByKey, I agree that the keys
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737735#comment-14737735
]
Glenn Strycker commented on SPARK-10493:
Of course. I have count statements everywhere in order
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737252#comment-14737252
]
Sean Owen commented on SPARK-10493:
---
What do you mean that it's not collapsing key pairs? the output of
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737296#comment-14737296
]
Glenn Strycker commented on SPARK-10493:
[~srowen], the code I attached did run correctly.
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735626#comment-14735626
]
Glenn Strycker commented on SPARK-10493:
Thanks for the speedy follow-up, [~frosner]!
I'm
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735653#comment-14735653
]
Glenn Strycker commented on SPARK-10493:
Note: this only seems to be occurring "at scale" so far.
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735598#comment-14735598
]
Frank Rosner commented on SPARK-10493:
--
Thanks for submitting the issue, [~glenn.strycker] :)
Can
[
https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735772#comment-14735772
]
Sean Owen commented on SPARK-10493:
---
There are some key pieces of info missing, like what the key and
18 matches
Mail list logo