[jira] [Created] (SPARK-7214) Unrolling never evicts blocks when MemoryStore is nearly full
Charles Reiss created SPARK-7214: Summary: Unrolling never evicts blocks when MemoryStore is nearly full Key: SPARK-7214 URL: https://issues.apache.org/jira/browse/SPARK-7214 Project: Spark Issue Type: Bug Components: Block Manager Reporter: Charles Reiss Priority: Minor When less than spark.storage.unrollMemoryThreshold (default 1MB) is left in the MemoryStore, new blocks that are computed with unrollSafely (e.g. any cached RDD split) will always fail unrolling even if old blocks could be dropped to accommodate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-4157) Task input statistics incomplete when a task reads from multiple locations
[ https://issues.apache.org/jira/browse/SPARK-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Reiss resolved SPARK-4157. -- Resolution: Duplicate Task input statistics incomplete when a task reads from multiple locations -- Key: SPARK-4157 URL: https://issues.apache.org/jira/browse/SPARK-4157 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Charles Reiss Priority: Minor SPARK-1683 introduced tracking of filesystem reads for tasks, but the tracking code assumes that each task reads from exactly one file/cache block, and replaces any prior InputMetrics object for a task after each read. But, for example, a task computing a shuffle-less join (input RDDs are prepartitioned by key) may read two or more cached dependency RDD blocks from cache. In this case, the displayed input size will be for whichever dependency was requested last. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4351) Record cacheable RDD reads and display RDD miss rates
Charles Reiss created SPARK-4351: Summary: Record cacheable RDD reads and display RDD miss rates Key: SPARK-4351 URL: https://issues.apache.org/jira/browse/SPARK-4351 Project: Spark Issue Type: Improvement Reporter: Charles Reiss Priority: Minor Currently, when Spark fails to keep an RDD cached, there is little visibility to the user (beyond performance effects), especially if the user is not reading executor logs. We could expose this information to the Web UI and the event log like we do for RDD storage information by reporting RDD reads and their results with task metrics. From this, live computation of RDD miss rates is straightforward, and information in the event log would enable more complicated post-hoc analyses. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-704) ConnectionManager sometimes cannot detect loss of sending connections
[ https://issues.apache.org/jira/browse/SPARK-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039396#comment-14039396 ] Charles Reiss commented on SPARK-704: - It's been a while since I reported this issue, so it may have been incidentally fixed. But this problem was with a remote node failure _after_ a message (or several messages) was successfully sent to that node but before a response was received. So, there would be no message to send to trigger a failing attempt to write to the channel. If there's a corresponding ReceivingConnection, then the remote node death would be detected via a failed read, but I believe the code in ConnectionManager#removeConnection would not reliably trigger the MessageStatuses. ConnectionManager sometimes cannot detect loss of sending connections - Key: SPARK-704 URL: https://issues.apache.org/jira/browse/SPARK-704 Project: Spark Issue Type: Bug Reporter: Charles Reiss Assignee: Henry Saputra ConnectionManager currently does not detect when SendingConnections disconnect except if it is trying to send through them. As a result, a node failure just after a connection is initiated but before any acknowledgement messages can be sent may result in a hang. ConnectionManager has code intended to detect this case by detecting the failure of a corresponding ReceivingConnection, but this code assumes that the remote host:port of the ReceivingConnection is the same as the ConnectionManagerId, which is almost never true. Additionally, there does not appear to be any reason to assume a corresponding ReceivingConnection will exist. -- This message was sent by Atlassian JIRA (v6.2#6252)