[jira] [Commented] (SPARK-12196) Store blocks in different speed storage devices by hierarchy way
[ https://issues.apache.org/jira/browse/SPARK-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073386#comment-15073386 ] wei wu commented on SPARK-12196: Yes, Hao. The local dir path format "[SSD]file:///") may be not identified by the Yarn local dir setting. Another question is that: if the use have mount the device (in production environment cluster ) as the follows : /mnt/c, /mnt/d, /mnt/e/, , /mnt/i If the user want to use the new feature in spark new version, the user should re-mount the disk device. we think the following configuration may be better: spark.local.dir = /mnt/c, /mnt/d, /mnt/e/, , /mnt/i spark.storage.hierarchyStore.reserved.quota = SSD 50GB, DISK, SSD 80GB, , DISK And we suggest the following configuration idea: I think we should set a space reverser thread in block manager to check if enough space is reserved for each SSD storage. The reserver is used to solve the no free SSD space problem when concurrently write blocks. Just like: spark.ssd. reserver.interval.ms = 1000 If the SSD capacity is small, the SSD may be cache the RDD or save the shuffle data. Different job may compete the SSD resource (may be cache RDD or shuffle data). But the user want to give priority in use of the SSD to cache the RDD. I think we should add the similar configuration to Flag for enabling the SSD storage to shuffle data. spark.ssd.shuffle.enabled = false > Store blocks in different speed storage devices by hierarchy way > > > Key: SPARK-12196 > URL: https://issues.apache.org/jira/browse/SPARK-12196 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Reporter: yucai > > *Problem* > Nowadays, users have both SSDs and HDDs. > SSDs have great performance, but capacity is small. HDDs have good capacity, > but x2-x3 lower than SSDs. > How can we get both good? > *Solution* > Our idea is to build hierarchy store: use SSDs as cache and HDDs as backup > storage. > When Spark core allocates blocks for RDD (either shuffle or RDD cache), it > gets blocks from SSDs first, and when SSD’s useable space is less than some > threshold, getting blocks from HDDs. > In our implementation, we actually go further. We support a way to build any > level hierarchy store access all storage medias (NVM, SSD, HDD etc.). > *Performance* > 1. At the best case, our solution performs the same as all SSDs. > 2. At the worst case, like all data are spilled to HDDs, no performance > regression. > 3. Compared with all HDDs, hierarchy store improves more than *_x1.86_* (it > could be higher, CPU reaches bottleneck in our test environment). > 4. Compared with Tachyon, our hierarchy store still *_x1.3_* faster. Because > we support both RDD cache and shuffle and no extra inter process > communication. > *Usage* > 1. Set the priority and threshold for each layer in > spark.storage.hierarchyStore. > {code} > spark.storage.hierarchyStore='nvm 50GB,ssd 80GB' > {code} > It builds a 3 layers hierarchy store: the 1st is "nvm", the 2nd is "sdd", all > the rest form the last layer. > 2. Configure each layer's location, user just needs put the keyword like > "nvm", "ssd", which are specified in step 1, into local dirs, like > spark.local.dir or yarn.nodemanager.local-dirs. > {code} > spark.local.dir=/mnt/nvm1,/mnt/ssd1,/mnt/ssd2,/mnt/ssd3,/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4,/mnt/others > {code} > After then, restart your Spark application, it will allocate blocks from nvm > first. > When nvm's usable space is less than 50GB, it starts to allocate from ssd. > When ssd's usable space is less than 80GB, it starts to allocate from the > last layer. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12196) Store blocks in different speed storage devices by hierarchy way
[ https://issues.apache.org/jira/browse/SPARK-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072634#comment-15072634 ] Cheng Hao commented on SPARK-12196: --- Thank you wei wu to support this feature! However, we're trying to avoid to change the existing configuration format, as it might impact the user applications, and besides, in Yarn/Mesos, this configuration key will not work anymore. An updated PR will be submitted soon, welcome to join the discussion the in PR. > Store blocks in different speed storage devices by hierarchy way > > > Key: SPARK-12196 > URL: https://issues.apache.org/jira/browse/SPARK-12196 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Reporter: yucai > > *Problem* > Nowadays, users have both SSDs and HDDs. > SSDs have great performance, but capacity is small. HDDs have good capacity, > but x2-x3 lower than SSDs. > How can we get both good? > *Solution* > Our idea is to build hierarchy store: use SSDs as cache and HDDs as backup > storage. > When Spark core allocates blocks for RDD (either shuffle or RDD cache), it > gets blocks from SSDs first, and when SSD’s useable space is less than some > threshold, getting blocks from HDDs. > In our implementation, we actually go further. We support a way to build any > level hierarchy store access all storage medias (NVM, SSD, HDD etc.). > *Performance* > 1. At the best case, our solution performs the same as all SSDs. > 2. At the worst case, like all data are spilled to HDDs, no performance > regression. > 3. Compared with all HDDs, hierarchy store improves more than *_x1.86_* (it > could be higher, CPU reaches bottleneck in our test environment). > 4. Compared with Tachyon, our hierarchy store still *_x1.3_* faster. Because > we support both RDD cache and shuffle and no extra inter process > communication. > *Usage* > 1. Set the priority and threshold for each layer in > spark.storage.hierarchyStore. > {code} > spark.storage.hierarchyStore='nvm 50GB,ssd 80GB' > {code} > It builds a 3 layers hierarchy store: the 1st is "nvm", the 2nd is "sdd", all > the rest form the last layer. > 2. Configure each layer's location, user just needs put the keyword like > "nvm", "ssd", which are specified in step 1, into local dirs, like > spark.local.dir or yarn.nodemanager.local-dirs. > {code} > spark.local.dir=/mnt/nvm1,/mnt/ssd1,/mnt/ssd2,/mnt/ssd3,/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4,/mnt/others > {code} > After then, restart your Spark application, it will allocate blocks from nvm > first. > When nvm's usable space is less than 50GB, it starts to allocate from ssd. > When ssd's usable space is less than 80GB, it starts to allocate from the > last layer. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12196) Store blocks in different speed storage devices by hierarchy way
[ https://issues.apache.org/jira/browse/SPARK-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072567#comment-15072567 ] wei wu commented on SPARK-12196: We also have the similar idea about Spark supported SSD for Block Manager and have done a prototype for it. And we also have done some performance test on it. How about we add the following function and API? We use the benchmark problem from databricks: https://github.com/databricks/spark-perf/tree/master/spark-tests, With the test configuration Executor number: 3, Per executor Memory: 4GB and 2 cores, Data Size(1867MB); The performance results is: Test case Memory SSDHDD Count 0.259s3s 6.75s count-with-filter 0.56s 3.24s 10s aggregate-by-key2s 4.8s 9s The prototype configuration just like as follows: We use the following Configuration that is similar with the hadoop data node path configuration: spark.local.dir = [DISK]file:/// disk0; [SSD]file:///disk1; [DISK]file:///disk2;[SSD]file:/// disk3; [DISK]file:/// disk4; [DISK]file:/// disk5; [DISK]file:/// disk6; [DISK]file:/// disk7; or spark.local.dir = file:/// disk0; [SSD];file:///disk1; file:///disk2;[SSD]file:/// disk3; file:/// disk4; file:/// disk5; file:/// disk6; file:/// disk7; or spark.local.dir = file:/// disk0;file:///disk1; file:///disk2;file:/// disk3; file:/// disk4; file:/// disk5; file:/// disk6; file:/// disk7; We add the [SSD] and [DISK] identifier for the different disk path. The [SSD] mark the disk as SSD storage. The [DISK] mark the disk as HDD disk. If we ignore the [DISK] in disk path, the disk is default as HDD storage. Add the related StorageLevel API for SSD: StorageLevel. MEMORY_AND_SSD // cache the block in memory, then ssd StorageLevel. SSD_ONLY //cache the block only in ssd StorageLevel. MEMORY_AND_SSD_AND_DISK //cache block in memory, then ssd, then hdd StorageLevel. SSD_AND_DISK // cache the block in ssd, then hdd For example: the user can use the follow API to cache the block data: RDD.persist(StorageLevel.MEMORY_AND_SSD) RDD.persist(StorageLevel.SSD) RDD.persist(StorageLevel.SSD_AND_DISK) RDD.persist(StorageLevel. MEMORY_AND_SSD_AND_DISK) > Store blocks in different speed storage devices by hierarchy way > > > Key: SPARK-12196 > URL: https://issues.apache.org/jira/browse/SPARK-12196 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Reporter: yucai > > *Problem* > Nowadays, users have both SSDs and HDDs. > SSDs have great performance, but capacity is small. HDDs have good capacity, > but x2-x3 lower than SSDs. > How can we get both good? > *Solution* > Our idea is to build hierarchy store: use SSDs as cache and HDDs as backup > storage. > When Spark core allocates blocks for RDD (either shuffle or RDD cache), it > gets blocks from SSDs first, and when SSD’s useable space is less than some > threshold, getting blocks from HDDs. > In our implementation, we actually go further. We support a way to build any > level hierarchy store access all storage medias (NVM, SSD, HDD etc.). > *Performance* > 1. At the best case, our solution performs the same as all SSDs. > 2. At the worst case, like all data are spilled to HDDs, no performance > regression. > 3. Compared with all HDDs, hierarchy store improves more than *_x1.86_* (it > could be higher, CPU reaches bottleneck in our test environment). > 4. Compared with Tachyon, our hierarchy store still *_x1.3_* faster. Because > we support both RDD cache and shuffle and no extra inter process > communication. > *Usage* > 1. Set the priority and threshold for each layer in > spark.storage.hierarchyStore. > {code} > spark.storage.hierarchyStore='nvm 50GB,ssd 80GB' > {code} > It builds a 3 layers hierarchy store: the 1st is "nvm", the 2nd is "sdd", all > the rest form the last layer. > 2. Configure each layer's location, user just needs put the keyword like > "nvm", "ssd", which are specified in step 1, into local dirs, like > spark.local.dir or yarn.nodemanager.local-dirs. > {code} > spark.local.dir=/mnt/nvm1,/mnt/ssd1,/mnt/ssd2,/mnt/ssd3,/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4,/mnt/others > {code} > After then, restart your Spark application, it will allocate blocks from nvm > first. > When nvm's usable space is less than 50GB, it starts to allocate from ssd. > When ssd's usable space is less than 80GB, it starts to allocate from the > last layer. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To
[jira] [Commented] (SPARK-12196) Store blocks in different speed storage devices by hierarchy way
[ https://issues.apache.org/jira/browse/SPARK-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073573#comment-15073573 ] yucai commented on SPARK-12196: --- Hello Wei, nice to know you :). Some explanation about our implementation: 1. About space reserve thread, in our previous version, we do have this kind of daemon, but we removed it finally, because based on our E2E testing, almost no overhead in current implement. getUseableSpace is to get some meta data in file system, most of time it is well cached by OS. Our testing environment is 4 HSW box(72 cores, 256GB memory, 10GB Nic, HDDs/SSDs) and running real customer case NWeight, which is to compute associations between two vertices that are n-hop away(e.g., friend-to-friend or video-to-video relationship for recommendation). 2. Our implementation does support shuffle data also. > Store blocks in different speed storage devices by hierarchy way > > > Key: SPARK-12196 > URL: https://issues.apache.org/jira/browse/SPARK-12196 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Reporter: yucai > > *Problem* > Nowadays, users have both SSDs and HDDs. > SSDs have great performance, but capacity is small. HDDs have good capacity, > but x2-x3 lower than SSDs. > How can we get both good? > *Solution* > Our idea is to build hierarchy store: use SSDs as cache and HDDs as backup > storage. > When Spark core allocates blocks for RDD (either shuffle or RDD cache), it > gets blocks from SSDs first, and when SSD’s useable space is less than some > threshold, getting blocks from HDDs. > In our implementation, we actually go further. We support a way to build any > level hierarchy store access all storage medias (NVM, SSD, HDD etc.). > *Performance* > 1. At the best case, our solution performs the same as all SSDs. > 2. At the worst case, like all data are spilled to HDDs, no performance > regression. > 3. Compared with all HDDs, hierarchy store improves more than *_x1.86_* (it > could be higher, CPU reaches bottleneck in our test environment). > 4. Compared with Tachyon, our hierarchy store still *_x1.3_* faster. Because > we support both RDD cache and shuffle and no extra inter process > communication. > *Usage* > 1. Set the priority and threshold for each layer in > spark.storage.hierarchyStore. > {code} > spark.storage.hierarchyStore='nvm 50GB,ssd 80GB' > {code} > It builds a 3 layers hierarchy store: the 1st is "nvm", the 2nd is "sdd", all > the rest form the last layer. > 2. Configure each layer's location, user just needs put the keyword like > "nvm", "ssd", which are specified in step 1, into local dirs, like > spark.local.dir or yarn.nodemanager.local-dirs. > {code} > spark.local.dir=/mnt/nvm1,/mnt/ssd1,/mnt/ssd2,/mnt/ssd3,/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4,/mnt/others > {code} > After then, restart your Spark application, it will allocate blocks from nvm > first. > When nvm's usable space is less than 50GB, it starts to allocate from ssd. > When ssd's usable space is less than 80GB, it starts to allocate from the > last layer. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12196) Store blocks in different speed storage devices by hierarchy way
[ https://issues.apache.org/jira/browse/SPARK-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15071354#comment-15071354 ] Zhang, Liye commented on SPARK-12196: - I am out of office with limited email access from 12/21/2015 to 12/25/2015. Sorry for slow email response. Any emergency, contact my manager (Cheng, Hao hao.ch...@intel.com). Thanks > Store blocks in different speed storage devices by hierarchy way > > > Key: SPARK-12196 > URL: https://issues.apache.org/jira/browse/SPARK-12196 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Reporter: yucai > > *Problem* > Nowadays, users have both SSDs and HDDs. > SSDs have great performance, but capacity is small. HDDs have good capacity, > but x2-x3 lower than SSDs. > How can we get both good? > *Solution* > Our idea is to build hierarchy store: use SSDs as cache and HDDs as backup > storage. > When Spark core allocates blocks for RDD (either shuffle or RDD cache), it > gets blocks from SSDs first, and when SSD’s useable space is less than some > threshold, getting blocks from HDDs. > In our implementation, we actually go further. We support a way to build any > level hierarchy store access all storage medias (NVM, SSD, HDD etc.). > *Performance* > 1. At the best case, our solution performs the same as all SSDs. > 2. At the worst case, like all data are spilled to HDDs, no performance > regression. > 3. Compared with all HDDs, hierarchy store improves more than *_x1.86_* (it > could be higher, CPU reaches bottleneck in our test environment). > 4. Compared with Tachyon, our hierarchy store still *_x1.3_* faster. Because > we support both RDD cache and shuffle and no extra inter process > communication. > *Usage* > 1. Set the priority and threshold for each layer in > spark.storage.hierarchyStore. > {code} > spark.storage.hierarchyStore='nvm 50GB,ssd 80GB' > {code} > It builds a 3 layers hierarchy store: the 1st is "nvm", the 2nd is "sdd", all > the rest form the last layer. > 2. Configure each layer's location, user just needs put the keyword like > "nvm", "ssd", which are specified in step 1, into local dirs, like > spark.local.dir or yarn.nodemanager.local-dirs. > {code} > spark.local.dir=/mnt/nvm1,/mnt/ssd1,/mnt/ssd2,/mnt/ssd3,/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4,/mnt/others > {code} > After then, restart your Spark application, it will allocate blocks from nvm > first. > When nvm's usable space is less than 50GB, it starts to allocate from ssd. > When ssd's usable space is less than 80GB, it starts to allocate from the > last layer. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org