[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-5605: Labels: BB2015-05-TBR (was: ) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > 64-bit jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Labels: BB2015-05-TBR > Fix For: 1.0.1 > > Attachments: MAPREDUCE-5605-v1.patch, TR-mammoth-HUST.pdf, > hadoop-core-1.0.1-mammoth-0.9.0.jar > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. The benchmark results shows that it can get impressive improvement > in typical cases. When the a system is relatively short of memory (eg, HPC, > small- and medium-size enterprises), the improvement will be even more > impressive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: TR-mammoth-HUST.pdf The design, evaluation and discoveries are included in the paper. > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > 64-bit jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Fix For: 1.0.1 > > Attachments: MAPREDUCE-5605-v1.patch, TR-mammoth-HUST.pdf, > hadoop-core-1.0.1-mammoth-0.9.0.jar > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. The benchmark results shows that it can get impressive improvement > in typical cases. When the a system is relatively short of memory (eg, HPC, > small- and medium-size enterprises), the improvement will be even more > impressive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Tags: memory-centric multi-thread optimization task (was: memory-centric muluti-thread optimization task) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > 64-bit jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Fix For: 1.0.1 > > Attachments: MAPREDUCE-5605-v1.patch, > hadoop-core-1.0.1-mammoth-0.9.0.jar > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. The benchmark results shows that it can get impressive improvement > in typical cases. When the a system is relatively short of memory (eg, HPC, > small- and medium-size enterprises), the improvement will be even more > impressive. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: hadoop-core-1.0.1-mammoth-0.9.0.jar > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > 64-bit jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Fix For: 1.0.1 > > Attachments: MAPREDUCE-5605-v1.patch, > hadoop-core-1.0.1-mammoth-0.9.0.jar > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. The benchmark results shows that it can get impressive improvement > in typical cases. When the a system is relatively short of memory (eg, HPC, > small- and medium-size enterprises), the improvement will be even more > impressive. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Fix Version/s: 1.0.1 > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > 64-bit jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Fix For: 1.0.1 > > Attachments: MAPREDUCE-5605-v1.patch > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. The benchmark results shows that it can get impressive improvement > in typical cases. When the a system is relatively short of memory (eg, HPC, > small- and medium-size enterprises), the improvement will be even more > impressive. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Environment: x86-64 Linux/Unix 64-bit jdk7 preferred was: x86-64 Linux/Unix jdk7 preferred > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > 64-bit jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MAPREDUCE-5605-v1.patch > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. The benchmark results shows that it can get impressive improvement > in typical cases. When the a system is relatively short of memory (eg, HPC, > small- and medium-size enterprises), the improvement will be even more > impressive. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Description: Memory is a very important resource to bridge the gap between CPUs and I/O devices. So the idea is to maximize the usage of memory to solve the problem of I/O bottleneck. We developed a multi-threaded task execution engine, which runs in a single JVM on a node. In the execution engine, we have implemented the algorithm of memory scheduling to realize global memory management, based on which we further developed the techniques such as sequential disk accessing, multi-cache and solved the problem of full garbage collection in the JVM. The benchmark results shows that it can get impressive improvement in typical cases. When the a system is relatively short of memory (eg, HPC, small- and medium-size enterprises), the improvement will be even more impressive. (was: Memory is a very important resource to bridge the gap between CPUs and I/O devices. So the idea is to maximize the usage of memory to solve the problem of I/O bottleneck. We developed a multi-threaded task execution engine, which runs in a single JVM on a node. In the execution engine, we have implemented the algorithm of memory scheduling to realize global memory management, based on which we further developed the techniques such as sequential disk accessing, multi-cache and solved the problem of full garbage collection in the JVM. We have conducted extensive experiments with comparison against the native Hadoop platform. The results show that the Mammoth system can reduce the job execution time by more than 40% in typical cases, without requiring any modifications of the Hadoop programs. When a system is short of memory, Mammoth can improve the performance by up to 4 times, as observed for I/O intensive applications, such as PageRank. ) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MAPREDUCE-5605-v1.patch > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. The benchmark results shows that it can get impressive improvement > in typical cases. When the a system is relatively short of memory (eg, HPC, > small- and medium-size enterprises), the improvement will be even more > impressive. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: MAPREDUCE-5605-v1.patch > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MAPREDUCE-5605-v1.patch > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: release-1.0.1.patch > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: release-1.0.1.patch) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: RunningJob.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: Task.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: SequenceFileOutputFormat.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: SpillScheduler.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: RoundQueue.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: ReinitTrackerAction.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: OutputFormat.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: ReduceTaskStatus.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: RecordReader.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: Partitioner.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: RamManager.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: RawKeyValueIterator.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: RawHistoryFileServlet.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskInProgress.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: OutputLogFilter.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: RawBufferedOutputStream.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: ReduceRamManager.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: OutputCommitter.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskLog.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: ReduceTaskRunner.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: ReduceTask.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: OutputCollector.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskLogAppender.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, Task.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskRunner.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: MergeSorter.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskTracker.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskTrackerInstrumentation.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskStatus.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: Operation.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskLogServlet.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskReport.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskLogsTruncater.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskScheduler.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskMemoryManagerThread.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskTrackerAction.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: Merger.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: OutputCollector.java, OutputCommitter.java, > OutputFormat.java, OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: MapTaskRunner.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: JvmTask.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: MapTask.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TaskTrackerStatus.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: MemoryElement.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: MapRunner.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: TextOutputFormat.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: MapTaskStatus.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: MapRamManager.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: JvmManager.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: MapTaskCompletionEventsUpdate.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: MapOutputFile.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: MapTaskStatus.java, MemoryElement.java, > MergeSorter.java, Merger.java, Operation.java, OutputCollector.java, > OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, > Partitioner.java, RamManager.java, RawBufferedOutputStream.java, > RawHistoryFileServlet.java, RawKeyValueIterator.java, RecordReader.java, > ReduceRamManager.java, ReduceTask.java, ReduceTaskRunner.java, > ReduceTaskStatus.java, ReinitTrackerAction.java, RoundQueue.java, > RunningJob.java, SequenceFileOutputFormat.java, SpillScheduler.java, > Task.java, TaskInProgress.java, TaskLog.java, TaskLogAppender.java, > TaskLogServlet.java, TaskLogsTruncater.java, TaskMemoryManagerThread.java, > TaskReport.java, TaskRunner.java, TaskScheduler.java, TaskStatus.java, > TaskTracker.java, TaskTrackerAction.java, TaskTrackerInstrumentation.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: TextOutputFormat.java TaskTrackerStatus.java TaskTrackerInstrumentation.java TaskTrackerAction.java TaskTracker.java TaskStatus.java TaskScheduler.java TaskRunner.java TaskReport.java TaskMemoryManagerThread.java TaskLogsTruncater.java TaskLogServlet.java TaskLogAppender.java TaskLog.java TaskInProgress.java Task.java SpillScheduler.java SequenceFileOutputFormat.java RunningJob.java RoundQueue.java ReinitTrackerAction.java ReduceTaskStatus.java ReduceTaskRunner.java ReduceTask.java ReduceRamManager.java RecordReader.java RawKeyValueIterator.java RawHistoryFileServlet.java RawBufferedOutputStream.java RamManager.java Partitioner.java OutputLogFilter.java OutputFormat.java OutputCommitter.java OutputCollector.java Operation.java MergeSorter.java Merger.java MemoryElement.java MapTaskStatus.java MapTaskRunner.java MapTaskCompletionEventsUpdate.java MapTask.java MapRunner.java MapRamManager.java MapOutputFile.java JvmTask.java JvmManager.java JVMId.java JobTaskRunner.java JobConf.java IFile.java DefaultJvmMemoryManager.java ChildRamManager.java Child.java CachePool.java CacheOutputStream.java CacheFile.java > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: ChildRamManager.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: IFile.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: DefaultJvmMemoryManager.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: Child.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: CacheFile.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: CacheOutputStream.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: JobTaskRunner.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JvmManager.java, JvmTask.java, MapOutputFile.java, > MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: CachePool.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: JobConf.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Attachment: (was: JVMId.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, > MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, > MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, > MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, > OutputCollector.java, OutputCommitter.java, OutputFormat.java, > OutputLogFilter.java, Partitioner.java, RamManager.java, > RawBufferedOutputStream.java, RawHistoryFileServlet.java, > RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, > ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, > ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, > SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, > TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, > TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, > TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, > TaskTrackerAction.java, TaskTrackerInstrumentation.java, > TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Status: Patch Available (was: In Progress) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
[ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: - Description: Memory is a very important resource to bridge the gap between CPUs and I/O devices. So the idea is to maximize the usage of memory to solve the problem of I/O bottleneck. We developed a multi-threaded task execution engine, which runs in a single JVM on a node. In the execution engine, we have implemented the algorithm of memory scheduling to realize global memory management, based on which we further developed the techniques such as sequential disk accessing, multi-cache and solved the problem of full garbage collection in the JVM. We have conducted extensive experiments with comparison against the native Hadoop platform. The results show that the Mammoth system can reduce the job execution time by more than 40% in typical cases, without requiring any modifications of the Hadoop programs. When a system is short of memory, Mammoth can improve the performance by up to 4 times, as observed for I/O intensive applications, such as PageRank. (was: Memory is a very important resource to bridge the gap between CPUs and I/O devices. So the idea is to maximize the usage of memory to solve the problem of I/O bottleneck. We developed a multi-threaded task execution engine, which runs in a single JVM on a node. In the execution engine, we have implemented the algorithm of memory scheduling to realize global memory management, based on which we further developed the techniques such as sequential disk accessing, multi-cache and solved the problem of full garbage collection in the JVM. We have conducted extensive experiments with comparison against the native Hadoop platform. The results show that the Mammoth system can reduce the job execution time by more than 40% in typical cases, without requiring any modifications of the Hadoop programs. When a system is short of memory, Mammoth can improve the performance by up to 4 times, as observed for I/O intensive applications, such as PageRank. ) > Memory-centric MapReduce aiming to solve the I/O bottleneck > --- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred >Reporter: Ming Chen >Assignee: Ming Chen > > Memory is a very important resource to bridge the gap between CPUs and I/O > devices. So the idea is to maximize the usage of memory to solve the problem > of I/O bottleneck. We developed a multi-threaded task execution engine, which > runs in a single JVM on a node. In the execution engine, we have implemented > the algorithm of memory scheduling to realize global memory management, based > on which we further developed the techniques such as sequential disk > accessing, multi-cache and solved the problem of full garbage collection in > the JVM. We have conducted extensive experiments with comparison against the > native Hadoop platform. The results show that the Mammoth system can reduce > the job execution time by more than 40% in typical cases, without requiring > any modifications of the Hadoop programs. When a system is short of memory, > Mammoth can improve the performance by up to 4 times, as observed for I/O > intensive applications, such as PageRank. -- This message was sent by Atlassian JIRA (v6.1#6144)