[jira] [Commented] (IGNITE-12096) Ignite memory metrics incorrect on cache usage contraction

2020-01-14 Thread Colin Cassidy (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015067#comment-17015067
 ] 

Colin Cassidy commented on IGNITE-12096:


 

I have been looking at the Ignite code - in particular, AbstractFreeList and 
PagesList and debugging the occurrence of this error. I believe that I have 
some understanding of the problem and also a simple potential fix, but I would 
like to verify this.I have been looking at the Ignite code - in particular, 
AbstractFreeList and PagesList and debugging the occurrence of this error. I 
believe that I have some understanding of the problem and also a simple 
potential fix, but I would like to verify this.


The problem occurs because fillFactor is calculated in DataRegionMetricsImpl as 
fillFactor = totalAllocated - freeSpace. When entries are removed from the 
cache in the supplied example, the value of totalAllocated is not reduced. In 
Ignite 2.6, freeSpace increased to compensate but from 2.7, this does not 
happen.
The reason is that the freeSpace calculation appears to delibareately exclude 
the REUSE_BUCKET - which is the final entry (index 255) in the bucket list. 
[https://github.com/gridgain/gridgain/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/persistence/freelist/AbstractFreeList.java#L394]


When the test is run in Ignite 2.6, cache entry removal results in a large 
increment to bucket 251. This brings fillFactor close to zero. From Ignite 2.7, 
it is the REUSE_BUCKET that is incremented - but this does not contribute to 
freeSpace.


The difference in behaviour appears to be caused by the following change, which 
appears to mark pages for recycling:

[https://github.com/gridgain/gridgain/blame/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/persistence/freelist/AbstractFreeList.java#L641https://github.com/gridgain/gridgain/commit/47da5df328a18d0d55ba534b1af541b72df76901]


My proposed fix is to change AbstractFreeList:394 to include the REUSE_BUCKET 
in the freeSpace calculation i.e.


_for (int b = BUCKETS - 1; b > 0; b--) {_

instead of

_for (int b = BUCKETS - 2; b > 0; b--) {_


I can confirm that this fixes the metrics reporting for the test - but would 
like to understand the reasoning behind excluding the REUSE_BUCKET in the first 
place. Is there some reason why pages that are marked for recycling cannot be 
included in the free list? If so, is there a way we can avoid these pages being 
left in an apparently permanent limbo?


Do you agree with my proposed change, [~sergey-chugunov] and [~jokser]?


Regards,

Colin.

> Ignite memory metrics incorrect on cache usage contraction
> --
>
> Key: IGNITE-12096
> URL: https://issues.apache.org/jira/browse/IGNITE-12096
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.7
>Reporter: Colin Cassidy
>Priority: Critical
>
> When using the Ignite metrics API to measure available memory, the usage 
> figures appear to be accurate while memory is being consumed - but when 
> memory is freed the metrics do not drop. They appear to report that memory 
> has not been freed up, even though it has.
> A reliable estimate of memory consumption is very important for solutions 
> that don't use native persistence - as this is the only controllable way of 
> avoiding a critical OOM condition.
> Reproducer below. This affects Ignite 2.7+.
> {{}}{{import org.apache.ignite.failure.NoOpFailureHandler; }}
>  {{import org.junit.Test; }}
> {{public class MemoryTest2 { }}
> {{    private static final String CACHE_NAME = "cache"; }}
>  {{    private static final String DEFAULT_MEMORY_REGION = "Default_Region"; 
> }}
>  {{    private static final long MEM_SIZE = 100L * 1024 * 1024; }}
> {{    @Test }}
>  {{    public void testOOM() throws InterruptedException { }}
>  {{        try (Ignite ignite = startIgnite("IgniteMemoryMonitorTest1")) { }}
>  {{            fillDataRegion(ignite); }}
>  {{            CacheConfiguration cfg = new }}
>  {{CacheConfiguration<>(CACHE_NAME); }}
>  {{            cfg.setStatisticsEnabled(true); }}
>  {{            IgniteCache cache = }}
>  {{ignite.getOrCreateCache(cfg); }}
> {{            // Clear all entries from the cache to free up memory }}
>  {{            memUsed(ignite); }}
>  {{            cache.clear(); }}
>  {{            cache.removeAll(); }}
>  {{            cache.put("Key", "Value"); }}
>  {{            memUsed(ignite); }}
>  {{            cache.destroy(); }}
>  {{            Thread.sleep(5000); }}
> {{            // Should now report close to 0% but reports 59% still }}
>  {{            memUsed(ignite); }}
>  {{        } }}
>  {{    } }}
>  {{    }}
>  {{    private Ignite startIgnite(String instanceName) { }}
>  {{        IgniteConfiguration cfg = 

[jira] [Commented] (IGNITE-12096) Ignite memory metrics incorrect on cache usage contraction

2019-08-23 Thread Colin Cassidy (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16914570#comment-16914570
 ] 

Colin Cassidy commented on IGNITE-12096:


getOffheapUsedSize seems to be equivalent to getTotalAllocatedPages() * 
getPageSize()

Which is to say, it doesn't take account of the fill factor either - so the 
value doesn't come down on cache purge even before 2.7

I'm happy to multiply by the fill factor - but the problem seems to be that it 
behaves differently from 2.7.

Thanks for the link to the discussion - I'll take a look.

> Ignite memory metrics incorrect on cache usage contraction
> --
>
> Key: IGNITE-12096
> URL: https://issues.apache.org/jira/browse/IGNITE-12096
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.7
>Reporter: Colin Cassidy
>Priority: Critical
>
> When using the Ignite metrics API to measure available memory, the usage 
> figures appear to be accurate while memory is being consumed - but when 
> memory is freed the metrics do not drop. They appear to report that memory 
> has not been freed up, even though it has.
> Reproducer below. This affects Ignite 2.7+.
> {{}}{{import org.apache.ignite.failure.NoOpFailureHandler; }}
> {{import org.junit.Test; }}
> {{public class MemoryTest2 { }}
> {{    private static final String CACHE_NAME = "cache"; }}
> {{    private static final String DEFAULT_MEMORY_REGION = "Default_Region"; }}
> {{    private static final long MEM_SIZE = 100L * 1024 * 1024; }}
> {{    @Test }}
> {{    public void testOOM() throws InterruptedException { }}
> {{        try (Ignite ignite = startIgnite("IgniteMemoryMonitorTest1")) { }}
> {{            fillDataRegion(ignite); }}
> {{            CacheConfiguration cfg = new }}
> {{CacheConfiguration<>(CACHE_NAME); }}
> {{            cfg.setStatisticsEnabled(true); }}
> {{            IgniteCache cache = }}
> {{ignite.getOrCreateCache(cfg); }}
> {{            // Clear all entries from the cache to free up memory }}
> {{            memUsed(ignite); }}
> {{            cache.clear(); }}
> {{            cache.removeAll(); }}
> {{            cache.put("Key", "Value"); }}
> {{            memUsed(ignite); }}
> {{            cache.destroy(); }}
> {{            Thread.sleep(5000); }}
> {{            // Should now report close to 0% but reports 59% still }}
> {{            memUsed(ignite); }}
> {{        } }}
> {{    } }}
> {{    }}
> {{    private Ignite startIgnite(String instanceName) { }}
> {{        IgniteConfiguration cfg = new IgniteConfiguration(); }}
> {{        cfg.setIgniteInstanceName(instanceName); }}
> {{        cfg.setDataStorageConfiguration(createDataStorageConfiguration()); 
> }}
> {{        cfg.setFailureHandler(new NoOpFailureHandler()); }}
> {{        return Ignition.start(cfg); }}
> {{    } }}
> {{    private DataStorageConfiguration createDataStorageConfiguration() { }}
> {{        return new DataStorageConfiguration() }}
> {{                .setDefaultDataRegionConfiguration( }}
> {{                        new DataRegionConfiguration() }}
> {{                                .setName(DEFAULT_MEMORY_REGION) }}
> {{                                .setInitialSize(MEM_SIZE) }}
> {{                                .setMaxSize(MEM_SIZE) }}
> {{                                .setMetricsEnabled(true)); }}
> {{    } }}
> {{    private void fillDataRegion(Ignite ignite) { }}
> {{        byte[] megabyte = new byte[1024 * 1024]; }}
> {{            IgniteCache cache = }}
> {{                    ignite.getOrCreateCache(CACHE_NAME); }}
> {{            for (int i = 0; i < 50; i++) { }}
> {{                cache.put(i, megabyte); }}
> {{                memUsed(ignite); }}
> {{            } }}
> {{    } }}
> {{    private void memUsed(Ignite ignite) { }}
> {{        DataRegionConfiguration defaultDataRegionCfg = }}
> {{ignite.configuration() }}
> {{                .getDataStorageConfiguration() }}
> {{                .getDefaultDataRegionConfiguration(); }}
> {{        String regionName = defaultDataRegionCfg.getName(); }}
> {{        DataRegionMetrics metrics = ignite.dataRegionMetrics(regionName); }}
> {{        float usedMem = metrics.getPagesFillFactor() * }}
> {{metrics.getTotalAllocatedPages() * metrics.getPageSize(); }}
> {{        float pctUsed = 100 * usedMem / defaultDataRegionCfg.getMaxSize(); 
> }}
> {{        System.out.println("Memory used: " + pctUsed + "%"); }}
> {{    } }}
> {{} }}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12096) Ignite memory metrics incorrect on cache usage contraction

2019-08-23 Thread Denis Magda (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16914551#comment-16914551
 ] 

Denis Magda commented on IGNITE-12096:
--

Collin, how about this metric that is designed to return the actual size of a 
region - DataRegionMetricsMXBean.getOffheapUsedSize?

It's available since Ignite 2.5: 
https://issues.apache.org/jira/browse/IGNITE-8078

Plus, we might need to restart a discussion here if the metric doesn't suit 
your needs:
http://apache-ignite-developers.2346864.n4.nabble.com/Memory-usage-per-cache-td28470.html

> Ignite memory metrics incorrect on cache usage contraction
> --
>
> Key: IGNITE-12096
> URL: https://issues.apache.org/jira/browse/IGNITE-12096
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.7
>Reporter: Colin Cassidy
>Priority: Critical
>
> When using the Ignite metrics API to measure available memory, the usage 
> figures appear to be accurate while memory is being consumed - but when 
> memory is freed the metrics do not drop. They appear to report that memory 
> has not been freed up, even though it has.
> Reproducer below. This affects Ignite 2.7+.
> {{}}{{import org.apache.ignite.failure.NoOpFailureHandler; }}
> {{import org.junit.Test; }}
> {{public class MemoryTest2 { }}
> {{    private static final String CACHE_NAME = "cache"; }}
> {{    private static final String DEFAULT_MEMORY_REGION = "Default_Region"; }}
> {{    private static final long MEM_SIZE = 100L * 1024 * 1024; }}
> {{    @Test }}
> {{    public void testOOM() throws InterruptedException { }}
> {{        try (Ignite ignite = startIgnite("IgniteMemoryMonitorTest1")) { }}
> {{            fillDataRegion(ignite); }}
> {{            CacheConfiguration cfg = new }}
> {{CacheConfiguration<>(CACHE_NAME); }}
> {{            cfg.setStatisticsEnabled(true); }}
> {{            IgniteCache cache = }}
> {{ignite.getOrCreateCache(cfg); }}
> {{            // Clear all entries from the cache to free up memory }}
> {{            memUsed(ignite); }}
> {{            cache.clear(); }}
> {{            cache.removeAll(); }}
> {{            cache.put("Key", "Value"); }}
> {{            memUsed(ignite); }}
> {{            cache.destroy(); }}
> {{            Thread.sleep(5000); }}
> {{            // Should now report close to 0% but reports 59% still }}
> {{            memUsed(ignite); }}
> {{        } }}
> {{    } }}
> {{    }}
> {{    private Ignite startIgnite(String instanceName) { }}
> {{        IgniteConfiguration cfg = new IgniteConfiguration(); }}
> {{        cfg.setIgniteInstanceName(instanceName); }}
> {{        cfg.setDataStorageConfiguration(createDataStorageConfiguration()); 
> }}
> {{        cfg.setFailureHandler(new NoOpFailureHandler()); }}
> {{        return Ignition.start(cfg); }}
> {{    } }}
> {{    private DataStorageConfiguration createDataStorageConfiguration() { }}
> {{        return new DataStorageConfiguration() }}
> {{                .setDefaultDataRegionConfiguration( }}
> {{                        new DataRegionConfiguration() }}
> {{                                .setName(DEFAULT_MEMORY_REGION) }}
> {{                                .setInitialSize(MEM_SIZE) }}
> {{                                .setMaxSize(MEM_SIZE) }}
> {{                                .setMetricsEnabled(true)); }}
> {{    } }}
> {{    private void fillDataRegion(Ignite ignite) { }}
> {{        byte[] megabyte = new byte[1024 * 1024]; }}
> {{            IgniteCache cache = }}
> {{                    ignite.getOrCreateCache(CACHE_NAME); }}
> {{            for (int i = 0; i < 50; i++) { }}
> {{                cache.put(i, megabyte); }}
> {{                memUsed(ignite); }}
> {{            } }}
> {{    } }}
> {{    private void memUsed(Ignite ignite) { }}
> {{        DataRegionConfiguration defaultDataRegionCfg = }}
> {{ignite.configuration() }}
> {{                .getDataStorageConfiguration() }}
> {{                .getDefaultDataRegionConfiguration(); }}
> {{        String regionName = defaultDataRegionCfg.getName(); }}
> {{        DataRegionMetrics metrics = ignite.dataRegionMetrics(regionName); }}
> {{        float usedMem = metrics.getPagesFillFactor() * }}
> {{metrics.getTotalAllocatedPages() * metrics.getPageSize(); }}
> {{        float pctUsed = 100 * usedMem / defaultDataRegionCfg.getMaxSize(); 
> }}
> {{        System.out.println("Memory used: " + pctUsed + "%"); }}
> {{    } }}
> {{} }}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12096) Ignite memory metrics incorrect on cache usage contraction

2019-08-23 Thread Colin Cassidy (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16914510#comment-16914510
 ] 

Colin Cassidy commented on IGNITE-12096:


Thanks for the response. Some observations:
 * I'm using the technique recommended to me by GG support - but happy to be 
corrected.
 * The memory usage calculation page recommends using DataStorageMetrics. This 
returns null for me even with setMetricsEnabled(true) - presumably because I am 
not using native persistence.
 * The above code worked fine up to Ignite 2.6 - so I assume there has been 
some change to the purging logic to cause this.
 * If using the DataRegion allocatedSize, it appears not to be account for the 
fill factor - so this value doesn't drop on cache purge even in Ignite 2.6.
 * My entries are 1MB each - comfortably larger than the page size. So I expect 
fragmentation is probably be minimal.
 * Memory is not reported as free even when my cache is destroyed.
 * If I remove all entries from the cache and then write them back again, the 
memory usage stays static - it doesn't drop at any point, even if the cache was 
close to full. The memory must be reclaimed at some point before it is reused - 
or is the problem that they are overwritten and never actually purged? Prior to 
2.7, I would see the fill factor drop to near 0 indicating that the pages are 
still allocated but are now considered to be empty.

For many use cases, it's important to have a timely and reasonably accurate 
estimate of memory usage because in a pure in-memory configuration (no native 
persistence) there is no other way to avoid an OOM condition. OOM is considered 
a critical error and causes the node to stop. Although this can be overridden, 
I am told that this is not a good idea.

> Ignite memory metrics incorrect on cache usage contraction
> --
>
> Key: IGNITE-12096
> URL: https://issues.apache.org/jira/browse/IGNITE-12096
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.7
>Reporter: Colin Cassidy
>Priority: Critical
>
> When using the Ignite metrics API to measure available memory, the usage 
> figures appear to be accurate while memory is being consumed - but when 
> memory is freed the metrics do not drop. They appear to report that memory 
> has not been freed up, even though it has.
> Reproducer below. This affects Ignite 2.7+.
> {{}}{{import org.apache.ignite.failure.NoOpFailureHandler; }}
> {{import org.junit.Test; }}
> {{public class MemoryTest2 { }}
> {{    private static final String CACHE_NAME = "cache"; }}
> {{    private static final String DEFAULT_MEMORY_REGION = "Default_Region"; }}
> {{    private static final long MEM_SIZE = 100L * 1024 * 1024; }}
> {{    @Test }}
> {{    public void testOOM() throws InterruptedException { }}
> {{        try (Ignite ignite = startIgnite("IgniteMemoryMonitorTest1")) { }}
> {{            fillDataRegion(ignite); }}
> {{            CacheConfiguration cfg = new }}
> {{CacheConfiguration<>(CACHE_NAME); }}
> {{            cfg.setStatisticsEnabled(true); }}
> {{            IgniteCache cache = }}
> {{ignite.getOrCreateCache(cfg); }}
> {{            // Clear all entries from the cache to free up memory }}
> {{            memUsed(ignite); }}
> {{            cache.clear(); }}
> {{            cache.removeAll(); }}
> {{            cache.put("Key", "Value"); }}
> {{            memUsed(ignite); }}
> {{            cache.destroy(); }}
> {{            Thread.sleep(5000); }}
> {{            // Should now report close to 0% but reports 59% still }}
> {{            memUsed(ignite); }}
> {{        } }}
> {{    } }}
> {{    }}
> {{    private Ignite startIgnite(String instanceName) { }}
> {{        IgniteConfiguration cfg = new IgniteConfiguration(); }}
> {{        cfg.setIgniteInstanceName(instanceName); }}
> {{        cfg.setDataStorageConfiguration(createDataStorageConfiguration()); 
> }}
> {{        cfg.setFailureHandler(new NoOpFailureHandler()); }}
> {{        return Ignition.start(cfg); }}
> {{    } }}
> {{    private DataStorageConfiguration createDataStorageConfiguration() { }}
> {{        return new DataStorageConfiguration() }}
> {{                .setDefaultDataRegionConfiguration( }}
> {{                        new DataRegionConfiguration() }}
> {{                                .setName(DEFAULT_MEMORY_REGION) }}
> {{                                .setInitialSize(MEM_SIZE) }}
> {{                                .setMaxSize(MEM_SIZE) }}
> {{                                .setMetricsEnabled(true)); }}
> {{    } }}
> {{    private void fillDataRegion(Ignite ignite) { }}
> {{        byte[] megabyte = new byte[1024 * 1024]; }}
> {{            IgniteCache cache = }}
> {{                    ignite.getOrCreateCache(CACHE_NAME); }}
> {{            for (int i = 0; i < 

[jira] [Commented] (IGNITE-12096) Ignite memory metrics incorrect on cache usage contraction

2019-08-23 Thread Denis Magda (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16914311#comment-16914311
 ] 

Denis Magda commented on IGNITE-12096:
--

I'm not sure that the recommended way for the used space calculation. Memory 
cleaning can be deferred until the compaction process kicks off:
https://apacheignite.readme.io/docs/memory-defragmentation

Try to adjust the way you do the calculation and see if there is any change:
https://apacheignite.readme.io/docs/memory-metrics#section-memory-usage-calculation

But I still believe that we need to way for next compaction round to purge 
deleted entries from memory. [~DmitriyGovorukhin] does it sound correct?

> Ignite memory metrics incorrect on cache usage contraction
> --
>
> Key: IGNITE-12096
> URL: https://issues.apache.org/jira/browse/IGNITE-12096
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.7
>Reporter: Colin Cassidy
>Priority: Critical
>
> When using the Ignite metrics API to measure available memory, the usage 
> figures appear to be accurate while memory is being consumed - but when 
> memory is freed the metrics do not drop. They appear to report that memory 
> has not been freed up, even though it has.
> Reproducer below. This affects Ignite 2.7+.
> {{}}{{import org.apache.ignite.failure.NoOpFailureHandler; }}
> {{import org.junit.Test; }}
> {{public class MemoryTest2 { }}
> {{    private static final String CACHE_NAME = "cache"; }}
> {{    private static final String DEFAULT_MEMORY_REGION = "Default_Region"; }}
> {{    private static final long MEM_SIZE = 100L * 1024 * 1024; }}
> {{    @Test }}
> {{    public void testOOM() throws InterruptedException { }}
> {{        try (Ignite ignite = startIgnite("IgniteMemoryMonitorTest1")) { }}
> {{            fillDataRegion(ignite); }}
> {{            CacheConfiguration cfg = new }}
> {{CacheConfiguration<>(CACHE_NAME); }}
> {{            cfg.setStatisticsEnabled(true); }}
> {{            IgniteCache cache = }}
> {{ignite.getOrCreateCache(cfg); }}
> {{            // Clear all entries from the cache to free up memory }}
> {{            memUsed(ignite); }}
> {{            cache.clear(); }}
> {{            cache.removeAll(); }}
> {{            cache.put("Key", "Value"); }}
> {{            memUsed(ignite); }}
> {{            cache.destroy(); }}
> {{            Thread.sleep(5000); }}
> {{            // Should now report close to 0% but reports 59% still }}
> {{            memUsed(ignite); }}
> {{        } }}
> {{    } }}
> {{    }}
> {{    private Ignite startIgnite(String instanceName) { }}
> {{        IgniteConfiguration cfg = new IgniteConfiguration(); }}
> {{        cfg.setIgniteInstanceName(instanceName); }}
> {{        cfg.setDataStorageConfiguration(createDataStorageConfiguration()); 
> }}
> {{        cfg.setFailureHandler(new NoOpFailureHandler()); }}
> {{        return Ignition.start(cfg); }}
> {{    } }}
> {{    private DataStorageConfiguration createDataStorageConfiguration() { }}
> {{        return new DataStorageConfiguration() }}
> {{                .setDefaultDataRegionConfiguration( }}
> {{                        new DataRegionConfiguration() }}
> {{                                .setName(DEFAULT_MEMORY_REGION) }}
> {{                                .setInitialSize(MEM_SIZE) }}
> {{                                .setMaxSize(MEM_SIZE) }}
> {{                                .setMetricsEnabled(true)); }}
> {{    } }}
> {{    private void fillDataRegion(Ignite ignite) { }}
> {{        byte[] megabyte = new byte[1024 * 1024]; }}
> {{            IgniteCache cache = }}
> {{                    ignite.getOrCreateCache(CACHE_NAME); }}
> {{            for (int i = 0; i < 50; i++) { }}
> {{                cache.put(i, megabyte); }}
> {{                memUsed(ignite); }}
> {{            } }}
> {{    } }}
> {{    private void memUsed(Ignite ignite) { }}
> {{        DataRegionConfiguration defaultDataRegionCfg = }}
> {{ignite.configuration() }}
> {{                .getDataStorageConfiguration() }}
> {{                .getDefaultDataRegionConfiguration(); }}
> {{        String regionName = defaultDataRegionCfg.getName(); }}
> {{        DataRegionMetrics metrics = ignite.dataRegionMetrics(regionName); }}
> {{        float usedMem = metrics.getPagesFillFactor() * }}
> {{metrics.getTotalAllocatedPages() * metrics.getPageSize(); }}
> {{        float pctUsed = 100 * usedMem / defaultDataRegionCfg.getMaxSize(); 
> }}
> {{        System.out.println("Memory used: " + pctUsed + "%"); }}
> {{    } }}
> {{} }}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)