Correct -- I'll create it now and add you as a watcher.

On Thu, Mar 9, 2017 at 3:31 PM, Mark Payne <[email protected]> wrote:

> Excellent, thanks! Definitely looks like old records are not getting
> evicted. You have not yet created a JIRA for
> this, correct?
>
> Thanks
> -Mark
>
>
> > On Mar 9, 2017, at 10:24 AM, Joe Gresock <[email protected]> wrote:
> >
> > Good instinct -- here's what I get:
> >
> > nifi-app.log:2017-03-09 15:03:00,670 INFO [Distributed Cache Server
> > Communications Thread: ac907dec-49a4-439e-99f5-1558f2358d87]
> > org.wali.MinimalLockingWriteAheadLog
> > org.wali.MinimalLockingWriteAheadLog@40569408 checkpointed with
> *4262902*
> > Records and 0 Swap Files in 256302 milliseconds (Stop-the-world time =
> 1378
> > milliseconds, Clear Edit Logs time = 19 millis), max Transaction ID
> 4263237
> >
> > Looks like it's over 4.2 million records now.
> >
> > On Thu, Mar 9, 2017 at 3:13 PM, Mark Payne <[email protected]> wrote:
> >
> >> Joe,
> >>
> >> That definitely sounds like a bug causing the eviction to not happen.
> Can
> >> you grep your logs for the phrase
> >> "checkpointed with"? You should have a line that tells you how many
> >> records were written to the Snapshot.
> >> You will certainly see a few of these types of messages, though, because
> >> you have 1 for the FlowFile Repository,
> >> one for Local State Management, and another one for the
> >> DistributedMapCacheServer. I am curious to see if
> >> you see the log message indicating 3 million+ records also.
> >>
> >> Thanks
> >> -Mark
> >>
> >>
> >>> On Mar 8, 2017, at 7:13 PM, Joe Gresock <[email protected]> wrote:
> >>>
> >>> Looking through the PersistenceMapCache and SimpleMapCache, it seems
> like
> >>> lots of these records should have been evicted by now.  We're up to 3.1
> >>> million records on disk in the snapshot file.  My understanding is that
> >>> when wali.checkpoint() is called, it collapses all the DELETE records
> in
> >>> the journaled log and removes them before writing the snapshot file.
> Is
> >>> that accurate?
> >>>
> >>> I feel like something is not going quite right with the eviction
> process.
> >>> I am using 1.1.1, though, and I have noticed that the
> PersistentMapCache
> >>> has changed in [1], so I might apply that patch and try some more
> >>> experiments.
> >>>
> >>> Would anyone be willing to try to replicate this behavior in NiFi
> 1.1.1?
> >>> You should be able to do it as follows:
> >>> Services:
> >>> DistributedMapCacheServer, maximum cache entries = 100,000, FIFO
> >> eviction,
> >>> persistence directory specified
> >>> DistributedMapCacheClientService, point to the same host and port
> >>>
> >>> Flow:
> >>> GenerateFlowFile (randomize 1K binary files in batches of 10, schedule
> 10
> >>> threads) ->HashContent (md5) into hash.value -> DetectDuplicate with
> >>> identifier = ${hash.value}, description = ., no age off, select your
> >> cache
> >>> client, cache identifier = true
> >>>
> >>> This should cause the snapshot file to exceed 100,000 keys pretty
> >> quickly,
> >>> and as far as I can tell, it never goes back down.  This in itself is
> >> not a
> >>> problem, but when the cache gets really big, it tends to crash our
> >> cluster
> >>> when NiFi reloads it into memory.
> >>>
> >>> [1] https://issues.apache.org/jira/browse/NIFI-3214
> >>>
> >>>
> >>> On Wed, Mar 8, 2017 at 11:06 AM, Joe Gresock <[email protected]>
> wrote:
> >>>
> >>>> Thanks Bryan, I'll start looking through the PersistenceMapCache.
> This
> >>>> morning I checked back and the snapshot file now has 2.9 million keys
> >> in it.
> >>>>
> >>>> On Tue, Mar 7, 2017 at 4:39 PM, Bryan Bende <[email protected]> wrote:
> >>>>
> >>>>> Joe,
> >>>>>
> >>>>> I'm not that familiar with the persistence part of the DMCS, although
> >>>>> I do know that it uses the write-ahead-log that is also used by the
> >>>>> flow file repo.
> >>>>>
> >>>>> The code for PersistenceMapCache is here:
> >>>>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
> >>>>> nifi-standard-services/nifi-distributed-cache-services-
> >>>>> bundle/nifi-distributed-cache-server/src/main/java/org/
> >>>>> apache/nifi/distributed/cache/server/map/PersistentMapCache.java
> >>>>>
> >>>>> It looks like the WAL is check-pointed during puts here:
> >>>>>
> >>>>> final long modCount = modifications.getAndIncrement();
> >>>>> if ( modCount > 0 && modCount % 100000 == 0 ) {
> >>>>>   wali.checkpoint();
> >>>>> }
> >>>>>
> >>>>> And during deletes here:
> >>>>>
> >>>>> final long modCount = modifications.getAndIncrement();
> >>>>> if (modCount > 0 && modCount % 1000 == 0) {
> >>>>>   wali.checkpoint();
> >>>>> }
> >>>>>
> >>>>> Not sure if that was intentional that put operations check point
> every
> >>>>> 100k and and deletes check point every 1k.
> >>>>>
> >>>>> Maybe Mark or others could shed some light on why the snapshot is
> >>>>> reaching 3GB in size.
> >>>>>
> >>>>> -Bryan
> >>>>>
> >>>>>
> >>>>> On Tue, Mar 7, 2017 at 7:07 AM, Joe Gresock <[email protected]>
> >> wrote:
> >>>>>> Hi folks,
> >>>>>>
> >>>>>> Is there a technical description of how the
> DistributedMapCacheServer
> >>>>>> (DMCS) persistence works?  I've noticed the following on our
> cluster:
> >>>>>>
> >>>>>> - I have the DMCS configured on port 4557 as FIFO with max 100,000
> >>>>> entries,
> >>>>>> and have specified a persistence directory
> >>>>>> - I am using DetectDuplicate with the DMCS, and the individual key
> >>>>> length
> >>>>>> is 80 bytes, with a Description length of 1 byte.  By my count, this
> >>>>> should
> >>>>>> result in a pure data size of 7.7MB.
> >>>>>> - I notice that the snapshot file in the persistence directory
> appears
> >>>>> to
> >>>>>> continue growing past the 100,000 limit, though this may be expected
> >>>>>> depending on the implementation.  Since I know that the key will
> >> contain
> >>>>>> "json" in it, I can run the following command to count the number of
> >>>>>> possible keys in the snapshot file (though I'm not sure if this is a
> >>>>> good
> >>>>>> way of measuring how many keys are actually cached): grep -oa json
> >>>>> snapshot
> >>>>>> | wc -l
> >>>>>> - When the snapshot file reaches around 3GB, the DMCS has a hard
> time
> >>>>>> staying up, and frequently becomes unreachable (netstat -tulpn |
> grep
> >>>>> 4557
> >>>>>> shows nothing).  At this point, in order to restore functionality I
> >>>>> delete
> >>>>>> the persistence directory and let it start over.
> >>>>>>
> >>>>>> So my main questions are:
> >>>>>> - How are the snapshot and partition files structured, and how can I
> >>>>>> estimate how many keys are actually cached at a given time?
> >>>>>> - Is the described behavior indicative of the cache exceeding the
> >>>>>> configured max number of keys?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Joe
> >>>>>>
> >>>>>> --
> >>>>>> I know what it is to be in need, and I know what it is to have
> plenty.
> >>>>> I
> >>>>>> have learned the secret of being content in any and every situation,
> >>>>>> whether well fed or hungry, whether living in plenty or in want.  I
> >> can
> >>>>> do
> >>>>>> all this through him who gives me strength.    *-Philippians
> 4:12-13*
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> I know what it is to be in need, and I know what it is to have plenty.
> >> I
> >>>> have learned the secret of being content in any and every situation,
> >>>> whether well fed or hungry, whether living in plenty or in want.  I
> can
> >>>> do all this through him who gives me strength.    *-Philippians
> 4:12-13*
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> I know what it is to be in need, and I know what it is to have
> plenty.  I
> >>> have learned the secret of being content in any and every situation,
> >>> whether well fed or hungry, whether living in plenty or in want.  I can
> >> do
> >>> all this through him who gives me strength.    *-Philippians 4:12-13*
> >>
> >>
> >
> >
> > --
> > I know what it is to be in need, and I know what it is to have plenty.  I
> > have learned the secret of being content in any and every situation,
> > whether well fed or hungry, whether living in plenty or in want.  I can
> do
> > all this through him who gives me strength.    *-Philippians 4:12-13*
>
>


-- 
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13*

Reply via email to