Re: flush to database?

Eric Berryman Mon, 25 Jun 2018 08:15:02 -0700

So, after restarting the application all the entries up until that past
date were lost. But, thanks to your warning, I was able to back up
everything from a script, and reapply after restarting.
I also upgraded, hoping this doesn't happen again.
Thank you so much!


On Tue, Jun 5, 2018 at 6:47 PM Woonsan Ko <[email protected]> wrote:

> On Tue, Jun 5, 2018 at 3:15 PM, Eric Berryman <[email protected]>
> wrote:
> > Again, thank you so much for walking me through this.
> >
> > Considering your comment that the Journal updates the cache too.
> >
> > Looking at the Journal table in more detail, I see something that at
> least
> > makes sense.
> > The last entry of meta data in the Journal table is the same as the last
> > image on node2.
> > So, it looks as if node2 is doing what it is suppose to do, but node1 has
> > stopped writing to the Journal.
> > (So the meta data must be in the cache of node1?)
> >
> > At this point, I'm a little worried that if I restart node1; I will lose
> > data, or it will fix everything.
>
> What's the version of Jackrabbit exactly?
> Just as a wild guess (I'm not so sure), if there's a bug (such as dead
> lock), you might lose some data when restarted.
> If you're not with the latest of 2.6.x
> (http://jackrabbit.apache.org/jcr/downloads.html#v2.6), there's a
> chance of some known issues. But I'm not really sure whether or not
> it's caused by a bug. You can check JIRA board. e.g,
> https://issues.apache.org/jira/browse/JCR-3783
>
> >
> > Is this recoverable?  I'm guessing a restart of node1 will cause the
> > Journal to start getting updated again, but how do I get the missing
> > entries?
>
> Hmm. There's a standalone tool for backup
> (
> http://jackrabbit.apache.org/jcr/standalone-server.html#Backup_and_migration
> )
> but it's not an option for you unfortunately.
>
> Woonsan
>
> > And, am I sure after a restart node1 will still have all the entries for
> > the past two weeks (considering it wasn't making Journal entries).
> >
> > I don't see anything in the log that suggests that something died.  I'm
> > also wondering how to monitor for this.
> >
> > Here is my repository.xml:
> > <Repository>
> >     <FileSystem
> class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
> >         <param name="path" value="/tmp/jackrabbit-olog/repository"/>
> >     </FileSystem>
> >     <DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">
> >         <param name="driver" value="javax.naming.InitialContext"/>
> >         <param name="url" value="jdbc/jcr"/>
> >         <param name="databaseType" value="mysql"/>
> >         <param name="schemaObjectPrefix" value="jcr_ds_" />
> >     </DataStore>
> >     <Security appName="Jackrabbit">
> >         <SecurityManager
> > class="org.apache.jackrabbit.core.security.simple.SimpleSecurityManager"
> > workspaceName="security">
> >         </SecurityManager>
> >         <AccessManager class="org.apache.jackrabbit.core.security.simple.
> > SimpleAccessManager">
> >         </AccessManager>
> >         <LoginModule class="org.apache.jackrabbit.core.security.simple.
> > SimpleLoginModule">
> >         </LoginModule>
> >     </Security>
> >     <Workspaces rootPath="/tmp/jackrabbit-olog/workspaces"
> > defaultWorkspace="olog"/>
> >     <Workspace name="${wsp.name}">
> >         <FileSystem class="org.apache.jackrabbit.
> > core.fs.local.LocalFileSystem">
> >             <param name="path" value="${wsp.home}"/>
> >         </FileSystem>
> >         <PersistenceManager class="org.apache.jackrabbit.
> > core.persistence.bundle.MySqlPersistenceManager">
> >             <param name="driver" value="javax.naming.InitialContext"/>
> >             <param name="url" value="jdbc/jcr"/>
> >             <param name="schema" value="mysql"/>
> >             <param name="schemaObjectPrefix" value="jcr_${wsp.name
> }_pm_"/>
> >             <param name="externalBLOBs" value="false"/>
> >         </PersistenceManager>
> >         <SearchIndex class="org.apache.jackrabbit.
> > core.query.lucene.SearchIndex">
> >             <param name="path" value="${wsp.home}/index"/>
> >             <param name="supportHighlighting" value="true"/>
> >         </SearchIndex>
> >     </Workspace>
> >     <Versioning rootPath="/tmp/jackrabbit-olog/version">
> >         <FileSystem class="org.apache.jackrabbit.
> > core.fs.local.LocalFileSystem">
> >             <param name="path" value="${rep.home}/version" />
> >         </FileSystem>
> >         <PersistenceManager class="org.apache.jackrabbit.core.state.db.
> > SimpleDbPersistenceManager">
> >             <param name="driver" value="javax.naming.InitialContext"/>
> >             <param name="url" value="jdbc/jcr"/>
> >             <param name="schema" value="mysql"/>
> >             <param name="schemaObjectPrefix" value="jcr_pmver_"/>
> >             <param name="externalBLOBs" value="false"/>
> >         </PersistenceManager>
> >     </Versioning>
> >     <SearchIndex
> class="org.apache.jackrabbit.core.query.lucene.SearchIndex"
> >>
> >         <param name="path"
> value="/tmp/jackrabbit-olog/repository/index"/>
> >         <param name="supportHighlighting" value="true"/>
> >     </SearchIndex>
> >     <Cluster id="node3" syncDelay="2000">
> >         <Journal
> class="org.apache.jackrabbit.core.journal.DatabaseJournal">
> >             <param name="revision" value="${rep.home}/revision.log" />
> >             <param name="driver" value="javax.naming.InitialContext"/>
> >             <param name="url" value="jdbc/jcr"/>
> >             <param name="databaseType" value="mysql"/>
> >         </Journal>
> >     </Cluster>
> > </Repository>
> >
> > Thank you again!
> > Eric
> >
> > On Mon, Jun 4, 2018 at 10:25 AM, Woonsan Ko <[email protected]> wrote:
> >
> >> On Sat, Jun 2, 2018 at 8:39 AM, Eric Berryman <[email protected]>
> >> wrote:
> >> > - Blob data in the table doesn't necessarily mean that the binary data
> >> > is stored
> >> > in a JCR node's binary property.
> >> >
> >> > - If you have a chance to use javax.jcr.Node#getNode(path) directly to
> >> > retrieve the specific node containing binary property, then I don't
> >> think that
> >> > will hit the Lucene index.
> >> >
> >> > Thank you, these are good bits of information.  I just checked, and I
> do
> >> > have API endpoints that only use getNode.  Node1 returns binary
> >> properties,
> >> > while node2 doesn't.  So, from your comment the index has nothing to
> do
> >> > with my issue.  But, it looks like your first comment puts me in the
> >> right
> >> > path.  The table has the blob, but the binary property is probably
> >> missing
> >> > in the database.  Is it possible this isn't flushed to the database by
> >>
> >> I don't think so. Any changes must be persisted.
> >>
> >> > node1?  It seems to make sense that the large binary gets persisted,
> >> while
> >> > the small property might still be in memory?
> >>
> >> "the small property" can be persisted differently from "the larger
> >> binary", depending on "minRecordLength" parameter:
> >> - https://wiki.apache.org/jackrabbit/DataStore
> >>
> >> If the binary property data is not larger than minRecordLength, it's
> >> persisted to the PersistenceManager from the memory, not to the
> >> DataStore.
> >> By the way, do you know that the single DataStore is global across
> >> workspaces? If node1 uses a different workspace from what node1 uses
> >> and if the binary data was smaller than minRecordLength, then the
> >> binary from node1 is stored in its own database or table which might
> >> not be seen by node2 (due to different workspace / DB configuration
> >> possibly).
> >> Perhaps you can check the repository.xml file and workspace.xml
> >> file(s) on each node if there's anything different.
> >>
> >> >
> >> > Another question, is the journal only used for updating the index, or
> >> does
> >> > it do more than that?
> >>
> >> I think it should care of the caching node state manager as well.
> >>
> >> Woonsan
> >>
> >> >
> >> > Thank you again for your help!
> >> > Eric
> >> >
> >> > On Sat, Jun 2, 2018, 00:18 Woonsan Ko <[email protected]> wrote:
> >> >
> >> >> On Fri, Jun 1, 2018 at 9:57 PM, Eric Berryman <
> [email protected]>
> >> >> wrote:
> >> >> > Node1 looks completely fine, and the application that uses it is in
> >> >> > production.  It's a simple java ee application that uses the jcr to
> >> >> upload
> >> >> > and list past images.
> >> >>
> >> >> Does the application use javax.jcr.query.Query first to retrieve the
> >> >> nodes containing binary properties? If so, it uses Lucene index for
> >> >> the query.
> >> >> If you have a chance to use javax.jcr.Node#getNode(path) directly to
> >> >> retrieve the specific node containing binary property, then I don't
> >> >> think that will hit the Lucene index. It just converts the path to
> >> >> node ids to retrieve node states from database. So, it is worth
> >> >> validating one of the recently added nodes by #getNode(path) on both
> >> >> Node1 and Node2, IMO. If it returns a node but fails to return it by
> >> >> Query, then it is a Lucene index issue. If it returns nothing in both
> >> >> ways on Node2 while it works fineon Node1, then perhaps is Node2
> >> >> looking at a different database or tables?
> >> >>
> >> >> Regards,
> >> >>
> >> >> Woonsan
> >> >>
> >> >> >
> >> >> > I guess what I don't understand, is that they are looking at the
> exact
> >> >> same
> >> >> > database.  It seems I should be able to have node2 see it the same
> >> way,
> >> >> and
> >> >> > the only difference would be the index, which is in a local file
> >> >> directory.
> >> >> >
> >> >> > So strange.
> >> >> >
> >> >> > Thank you!
> >> >> >
> >> >> > On Jun 1, 2018 21:44, "Woonsan Ko" <[email protected]> wrote:
> >> >> >
> >> >> > Hi Eric,
> >> >> >
> >> >> >
> >> >> > On Fri, Jun 1, 2018 at 1:29 PM, Eric Berryman <
> >> [email protected]>
> >> >> > wrote:
> >> >> >> Hello!
> >> >> >>
> >> >> >> I have an application that uses jackrabbit to save images, using
> the
> >> >> >> database filestore.
> >> >> >> I have jackrabbit clustered (node1, node2).
> >> >> >> This was working for me fine, but I started seeing an oddity.
> >> >> >> Node1 inserts an image, but node2 doesn't seem to see it when
> queried
> >> >> >> anymore.
> >> >> >> So, node2 is now missing about the last 2 weeks of images.
> >> >> >> I can see the correct image as a blob in the jcr_ds_DATASTORE
> table.
> >> >> >
> >> >> > Are you sure you are able to query or find the images in node1?
> >> >> > Blob data in the table doesn't necessarily mean that the binary
> data
> >> >> > is stored in a JCR node's binary property. The blob data could be
> >> >> > referred by another node or versioned frozen node or non-existing
> node
> >> >> > which can be caused by node deletion but the binary data wasn't
> >> >> > garbage-collected.
> >> >> > So, I'd traverse the nodes through simple JCR API and validate if
> the
> >> >> > nodes really exists even in node1. You might need to ask around
> about
> >> >> > the paths of the recently added nodes containing the binary data
> to do
> >> >> > that.
> >> >> >
> >> >> >
> >> >> >>
> >> >> >> And, node2 logged that the journal has been applied.
> >> >> >> The LOCAL_REVISIONS table shows both nodes have a revision id of
> 605
> >> >> >> (although I do have 1364 images).
> >> >> >>
> >> >> >> I've tried adding enableConsistencyCheck=true and
> >> >> >> forceConsistencyCheck=true to the index part of the repository.xml
> >> file.
> >> >> >> But, I don't see any errors.  Just, that the consistency check
> >> happened.
> >> >> >>
> >> >> >> I've also tried clearing the index directory of node2.  Jackrabbit
> >> >> >> recreates the index, applies the 605 journal entries, then ends
> up in
> >> >> the
> >> >> >> same state without the last two weeks of images.
> >> >> >>
> >> >> >> Are there any ideas to fix what seems to be an index issue.
> >> >> >
> >> >> > I'm kind of suspicious that some of the new nodes in last two weeks
> >> >> > might have been removed for some reasons. You can perhaps rule out
> >> >> > this possibility by inspecting JCR nodes on node1 first.
> >> >> >
> >> >> > Regards,
> >> >> >
> >> >> > Woonsan
> >> >> >
> >> >> >
> >> >> >>
> >> >> >> Any help or ideas to troubleshoot are greatly appreciated.
> >> >> >> (jackrabbit 2.6)
> >> >> >>
> >> >> >> Thank you!
> >> >> >> Eric
> >> >>
> >>
>

Re: flush to database?

Reply via email to