On Sat, Jun 2, 2018 at 8:39 AM, Eric Berryman <[email protected]> wrote: > - Blob data in the table doesn't necessarily mean that the binary data > is stored > in a JCR node's binary property. > > - If you have a chance to use javax.jcr.Node#getNode(path) directly to > retrieve the specific node containing binary property, then I don't think that > will hit the Lucene index. > > Thank you, these are good bits of information. I just checked, and I do > have API endpoints that only use getNode. Node1 returns binary properties, > while node2 doesn't. So, from your comment the index has nothing to do > with my issue. But, it looks like your first comment puts me in the right > path. The table has the blob, but the binary property is probably missing > in the database. Is it possible this isn't flushed to the database by
I don't think so. Any changes must be persisted. > node1? It seems to make sense that the large binary gets persisted, while > the small property might still be in memory? "the small property" can be persisted differently from "the larger binary", depending on "minRecordLength" parameter: - https://wiki.apache.org/jackrabbit/DataStore If the binary property data is not larger than minRecordLength, it's persisted to the PersistenceManager from the memory, not to the DataStore. By the way, do you know that the single DataStore is global across workspaces? If node1 uses a different workspace from what node1 uses and if the binary data was smaller than minRecordLength, then the binary from node1 is stored in its own database or table which might not be seen by node2 (due to different workspace / DB configuration possibly). Perhaps you can check the repository.xml file and workspace.xml file(s) on each node if there's anything different. > > Another question, is the journal only used for updating the index, or does > it do more than that? I think it should care of the caching node state manager as well. Woonsan > > Thank you again for your help! > Eric > > On Sat, Jun 2, 2018, 00:18 Woonsan Ko <[email protected]> wrote: > >> On Fri, Jun 1, 2018 at 9:57 PM, Eric Berryman <[email protected]> >> wrote: >> > Node1 looks completely fine, and the application that uses it is in >> > production. It's a simple java ee application that uses the jcr to >> upload >> > and list past images. >> >> Does the application use javax.jcr.query.Query first to retrieve the >> nodes containing binary properties? If so, it uses Lucene index for >> the query. >> If you have a chance to use javax.jcr.Node#getNode(path) directly to >> retrieve the specific node containing binary property, then I don't >> think that will hit the Lucene index. It just converts the path to >> node ids to retrieve node states from database. So, it is worth >> validating one of the recently added nodes by #getNode(path) on both >> Node1 and Node2, IMO. If it returns a node but fails to return it by >> Query, then it is a Lucene index issue. If it returns nothing in both >> ways on Node2 while it works fineon Node1, then perhaps is Node2 >> looking at a different database or tables? >> >> Regards, >> >> Woonsan >> >> > >> > I guess what I don't understand, is that they are looking at the exact >> same >> > database. It seems I should be able to have node2 see it the same way, >> and >> > the only difference would be the index, which is in a local file >> directory. >> > >> > So strange. >> > >> > Thank you! >> > >> > On Jun 1, 2018 21:44, "Woonsan Ko" <[email protected]> wrote: >> > >> > Hi Eric, >> > >> > >> > On Fri, Jun 1, 2018 at 1:29 PM, Eric Berryman <[email protected]> >> > wrote: >> >> Hello! >> >> >> >> I have an application that uses jackrabbit to save images, using the >> >> database filestore. >> >> I have jackrabbit clustered (node1, node2). >> >> This was working for me fine, but I started seeing an oddity. >> >> Node1 inserts an image, but node2 doesn't seem to see it when queried >> >> anymore. >> >> So, node2 is now missing about the last 2 weeks of images. >> >> I can see the correct image as a blob in the jcr_ds_DATASTORE table. >> > >> > Are you sure you are able to query or find the images in node1? >> > Blob data in the table doesn't necessarily mean that the binary data >> > is stored in a JCR node's binary property. The blob data could be >> > referred by another node or versioned frozen node or non-existing node >> > which can be caused by node deletion but the binary data wasn't >> > garbage-collected. >> > So, I'd traverse the nodes through simple JCR API and validate if the >> > nodes really exists even in node1. You might need to ask around about >> > the paths of the recently added nodes containing the binary data to do >> > that. >> > >> > >> >> >> >> And, node2 logged that the journal has been applied. >> >> The LOCAL_REVISIONS table shows both nodes have a revision id of 605 >> >> (although I do have 1364 images). >> >> >> >> I've tried adding enableConsistencyCheck=true and >> >> forceConsistencyCheck=true to the index part of the repository.xml file. >> >> But, I don't see any errors. Just, that the consistency check happened. >> >> >> >> I've also tried clearing the index directory of node2. Jackrabbit >> >> recreates the index, applies the 605 journal entries, then ends up in >> the >> >> same state without the last two weeks of images. >> >> >> >> Are there any ideas to fix what seems to be an index issue. >> > >> > I'm kind of suspicious that some of the new nodes in last two weeks >> > might have been removed for some reasons. You can perhaps rule out >> > this possibility by inspecting JCR nodes on node1 first. >> > >> > Regards, >> > >> > Woonsan >> > >> > >> >> >> >> Any help or ideas to troubleshoot are greatly appreciated. >> >> (jackrabbit 2.6) >> >> >> >> Thank you! >> >> Eric >>
