[ https://issues.apache.org/jira/browse/JCR-3333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292706#comment-13292706 ]
Thomas Mueller commented on JCR-3333: ------------------------------------- What version of Jackrabbit do you use exactly? Recent version of Jackrabbit use the data store feature by default, which will only store the binary once: http://wiki.apache.org/jackrabbit/DataStore If you use a very old version of Jackrabbit, the data store is not available. In this case, the same binary can be stored multiple times, but this isn't a bug - it's just the way it's designed. > The binary file entities are stored twice in the DB > --------------------------------------------------- > > Key: JCR-3333 > URL: https://issues.apache.org/jira/browse/JCR-3333 > Project: Jackrabbit Content Repository > Issue Type: Bug > Components: JCR 2.0 > Environment: Windows 7, Linux > Reporter: P.C.Sun > Attachments: repository.xml > > > We are using JCR in Liferay to store documents, which means all documents > store in DB in binary. As these days, we found the size of DB is increasing > very fast. So we run the SQL to get size of documents. The SQLs are like: > 1. select sum(size_) from dlfileentry(liferay table to store file meta data, > such as name, size); -> All documents size recorded in dlentry table: > The result is: 43330765874, which means around 40.36 GB > 2. The DB size report is: around 95.97 GB. > 3. Within these tables, there are two very big tables: > j_pm_liferay_binval -> 52.07GB > j_v_pm_binval -> 43.65 GB > So the question is: if the document itself is only around 40.36 GB, what are > those two tables storing? From the table itself, they are the all binval > tables...Does it mean every document is stored twice or something. What's > inside those tables? > In this case, the DB increase around 30 GB within 3 months, really fast, any > suggestion to improve this? > As replied from Liferay: the table j_v_pm_binaval is to store the file > version. However, for the new document, it's also stored, which we think it > should be created only when a new version is generated. They also mentioned > to solve this we need to change repository.xml, however, we don't have the > answer how to deal with the old files, whether they will get lost if we > change the config file. > Please let me know whether it is possible to clean them in DB? > Thank you very much and looking forwards to your reply. > Best Regards. > P.C.(JACK) SUN -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira