[ 
https://issues.apache.org/jira/browse/JCR-3333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292706#comment-13292706
 ] 

Thomas Mueller commented on JCR-3333:
-------------------------------------

What version of Jackrabbit do you use exactly?

Recent version of Jackrabbit use the data store feature by default, which will 
only store the binary once: http://wiki.apache.org/jackrabbit/DataStore

If you use a very old version of Jackrabbit, the data store is not available. 
In this case, the same binary can be stored multiple times, but this isn't a 
bug - it's just the way it's designed.
                
> The binary file entities are stored twice in the DB
> ---------------------------------------------------
>
>                 Key: JCR-3333
>                 URL: https://issues.apache.org/jira/browse/JCR-3333
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: JCR 2.0
>         Environment: Windows 7, Linux
>            Reporter: P.C.Sun
>         Attachments: repository.xml
>
>
> We are using JCR in Liferay to store documents, which means all documents 
> store in DB in binary. As these days, we found the size of DB is increasing 
> very fast. So we run the SQL to get size of documents. The SQLs are like: 
> 1. select sum(size_) from dlfileentry(liferay table to store file meta data, 
> such as name, size); -> All documents size recorded in dlentry table:
> The result is: 43330765874, which means around 40.36 GB
> 2. The DB size report is: around 95.97 GB. 
> 3. Within these tables, there are two very big tables: 
> j_pm_liferay_binval -> 52.07GB
> j_v_pm_binval -> 43.65 GB
> So the question is: if the document itself is only around 40.36 GB, what are 
> those two tables storing? From the table itself, they are the all binval 
> tables...Does it mean every document is stored twice or something. What's 
> inside those tables? 
> In this case, the DB increase around 30 GB within 3 months, really fast, any 
> suggestion to improve this? 
> As replied from Liferay: the table j_v_pm_binaval is to store the file 
> version. However, for the new document, it's also stored, which we think it 
> should be created only when a new version is generated. They also mentioned 
> to solve this we need to change repository.xml, however, we don't have the 
> answer how to deal with the old files, whether they will get lost if we 
> change the config file.
> Please let me know whether it is possible to clean them in DB? 
> Thank you very much and looking forwards to your reply. 
> Best Regards.
> P.C.(JACK) SUN

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to