[ 
https://issues.apache.org/jira/browse/JENA-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500751#comment-17500751
 ] 

Lorenz Bühmann edited comment on JENA-2225 at 3/3/22, 1:35 PM:
---------------------------------------------------------------

[~andy] I added a minimized stats.opt to this issue. It is sufficient to have 
just a large total count value beyond integer range in the meta statistics, at 
least this is is the first part where the StatsMatcher already fails. Not sure 
if any of the other Items are also handled at different places in the source 
code though. 


was (Author: lorenzb):
[~andy] I added a minimized stats.opt to this issue. It is sufficient to have 
just a large total count value beyond integer range in the meta statistics, at 
least this is already where the StatsMatcher fails already. 

> TDB/TDB2 dataset size stat serialized incorrectly for large datasets
> --------------------------------------------------------------------
>
>                 Key: JENA-2225
>                 URL: https://issues.apache.org/jira/browse/JENA-2225
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB, TDB2
>    Affects Versions: Jena 4.3.1
>            Reporter: Lorenz Bühmann
>            Assignee: Andy Seaborne
>            Priority: Minor
>             Fix For: Jena 4.4.0
>
>         Attachments: stats.opt
>
>
> When computing the TDB/TDB2 stats via CLI the size will be serialized 
> incorrectly for large datasets.
> For example for latest Wikidata Truthy we get
> {noformat}
> (count -1983667112)){noformat}
> This happens because for both the corresponding `Stats.java` class does 
> enforce an Integer type Node though the value is a long type:
> {code:java}
> if ( count >= 0 )
>     addPair(meta.getList(), StatsMatcher.COUNT, 
> NodeFactoryExtra.intToNode((int)count)) ; {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to