[GitHub] [jena] afs commented on a change in pull request #603: JENA-1756: Dependency updates
afs commented on a change in pull request #603: JENA-1756: Dependency updates URL: https://github.com/apache/jena/pull/603#discussion_r323472276 ## File path: jena-base/src/main/java/org/apache/jena/atlas/lib/DateTimeUtils.java ## @@ -18,29 +18,22 @@ package org.apache.jena.atlas.lib; +import java.time.ZonedDateTime; +import java.time.format.DateTimeFormatter; import java.util.Calendar ; import java.util.Date ; import java.util.GregorianCalendar ; import org.apache.commons.lang3.time.FastDateFormat ; public class DateTimeUtils { - -// Include timezone (even xsd:dates have timezones; Calendars have timezones) -// NB in SimpleDateFormat != FastDateFormat -// SimpleDateFormat does not format Calendars. -// SimpleDateFormat has "X" for ISO format tmezones (+00:00) -//FastDateFormat uses "ZZ" for this. -private static final FastDateFormat dateTimeFmt_display = FastDateFormat.getInstance("/MM/dd HH:mm:ss z") ; -private static final FastDateFormat dateFmt_mmdd= FastDateFormat.getInstance("-MM-ddZZ") ; -// For milliseconds == 0 -private static final FastDateFormat dateTimeFmt_XSD_ms0 = FastDateFormat.getInstance("-MM-dd'T'HH:mm:ssZZ") ; -// For milliseconds != 0 -private static final FastDateFormat dateTimeFmt_XSD_ms = FastDateFormat.getInstance("-MM-dd'T'HH:mm:ss.SSSZZ") ; -// For milliseconds == 0 -private static final FastDateFormat timeFmt_XSD_ms0 = FastDateFormat.getInstance("HH:mm:ssZZ") ; -// For milliseconds != 0 -private static final FastDateFormat timeFmt_XSD_ms = FastDateFormat.getInstance("HH:mm:ss.SSSZZ") ; +// Use xxx to get +00:00 format with DateTimeFormatter Review comment: Nor did I until a few hours ago! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [jena] kinow commented on a change in pull request #603: JENA-1756: Dependency updates
kinow commented on a change in pull request #603: JENA-1756: Dependency updates URL: https://github.com/apache/jena/pull/603#discussion_r323470387 ## File path: jena-base/src/main/java/org/apache/jena/atlas/lib/DateTimeUtils.java ## @@ -18,29 +18,22 @@ package org.apache.jena.atlas.lib; +import java.time.ZonedDateTime; +import java.time.format.DateTimeFormatter; import java.util.Calendar ; import java.util.Date ; import java.util.GregorianCalendar ; import org.apache.commons.lang3.time.FastDateFormat ; public class DateTimeUtils { - -// Include timezone (even xsd:dates have timezones; Calendars have timezones) -// NB in SimpleDateFormat != FastDateFormat -// SimpleDateFormat does not format Calendars. -// SimpleDateFormat has "X" for ISO format tmezones (+00:00) -//FastDateFormat uses "ZZ" for this. -private static final FastDateFormat dateTimeFmt_display = FastDateFormat.getInstance("/MM/dd HH:mm:ss z") ; -private static final FastDateFormat dateFmt_mmdd= FastDateFormat.getInstance("-MM-ddZZ") ; -// For milliseconds == 0 -private static final FastDateFormat dateTimeFmt_XSD_ms0 = FastDateFormat.getInstance("-MM-dd'T'HH:mm:ssZZ") ; -// For milliseconds != 0 -private static final FastDateFormat dateTimeFmt_XSD_ms = FastDateFormat.getInstance("-MM-dd'T'HH:mm:ss.SSSZZ") ; -// For milliseconds == 0 -private static final FastDateFormat timeFmt_XSD_ms0 = FastDateFormat.getInstance("HH:mm:ssZZ") ; -// For milliseconds != 0 -private static final FastDateFormat timeFmt_XSD_ms = FastDateFormat.getInstance("HH:mm:ss.SSSZZ") ; +// Use xxx to get +00:00 format with DateTimeFormatter Review comment: Oh, didn't know about `xxx` for the time-zone offset. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re: documentation and examples
On 11/09/2019 18:40, ajs6f wrote: Adding it to the build means that the documented examples should always stay in step with the code it pulls from the tests and the must pass. Good idea to have a build step to help keep them up-to-date. I've used systems like this and they work well, but I think that we should do this after we move to a more graceful documentation build. In JENA-1755 Andy mentions Jekyll (which has come up before here) and some new features from INFRA for managing sites that should make it more automated. If this is the Jekyll in question: https://jekyllrb.com/docs/ do we have a good way to do a migration? Bruno, I seem to remember you having some experience with such a migration-- is that right? If so I would be happy to work with you to do this, if we all end up agreeing to it. I'm hoping the bulk of conversion work is a perl script to redo the top of each file; Jekyll has short header section. Otherswise, the skeleton needs converting (one file) and there are bound to be "others" in small numbers. Less clear about styling but that's because I haven't looked at all. I've mentioned Jekyll because I've used it (e.g. RDF Delta) and styled sites with it. It is the GH making it one of these base line systems developers have come across. The content is markdown which is the main point. Other recommendations? There are a lot of static site generators, most of which look suitable. Pick your impl language is as good a factor as others! Longevity, stability and maturity are important because we won't want to keep changing the site. Andy ajs6f On Sep 7, 2019, at 12:52 PM, Andy Seaborne wrote: On 05/09/2019 11:46, Claude Warren wrote: There were recently some comments about the lack of query builder documentation (https://issues.apache.org/jira/browse/JENA-1751), so taking that to heart I sat down to write some. Then I recalled I had seen a discussion on one of the other lists about generating examples for the web from example and test code. I was wondering a) if anybody else saw the discussion and if so do you remember where? b) if we should do something like that in Jena. Not the same thing but several module have "src-examples" so that code is available to be linked to. It gives the opportunity of addthme to the local IDE set so that are compiled. Adding it to the build means that the documented examples should always stay in step with the code it pulls from the tests and the must pass. Good idea to have a build step to help keep them up-to-date. There is the under-used jena-examples. Maybe that could be used. If there is interest I will see if I can find the other discussion. Claude Andy
Re: [jira] [Commented] (JENA-1755) Improve documentation of Query Builders
On 11/09/2019 19:42, Claude Warren wrote: I was just speaking to infra about this. Will write more in depth later. But we should think about how we want to work. I favor adding the specific docs to src/site and working from there but I am certain there are lots of opinions here. I don't know what that means. Could you expand that a bit please? Andy Claude On Wed, Sep 11, 2019, 08:35 Andy Seaborne (Jira) wrote: [ https://issues.apache.org/jira/browse/JENA-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927696#comment-16927696 ] Andy Seaborne commented on JENA-1755: - And we need to think about migrating to e.g. Jekyll. I think that new services from INFRA means we can setup a job to automate the website staging. [ https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories ] "automate web site builds using pelican (and other systems)" I haven't had time to dig into the details. Improve documentation of Query Builders --- Key: JENA-1755 URL: https://issues.apache.org/jira/browse/JENA-1755 Project: Apache Jena Issue Type: Improvement Reporter: Jan Martin Keil Priority: Major As discussed in JENA-1751, I propose to improve the documentation of the query builders: {quote}Unfortunately, I did not find (and I think there isn't) any documentation or tutorial about the query builders explaining more than the very basics. Also the JavaDoc (which is to the best of my knowledge nowhere linked on [https://jena.apache.org/]), is, in my experience, not helpful and makes it often necessary to look into the code to understand what is needed and maybe find out how to get it. If I did not miss a comprehensive documentation somewhere, I think it would be worth, to improve documentation. Even a few words at the builder classes (mentioning e.g. ExprFactory) and small examples at the more complicated methods would help a lot. {quote} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[GitHub] [jena] afs opened a new pull request #603: JENA-1756: Dependency updates
afs opened a new pull request #603: JENA-1756: Dependency updates URL: https://github.com/apache/jena/pull/603 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (JENA-1756) Update jena dependencies
[ https://issues.apache.org/jira/browse/JENA-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927930#comment-16927930 ] Andy Seaborne commented on JENA-1756: - Apache Commons Lang3 (3.4 to 3.9) introduces one change in {{DateTimeUtils}} (date and time formatting code). The {{FastDateFormat}} class does not handle timezones correctly (or at least, 3.4 and 3.9 behave differently with 3.9 always outputting as UTC). The solution is to swap to {{java.time.format.DateTimeFormatter}}. > Update jena dependencies > > > Key: JENA-1756 > URL: https://issues.apache.org/jira/browse/JENA-1756 > Project: Apache Jena > Issue Type: Bug >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 3.13.0 > > > The following dependency updates can be done: > jsonldjava – 0.12.5 > commonslang3 -- 3.9 > commonscsv – 1.7 > httpclient – 4.5.10 > micrometer – 1.2.1 > commons-collections4 – 4.4 > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Assigned] (JENA-1756) Update jena dependencies
[ https://issues.apache.org/jira/browse/JENA-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne reassigned JENA-1756: --- Assignee: Andy Seaborne > Update jena dependencies > > > Key: JENA-1756 > URL: https://issues.apache.org/jira/browse/JENA-1756 > Project: Apache Jena > Issue Type: Bug >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > > The following dependency updates can be done: > jsonldjava – 0.12.5 > commonslang3 -- 3.9 > commonscsv – 1.7 > httpclient – 4.5.10 > micrometer – 1.2.1 > commons-collections4 – 4.4 > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (JENA-1756) Update jena dependencies
[ https://issues.apache.org/jira/browse/JENA-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne updated JENA-1756: Fix Version/s: Jena 3.13.0 > Update jena dependencies > > > Key: JENA-1756 > URL: https://issues.apache.org/jira/browse/JENA-1756 > Project: Apache Jena > Issue Type: Bug >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 3.13.0 > > > The following dependency updates can be done: > jsonldjava – 0.12.5 > commonslang3 -- 3.9 > commonscsv – 1.7 > httpclient – 4.5.10 > micrometer – 1.2.1 > commons-collections4 – 4.4 > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (JENA-1756) Update jena dependencies
Andy Seaborne created JENA-1756: --- Summary: Update jena dependencies Key: JENA-1756 URL: https://issues.apache.org/jira/browse/JENA-1756 Project: Apache Jena Issue Type: Bug Reporter: Andy Seaborne The following dependency updates can be done: jsonldjava – 0.12.5 commonslang3 -- 3.9 commonscsv – 1.7 httpclient – 4.5.10 micrometer – 1.2.1 commons-collections4 – 4.4 -- This message was sent by Atlassian Jira (v8.3.2#803003)
Re: [jira] [Commented] (JENA-1755) Improve documentation of Query Builders
I was just speaking to infra about this. Will write more in depth later. But we should think about how we want to work. I favor adding the specific docs to src/site and working from there but I am certain there are lots of opinions here. Claude On Wed, Sep 11, 2019, 08:35 Andy Seaborne (Jira) wrote: > > [ > https://issues.apache.org/jira/browse/JENA-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927696#comment-16927696 > ] > > Andy Seaborne commented on JENA-1755: > - > > And we need to think about migrating to e.g. Jekyll. > > I think that new services from INFRA means we can setup a job to automate > the website staging. > > [ > https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories > ] > > "automate web site builds using pelican (and other systems) " > > I haven't had time to dig into the details. > > > Improve documentation of Query Builders > > --- > > > > Key: JENA-1755 > > URL: https://issues.apache.org/jira/browse/JENA-1755 > > Project: Apache Jena > > Issue Type: Improvement > >Reporter: Jan Martin Keil > >Priority: Major > > > > As discussed in JENA-1751, I propose to improve the documentation of the > query builders: > > {quote}Unfortunately, I did not find (and I think there isn't) any > documentation or tutorial about the query builders explaining more than the > very basics. Also the JavaDoc (which is to the best of my knowledge nowhere > linked on [https://jena.apache.org/]), is, in my experience, not helpful > and makes it often necessary to look into the code to understand what is > needed and maybe find out how to get it. If I did not miss a comprehensive > documentation somewhere, I think it would be worth, to improve > documentation. Even a few words at the builder classes (mentioning e.g. > ExprFactory) and small examples at the more complicated methods would help > a lot. > > {quote} > > > > -- > This message was sent by Atlassian Jira > (v8.3.2#803003) >
Re: documentation and examples
>> Adding it to the build means that the documented examples should always >> stay in step with the code it pulls from the tests and the must pass. > > Good idea to have a build step to help keep them up-to-date. I've used systems like this and they work well, but I think that we should do this after we move to a more graceful documentation build. In JENA-1755 Andy mentions Jekyll (which has come up before here) and some new features from INFRA for managing sites that should make it more automated. If this is the Jekyll in question: https://jekyllrb.com/docs/ do we have a good way to do a migration? Bruno, I seem to remember you having some experience with such a migration-- is that right? If so I would be happy to work with you to do this, if we all end up agreeing to it. ajs6f > On Sep 7, 2019, at 12:52 PM, Andy Seaborne wrote: > > > > On 05/09/2019 11:46, Claude Warren wrote: >> There were recently some comments about the lack of query builder >> documentation (https://issues.apache.org/jira/browse/JENA-1751), so taking >> that to heart I sat down to write some. Then I recalled I had seen a >> discussion on one of the other lists about generating examples for the web >> from example and test code. >> I was wondering >> a) if anybody else saw the discussion and if so do you remember where? >> b) if we should do something like that in Jena. > > Not the same thing but several module have "src-examples" so that code is > available to be linked to. It gives the opportunity of addthme to the local > IDE set so that are compiled. > > >> Adding it to the build means that the documented examples should always >> stay in step with the code it pulls from the tests and the must pass. > > Good idea to have a build step to help keep them up-to-date. > > There is the under-used jena-examples. > Maybe that could be used. > >> If there is interest I will see if I can find the other discussion. >> Claude > > Andy >
[jira] [Resolved] (JENA-1733) SHACL Engine
[ https://issues.apache.org/jira/browse/JENA-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-1733. - Fix Version/s: Jena 3.13.0 Resolution: Fixed > SHACL Engine > > > Key: JENA-1733 > URL: https://issues.apache.org/jira/browse/JENA-1733 > Project: Apache Jena > Issue Type: Task >Affects Versions: Jena 3.12.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne >Priority: Major > Fix For: Jena 3.13.0 > > Time Spent: 3h > Remaining Estimate: 0h > > Include SHACL in the distribution. > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Resolved] (JENA-1746) TDB2 rollback method clashes with nodetable cache
[ https://issues.apache.org/jira/browse/JENA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-1746. - Fix Version/s: Jena 3.13.0 Assignee: Andy Seaborne Resolution: Fixed > TDB2 rollback method clashes with nodetable cache > - > > Key: JENA-1746 > URL: https://issues.apache.org/jira/browse/JENA-1746 > Project: Apache Jena > Issue Type: Bug > Components: TDB2 >Affects Versions: Jena 3.11.0, Jena 3.12.0 > Environment: Linux 3.16.0-9-amd64 #1 SMP Debian 3.16.68-2 > (2019-06-17) x86_64 GNU/Linux > java version "1.8.0_05" > Java(TM) SE Runtime Environment (build 1.8.0_05-b13) > Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) >Reporter: Miklós Győrfi >Assignee: Andy Seaborne >Priority: Critical > Fix For: Jena 3.13.0 > > Attachments: jena-test.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > *Issue:* Inserting triplets, then rollbacking the TDB2 dataset, and loading > back nodes, including some nodes again with the same content causes some > artifacts and mess: some nodes disappear, some nodes are replaced. Moreover > it unrecoverably *corrupts* the database files: accessing triplets then may > cause RiotThriftException. > **org.apache.jena.riot.thrift.RiotThriftException: No conversion to a > Node: > *Reproduction*: Create some quads into a non-empty dataset, then rollback it, > and create again the same triplets in another order, using anonymous and URL > nodes simultaneously. Although this method does not guarantee the issue, the > possibility is high. > *Cause*: My inverstigation shows, that the culprit is the {{NodeTableCache}}. > It caches the node - nodeId relation of the backed table ({{NodeTableTRDF}}), > but the cache does not react to the rollback (abort) operation. The backing > table - during rollback - invalidates the node Id-s. The node Id is in close > relation of the position of the node data in the node data file, so new > inserts can reuse these invalidated node Ids, or close to it for other nodes. > As the nodes (remaining in cache, but not written, and the new ones) then > overlaps each other, reading back them causes Thrift errors, or later it > causes missing nodes in the index. The data of the cached nodes disappears, > if they fall out from the cache, or the dataset reopens. > *Possible fix:* None of the NodeTables registers and reacts to the rollback, > only the backing file and index are restored. Best possible solution is > _creating an option for these components to react to the restoration_. Cache > then may evict cached data, or may track changes in transactions, and can > evict only those. Anyway it is very justifiable for the rollback situations > to evict all the caches. > TransactionCoordinator has collections for shutdownHooks, and for > transactionsComponents. This is a good pattern for creating another > collection for notification interfaces, and calling back these on > transactional events. CacheNodeTable (and other objects) can then be a > listener to this events, and may evict the cache, if necessary. > Other possibility to create callback option in the NodeTable to react to the > invalidation, and propagate back the invalidation in the NodeTable > hierarchy. > Another simpler fix is to propagate down the thread-safe storage "version" in > the NodeTables, and check it in the cache, and evict. > *Workaround:* Skipping the cache (setting nodeToIdCacheSize and > idToNodeCacheSize to -1 in StoreParams) is a good workaround now, but causes > performance issues. > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (JENA-1746) TDB2 rollback method clashes with nodetable cache
[ https://issues.apache.org/jira/browse/JENA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927815#comment-16927815 ] ASF subversion and git services commented on JENA-1746: --- Commit 4eafab1de66c2638cb7348e020af29bbe664616d in jena's branch refs/heads/master from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=4eafab1 ] JENA-1746: BufferingCache for NodeTableCache > TDB2 rollback method clashes with nodetable cache > - > > Key: JENA-1746 > URL: https://issues.apache.org/jira/browse/JENA-1746 > Project: Apache Jena > Issue Type: Bug > Components: TDB2 >Affects Versions: Jena 3.11.0, Jena 3.12.0 > Environment: Linux 3.16.0-9-amd64 #1 SMP Debian 3.16.68-2 > (2019-06-17) x86_64 GNU/Linux > java version "1.8.0_05" > Java(TM) SE Runtime Environment (build 1.8.0_05-b13) > Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) >Reporter: Miklós Győrfi >Priority: Critical > Attachments: jena-test.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > *Issue:* Inserting triplets, then rollbacking the TDB2 dataset, and loading > back nodes, including some nodes again with the same content causes some > artifacts and mess: some nodes disappear, some nodes are replaced. Moreover > it unrecoverably *corrupts* the database files: accessing triplets then may > cause RiotThriftException. > **org.apache.jena.riot.thrift.RiotThriftException: No conversion to a > Node: > *Reproduction*: Create some quads into a non-empty dataset, then rollback it, > and create again the same triplets in another order, using anonymous and URL > nodes simultaneously. Although this method does not guarantee the issue, the > possibility is high. > *Cause*: My inverstigation shows, that the culprit is the {{NodeTableCache}}. > It caches the node - nodeId relation of the backed table ({{NodeTableTRDF}}), > but the cache does not react to the rollback (abort) operation. The backing > table - during rollback - invalidates the node Id-s. The node Id is in close > relation of the position of the node data in the node data file, so new > inserts can reuse these invalidated node Ids, or close to it for other nodes. > As the nodes (remaining in cache, but not written, and the new ones) then > overlaps each other, reading back them causes Thrift errors, or later it > causes missing nodes in the index. The data of the cached nodes disappears, > if they fall out from the cache, or the dataset reopens. > *Possible fix:* None of the NodeTables registers and reacts to the rollback, > only the backing file and index are restored. Best possible solution is > _creating an option for these components to react to the restoration_. Cache > then may evict cached data, or may track changes in transactions, and can > evict only those. Anyway it is very justifiable for the rollback situations > to evict all the caches. > TransactionCoordinator has collections for shutdownHooks, and for > transactionsComponents. This is a good pattern for creating another > collection for notification interfaces, and calling back these on > transactional events. CacheNodeTable (and other objects) can then be a > listener to this events, and may evict the cache, if necessary. > Other possibility to create callback option in the NodeTable to react to the > invalidation, and propagate back the invalidation in the NodeTable > hierarchy. > Another simpler fix is to propagate down the thread-safe storage "version" in > the NodeTables, and check it in the cache, and evict. > *Workaround:* Skipping the cache (setting nodeToIdCacheSize and > idToNodeCacheSize to -1 in StoreParams) is a good workaround now, but causes > performance issues. > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (JENA-1746) TDB2 rollback method clashes with nodetable cache
[ https://issues.apache.org/jira/browse/JENA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927816#comment-16927816 ] ASF subversion and git services commented on JENA-1746: --- Commit c1a84039080a29dd20a02a79c0793724564f7c11 in jena's branch refs/heads/master from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=c1a8403 ] Merge pull request #602 from afs/jena1746-tdb2-abort JENA-1746: TDB2 abort > TDB2 rollback method clashes with nodetable cache > - > > Key: JENA-1746 > URL: https://issues.apache.org/jira/browse/JENA-1746 > Project: Apache Jena > Issue Type: Bug > Components: TDB2 >Affects Versions: Jena 3.11.0, Jena 3.12.0 > Environment: Linux 3.16.0-9-amd64 #1 SMP Debian 3.16.68-2 > (2019-06-17) x86_64 GNU/Linux > java version "1.8.0_05" > Java(TM) SE Runtime Environment (build 1.8.0_05-b13) > Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) >Reporter: Miklós Győrfi >Priority: Critical > Attachments: jena-test.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > *Issue:* Inserting triplets, then rollbacking the TDB2 dataset, and loading > back nodes, including some nodes again with the same content causes some > artifacts and mess: some nodes disappear, some nodes are replaced. Moreover > it unrecoverably *corrupts* the database files: accessing triplets then may > cause RiotThriftException. > **org.apache.jena.riot.thrift.RiotThriftException: No conversion to a > Node: > *Reproduction*: Create some quads into a non-empty dataset, then rollback it, > and create again the same triplets in another order, using anonymous and URL > nodes simultaneously. Although this method does not guarantee the issue, the > possibility is high. > *Cause*: My inverstigation shows, that the culprit is the {{NodeTableCache}}. > It caches the node - nodeId relation of the backed table ({{NodeTableTRDF}}), > but the cache does not react to the rollback (abort) operation. The backing > table - during rollback - invalidates the node Id-s. The node Id is in close > relation of the position of the node data in the node data file, so new > inserts can reuse these invalidated node Ids, or close to it for other nodes. > As the nodes (remaining in cache, but not written, and the new ones) then > overlaps each other, reading back them causes Thrift errors, or later it > causes missing nodes in the index. The data of the cached nodes disappears, > if they fall out from the cache, or the dataset reopens. > *Possible fix:* None of the NodeTables registers and reacts to the rollback, > only the backing file and index are restored. Best possible solution is > _creating an option for these components to react to the restoration_. Cache > then may evict cached data, or may track changes in transactions, and can > evict only those. Anyway it is very justifiable for the rollback situations > to evict all the caches. > TransactionCoordinator has collections for shutdownHooks, and for > transactionsComponents. This is a good pattern for creating another > collection for notification interfaces, and calling back these on > transactional events. CacheNodeTable (and other objects) can then be a > listener to this events, and may evict the cache, if necessary. > Other possibility to create callback option in the NodeTable to react to the > invalidation, and propagate back the invalidation in the NodeTable > hierarchy. > Another simpler fix is to propagate down the thread-safe storage "version" in > the NodeTables, and check it in the cache, and evict. > *Workaround:* Skipping the cache (setting nodeToIdCacheSize and > idToNodeCacheSize to -1 in StoreParams) is a good workaround now, but causes > performance issues. > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[GitHub] [jena] afs merged pull request #602: JENA-1746: TDB2 abort
afs merged pull request #602: JENA-1746: TDB2 abort URL: https://github.com/apache/jena/pull/602 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (JENA-1746) TDB2 rollback method clashes with nodetable cache
[ https://issues.apache.org/jira/browse/JENA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927814#comment-16927814 ] ASF subversion and git services commented on JENA-1746: --- Commit 26e96398d8b6d1024ac1456bcfb85d8b4a1319a1 in jena's branch refs/heads/master from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=26e9639 ] JENA-1746: TransactionListener > TDB2 rollback method clashes with nodetable cache > - > > Key: JENA-1746 > URL: https://issues.apache.org/jira/browse/JENA-1746 > Project: Apache Jena > Issue Type: Bug > Components: TDB2 >Affects Versions: Jena 3.11.0, Jena 3.12.0 > Environment: Linux 3.16.0-9-amd64 #1 SMP Debian 3.16.68-2 > (2019-06-17) x86_64 GNU/Linux > java version "1.8.0_05" > Java(TM) SE Runtime Environment (build 1.8.0_05-b13) > Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) >Reporter: Miklós Győrfi >Priority: Critical > Attachments: jena-test.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > *Issue:* Inserting triplets, then rollbacking the TDB2 dataset, and loading > back nodes, including some nodes again with the same content causes some > artifacts and mess: some nodes disappear, some nodes are replaced. Moreover > it unrecoverably *corrupts* the database files: accessing triplets then may > cause RiotThriftException. > **org.apache.jena.riot.thrift.RiotThriftException: No conversion to a > Node: > *Reproduction*: Create some quads into a non-empty dataset, then rollback it, > and create again the same triplets in another order, using anonymous and URL > nodes simultaneously. Although this method does not guarantee the issue, the > possibility is high. > *Cause*: My inverstigation shows, that the culprit is the {{NodeTableCache}}. > It caches the node - nodeId relation of the backed table ({{NodeTableTRDF}}), > but the cache does not react to the rollback (abort) operation. The backing > table - during rollback - invalidates the node Id-s. The node Id is in close > relation of the position of the node data in the node data file, so new > inserts can reuse these invalidated node Ids, or close to it for other nodes. > As the nodes (remaining in cache, but not written, and the new ones) then > overlaps each other, reading back them causes Thrift errors, or later it > causes missing nodes in the index. The data of the cached nodes disappears, > if they fall out from the cache, or the dataset reopens. > *Possible fix:* None of the NodeTables registers and reacts to the rollback, > only the backing file and index are restored. Best possible solution is > _creating an option for these components to react to the restoration_. Cache > then may evict cached data, or may track changes in transactions, and can > evict only those. Anyway it is very justifiable for the rollback situations > to evict all the caches. > TransactionCoordinator has collections for shutdownHooks, and for > transactionsComponents. This is a good pattern for creating another > collection for notification interfaces, and calling back these on > transactional events. CacheNodeTable (and other objects) can then be a > listener to this events, and may evict the cache, if necessary. > Other possibility to create callback option in the NodeTable to react to the > invalidation, and propagate back the invalidation in the NodeTable > hierarchy. > Another simpler fix is to propagate down the thread-safe storage "version" in > the NodeTables, and check it in the cache, and evict. > *Workaround:* Skipping the cache (setting nodeToIdCacheSize and > idToNodeCacheSize to -1 in StoreParams) is a good workaround now, but causes > performance issues. > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (JENA-1755) Improve documentation of Query Builders
[ https://issues.apache.org/jira/browse/JENA-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927696#comment-16927696 ] Andy Seaborne commented on JENA-1755: - And we need to think about migrating to e.g. Jekyll. I think that new services from INFRA means we can setup a job to automate the website staging. [https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories] "automate web site builds using pelican (and other systems) " I haven't had time to dig into the details. > Improve documentation of Query Builders > --- > > Key: JENA-1755 > URL: https://issues.apache.org/jira/browse/JENA-1755 > Project: Apache Jena > Issue Type: Improvement >Reporter: Jan Martin Keil >Priority: Major > > As discussed in JENA-1751, I propose to improve the documentation of the > query builders: > {quote}Unfortunately, I did not find (and I think there isn't) any > documentation or tutorial about the query builders explaining more than the > very basics. Also the JavaDoc (which is to the best of my knowledge nowhere > linked on [https://jena.apache.org/]), is, in my experience, not helpful and > makes it often necessary to look into the code to understand what is needed > and maybe find out how to get it. If I did not miss a comprehensive > documentation somewhere, I think it would be worth, to improve documentation. > Even a few words at the builder classes (mentioning e.g. ExprFactory) and > small examples at the more complicated methods would help a lot. > {quote} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (JENA-1755) Improve documentation of Query Builders
[ https://issues.apache.org/jira/browse/JENA-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927652#comment-16927652 ] Claude Warren commented on JENA-1755: - I became aware of this issue during the discussion of JENA-1751 which I then used in my talk at ApacheCon when I was discussing the difficulty in finding documentation. I am very interested in helping to sort out this issue and will both author and review documentation. We may need to revisit how we distribute the documentation across the site. But I think that can wait until we get the documentation for Query builder in good shape. > Improve documentation of Query Builders > --- > > Key: JENA-1755 > URL: https://issues.apache.org/jira/browse/JENA-1755 > Project: Apache Jena > Issue Type: Improvement >Reporter: Jan Martin Keil >Priority: Major > > As discussed in JENA-1751, I propose to improve the documentation of the > query builders: > {quote}Unfortunately, I did not find (and I think there isn't) any > documentation or tutorial about the query builders explaining more than the > very basics. Also the JavaDoc (which is to the best of my knowledge nowhere > linked on [https://jena.apache.org/]), is, in my experience, not helpful and > makes it often necessary to look into the code to understand what is needed > and maybe find out how to get it. If I did not miss a comprehensive > documentation somewhere, I think it would be worth, to improve documentation. > Even a few words at the builder classes (mentioning e.g. ExprFactory) and > small examples at the more complicated methods would help a lot. > {quote} -- This message was sent by Atlassian Jira (v8.3.2#803003)