[GitHub] jena pull request: JENA-1085: Common pattern for transaction lifec...
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/108#discussion_r47445013 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/DatasetGraphInMemory.java --- @@ -55,21 +51,15 @@ private final DatasetPrefixStorage prefixes = new DatasetPrefixStorageInMemory(); +/** This lock imposes the multiple-reader and single-writer policy of transactions */ private final Lock writeLock = new LockMRPlusSW(); -private Lock writeLock() { -return writeLock; -} - -private final ReentrantReadWriteLock commitLock = new ReentrantReadWriteLock(true); - /** - * Commits must be atomic, and because a thread that is committing alters the various indexes one after another, we - * lock out {@link #begin(ReadWrite)} while {@link #commit()} is executing. + * Transaction lifecycle operations must be atomi, especially begin and commit where --- End diff -- This comment is a bit confusing. How about: ``` We have a lock for committing because transaction lifecycle operations must be atomic, especially {@link Transactional#begin} and {@link Transactional#commit}, for which the global state of the (possibly multiple) indexes may be read or written one after another. ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1085: Common pattern for transaction lifec...
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/108#issuecomment-164263376 I'm not sure why you renamed `commitLock`. I would think that `systemLock` would be a better name for the lock supporting `getLock`, which is not the same and is in fact a separate MRSW lock. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1085: Common pattern for transaction lifec...
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/108#discussion_r47445042 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/DatasetGraphInMemory.java --- @@ -125,57 +115,104 @@ public DatasetGraphInMemory(final QuadTable i, final TripleTable t) { @Override public void begin(final ReadWrite readWrite) { -if (isInTransaction()) throw new JenaTransactionException("Transactions cannot be nested!"); -transactionType(readWrite); -isInTransaction(true); -writeLock().enterCriticalSection(readWrite.equals(READ)); // get the dataset write lock, if needed. -commitLock().readLock().lock(); // if a commit is proceeding, wait so that we see a coherent index state -try { +if (isInTransaction()) +throw new JenaTransactionException("Transactions cannot be nested!"); +startTransaction(readWrite) ; +_begin(readWrite) ; +} + +private void _begin(ReadWrite readWrite) { +withLock(systemLock, () ->{ quadsIndex().begin(readWrite); defaultGraph().begin(readWrite); -} finally { -commitLock().readLock().unlock(); -} +}) ; +} + +/** Called transaction start code at most once per transaction. */ +private void startTransaction(ReadWrite mode) { +writeLock.enterCriticalSection(mode.equals(READ)); // get the dataset write lock, if needed. +transactionType(mode); +isInTransaction(true); } +/** Called transaction ending code at most once per transaction. */ +private void finishTransaction() { +isInTransaction.remove(); +transactionType.remove(); +writeLock.leaveCriticalSection(); +} + @Override public void commit() { -if (!isInTransaction()) throw new JenaTransactionException("Tried to commit outside a transaction!"); -commitLock().writeLock().lock(); -try { +if (!isInTransaction()) +throw new JenaTransactionException("Tried to commit outside a transaction!"); +if (transactionType().equals(WRITE)) +_commit(); +finishTransaction(); +} + +private void _commit() { +withLock(systemLock, () -> { quadsIndex().commit(); defaultGraph().commit(); -} finally { -commitLock().writeLock().unlock(); -} -isInTransaction.remove(); -writeLock().leaveCriticalSection(); +quadsIndex().end(); +defaultGraph().end(); +} ) ; } - + @Override public void abort() { -if (!isInTransaction()) throw new JenaTransactionException("Tried to abort outside a transaction!"); -end(); +if (!isInTransaction()) +throw new JenaTransactionException("Tried to abort outside a transaction!"); +if (transactionType().equals(WRITE)) +_abort(); +finishTransaction(); } +private void _abort() { +withLock(systemLock, () -> { +quadsIndex().abort(); +defaultGraph().abort(); +quadsIndex().end(); +defaultGraph().end(); +} ) ; +} + @Override public void close() { -if (isInTransaction()) abort(); +if (isInTransaction()) +abort(); } @Override public void end() { if (isInTransaction()) { -if (transactionType().equals(WRITE)) -log.warn("end() called for WRITE transaction without commit or abort having been called"); +if (transactionType().equals(WRITE)) { +log.warn("end() called for WRITE transaction without commit or abort having been called causing a forced abort"); --- End diff -- This message is a tiny bit ambiguous if you don't already understand that `end` calls `_abort`. Maybe just add a sentence break: `"end() called for WRITE transaction without commit or abort having been called. This causes a forced abort!"` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1085: Common pattern for transaction lifec...
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/108#issuecomment-164263343 I agree about`ReentrantLock`. Lighter for the same value. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1078: Logging warning when transaction is ...
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/105#issuecomment-163937760 I had no idea that throwing WARNINGs in the test suite was any kind of issue. ERRORs, yes, but not WARNINGs. That's a policy that might be recorded somewhere, perhaps [here](https://jena.apache.org/getting_involved/reviewing_contributions.html)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1083: Refactoring in TupleTable and subtyp...
Github user ajs6f closed the pull request at: https://github.com/apache/jena/pull/106 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1078: Logging warning when transaction is ...
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/105#issuecomment-163938075 In any event, I can review them and eliminate them. No need for you to spend time doing that-- this is my code and I take responsibility for it. Would you reopen the ticket for me? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1085 : Common pattern for passing down tra...
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/107#discussion_r47392305 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/DatasetGraphInMemory.java --- @@ -166,16 +179,31 @@ public void close() { @Override public void end() { if (isInTransaction()) { -if (transactionType().equals(WRITE)) -log.warn("end() called for WRITE transaction without commit or abort having been called"); +if (transactionType().equals(WRITE)) { +log.warn("end() called for WRITE transaction without commit or abort having been called causing a forced abort"); +_abort() ; +} +finishTransaction(); +return ; --- End diff -- Why an "naked" `return` here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1085 : Common pattern for passing down tra...
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/107#issuecomment-164023587 Yes, I think that renaming `writeLock` makes sense. The alternative would be to actually check the type of transaction and avoid fiddling with that lock if it is a READ. I think either is okay and either is clearer than what is there now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1085 : Common pattern for passing down tra...
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/107#issuecomment-164023866 I'm fine with pulling out the default impl of `TupleTable::abort`. I myself don't find it any less clear, but if you do, that's evidence enough. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1083: Refactoring in TupleTable and subtyp...
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/106#issuecomment-163739254 There are a lot of whitespace changes in this PR, for which I apologize. They are present because I have switched my IDE to 4-space indents to match the more commons usage in the Jena codebase. If this is too confusing and it would be preferably to go back to the form that I had been using, please say so and I will be happy to so do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1083: Refactoring in TupleTable and subtyp...
GitHub user ajs6f opened a pull request: https://github.com/apache/jena/pull/106 JENA-1083: Refactoring in TupleTable and subtypes for clarity and concision You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajs6f/jena TupleTableCleanup Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/106.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #106 commit 7769970b0124b3a054a20d315e44dce77c33e030 Author: ajs6f <aj...@virginia.edu> Date: 2015-12-10T20:07:18Z Refactoring in TupleTable and subtypes for clarity and concision --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1078: Logging warning when transaction is ...
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/105#discussion_r47086225 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/DatasetGraphInMemory.java --- @@ -161,17 +161,17 @@ public void abort() { @Override public void close() { if (isInTransaction()) abort(); - } @Override public void end() { if (isInTransaction()) { +log.warn("Ending transaction without commit!"); --- End diff -- Fixed with latest commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1078: Logging warning when transaction is ...
GitHub user ajs6f opened a pull request: https://github.com/apache/jena/pull/105 JENA-1078: Logging warning when transaction is ended without commiting You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajs6f/jena AddNoCommitWarning Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/105.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #105 commit 53975ed362e27c88cea7326789fad5c1d009e98e Author: ajs6f <aj...@virginia.edu> Date: 2015-12-08T16:53:25Z Logging warning when transaction is ended without commiting --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Using ThreadLocal::remove to clean thread-speci...
GitHub user ajs6f opened a pull request: https://github.com/apache/jena/pull/104 Using ThreadLocal::remove to clean thread-specific state You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajs6f/jena RemoveThreadLocals Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/104.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #104 commit 3bdb61814e27714dd1736f52bc03355354a29ed8 Author: ajs6f <aj...@virginia.edu> Date: 2015-12-04T17:00:38Z Using ThreadLocal::remove to clean thread-specific state --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r46154284 --- Diff: jena-base/pom.xml --- @@ -50,21 +50,29 @@ org.apache.commons commons-csv - - + + com.github.andrewoma.dexx dexx-collections - + org.apache.commons commons-lang3 +org.apache.jena --- End diff -- Again, why is `jena-shaded-guava` being pulled in twice, with the wrong version, when it is already pulled in above? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r46153931 --- Diff: jena-arq/pom.xml --- @@ -123,9 +123,18 @@ com.jayway.awaitility awaitility test + --- End diff -- This is totally confusing. Not only does it appear that you are pulling in the same dependency twice, this very dependency is already pulled in at line 76 above! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624: Correction to transaction begin for D...
GitHub user ajs6f opened a pull request: https://github.com/apache/jena/pull/103 JENA-624: Correction to transaction begin for DatasetGraphInMemory Correction to transaction begin for DatasetGraphInMemory to include default graph as well as quad table. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajs6f/jena CorrectionToDatasetGraphInMemory Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/103.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #103 commit c9b292293d307de305700981adb3d2d6d424d732 Author: ajs6f <aj...@virginia.edu> Date: 2015-11-30T18:23:34Z Correction to transaction begin for DatasetGraphInMemory to include default graph as well as quad table. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/95#issuecomment-159014671 Re: `ResultSet`: Yes and, I think you want to avoid using them again even if they are not complete. As soon as a thread has called `::next` even once, they are no longer safe for reuse. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: A few suggestions for @osma
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/101#issuecomment-158935620 Sure, I was just responding to the question about commits. I agree that code must be written in a style that supports maintenance over the long term. I'd make two points about that: code that is written in archaic style is just as hard to maintain as code that is written in ultra-modern style, and it would be really wonderful if Jena could create some formatting standards (e.g. for indentation, etc.) Common formatting can help a lot with readability. I tried to begin that conversation some months ago, but I didn't seem to get much enthusiasm. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45605114 --- Diff: .travis.yml --- @@ -0,0 +1,22 @@ + # Licensed to the Apache Software Foundation (ASF) under one --- End diff -- I think Travis-CI is great and use it all the time, but I think perhaps there should be some discussion on-list about it first. There is already some CI somewhere (although I don't where and I'm not sure that ordinary mortals have access to it-- @afs ?). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/95#issuecomment-158940056 I'm not usually one to complain about this sort of thing, but there is so much whitespace-change noise in this that it is _really_ hard to figure what has actually been changed where. There are at least several files here that haven't actually been changed at all but have dozens of lines of spurious delta-ing in them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: A few suggestions for @osma
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/101#issuecomment-158933847 Maybe there's confusion about what this PR is for-- I wasn't intending that it would be merged, but that you would use it as a source of examples, like I wrote. The intention was to show off the various techniques (lambdas, etc.) in context. You would want to make a PR using whatever you like here. For example, I show a few uses of `Optional` as the return type of methods that return a value-or-`null`, but there are many such places. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: A few suggestions for @osma
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/101#issuecomment-158938940 Just to be clear, I don't mean to argue that Jena's code is particularly archaic, just that we should aim for balance between taking advantage of new capabilities and avoiding the bleeding edge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45605185 --- Diff: jena-arq/src/main/java/org/apache/jena/atlas/json/io/JSWriter.java --- @@ -67,7 +67,7 @@ public void startOutput() {} public static final String ObjectSep = " ," ; public static final String ObjectPairSep = " : " ; -// Remember whether we are in the first element of a compound --- End diff -- There is no actual change to this file... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45605299 --- Diff: jena-arq/src/main/java/org/apache/jena/query/ResultSetFormatter.java --- @@ -48,7 +48,7 @@ public class ResultSetFormatter { // See also ResultSetMgr -- this post-dates this code. // Ideally, the operation here should call ResultSetMgr. - --- End diff -- There is no actual change to this file... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45619712 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/servlets/ResponseOutputStream.java --- @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.fuseki.servlets; + +import java.io.IOException; +import java.io.OutputStream; + +/** + * This class is the data replicator of output + * streams. These streams sit on top of an already existing output + * stream (the underlying output stream) which it uses as its + * basic sink of data, but possibly coping the data along the + * way or providing additional functionality. + * + */ + +public class ResponseOutputStream extends OutputStream { --- End diff -- This is a strange name for a type, the avowed purpose of which is to _replicate_ data. Why is it called "ResponseOutputStream"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45619804 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/servlets/ResponseOutputStream.java --- @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.fuseki.servlets; + +import java.io.IOException; +import java.io.OutputStream; + +/** + * This class is the data replicator of output + * streams. These streams sit on top of an already existing output + * stream (the underlying output stream) which it uses as its + * basic sink of data, but possibly coping the data along the + * way or providing additional functionality. + * + */ + +public class ResponseOutputStream extends OutputStream { --- End diff -- Could this not be done without introducing a new type just by using `TeeOutputStream`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45620222 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/servlets/ResponseResultSet.java --- @@ -309,13 +296,81 @@ private static void output(HttpAction action, String contentType, String charset out.println("## Query cancelled due to timeout during execution ##") ; out.println("## Incomplete results ##") ; out.flush() ; -// No point raising an exception - 200 was sent already. +// No point raising an exception - 200 was sent already. //errorOccurred(ex) ; } -// Includes client gone. -} catch (IOException ex) +// Includes client gone. +} catch (IOException ex) { ServletOps.errorOccurred(ex) ; } // Do not call httpResponse.flushBuffer(); here - Jetty closes the stream if it is a gzip stream -// then the JSON callback closing details can't be added. +// then the JSON callback closing details can't be added. +} + +private static void output(HttpAction action, String contentType, String charset, OutputContent proc, CacheAction cacheAction) +{ +try { +setHttpResponse(action, contentType, charset) ; +action.response.setStatus(HttpSC.OK_200) ; + +OutputStream outServlet = action.response.getOutputStream(); +Cache cache = SPARQL_Query_Cache.getCache(); --- End diff -- Shouldn't this be properly parameterized? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/95#issuecomment-158980337 I'm honestly more and more confused about the design here, using `CacheAction` and `CacheEntry`. It seems that the same work could be accomplished more clearly either by using Jena's extant cache types (properly parameterized) or even just a good `Map` implementation, relying on `Map::computeIfAbsent`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45620264 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_Query_Cache.java --- @@ -0,0 +1,249 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.jena.fuseki.servlets; + +import org.apache.jena.atlas.io.IO; +import org.apache.jena.atlas.io.IndentedLineBuffer; +import org.apache.jena.atlas.lib.Cache; +import org.apache.jena.atlas.lib.CacheFactory; +import org.apache.jena.atlas.web.ContentType; +import org.apache.jena.fuseki.DEF; +import org.apache.jena.fuseki.Fuseki; +import org.apache.jena.fuseki.FusekiLib; +import org.apache.jena.fuseki.cache.CacheAction; +import org.apache.jena.fuseki.cache.CacheEntry; +import org.apache.jena.query.*; +import org.apache.jena.riot.web.HttpNames; +import org.apache.jena.sparql.core.Prologue; +import org.apache.jena.sparql.resultset.SPARQLResult; +import org.apache.jena.web.HttpSC; + +import javax.servlet.http.HttpServletRequest; +import javax.servlet.http.HttpServletResponse; +import java.io.IOException; +import java.io.InputStream; +import java.util.*; + +import static java.lang.String.format; +import static org.apache.jena.fuseki.server.CounterName.QueryTimeouts; +import static org.apache.jena.riot.WebContent.*; +import static org.apache.jena.riot.web.HttpNames.*; +import static org.apache.jena.riot.web.HttpNames.paramTimeout; + +public class SPARQL_Query_Cache extends SPARQL_Protocol { + +private static final String QueryParseBase = Fuseki.BaseParserSPARQL ; + +private static Cache cache = null; --- End diff -- Why isn't this properly parameterized? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/95#issuecomment-158987655 @rvesse Sorry, I didn't phrase that well. I meant that commits _can_ get squashed, so there's no need to worry about making extra ones. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/95#issuecomment-158996139 If I understand you correctly, and you are providing `ResultSet` as the cached value in some cases, are you dealing with threading? That is, are you ensuring that you don't supply the same (same object) cached `ResultSet` as a response to two different requests at the same time? It would be nice to see some tests to ensure that this new caching facility is threadsafe in that sense. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: A few suggestions for @osma
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/101#issuecomment-159008507 Sure, @afs . It might be good to get some of this written down in the docs. I'll happily close this PR and leave the one directly to @osma's clone in place for him to do just as he likes with. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/95#issuecomment-159011031 @afs Yes, I work on other projects where that is the case. My feeling is that (irrespective of policy) it's good form to squash down, not all the way to one commit, but down to a reasonable number that properly package the substantive changes at play. But as I said, I didn't mean to imply that Jena has some particular policy and I'm sorry if I caused any confusion. On a semi-related note, there are a lot of merge commits in this PR. @samaitra, it's of course your choice as to how to work, but if you are not familiar with `git rebase`, it can be really helpful when trying to stay up to date with an upstream master as you work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45621175 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_Update.java --- @@ -92,13 +93,15 @@ protected void doOptions(HttpServletRequest request, HttpServletResponse respons response.setHeader(HttpNames.hAllow, "OPTIONS,POST") ; response.setHeader(HttpNames.hContentLengh, "0") ; } - + protected void doOptions(HttpAction action) { doOptions(action.request, action.response) ; } @Override protected void perform(HttpAction action) { +Cache cache = SPARQL_Query_Cache.getCache(); --- End diff -- A more subtle approach would be to clear the cache only if the action goes through. If it fails and no chance is made to the data, there's no need to clear the cache. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45625411 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_Query_Cache.java --- @@ -0,0 +1,249 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.jena.fuseki.servlets; + +import org.apache.jena.atlas.io.IO; +import org.apache.jena.atlas.io.IndentedLineBuffer; +import org.apache.jena.atlas.lib.Cache; +import org.apache.jena.atlas.lib.CacheFactory; +import org.apache.jena.atlas.web.ContentType; +import org.apache.jena.fuseki.DEF; +import org.apache.jena.fuseki.Fuseki; +import org.apache.jena.fuseki.FusekiLib; +import org.apache.jena.fuseki.cache.CacheAction; +import org.apache.jena.fuseki.cache.CacheEntry; +import org.apache.jena.query.*; +import org.apache.jena.riot.web.HttpNames; +import org.apache.jena.sparql.core.Prologue; +import org.apache.jena.sparql.resultset.SPARQLResult; +import org.apache.jena.web.HttpSC; + +import javax.servlet.http.HttpServletRequest; +import javax.servlet.http.HttpServletResponse; +import java.io.IOException; +import java.io.InputStream; +import java.util.*; + +import static java.lang.String.format; +import static org.apache.jena.fuseki.server.CounterName.QueryTimeouts; +import static org.apache.jena.riot.WebContent.*; +import static org.apache.jena.riot.web.HttpNames.*; +import static org.apache.jena.riot.web.HttpNames.paramTimeout; + +public class SPARQL_Query_Cache extends SPARQL_Protocol { + +private static final String QueryParseBase = Fuseki.BaseParserSPARQL ; + +private static Cache cache = null; + +private int CACHE_SIZE = 1; + +public SPARQL_Query_Cache() { +super() ; +if(cache == null){ +cache = CacheFactory.createCache(CACHE_SIZE); +} +} + +// All the params we support +protected static List allParams = Arrays.asList(paramQuery, paramDefaultGraphURI, paramNamedGraphURI, +paramQueryRef, paramStyleSheet, paramAccept, paramOutput1, +paramOutput2, paramCallback, paramForceAccept, paramTimeout) ; + +// Choose REST verbs to support. + +// doMethod : Not used with UberServlet dispatch. + +@Override +protected void doPost(HttpServletRequest request, HttpServletResponse response) { +doCommon(request, response) ; +} + +@Override +protected void doGet(HttpServletRequest request, HttpServletResponse response) { +doCommon(request, response) ; +} + +@Override +protected void doOptions(HttpServletRequest request, HttpServletResponse response) { +setCommonHeadersForOptions(response) ; +response.setHeader(HttpNames.hAllow, "GET,OPTIONS,POST") ; +response.setHeader(HttpNames.hContentLengh, "0") ; +} + +protected void doOptions(HttpAction action) { +doOptions(action.request, action.response) ; +} + +@Override +protected void validate(HttpAction action) { +} + +@Override +protected void perform(HttpAction action) { + +// OPTIONS +if ( action.request.getMethod().equals(HttpNames.METHOD_OPTIONS) ) { +// Share with update via SPARQL_Protocol. +doOptions(action) ; +return ; +} + +// GET +if ( action.request.getMethod().equals(HttpNames.METHOD_GET) ) { +executeWithParameter(action) ; +return ; +} + +ContentType ct = FusekiLib.getContentType(action) ; + +// POST application/x-www-form-url +// POST ?query= and no Content-Type +if ( ct == null || isHtmlForm(ct) ) { +// validation c
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/95#issuecomment-158994873 I'm not quite clear what you mean by "I faced error as ResultSet is closed, Query execution error." Do you mean that the `ResultSet` cannot be rewound for repeated use? And you are using the `byte[]` to store a "reusable" form? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r45620403 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_Query_Cache.java --- @@ -0,0 +1,249 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.jena.fuseki.servlets; + +import org.apache.jena.atlas.io.IO; +import org.apache.jena.atlas.io.IndentedLineBuffer; +import org.apache.jena.atlas.lib.Cache; +import org.apache.jena.atlas.lib.CacheFactory; +import org.apache.jena.atlas.web.ContentType; +import org.apache.jena.fuseki.DEF; +import org.apache.jena.fuseki.Fuseki; +import org.apache.jena.fuseki.FusekiLib; +import org.apache.jena.fuseki.cache.CacheAction; +import org.apache.jena.fuseki.cache.CacheEntry; +import org.apache.jena.query.*; +import org.apache.jena.riot.web.HttpNames; +import org.apache.jena.sparql.core.Prologue; +import org.apache.jena.sparql.resultset.SPARQLResult; +import org.apache.jena.web.HttpSC; + +import javax.servlet.http.HttpServletRequest; +import javax.servlet.http.HttpServletResponse; +import java.io.IOException; +import java.io.InputStream; +import java.util.*; + +import static java.lang.String.format; +import static org.apache.jena.fuseki.server.CounterName.QueryTimeouts; +import static org.apache.jena.riot.WebContent.*; +import static org.apache.jena.riot.web.HttpNames.*; +import static org.apache.jena.riot.web.HttpNames.paramTimeout; + +public class SPARQL_Query_Cache extends SPARQL_Protocol { + +private static final String QueryParseBase = Fuseki.BaseParserSPARQL ; + +private static Cache cache = null; + +private int CACHE_SIZE = 1; + +public SPARQL_Query_Cache() { +super() ; +if(cache == null){ +cache = CacheFactory.createCache(CACHE_SIZE); --- End diff -- Why this logic in the constructor? Why not just initialize in the field declaration? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: A few suggestions for @osma
GitHub user ajs6f opened a pull request: https://github.com/apache/jena/pull/101 A few suggestions for @osma These are a few small suggestions for @osma for using Guava or Java 8 features to slightly tighen code in `jena-text`. They are far from exhaustive, just meant to be examples for features like `Optional` or the Streams API. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajs6f/jena jena-text-suggestions Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/101.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #101 commit cbd83c5aa5ce72d54e49487ac88f9f20fea7469a Author: ajs6f <aj...@virginia.edu> Date: 2015-11-19T16:40:08Z A few suggestions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Rename in-memory dataset test classes.
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/100#issuecomment-158076981 And the various `Assembler` subtypes? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Rename in-memory dataset test classes.
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/100#issuecomment-157728485 I'm totally cool with this renaming, of course. I did want to ask about tests for the Assembler system-- I didn't see any, when I came to write the ones for my little addition to that system. Did I miss them or are they not there? Also, the same goes for the `LockMRPlusSW` I wrote-- I didn't see any tests for the other locks impls. Should I file a ticket to remind us to write those with all the copious spare time we have :) , or did I just miss them? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-156126355 @afs Squashed 'em pretty flat. The first commit is journaling operation, the second is the new in-memory design. I will wait to hear from you, otherwise moving on the docs and trying to understand the ARQ test framework in `/testing`. I think I'm going to have to come to the list on that, because while I think I can correctly read much of the RDF that is setting up tests, I can't figure out for the life of me how it gets executed or how to hook in a section so that my new designs are used. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-156146622 It occurs to me that a better way to explain myself is: `TestDatasetGraphInMemory` looks the way it does (lots of nested classes) because I wanted to keep the usual correspondence of one-class-to-one-test-class and I needed to use the test behavior from many abstract dataset test classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-156149841 Ah, I didn't realize that we were using the automatic "execute-by-naming-convention" test running as well as the explicit style of gathering things up into suites. I'll keep my grubby hands off now and stay out of the way. {grin} --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-155535885 @afs , I've factored a `findInUnionGraph` through the code to provide access to implementation efficiencies. Let me know what you think! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44455660 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/DatasetGraphFactory.java --- @@ -53,45 +54,38 @@ private static void copyOver(DatasetGraph dsgDest, DatasetGraph dsgSrc) * Create a DatasetGraph starting with a single graph. * New graphs must be explicitly added. */ -public static DatasetGraph create(Graph graph) +public static DatasetGraph create(final Graph graph) { -DatasetGraph dsg2 = createMemFixed() ; +final DatasetGraph dsg2 = createMemFixed() ; dsg2.setDefaultGraph(graph) ; return dsg2 ; } - + /** * Create a DatasetGraph which only ever has a single default graph. */ -public static DatasetGraph createOneGraph(Graph graph) { return new DatasetGraphOne(graph) ; } +public static DatasetGraph createOneGraph(final Graph graph) { return new DatasetGraphOne(graph) ; } /** Interface for makign graphs when a dataset needs to add a new graph. * Return null for no graph created. - */ + */ public interface GraphMaker { public Graph create() ; } /** A graph maker that doesn't make graphs */ -public static GraphMaker graphMakerNull = new GraphMaker() { -@Override -public Graph create() -{ -return null ; -} } ; - -private static GraphMaker memGraphMaker = new GraphMaker() -{ -@Override -public Graph create() -{ -return GraphFactory.createDefaultGraph() ; -} -} ; - +public static GraphMaker graphMakerNull = () -> null ; + +private static GraphMaker memGraphMaker = () -> GraphFactory.createDefaultGraph() ; + +/** + * @return a DatasetGraph which features transactional in-memory operation + */ +public static DatasetGraph createTxMem() { return new DatasetGraphInMemory(); } + --- End diff -- Changed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-155581142 Ah, that is exciting! I will begin learning the documentation CMS so as to offer a good set of pointers and explanations to help people try this out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-155582592 Right, I'll do what I did before and squash to two more-or-less orthogonal commits, one for "Introducing a journaling dataset!" and one for "Introducing a MR+SW in-mem dataset!". I'll add more useful and full commit comments for each. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44411395 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/DatasetGraphInMemory.java --- @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.sparql.core.mem; + +import static java.lang.ThreadLocal.withInitial; +import static org.apache.jena.graph.Node.ANY; +import static org.apache.jena.query.ReadWrite.READ; +import static org.apache.jena.query.ReadWrite.WRITE; +import static org.apache.jena.sparql.core.Quad.isUnionGraph; + +import java.util.HashSet; +import java.util.Iterator; +import java.util.Set; +import java.util.concurrent.locks.ReentrantReadWriteLock; +import java.util.function.Consumer; +import java.util.function.Supplier; + +import org.apache.jena.graph.Graph; +import org.apache.jena.graph.Node; +import org.apache.jena.graph.Triple; +import org.apache.jena.query.ReadWrite; +import org.apache.jena.shared.Lock; +import org.apache.jena.shared.LockMRPlusSW; +import org.apache.jena.sparql.JenaTransactionException; +import org.apache.jena.sparql.core.DatasetGraph; +import org.apache.jena.sparql.core.DatasetGraphTriplesQuads; +import org.apache.jena.sparql.core.DatasetPrefixStorage; +import org.apache.jena.sparql.core.Quad; +import org.apache.jena.sparql.core.Transactional; + +/** + * A {@link DatasetGraph} backed by an {@link QuadTable}. By default, this is a {@link HexTable} designed for high-speed + * in-memory operation. + * + */ +public class DatasetGraphInMemory extends DatasetGraphTriplesQuads implements Transactional { + + private final DatasetPrefixStorage prefixes = new DatasetPrefixStorageInMemory(); + + private final Lock writeLock = new LockMRPlusSW(); + + private Lock writeLock() { + return writeLock; + } + + private final ReentrantReadWriteLock commitLock = new ReentrantReadWriteLock(true); + + /** +* Commits must be atomic, and because a thread that is committing alters the various indexes one after another, we +* lock out {@link #begin(ReadWrite)} while {@link #commit()} is executing. +*/ + private ReentrantReadWriteLock commitLock() { + return commitLock; + } + + private final ThreadLocal isInTransaction = withInitial(() -> false); + + @Override + public boolean isInTransaction() { + return isInTransaction.get(); + } + + protected void isInTransaction(final boolean b) { + isInTransaction.set(b); + } + + private final ThreadLocal transactionType = withInitial(() -> null); + + /** +* @return the type of transaction in progress +*/ + public ReadWrite transactionType() { + return transactionType.get(); + } + + protected void transactionType(final ReadWrite readWrite) { + transactionType.set(readWrite); + } + + private final QuadTable quadsIndex; + + private QuadTable quadsIndex() { + return quadsIndex; + } + + private final TripleTable defaultGraph; + + private TripleTable defaultGraph() { + return defaultGraph; + } + + @Override + public Lock getLock() { + return writeLock(); + } + + /** +* Default constructor. +*/ + public DatasetGraphInMemory() { + this(new HexTable(), new TriTable()); + } + + /** +* @param i a table in which to store quads +* @param t a table in which to store triples +*/ + public DatasetGraphInMemory(final QuadTable i, final TripleTable t) { + this.quadsIndex = i; + this.defaultGraph = t; + } + + @Override + public void begin
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44412440 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/QuadTable.java --- @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.sparql.core.mem; + +import static org.apache.jena.graph.Node.ANY; + +import java.util.stream.Stream; + +import org.apache.jena.graph.Node; +import org.apache.jena.sparql.core.Quad; + +/** + * A simplex or multiplex table of {@link Quad}s. Implementations may wish to override {@link #listGraphNodes()} with a + * more efficient implementation. + * + */ +public interface QuadTable extends TupleTable { + + /** +* Search the table using a pattern of slots. {@link Node#ANY} or null will work as a wildcard. +* +* @param g the graph node of the pattern +* @param s the subject node of the pattern +* @param p the predicate node of the pattern +* @param o the object node of the pattern +* @return an {@link Stream} of matched quads +*/ + Stream find(Node g, Node s, Node p, Node o); + + /** +* Discover the graphs named in the table +* +* @return an {@link Stream} of graph names used in this table +*/ + default Stream listGraphNodes() { + return find(ANY, ANY, ANY, ANY).map(Quad::getGraph).distinct(); + } --- End diff -- Cool. I agree that the "forked impl" is a bit odd, but I did it for what I think are real gains in concision and clarity _within_ those types. The important point is that `QuadTableForm` is an `enum` and it's the six values _within_ `QuadTableForm` that actually impl `QuadTable`. Then `HexTable` binds up all those forms into a structure that acts as _one_ `QuadTable` by selecting the most efficient form(s) for any given operation. I'll add some comments to explain that relationship more fully and clearly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44409220 --- Diff: jena-arq/src/main/java/org/apache/jena/query/DatasetFactory.java --- @@ -16,253 +16,226 @@ * limitations under the License. */ -package org.apache.jena.query ; - -import java.util.List ; - -import org.apache.jena.assembler.Assembler ; -import org.apache.jena.rdf.model.Model ; -import org.apache.jena.rdf.model.Resource ; -import org.apache.jena.sparql.ARQException ; -import org.apache.jena.sparql.core.DatasetGraph ; -import org.apache.jena.sparql.core.DatasetGraphFactory ; -import org.apache.jena.sparql.core.DatasetImpl ; -import org.apache.jena.sparql.core.assembler.DatasetAssembler ; -import org.apache.jena.sparql.util.DatasetUtils ; -import org.apache.jena.sparql.util.graph.GraphUtils ; -import org.apache.jena.util.FileManager ; - -/** Make Datasets and DataSources in various ways */ - +package org.apache.jena.query; + +import java.util.List; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.rdf.model.Model; +import org.apache.jena.rdf.model.Resource; +import org.apache.jena.sparql.ARQException; +import org.apache.jena.sparql.core.DatasetGraph; +import org.apache.jena.sparql.core.DatasetGraphFactory; +import org.apache.jena.sparql.core.DatasetImpl; +import org.apache.jena.sparql.core.assembler.DatasetAssembler; +import org.apache.jena.sparql.util.DatasetUtils; +import org.apache.jena.sparql.util.graph.GraphUtils; +import org.apache.jena.util.FileManager; + +/** + * Makes {@link Dataset}s in various ways. + * + */ public class DatasetFactory { -/** Create an in-memory, modifiable Dataset */ -public static Dataset createMem() { -return create(DatasetGraphFactory.createMem()) ; -} - -/** - * Create an in-memory, modifiable Dataset. New graphs must be explicitly - * added using .addGraph. - */ -public static Dataset createMemFixed() { -return create(DatasetGraphFactory.createMemFixed()) ; -} - -/** - * Create a dataset with the given model as the default graph - * - * @param model - *The model for the default graph - * @return Dataset - */ -public static Dataset create(Model model) { -return new DatasetImpl(model) ; -} - -/** - * Create a dataset: clone the dataset structure of named graohs, and share - * the graphs themselves. - * - * @param dataset - *Dataset to clone structure from. - * @return Dataset - */ -public static Dataset create(Dataset dataset) { -return new DatasetImpl(dataset) ; -} - -/** - * Wrap a datasetgraph to make a mutable dataset - * - * @param dataset - *DatasetGraph - * @return Dataset - */ -public static Dataset create(DatasetGraph dataset) { -return DatasetImpl.wrap(dataset) ; -} - -/** - * Create a dataset based on a list of URIs : these are merged into the - * default graph of the dataset. - * - * @param uriList - *URIs merged to form the default dataset - * @return Dataset - */ - -public static Dataset create(List uriList) { -return create(uriList, null, null) ; -} - -/** - * Create a dataset with a default graph and no named graphs - * - * @param uri - *URIs merged to form the default dataset - * @return Dataset - */ - -public static Dataset create(String uri) { -return create(uri, null, null) ; -} - -/** - * Create a dataset based on a list of URIs : these are merged into the - * default graph of the dataset. - * - * @param uriList - *URIs merged to form the default dataset - * @param fileManager - * @return Dataset - */ - -/** - * Create a named graph container of graphs based on a list of URIs. - * - * @param namedSourceList - * @return Dataset - */ - -public static Dataset createNamed(List namedSourceList) { -return create((List)null, namedSourceList, null) ; -} - -/** - * Create a dataset based on two list of URIs. The first lists is used to - * create the background (unnamed graph) by merging, the second is used to - * create the collection of named graphs. - * - * (Jena calls graphs "M
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44410022 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/DatasetGraphInMemory.java --- @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.sparql.core.mem; + +import static java.lang.ThreadLocal.withInitial; +import static org.apache.jena.graph.Node.ANY; +import static org.apache.jena.query.ReadWrite.READ; +import static org.apache.jena.query.ReadWrite.WRITE; +import static org.apache.jena.sparql.core.Quad.isUnionGraph; + +import java.util.HashSet; +import java.util.Iterator; +import java.util.Set; +import java.util.concurrent.locks.ReentrantReadWriteLock; +import java.util.function.Consumer; +import java.util.function.Supplier; + +import org.apache.jena.graph.Graph; +import org.apache.jena.graph.Node; +import org.apache.jena.graph.Triple; +import org.apache.jena.query.ReadWrite; +import org.apache.jena.shared.Lock; +import org.apache.jena.shared.LockMRPlusSW; +import org.apache.jena.sparql.JenaTransactionException; +import org.apache.jena.sparql.core.DatasetGraph; +import org.apache.jena.sparql.core.DatasetGraphTriplesQuads; +import org.apache.jena.sparql.core.DatasetPrefixStorage; +import org.apache.jena.sparql.core.Quad; +import org.apache.jena.sparql.core.Transactional; + +/** + * A {@link DatasetGraph} backed by an {@link QuadTable}. By default, this is a {@link HexTable} designed for high-speed + * in-memory operation. + * + */ +public class DatasetGraphInMemory extends DatasetGraphTriplesQuads implements Transactional { + + private final DatasetPrefixStorage prefixes = new DatasetPrefixStorageInMemory(); + + private final Lock writeLock = new LockMRPlusSW(); + + private Lock writeLock() { + return writeLock; + } + + private final ReentrantReadWriteLock commitLock = new ReentrantReadWriteLock(true); + + /** +* Commits must be atomic, and because a thread that is committing alters the various indexes one after another, we +* lock out {@link #begin(ReadWrite)} while {@link #commit()} is executing. +*/ + private ReentrantReadWriteLock commitLock() { + return commitLock; + } + + private final ThreadLocal isInTransaction = withInitial(() -> false); + + @Override + public boolean isInTransaction() { + return isInTransaction.get(); + } + + protected void isInTransaction(final boolean b) { + isInTransaction.set(b); + } + + private final ThreadLocal transactionType = withInitial(() -> null); + + /** +* @return the type of transaction in progress +*/ + public ReadWrite transactionType() { + return transactionType.get(); + } + + protected void transactionType(final ReadWrite readWrite) { + transactionType.set(readWrite); + } + + private final QuadTable quadsIndex; + + private QuadTable quadsIndex() { + return quadsIndex; + } + + private final TripleTable defaultGraph; + + private TripleTable defaultGraph() { + return defaultGraph; + } + + @Override + public Lock getLock() { + return writeLock(); + } + + /** +* Default constructor. +*/ + public DatasetGraphInMemory() { + this(new HexTable(), new TriTable()); + } + + /** +* @param i a table in which to store quads +* @param t a table in which to store triples +*/ + public DatasetGraphInMemory(final QuadTable i, final TripleTable t) { + this.quadsIndex = i; + this.defaultGraph = t; + } + + @Override + public void begin
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44410004 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/DatasetGraphInMemory.java --- @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.sparql.core.mem; + +import static java.lang.ThreadLocal.withInitial; +import static org.apache.jena.graph.Node.ANY; +import static org.apache.jena.query.ReadWrite.READ; +import static org.apache.jena.query.ReadWrite.WRITE; +import static org.apache.jena.sparql.core.Quad.isUnionGraph; + +import java.util.HashSet; +import java.util.Iterator; +import java.util.Set; +import java.util.concurrent.locks.ReentrantReadWriteLock; +import java.util.function.Consumer; +import java.util.function.Supplier; + +import org.apache.jena.graph.Graph; +import org.apache.jena.graph.Node; +import org.apache.jena.graph.Triple; +import org.apache.jena.query.ReadWrite; +import org.apache.jena.shared.Lock; +import org.apache.jena.shared.LockMRPlusSW; +import org.apache.jena.sparql.JenaTransactionException; +import org.apache.jena.sparql.core.DatasetGraph; +import org.apache.jena.sparql.core.DatasetGraphTriplesQuads; +import org.apache.jena.sparql.core.DatasetPrefixStorage; +import org.apache.jena.sparql.core.Quad; +import org.apache.jena.sparql.core.Transactional; + +/** + * A {@link DatasetGraph} backed by an {@link QuadTable}. By default, this is a {@link HexTable} designed for high-speed + * in-memory operation. + * + */ +public class DatasetGraphInMemory extends DatasetGraphTriplesQuads implements Transactional { + + private final DatasetPrefixStorage prefixes = new DatasetPrefixStorageInMemory(); + + private final Lock writeLock = new LockMRPlusSW(); + + private Lock writeLock() { + return writeLock; + } + + private final ReentrantReadWriteLock commitLock = new ReentrantReadWriteLock(true); + + /** +* Commits must be atomic, and because a thread that is committing alters the various indexes one after another, we +* lock out {@link #begin(ReadWrite)} while {@link #commit()} is executing. +*/ + private ReentrantReadWriteLock commitLock() { + return commitLock; + } + + private final ThreadLocal isInTransaction = withInitial(() -> false); + + @Override + public boolean isInTransaction() { + return isInTransaction.get(); + } + + protected void isInTransaction(final boolean b) { + isInTransaction.set(b); + } + + private final ThreadLocal transactionType = withInitial(() -> null); + + /** +* @return the type of transaction in progress +*/ + public ReadWrite transactionType() { + return transactionType.get(); + } + + protected void transactionType(final ReadWrite readWrite) { + transactionType.set(readWrite); + } + + private final QuadTable quadsIndex; + + private QuadTable quadsIndex() { + return quadsIndex; + } + + private final TripleTable defaultGraph; + + private TripleTable defaultGraph() { + return defaultGraph; + } + + @Override + public Lock getLock() { + return writeLock(); + } + + /** +* Default constructor. +*/ + public DatasetGraphInMemory() { + this(new HexTable(), new TriTable()); + } + + /** +* @param i a table in which to store quads +* @param t a table in which to store triples +*/ + public DatasetGraphInMemory(final QuadTable i, final TripleTable t) { + this.quadsIndex = i; + this.defaultGraph = t; + } + + @Override + public void begin
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44410251 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/QuadTable.java --- @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.sparql.core.mem; + +import static org.apache.jena.graph.Node.ANY; + +import java.util.stream.Stream; + +import org.apache.jena.graph.Node; +import org.apache.jena.sparql.core.Quad; + +/** + * A simplex or multiplex table of {@link Quad}s. Implementations may wish to override {@link #listGraphNodes()} with a + * more efficient implementation. + * + */ +public interface QuadTable extends TupleTable { + + /** +* Search the table using a pattern of slots. {@link Node#ANY} or null will work as a wildcard. +* +* @param g the graph node of the pattern +* @param s the subject node of the pattern +* @param p the predicate node of the pattern +* @param o the object node of the pattern +* @return an {@link Stream} of matched quads +*/ + Stream find(Node g, Node s, Node p, Node o); + + /** +* Discover the graphs named in the table +* +* @return an {@link Stream} of graph names used in this table +*/ + default Stream listGraphNodes() { + return find(ANY, ANY, ANY, ANY).map(Quad::getGraph).distinct(); + } --- End diff -- See my comment on `DatasetGraphInMemory` as to union graph functionality. It applies here too. This is an interface which has no reason or right to assume anything about adjacency. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44410452 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/QuadTable.java --- @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.sparql.core.mem; + +import static org.apache.jena.graph.Node.ANY; + +import java.util.stream.Stream; + +import org.apache.jena.graph.Node; +import org.apache.jena.sparql.core.Quad; + +/** + * A simplex or multiplex table of {@link Quad}s. Implementations may wish to override {@link #listGraphNodes()} with a + * more efficient implementation. + * + */ +public interface QuadTable extends TupleTable { + + /** +* Search the table using a pattern of slots. {@link Node#ANY} or null will work as a wildcard. +* +* @param g the graph node of the pattern +* @param s the subject node of the pattern +* @param p the predicate node of the pattern +* @param o the object node of the pattern +* @return an {@link Stream} of matched quads +*/ + Stream find(Node g, Node s, Node p, Node o); + + /** +* Discover the graphs named in the table +* +* @return an {@link Stream} of graph names used in this table +*/ + default Stream listGraphNodes() { + return find(ANY, ANY, ANY, ANY).map(Quad::getGraph).distinct(); + } --- End diff -- As to starting with G, that's why this is a `default` method. Look at `HexTable` and you will see that very advantage being taken. But this is a `default` method and must not rely on facts about an implementation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44409436 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/mem/DatasetGraphInMemory.java --- @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.sparql.core.mem; + +import static java.lang.ThreadLocal.withInitial; +import static org.apache.jena.graph.Node.ANY; +import static org.apache.jena.query.ReadWrite.READ; +import static org.apache.jena.query.ReadWrite.WRITE; +import static org.apache.jena.sparql.core.Quad.isUnionGraph; + +import java.util.HashSet; +import java.util.Iterator; +import java.util.Set; +import java.util.concurrent.locks.ReentrantReadWriteLock; +import java.util.function.Consumer; +import java.util.function.Supplier; + +import org.apache.jena.graph.Graph; +import org.apache.jena.graph.Node; +import org.apache.jena.graph.Triple; +import org.apache.jena.query.ReadWrite; +import org.apache.jena.shared.Lock; +import org.apache.jena.shared.LockMRPlusSW; +import org.apache.jena.sparql.JenaTransactionException; +import org.apache.jena.sparql.core.DatasetGraph; +import org.apache.jena.sparql.core.DatasetGraphTriplesQuads; +import org.apache.jena.sparql.core.DatasetPrefixStorage; +import org.apache.jena.sparql.core.Quad; +import org.apache.jena.sparql.core.Transactional; + +/** + * A {@link DatasetGraph} backed by an {@link QuadTable}. By default, this is a {@link HexTable} designed for high-speed + * in-memory operation. + * + */ +public class DatasetGraphInMemory extends DatasetGraphTriplesQuads implements Transactional { + + private final DatasetPrefixStorage prefixes = new DatasetPrefixStorageInMemory(); + + private final Lock writeLock = new LockMRPlusSW(); + + private Lock writeLock() { + return writeLock; + } + + private final ReentrantReadWriteLock commitLock = new ReentrantReadWriteLock(true); + + /** +* Commits must be atomic, and because a thread that is committing alters the various indexes one after another, we +* lock out {@link #begin(ReadWrite)} while {@link #commit()} is executing. +*/ + private ReentrantReadWriteLock commitLock() { + return commitLock; + } + + private final ThreadLocal isInTransaction = withInitial(() -> false); + + @Override + public boolean isInTransaction() { + return isInTransaction.get(); + } + + protected void isInTransaction(final boolean b) { + isInTransaction.set(b); + } + + private final ThreadLocal transactionType = withInitial(() -> null); + + /** +* @return the type of transaction in progress +*/ + public ReadWrite transactionType() { + return transactionType.get(); + } + + protected void transactionType(final ReadWrite readWrite) { + transactionType.set(readWrite); + } + + private final QuadTable quadsIndex; + + private QuadTable quadsIndex() { + return quadsIndex; + } + + private final TripleTable defaultGraph; + + private TripleTable defaultGraph() { + return defaultGraph; + } + + @Override + public Lock getLock() { + return writeLock(); + } + + /** +* Default constructor. +*/ + public DatasetGraphInMemory() { + this(new HexTable(), new TriTable()); + } + + /** +* @param i a table in which to store quads +* @param t a table in which to store triples +*/ + public DatasetGraphInMemory(final QuadTable i, final TripleTable t) { + this.quadsIndex = i; + this.defaultGraph = t; + } + + @Override + public void begin
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44411038 --- Diff: jena-arq/src/main/java/org/apache/jena/query/DatasetFactory.java --- @@ -16,253 +16,226 @@ * limitations under the License. */ -package org.apache.jena.query ; - --- End diff -- See your comment [here](https://mail-archives.apache.org/mod_mbox/jena-dev/201510.mbox/%3C5610F36D.8090602%40apache.org%3E) about "other clearing up of DatasetFactory ...", although I'm not sure why Git created the diff in this confusing way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-155581224 @afs Shall I squash again in preparation? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/97#discussion_r44015443 --- Diff: jena-text/src/test/java/org/apache/jena/query/text/TestDatasetWithConfigurableAnalyzer.java --- @@ -0,0 +1,63 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import java.util.Arrays ; +import java.util.HashSet ; +import java.util.Set ; + +import org.apache.jena.atlas.lib.StrUtils ; +import org.junit.Before ; +import org.junit.Test ; + +/** + * This class defines a setup configuration for a dataset that uses an ASCII folding lowercase keyword analyzer with a Lucene index. + */ +public class TestDatasetWithConfigurableAnalyzer extends TestDatasetWithLowerCaseKeywordAnalyzer { +@Override +@Before +public void before() { +init(StrUtils.strjoinNL( +"text:ConfigurableAnalyzer ;", +"text:tokenizer text:KeywordTokenizer ;", +"text:filters (text:ASCIIFoldingFilter text:LowerCaseFilter)" +)); +} + +@Test +public void testConfigurableAnalyzerIsCaseAndAccentInsensitive() { +final String testName = "testConfigurableAnalyzerIsCaseAndAccentInsensitive"; +final String turtle = StrUtils.strjoinNL( +TURTLE_PROLOG, +"<" + RESOURCE_BASE + testName + ">", +" rdfs:label 'Feeling a déjàvu'", +"." +); +String queryString = StrUtils.strjoinNL( +QUERY_PROLOG, +"SELECT ?s", +"WHERE {", +"?s text:query ( rdfs:label '\"feeling ä déja\"*' 10 ) .", +"}" +); +Set expectedURIs = new HashSet<>() ; --- End diff -- Our shaded Guava offers various `Sets::newHashSet` functions for this kind of use, which read a pinch better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/97#discussion_r44015231 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/assembler/ConfigurableAnalyzerAssembler.java --- @@ -0,0 +1,101 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text.assembler; + +import java.util.ArrayList; +import java.util.List; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.text.TextIndexException; +import org.apache.jena.query.text.TextIndexLucene; +import org.apache.jena.query.text.analyzer.ConfigurableAnalyzer; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; +import org.apache.jena.rdf.model.Statement ; +import org.apache.jena.vocabulary.RDF ; +import org.apache.lucene.analysis.Analyzer; + + +/** + * Assembler to create a configurable analyzer. + */ +public class ConfigurableAnalyzerAssembler extends AssemblerBase { +/* +text:map ( + [ text:field "text" ; + text:predicate rdfs:label; + text:analyzer [ + a text:ConfigurableAnalyzer ; + text:tokenizer text:LetterTokenizer ; + text:filters (text:LowerCaseFilter) + ] + ] +. +*/ + + +@Override +public Analyzer open(Assembler a, Resource root, Mode mode) { +if (root.hasProperty(TextVocab.pTokenizer)) { +Resource tokenizerResource = (Resource) root.getProperty(TextVocab.pTokenizer).getObject(); +String tokenizer = tokenizerResource.getLocalName(); +List filters; +if (root.hasProperty(TextVocab.pFilters)) { +Resource filtersResource = (Resource) root.getProperty(TextVocab.pFilters).getObject(); +filters = toFilterList(filtersResource); +} else { +filters = new ArrayList<>(); +} +return new ConfigurableAnalyzer(TextIndexLucene.VER, tokenizer, filters); +} else { +throw new TextIndexException("text:tokenizer setting is required by ConfigurableAnalyzer"); +} +} + +private List toFilterList(Resource list) { +List result = new ArrayList<>(); +Resource current = list; +while (current != null && ! current.equals(RDF.nil)){ +Statement stmt = current.getProperty(RDF.first); +if (stmt == null) { +throw new TextIndexException("filter list not well formed"); +} +RDFNode node = stmt.getObject(); +if (! node.isResource()) { +throw new TextIndexException("filter is not a resource : " + node); +} + +Resource res = (Resource)node; --- End diff -- Isn't `Node::asResource` a little better here and below? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-154127004 Weird. I don't know what that means, but it certainly doesn't seem right. Several of the intermediate commits in that last also came well after 19 July (which is what it says on my page!). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44028911 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/InMemDatasetAssembler.java --- @@ -0,0 +1,67 @@ +package org.apache.jena.sparql.core.assembler; + +import static org.apache.jena.assembler.JA.MemoryDataset; +import static org.apache.jena.assembler.JA.data; +import static org.apache.jena.query.DatasetFactory.createTxnMem; +import static org.apache.jena.riot.RDFDataMgr.loadModel; +import static org.apache.jena.riot.RDFDataMgr.read; +import static org.apache.jena.sparql.core.assembler.AssemblerUtils.setContext; +import static org.apache.jena.sparql.core.assembler.DatasetAssemblerVocab.*; +import static org.apache.jena.sparql.util.FmtUtils.stringForRDFNode; +import static org.apache.jena.sparql.util.graph.GraphUtils.*; + +import java.util.function.Predicate; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.Dataset; +import org.apache.jena.rdf.model.Model; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; + +/** + * An {@link Assembler} that creates in-memory {@link Dataset}s. + * + */ +public class InMemDatasetAssembler extends AssemblerBase { + + @Override + public Dataset open(final Assembler assembler, final Resource root, final Mode mode) { + checkType(root, MemoryDataset); + final Dataset dataset = createTxnMem(); + setContext(root, dataset.getContext()); + // Default graph can be defined with ja:graph or ja:defaultGraph + final Resource defaultGraphDef = root.hasProperty(pDefaultGraph) ? getResourceValue(root, + pDefaultGraph) : root.hasProperty(pGraph) ? getResourceValue(root, pGraph) : null; --- End diff -- With all respect, that really doesn't seem to be what is happening in DatasetAssembler. A value from either predicate is taken and used at [this](https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/DatasetAssembler.java#L64) line to build from a model description. A value on `ja:graph` will be used there, not to load from a file. Is "load-from-a-file" the correct meaning for `ja:graph`? If so, what was the point of introducing `ja:data`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44029048 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/InMemDatasetAssembler.java --- @@ -0,0 +1,67 @@ +package org.apache.jena.sparql.core.assembler; + +import static org.apache.jena.assembler.JA.MemoryDataset; +import static org.apache.jena.assembler.JA.data; +import static org.apache.jena.query.DatasetFactory.createTxnMem; +import static org.apache.jena.riot.RDFDataMgr.loadModel; +import static org.apache.jena.riot.RDFDataMgr.read; +import static org.apache.jena.sparql.core.assembler.AssemblerUtils.setContext; +import static org.apache.jena.sparql.core.assembler.DatasetAssemblerVocab.*; +import static org.apache.jena.sparql.util.FmtUtils.stringForRDFNode; +import static org.apache.jena.sparql.util.graph.GraphUtils.*; + +import java.util.function.Predicate; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.Dataset; +import org.apache.jena.rdf.model.Model; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; + +/** + * An {@link Assembler} that creates in-memory {@link Dataset}s. + * + */ +public class InMemDatasetAssembler extends AssemblerBase { + + @Override + public Dataset open(final Assembler assembler, final Resource root, final Mode mode) { + checkType(root, MemoryDataset); + final Dataset dataset = createTxnMem(); + setContext(root, dataset.getContext()); + // Default graph can be defined with ja:graph or ja:defaultGraph + final Resource defaultGraphDef = root.hasProperty(pDefaultGraph) ? getResourceValue(root, + pDefaultGraph) : root.hasProperty(pGraph) ? getResourceValue(root, pGraph) : null; --- End diff -- I will add some `if-then` statements to unpack the code a bit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44030781 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/InMemDatasetAssembler.java --- @@ -0,0 +1,67 @@ +package org.apache.jena.sparql.core.assembler; + +import static org.apache.jena.assembler.JA.MemoryDataset; +import static org.apache.jena.assembler.JA.data; +import static org.apache.jena.query.DatasetFactory.createTxnMem; +import static org.apache.jena.riot.RDFDataMgr.loadModel; +import static org.apache.jena.riot.RDFDataMgr.read; +import static org.apache.jena.sparql.core.assembler.AssemblerUtils.setContext; +import static org.apache.jena.sparql.core.assembler.DatasetAssemblerVocab.*; +import static org.apache.jena.sparql.util.FmtUtils.stringForRDFNode; +import static org.apache.jena.sparql.util.graph.GraphUtils.*; + +import java.util.function.Predicate; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.Dataset; +import org.apache.jena.rdf.model.Model; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; + +/** + * An {@link Assembler} that creates in-memory {@link Dataset}s. + * + */ +public class InMemDatasetAssembler extends AssemblerBase { + + @Override + public Dataset open(final Assembler assembler, final Resource root, final Mode mode) { + checkType(root, MemoryDataset); + final Dataset dataset = createTxnMem(); + setContext(root, dataset.getContext()); + // Default graph can be defined with ja:graph or ja:defaultGraph + final Resource defaultGraphDef = root.hasProperty(pDefaultGraph) ? getResourceValue(root, + pDefaultGraph) : root.hasProperty(pGraph) ? getResourceValue(root, pGraph) : null; --- End diff -- Okay, I thought it would be cool, but I certainly admit that it's gotten confusing already, and no users have even gotten involved yet! {grin} I will strip out this jazz entirely and take it back to the plain case. To do that, I will use only one predicate, `ja:data`, with the meaning of "load tuples from this URI". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44025566 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/InMemDatasetAssembler.java --- @@ -0,0 +1,67 @@ +package org.apache.jena.sparql.core.assembler; + +import static org.apache.jena.assembler.JA.MemoryDataset; +import static org.apache.jena.assembler.JA.data; +import static org.apache.jena.query.DatasetFactory.createTxnMem; +import static org.apache.jena.riot.RDFDataMgr.loadModel; +import static org.apache.jena.riot.RDFDataMgr.read; +import static org.apache.jena.sparql.core.assembler.AssemblerUtils.setContext; +import static org.apache.jena.sparql.core.assembler.DatasetAssemblerVocab.*; +import static org.apache.jena.sparql.util.FmtUtils.stringForRDFNode; +import static org.apache.jena.sparql.util.graph.GraphUtils.*; + +import java.util.function.Predicate; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.Dataset; +import org.apache.jena.rdf.model.Model; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; + +/** + * An {@link Assembler} that creates in-memory {@link Dataset}s. + * + */ +public class InMemDatasetAssembler extends AssemblerBase { + + @Override + public Dataset open(final Assembler assembler, final Resource root, final Mode mode) { + checkType(root, MemoryDataset); + final Dataset dataset = createTxnMem(); + setContext(root, dataset.getContext()); + // Default graph can be defined with ja:graph or ja:defaultGraph + final Resource defaultGraphDef = root.hasProperty(pDefaultGraph) ? getResourceValue(root, + pDefaultGraph) : root.hasProperty(pGraph) ? getResourceValue(root, pGraph) : null; + if (defaultGraphDef != null) dataset.setDefaultModel(retrieve(defaultGraphDef, assembler, mode)); + // or with ja:data + final Predicate isResource = n -> { + if (n.isResource()) return true; + throw new DatasetAssemblerException(root, "Not a resource: " + stringForRDFNode(n)); + }; + multiValue(root, data).parallelStream().filter(isResource) + .forEach(defaultGraphNode -> read(dataset, defaultGraphNode.asResource().getURI())); + --- End diff -- `defaultGraphNode` could be a file, could be a network resource: how about `defaultGraphdocument`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44025441 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/InMemDatasetAssembler.java --- @@ -0,0 +1,67 @@ +package org.apache.jena.sparql.core.assembler; + +import static org.apache.jena.assembler.JA.MemoryDataset; +import static org.apache.jena.assembler.JA.data; +import static org.apache.jena.query.DatasetFactory.createTxnMem; +import static org.apache.jena.riot.RDFDataMgr.loadModel; +import static org.apache.jena.riot.RDFDataMgr.read; +import static org.apache.jena.sparql.core.assembler.AssemblerUtils.setContext; +import static org.apache.jena.sparql.core.assembler.DatasetAssemblerVocab.*; +import static org.apache.jena.sparql.util.FmtUtils.stringForRDFNode; +import static org.apache.jena.sparql.util.graph.GraphUtils.*; + +import java.util.function.Predicate; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.Dataset; +import org.apache.jena.rdf.model.Model; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; + +/** + * An {@link Assembler} that creates in-memory {@link Dataset}s. + * + */ +public class InMemDatasetAssembler extends AssemblerBase { + + @Override + public Dataset open(final Assembler assembler, final Resource root, final Mode mode) { + checkType(root, MemoryDataset); + final Dataset dataset = createTxnMem(); + setContext(root, dataset.getContext()); + // Default graph can be defined with ja:graph or ja:defaultGraph + final Resource defaultGraphDef = root.hasProperty(pDefaultGraph) ? getResourceValue(root, + pDefaultGraph) : root.hasProperty(pGraph) ? getResourceValue(root, pGraph) : null; + if (defaultGraphDef != null) dataset.setDefaultModel(retrieve(defaultGraphDef, assembler, mode)); + // or with ja:data + final Predicate isResource = n -> { + if (n.isResource()) return true; + throw new DatasetAssemblerException(root, "Not a resource: " + stringForRDFNode(n)); + }; + multiValue(root, data).parallelStream().filter(isResource) + .forEach(defaultGraphNode -> read(dataset, defaultGraphNode.asResource().getURI())); + --- End diff -- The hope from using `parallelStream` was indeed to queue on the lock, but get _reading_ going in parallel. IOW, reading could require network access or filesystem access that could very well be parallelized. But I take your point about the order of error messages. That could be really terribly confusing. I'll pull out the parallelization. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44025646 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/InMemDatasetAssembler.java --- @@ -0,0 +1,67 @@ +package org.apache.jena.sparql.core.assembler; + +import static org.apache.jena.assembler.JA.MemoryDataset; +import static org.apache.jena.assembler.JA.data; +import static org.apache.jena.query.DatasetFactory.createTxnMem; +import static org.apache.jena.riot.RDFDataMgr.loadModel; +import static org.apache.jena.riot.RDFDataMgr.read; +import static org.apache.jena.sparql.core.assembler.AssemblerUtils.setContext; +import static org.apache.jena.sparql.core.assembler.DatasetAssemblerVocab.*; +import static org.apache.jena.sparql.util.FmtUtils.stringForRDFNode; +import static org.apache.jena.sparql.util.graph.GraphUtils.*; + +import java.util.function.Predicate; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.Dataset; +import org.apache.jena.rdf.model.Model; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; + +/** + * An {@link Assembler} that creates in-memory {@link Dataset}s. + * + */ +public class InMemDatasetAssembler extends AssemblerBase { + + @Override + public Dataset open(final Assembler assembler, final Resource root, final Mode mode) { + checkType(root, MemoryDataset); + final Dataset dataset = createTxnMem(); --- End diff -- Yes, that's a bad oversight. I'll fix that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44025972 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/InMemDatasetAssembler.java --- @@ -0,0 +1,67 @@ +package org.apache.jena.sparql.core.assembler; + +import static org.apache.jena.assembler.JA.MemoryDataset; +import static org.apache.jena.assembler.JA.data; +import static org.apache.jena.query.DatasetFactory.createTxnMem; +import static org.apache.jena.riot.RDFDataMgr.loadModel; +import static org.apache.jena.riot.RDFDataMgr.read; +import static org.apache.jena.sparql.core.assembler.AssemblerUtils.setContext; +import static org.apache.jena.sparql.core.assembler.DatasetAssemblerVocab.*; +import static org.apache.jena.sparql.util.FmtUtils.stringForRDFNode; +import static org.apache.jena.sparql.util.graph.GraphUtils.*; + +import java.util.function.Predicate; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.Dataset; +import org.apache.jena.rdf.model.Model; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; + +/** + * An {@link Assembler} that creates in-memory {@link Dataset}s. + * + */ +public class InMemDatasetAssembler extends AssemblerBase { + + @Override + public Dataset open(final Assembler assembler, final Resource root, final Mode mode) { + checkType(root, MemoryDataset); + final Dataset dataset = createTxnMem(); + setContext(root, dataset.getContext()); + // Default graph can be defined with ja:graph or ja:defaultGraph + final Resource defaultGraphDef = root.hasProperty(pDefaultGraph) ? getResourceValue(root, + pDefaultGraph) : root.hasProperty(pGraph) ? getResourceValue(root, pGraph) : null; + if (defaultGraphDef != null) dataset.setDefaultModel(retrieve(defaultGraphDef, assembler, mode)); + // or with ja:data + final Predicate isResource = n -> { + if (n.isResource()) return true; + throw new DatasetAssemblerException(root, "Not a resource: " + stringForRDFNode(n)); + }; + multiValue(root, data).parallelStream().filter(isResource) + .forEach(defaultGraphNode -> read(dataset, defaultGraphNode.asResource().getURI())); + + // Assemble or load named graphs + multiValue(root, pNamedGraph).parallelStream().filter(isResource).forEach(namedGraphNode -> { + final Resource namedGraphResource = namedGraphNode.asResource(); --- End diff -- Yeah, I wondered about that, but the semantics of what I've got here are actually different. If you look at `isResource`, it throws an exception when the thing is a literal (which would make no sense in an assembler construction). That's what the other assemblers seem to do. I'm happy to just ignore any "weird" constructions like that, but I thought I'd better do what the others did and throw an exception. I'll happily simplify this to ignore instead. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/94#discussion_r44027198 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/InMemDatasetAssembler.java --- @@ -0,0 +1,67 @@ +package org.apache.jena.sparql.core.assembler; + +import static org.apache.jena.assembler.JA.MemoryDataset; +import static org.apache.jena.assembler.JA.data; +import static org.apache.jena.query.DatasetFactory.createTxnMem; +import static org.apache.jena.riot.RDFDataMgr.loadModel; +import static org.apache.jena.riot.RDFDataMgr.read; +import static org.apache.jena.sparql.core.assembler.AssemblerUtils.setContext; +import static org.apache.jena.sparql.core.assembler.DatasetAssemblerVocab.*; +import static org.apache.jena.sparql.util.FmtUtils.stringForRDFNode; +import static org.apache.jena.sparql.util.graph.GraphUtils.*; + +import java.util.function.Predicate; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.Dataset; +import org.apache.jena.rdf.model.Model; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; + +/** + * An {@link Assembler} that creates in-memory {@link Dataset}s. + * + */ +public class InMemDatasetAssembler extends AssemblerBase { + + @Override + public Dataset open(final Assembler assembler, final Resource root, final Mode mode) { + checkType(root, MemoryDataset); + final Dataset dataset = createTxnMem(); + setContext(root, dataset.getContext()); + // Default graph can be defined with ja:graph or ja:defaultGraph + final Resource defaultGraphDef = root.hasProperty(pDefaultGraph) ? getResourceValue(root, + pDefaultGraph) : root.hasProperty(pGraph) ? getResourceValue(root, pGraph) : null; --- End diff -- I'm happy to make them do whatever we think is going to be most useful but I'm not yet understanding why we would want them to be different here when in [DatasetAssembler](https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/sparql/core/assembler/DatasetAssembler.java#L58) they really do seem to be the same? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154099939 Sure, there's nothing wrong with what's there. All I meant is that `newHashSet` seems as clear as `new HashSet<>()` and allows elements, so is shorter. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154093794 @osma I am a nitpicker of the worst kind! {grin} If you would like me to make a PR against `jena-text` looking for this kind of thing (using Guava or Java 8 idioms to shorten things up) I'm happy to do that sometime in the next week or three. I'm a born "code janitor". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/97#discussion_r44013590 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/assembler/ConfigurableAnalyzerAssembler.java --- @@ -0,0 +1,101 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text.assembler; + +import java.util.ArrayList; +import java.util.List; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.text.TextIndexException; +import org.apache.jena.query.text.TextIndexLucene; +import org.apache.jena.query.text.analyzer.ConfigurableAnalyzer; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; +import org.apache.jena.rdf.model.Statement ; +import org.apache.jena.vocabulary.RDF ; +import org.apache.lucene.analysis.Analyzer; + + +/** + * Assembler to create a configurable analyzer. + */ +public class ConfigurableAnalyzerAssembler extends AssemblerBase { +/* +text:map ( + [ text:field "text" ; + text:predicate rdfs:label; + text:analyzer [ + a text:ConfigurableAnalyzer ; + text:tokenizer text:LetterTokenizer ; + text:filters (text:LowerCaseFilter) + ] + ] +. +*/ + + +@Override +public Analyzer open(Assembler a, Resource root, Mode mode) { +if (root.hasProperty(TextVocab.pTokenizer)) { +Resource tokenizerResource = (Resource) root.getProperty(TextVocab.pTokenizer).getObject(); --- End diff -- Would `Resource::getPropertyResourceValue` be more readable here? (No cast.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/97#discussion_r44012158 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/analyzer/ConfigurableAnalyzer.java --- @@ -0,0 +1,93 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text.analyzer ; + +import java.io.Reader ; +import java.util.List ; + +import org.apache.jena.query.text.TextIndexException; +import org.apache.lucene.analysis.Analyzer ; +import org.apache.lucene.analysis.TokenFilter ; +import org.apache.lucene.analysis.Tokenizer ; +import org.apache.lucene.analysis.TokenStream ; +import org.apache.lucene.analysis.core.KeywordTokenizer ; +import org.apache.lucene.analysis.core.LetterTokenizer; +import org.apache.lucene.analysis.core.LowerCaseFilter ; +import org.apache.lucene.analysis.core.WhitespaceTokenizer ; +import org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter; +import org.apache.lucene.analysis.standard.StandardFilter; +import org.apache.lucene.analysis.standard.StandardTokenizer; +import org.apache.lucene.util.Version ; + + +/** + * Lucene Analyzer implementation that can be configured with different + * Tokenizer and (optionally) TokenFilter implementations. + */ + +public class ConfigurableAnalyzer extends Analyzer { +private Version version; --- End diff -- Couldn't these files be `final`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-151317139 Makes sense, @afs. I'll head down that road. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-151155300 @afs For "Persistent datastructures: These could go in their own package." are you thinking of a subpackage of the one I'm currently using in `jena-arq`: `org.apache.jena.sparql.core.mem`, or since these structures are really just about arranging `Nodes`, maybe stick 'em in `jena-core`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-151165442 Okay, you mean then to add a `jena-base`:`org.apache.jena.atlas.lib.persistent`? Can do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-151154190 Got 'em, thanks. I had forgotten to turn on Eclipse's warnings for "empty statements". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-151139579 @afs I can't find any of the unnecessary semicolon warnings-- can you give me a pointer? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-151194518 @afs On the assembler front, did you want me to contribute a totally new assembler that works for this new dataset impl, or to alter the current `DatasetAssembler` to use this new impl instead of what it now uses? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-151170795 Oh, okay, that's different: so you're thinking `jena-base`:`org.apache.jena.atlas.lib.persistent` or the like? Cool. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-624
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/94#issuecomment-149916761 Thanks for the feedback, @afs! I'll take it point-by-point: - Performance Status: The suspicion to which I was referring is that it's because using the Streams API generates more objects and those objects have more state than using iterators. It should be noted that I have not been able to verify this. I could, by rewriting the PR to avoid the Streams API, but it would require a good bit of work to do that and I think it would be _much_ harder to read. I think the majority of the performance decline is because of the changes in design, not incidental stuff like this. - DatasetFactory: I just hadn't touched that yet to avoid touching anything outside of my code until we are happy with the basic machinery. I will happily put that in as per your proposal in that message. See a forthcoming commit. {grin} - Assembler: I'm still reading through the assembler subsystem and making sure I understand it before extending it. I should be able to supply something soon. - Documentation: Yes, I haven't done anything here. I assume you are talking about documentation in the website, right? Not Javadocs? - Persistent datastructures: Cool, I'll break these out and give them some tests. - Dependency management: Okay, I'll square that away. I didn't realize that $module/DEPENDENCIES even existed. What does that support? - (Not) Mocking in tests: I think that should be doable in most cases. I usually prefer mocking because it makes everything very explicit, but if Jena's habit is otherwise, I'm happy to follow it. - Warnings: Hm, didn't see those (except for the casts). I will make sure everything is clean. Those casts are really weird, because sometimes I have seen compilation fail without them, but they do seem to be unnecessary and the types are infer-able. I will check into it more thoroughly. - Journaling: I think including it could be a good way also to get some feedback. Is there some Jena-standard way to mark something as "slightly experimental" a la Google's `@Beta` annotation? - LockMRPlusSW: I will slap some tests on that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/95#issuecomment-149213529 There has been some discussion of ETags on the mailing list recently: http://jena.markmail.org/search/%22Fuskei+and+ETags%22 and there is also this older issue: https://issues.apache.org/jira/browse/JENA-388 Actually, I think 626 and 388 may be duplicates... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42413989 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheStore.java --- @@ -0,0 +1,150 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.jena.fuseki.cache; + +import org.apache.jena.fuseki.Fuseki; +import org.apache.jena.fuseki.servlets.ActionLib; +import org.apache.jena.fuseki.servlets.HttpAction; +import org.slf4j.Logger; + +import javax.servlet.http.HttpServletRequest; + +public class CacheStore { + + +/** flag to check if data store was initialized */ +public static boolean initialized = false ; + +private static Logger log = Fuseki.cacheLog ; + +/** execution timeout for any method */ +private int defaultExecutionTimeout = 500 ; + +/** thread pool size for this data store */ +protected int defaultThreadPoolSize = 500 ; + +/** client for interacting with Cache store **/ +private final CacheClient client = new GuavaCacheClient(); + +private static CacheStore instance; + +public static CacheStore getInstance(){ + +if(instance==null){ +instance = new CacheStore(); +return instance; +} +return instance; +} + +/** For testing - reset the places which initialize once */ --- End diff -- Singletons can be tricky, sometimes, because OSGi doesn't offer a single unified classpath to a multi-module application or library. It's not always an issue, but unless you have a really good reason to have a singleton (and I would argue here that you do not-- you are trying to build a single cache per-Fuseki-instance, not per-JVM, and as @afs pointed out, that is probably not the right approach in the end _anyway_) it's better to keep it simple. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42413334 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheEntry.java --- @@ -0,0 +1,57 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.fuseki.cache; + + +import org.apache.jena.sparql.resultset.SPARQLResult; + +public class CacheEntry { --- End diff -- My inclination would actually be to make this type as immutable as possible. If I am right in understanding that `result` must always be set, then make `result` `final`, have the constructor set it, and be done with it. If `cacheBuilder` may or may not be set, leave it mutable, and provide a mutator and a flag to show whether it has been set.Of course, this all deserves reconsideration in light of the advice @afs has given about architectural issues. You may want to have this type contain very different state or even wrap a computation instead of values. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42413696 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheStore.java --- @@ -0,0 +1,150 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.jena.fuseki.cache; + +import org.apache.jena.fuseki.Fuseki; +import org.apache.jena.fuseki.servlets.ActionLib; +import org.apache.jena.fuseki.servlets.HttpAction; +import org.slf4j.Logger; + +import javax.servlet.http.HttpServletRequest; + +public class CacheStore { + + +/** flag to check if data store was initialized */ +public static boolean initialized = false ; + +private static Logger log = Fuseki.cacheLog ; + +/** execution timeout for any method */ +private int defaultExecutionTimeout = 500 ; + +/** thread pool size for this data store */ +protected int defaultThreadPoolSize = 500 ; + +/** client for interacting with Cache store **/ +private final CacheClient client = new GuavaCacheClient(); + +private static CacheStore instance; + +public static CacheStore getInstance(){ + +if(instance==null){ --- End diff -- Yeah, the value of an assignment expression is the value that was assigned. Sometimes that can be a really opaque idiom, but here the context makes clear what's going on. Of course, depending on the architectural plan going forward, you may not be needing a singleton of this type anyway. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42413412 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheStore.java --- @@ -0,0 +1,150 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.jena.fuseki.cache; + +import org.apache.jena.fuseki.Fuseki; +import org.apache.jena.fuseki.servlets.ActionLib; +import org.apache.jena.fuseki.servlets.HttpAction; +import org.slf4j.Logger; + +import javax.servlet.http.HttpServletRequest; + +public class CacheStore { + + +/** flag to check if data store was initialized */ +public static boolean initialized = false ; + +private static Logger log = Fuseki.cacheLog ; + +/** execution timeout for any method */ +private int defaultExecutionTimeout = 500 ; + +/** thread pool size for this data store */ +protected int defaultThreadPoolSize = 500 ; + +/** client for interacting with Cache store **/ +private final CacheClient client = new GuavaCacheClient(); + +private static CacheStore instance; + +public static CacheStore getInstance(){ + +if(instance==null){ +instance = new CacheStore(); +return instance; +} +return instance; +} + +/** For testing - reset the places which initialize once */ +public synchronized static void reset() { +initialized = false ; +CacheStore.initialized = false ; +} + +public synchronized static void init(){ +if ( initialized ) +return ; +initialized = true ; + +} + +/** + * Get cache data from cache store. + * @param key Cache store key + */ +public Object doGet(String key) throws CacheStoreException{ +try{ +Object data = client.get(key); +if(data == null) +return null; +else +return data; --- End diff -- I think tht of the two, you definitely want to use `java.util.Optional`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42413572 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheStore.java --- @@ -0,0 +1,150 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.jena.fuseki.cache; + +import org.apache.jena.fuseki.Fuseki; +import org.apache.jena.fuseki.servlets.ActionLib; +import org.apache.jena.fuseki.servlets.HttpAction; +import org.slf4j.Logger; + +import javax.servlet.http.HttpServletRequest; + +public class CacheStore { --- End diff -- Ah, cool. I made other remarks like this elsewhere, so I guess those should be ignored. I think `Cache` has actually been around for quite some time, but it was maybe tucked into `jena-core` and not as visible? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42329075 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/GuavaCacheClient.java --- @@ -0,0 +1,73 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.fuseki.cache; + +import com.google.common.cache.Cache; +import com.google.common.cache.CacheBuilder; + +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; + +public class GuavaCacheClient extends CacheClient{ --- End diff -- There already exists a Guava-backed client in the form of `CacheGuava`. If you used the standard Jena `Cache` type instead of a new caching contract you could just use that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42325590 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/system/InitARQ.java --- @@ -23,7 +23,7 @@ /** ARQ initialization. Used by {@code JenaSystem} */ public class InitARQ implements JenaSubsystemLifecycle { - + --- End diff -- Needless whitespace change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42325746 --- Diff: jena-core/src/main/java/org/apache/jena/system/JenaSubsystemLifecycle.java --- @@ -18,15 +18,15 @@ package org.apache.jena.system; --- End diff -- These are all needless whitespace changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42325752 --- Diff: jena-fuseki2/jena-fuseki-core/pom.xml --- @@ -108,6 +108,12 @@ + com.google.guava --- End diff -- Guava types should be coming from Jena's shaded Guava artifact, not directly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/95#issuecomment-149066287 I'm a little confused by the fact that for a feature that seems to require purchase only in an HTTP API (in Fuseki itself) this code seems to tangle all the way down into very low-level serialization code and basic types. It seems to me to be possible to do this reusing the extant caching machinery and doing new work only in Fuseki. Perhaps you can explain a little about why you chose to develop new caching machinery and why you integrated it in ARQ's serialization machinery. Perhaps I am misunderstanding the intent here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42325779 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheAction.java --- @@ -0,0 +1,45 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.fuseki.cache; + +public class CacheAction { + +public CacheAction.Type type; --- End diff -- Possibly better to make this `final`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42325781 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheAction.java --- @@ -0,0 +1,45 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.fuseki.cache; + +public class CacheAction { + +public CacheAction.Type type; + +private String key; --- End diff -- Possibly better to make this `final`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42320566 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheStore.java --- @@ -0,0 +1,150 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.jena.fuseki.cache; + +import org.apache.jena.fuseki.Fuseki; +import org.apache.jena.fuseki.servlets.ActionLib; +import org.apache.jena.fuseki.servlets.HttpAction; +import org.slf4j.Logger; + +import javax.servlet.http.HttpServletRequest; + +public class CacheStore { --- End diff -- Why is this machinery being created independently of `org.apache.jena.atlas.lib.Cache<Key, Value>`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42320708 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/resultset/CSVOutput.java --- @@ -102,6 +102,62 @@ public void format(OutputStream out, ResultSet resultSet) } } +@Override +public void format(OutputStream out, ResultSet resultSet, StringBuilder cacheBuilder) +{ +try { +Writer w = FileUtils.asUTF8(out) ; +NodeToLabelMap bnodes = new NodeToLabelMap() ; +w = new BufferedWriter(w) ; + +String sep = null ; +List varNames = resultSet.getResultVars() ; +List vars = new ArrayList<>(varNames.size()) ; + +// Convert to Vars and output the header line. +for( String v : varNames ) +{ +if ( sep != null ){ --- End diff -- How is `sep` not going to be `null` at this line? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42320704 --- Diff: jena-arq/src/main/java/org/apache/jena/sparql/resultset/CSVOutput.java --- @@ -102,6 +102,62 @@ public void format(OutputStream out, ResultSet resultSet) } } +@Override +public void format(OutputStream out, ResultSet resultSet, StringBuilder cacheBuilder) +{ +try { +Writer w = FileUtils.asUTF8(out) ; +NodeToLabelMap bnodes = new NodeToLabelMap() ; +w = new BufferedWriter(w) ; --- End diff -- Why not use `try-with-resource` here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42320206 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheEntry.java --- @@ -0,0 +1,57 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.fuseki.cache; + + +import org.apache.jena.sparql.resultset.SPARQLResult; + +public class CacheEntry { --- End diff -- What is the purpose of the initialization machinery in this class? It doesn't check to see that any of the fields are initialized... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: JENA-626 SPARQL Query Caching
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/95#discussion_r42320283 --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache/CacheClient.java --- @@ -0,0 +1,30 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.fuseki.cache; + +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeoutException; + +public abstract class CacheClient { --- End diff -- Isn't this better as an interface instead of an `abstract` class? There is no state in it, which is the purpose of an `abstract` class. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---