[jira] [Commented] (JENA-626) SPARQL Query Caching

2015-11-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031916#comment-15031916
 ] 

ASF GitHub Bot commented on JENA-626:
-

Github user ajs6f commented on a diff in the pull request:

https://github.com/apache/jena/pull/95#discussion_r46153931
  
--- Diff: jena-arq/pom.xml ---
@@ -123,9 +123,18 @@
   com.jayway.awaitility
   awaitility
   test
+
--- End diff --

This is totally confusing. Not only does it appear that you are pulling in 
the same dependency twice, this very dependency is already pulled in at line 76 
above!


> SPARQL Query Caching
> 
>
> Key: JENA-626
> URL: https://issues.apache.org/jira/browse/JENA-626
> Project: Apache Jena
>  Issue Type: Improvement
>Reporter: Andy Seaborne
>Assignee: Saikat Maitra
>  Labels: java, linked_data, rdf, sparql
>
> Add a caching layer to Fuseki to cache the results of SPARQL Query requests.  
> This cache should allow for in-memory and disk-based caching, configuration 
> and cache management, and coordination with data modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] jena pull request: JENA-626 SPARQL Query Caching

2015-11-30 Thread ajs6f
Github user ajs6f commented on a diff in the pull request:

https://github.com/apache/jena/pull/95#discussion_r46154284
  
--- Diff: jena-base/pom.xml ---
@@ -50,21 +50,29 @@
 
   org.apache.commons
   commons-csv
- 
-
+
+
 
 
   com.github.andrewoma.dexx
   dexx-collections
 
-
+
 
 
   org.apache.commons
   commons-lang3
+org.apache.jena
--- End diff --

Again, why is `jena-shaded-guava` being pulled in twice, with the wrong 
version, when it is already pulled in above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (JENA-626) SPARQL Query Caching

2015-11-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031924#comment-15031924
 ] 

ASF GitHub Bot commented on JENA-626:
-

Github user ajs6f commented on a diff in the pull request:

https://github.com/apache/jena/pull/95#discussion_r46154284
  
--- Diff: jena-base/pom.xml ---
@@ -50,21 +50,29 @@
 
   org.apache.commons
   commons-csv
- 
-
+
+
 
 
   com.github.andrewoma.dexx
   dexx-collections
 
-
+
 
 
   org.apache.commons
   commons-lang3
+org.apache.jena
--- End diff --

Again, why is `jena-shaded-guava` being pulled in twice, with the wrong 
version, when it is already pulled in above?


> SPARQL Query Caching
> 
>
> Key: JENA-626
> URL: https://issues.apache.org/jira/browse/JENA-626
> Project: Apache Jena
>  Issue Type: Improvement
>Reporter: Andy Seaborne
>Assignee: Saikat Maitra
>  Labels: java, linked_data, rdf, sparql
>
> Add a caching layer to Fuseki to cache the results of SPARQL Query requests.  
> This cache should allow for in-memory and disk-based caching, configuration 
> and cache management, and coordination with data modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] jena pull request: JENA-626 SPARQL Query Caching

2015-11-30 Thread ajs6f
Github user ajs6f commented on a diff in the pull request:

https://github.com/apache/jena/pull/95#discussion_r46153931
  
--- Diff: jena-arq/pom.xml ---
@@ -123,9 +123,18 @@
   com.jayway.awaitility
   awaitility
   test
+
--- End diff --

This is totally confusing. Not only does it appear that you are pulling in 
the same dependency twice, this very dependency is already pulled in at line 76 
above!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Model locks and Dataset locks

2015-11-30 Thread A. Soroka
I noticed the other day that the ModelCom locking behavior seems completely 
independent of the locking behavior of the dataset that may underlie a Model. 
I’m a little worried about that with the advent of DatasetGraphInMemory, 
because I think the “ergonomics” could be confusing. I have a test case up here:

https://gist.github.com/ajs6f/3e87cd6f78ec3b4e27a1

The problem is that when used without transactions, DatasetGraphInMemory tries 
to “do what the user meant” by auto-wrapping ::add or ::delete operations in 
transactions. But this leads to a case where a thread can get a model from a 
DatasetGraphInMemory-backed dataset, acquire the write lock for that model with 
no trouble, but then block trying to add or delete anything from it because the 
mutation is blocked by some other thread elsewhere that has the dataset lock. 
What would seem less surprising to me would be for the first thread to block 
waiting to get the model write lock.

Does this seem too obscure? Is it worth trying to think about how ModelCom 
could respect the locking of an underlying Dataset, if one exists?

---
A. Soroka
The University of Virginia Library



[GitHub] jena pull request: JENA-624: Correction to transaction begin for D...

2015-11-30 Thread ajs6f
GitHub user ajs6f opened a pull request:

https://github.com/apache/jena/pull/103

JENA-624: Correction to transaction begin for DatasetGraphInMemory

Correction to transaction begin for DatasetGraphInMemory to include default 
graph as well as quad table.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ajs6f/jena CorrectionToDatasetGraphInMemory

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/jena/pull/103.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #103


commit c9b292293d307de305700981adb3d2d6d424d732
Author: ajs6f 
Date:   2015-11-30T18:23:34Z

Correction to transaction begin for DatasetGraphInMemory to include default 
graph as well as quad table.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (JENA-624) Develop a new in-memory RDF Dataset implementation

2015-11-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032211#comment-15032211
 ] 

ASF GitHub Bot commented on JENA-624:
-

GitHub user ajs6f opened a pull request:

https://github.com/apache/jena/pull/103

JENA-624: Correction to transaction begin for DatasetGraphInMemory

Correction to transaction begin for DatasetGraphInMemory to include default 
graph as well as quad table.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ajs6f/jena CorrectionToDatasetGraphInMemory

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/jena/pull/103.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #103


commit c9b292293d307de305700981adb3d2d6d424d732
Author: ajs6f 
Date:   2015-11-30T18:23:34Z

Correction to transaction begin for DatasetGraphInMemory to include default 
graph as well as quad table.




> Develop a new in-memory RDF Dataset implementation
> --
>
> Key: JENA-624
> URL: https://issues.apache.org/jira/browse/JENA-624
> Project: Apache Jena
>  Issue Type: Improvement
>Reporter: Andy Seaborne
>Assignee: A. Soroka
>  Labels: java, linked_data, rdf
>
> The current (Jan 2014) Jena in-memory dataset uses a general purpose 
> container that works for any storage technology for graphs together with 
> in-memory graphs.  
> This project would develop a new implementation design specifically for RDF 
> datasets (triples and quads) and efficient SPARQL execution, for example, 
> using multi-core parallel operations and/or multi-version concurrent 
> datastructures to maximise true parallel operation.
> This is a system project suitable for someone interested in datatbase 
> implementation, datastructure design and implementation, operating systems or 
> distributed systems.
> Note that TDB can operate in-memory using a simulated disk with 
> copy-in/copy-out semantics for disk-level operations.  It is for faithful 
> testing TDB infrastructure and is not designed performance, general in-memory 
> use or use at scale.  While lesson may be learnt from that system, TDB 
> in-memory is not the answer here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JENA-626) SPARQL Query Caching

2015-11-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032407#comment-15032407
 ] 

ASF GitHub Bot commented on JENA-626:
-

Github user samaitra commented on a diff in the pull request:

https://github.com/apache/jena/pull/95#discussion_r46198723
  
--- Diff: jena-base/pom.xml ---
@@ -50,21 +50,29 @@
 
   org.apache.commons
   commons-csv
- 
-
+
+
 
 
   com.github.andrewoma.dexx
   dexx-collections
 
-
+
 
 
   org.apache.commons
   commons-lang3
+org.apache.jena
--- End diff --

@ajs6f I have corrected the pom files. Thank you for reviewing the changes.


> SPARQL Query Caching
> 
>
> Key: JENA-626
> URL: https://issues.apache.org/jira/browse/JENA-626
> Project: Apache Jena
>  Issue Type: Improvement
>Reporter: Andy Seaborne
>Assignee: Saikat Maitra
>  Labels: java, linked_data, rdf, sparql
>
> Add a caching layer to Fuseki to cache the results of SPARQL Query requests.  
> This cache should allow for in-memory and disk-based caching, configuration 
> and cache management, and coordination with data modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] jena pull request: JENA-626 SPARQL Query Caching

2015-11-30 Thread samaitra
Github user samaitra commented on a diff in the pull request:

https://github.com/apache/jena/pull/95#discussion_r46198723
  
--- Diff: jena-base/pom.xml ---
@@ -50,21 +50,29 @@
 
   org.apache.commons
   commons-csv
- 
-
+
+
 
 
   com.github.andrewoma.dexx
   dexx-collections
 
-
+
 
 
   org.apache.commons
   commons-lang3
+org.apache.jena
--- End diff --

@ajs6f I have corrected the pom files. Thank you for reviewing the changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] jena pull request: JENA-624: Correction to transaction begin for D...

2015-11-30 Thread afs
Github user afs commented on the pull request:

https://github.com/apache/jena/pull/103#issuecomment-160771514
  
Are there some tests for this?

I noticed that `DatasetGraphInMemory.abort` does not call anything. Note 
the contract for transactions is that `end()` is optional for write 
transactions (see javadoc).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (JENA-624) Develop a new in-memory RDF Dataset implementation

2015-11-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032537#comment-15032537
 ] 

ASF GitHub Bot commented on JENA-624:
-

Github user afs commented on the pull request:

https://github.com/apache/jena/pull/103#issuecomment-160771514
  
Are there some tests for this?

I noticed that `DatasetGraphInMemory.abort` does not call anything. Note 
the contract for transactions is that `end()` is optional for write 
transactions (see javadoc).




> Develop a new in-memory RDF Dataset implementation
> --
>
> Key: JENA-624
> URL: https://issues.apache.org/jira/browse/JENA-624
> Project: Apache Jena
>  Issue Type: Improvement
>Reporter: Andy Seaborne
>Assignee: A. Soroka
>  Labels: java, linked_data, rdf
>
> The current (Jan 2014) Jena in-memory dataset uses a general purpose 
> container that works for any storage technology for graphs together with 
> in-memory graphs.  
> This project would develop a new implementation design specifically for RDF 
> datasets (triples and quads) and efficient SPARQL execution, for example, 
> using multi-core parallel operations and/or multi-version concurrent 
> datastructures to maximise true parallel operation.
> This is a system project suitable for someone interested in datatbase 
> implementation, datastructure design and implementation, operating systems or 
> distributed systems.
> Note that TDB can operate in-memory using a simulated disk with 
> copy-in/copy-out semantics for disk-level operations.  It is for faithful 
> testing TDB infrastructure and is not designed performance, general in-memory 
> use or use at scale.  While lesson may be learnt from that system, TDB 
> in-memory is not the answer here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JENA-624) Develop a new in-memory RDF Dataset implementation

2015-11-30 Thread A. Soroka (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032841#comment-15032841
 ] 

A. Soroka commented on JENA-624:


No, no test, because I thought it pretty obvious, but I can certainly add one, 
later this week.

I'm not sure what you mean by the question about {{::abort}}. 
{{DatasetGraphInMemory::abort}} specifically changes the transaction tracking 
state and calls {{DatasetGraphInMemory::end}}. That's just for DRY-ness. The 
things that need to happen in an abort are local state cleanup (releasing the 
dataset lock and so forth) and passing the message onto the various tuple 
tables, so they can drop whatever state was associated to the transaction (in 
the default impl, the "evolved" references to the data structures).

I understand {{end}} also to possess the semantic of "clean up and release any 
state associated to the transaction". Is that not so? That's why I reused 
{{end}} to do the same work as part of {{abort}}. See 
{{DatasetGraphWithLock::_abort}} for what I think is the same pattern.

> Develop a new in-memory RDF Dataset implementation
> --
>
> Key: JENA-624
> URL: https://issues.apache.org/jira/browse/JENA-624
> Project: Apache Jena
>  Issue Type: Improvement
>Reporter: Andy Seaborne
>Assignee: A. Soroka
>  Labels: java, linked_data, rdf
>
> The current (Jan 2014) Jena in-memory dataset uses a general purpose 
> container that works for any storage technology for graphs together with 
> in-memory graphs.  
> This project would develop a new implementation design specifically for RDF 
> datasets (triples and quads) and efficient SPARQL execution, for example, 
> using multi-core parallel operations and/or multi-version concurrent 
> datastructures to maximise true parallel operation.
> This is a system project suitable for someone interested in datatbase 
> implementation, datastructure design and implementation, operating systems or 
> distributed systems.
> Note that TDB can operate in-memory using a simulated disk with 
> copy-in/copy-out semantics for disk-level operations.  It is for faithful 
> testing TDB infrastructure and is not designed performance, general in-memory 
> use or use at scale.  While lesson may be learnt from that system, TDB 
> in-memory is not the answer here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


FOSDEM 2016 - take action by 4th of December 2015

2015-11-30 Thread Roman Shaposhnik
As most of you probably know FOSDEM 2016 (the biggest,
100% free open source developer conference) is right 
around the corner:
   https://fosdem.org/2016/

We hope to have an ASF booth and we would love to see as
many ASF projects as possible present at various tracks
(AKA Developer rooms):
   https://fosdem.org/2016/schedule/#devrooms

This year, for the first time, we are running a dedicated
Big Data and HPC Developer Room and given how much of that
open source development is done at ASF it would be great
to have folks submit talks to:
   https://hpc-bigdata-fosdem16.github.io

While the CFPs for different Developer Rooms follow slightly 
different schedules, but if you submit by the end of this week 
you should be fine.

Finally if you don't want to fish for CFP submission URL,
here it is:
   https://fosdem.org/submit

If you have any questions -- please email me *directly* and
hope to see as many of you as possible in two months! 

Thanks,
Roman.