Hi Ying,

It's the mid-point of the GSoC programme so it's a good time to assess the state of the project. It looks close to the plan and I'd like you to (briefly) write-up how the project is going. Check you are getting what you want out of the project as well. It is not just code production. Is the rest of the plan looking right still?


Looking on at the repository, there are a few things I'd like to see:

1/ More tests - tests should be structured so each tests a specific thing so when/if there are test failures, it's easier to see what might the the root cause.

2/ Examples and documentation

3/ Evaluation :

For example, is the property table specialisation resulting in a smaller storage cost? And, iteratively, can the design be changed to be more compact? Maybe some indexing isn't needed; maybe a different way to index the same access patterns would take less space.



Other:

The code can be packaged under org.apache.jena. We're trying to avoid com.hp.hpl.jena.

A specific question:

Access by subject is an important use case even when the rows are blank nodes. It will matter for SPARQL and even in since - "find by subject column/value then get row by subject", that is two graph.find calls, seems a reasonable access pattern.

I could not see that graph.find(subject, ANY, ANY) is using PropertyTable.getRow in the graph.find codepath and I expected it would be. Did I miss something?

        Andy

Reply via email to