[GitHub] jena pull request: Add (?uri ?score) to jena-text
Github user osma commented on the pull request: https://github.com/apache/jena/pull/72#issuecomment-106794036 I updated my branch to fix merge conflicts caused by recent jena-text commits. Would this be ready for merging? @afs ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Lucene index synchro on triple deletion (jena-t...
Github user osma commented on the pull request: https://github.com/apache/jena/pull/53#issuecomment-106801397 @amiara514 What's the status of this? Obviously there are now some merge conflicts that need to be resolved, but other than that, do you think it's ready for merging? I for one would like to see (a version of) this merged, because it would help a lot in making efficient text index queries if you could rely on the index not containing stale entries. Looking at the diffs, I see there are some changes to jena-arq that don't seem related to the main purpose of this code. Some of that is just cosmetic changes like adjusting whitespace. Then there is some QueryVisitor code, and some changes to ARQ tests. I doubt these patches are related to jena-text? A couple of notes on the jena-text part: 1. You seem to always use the "uid" field to store the SHA-1 checksum. However, all other field names in jena-text, including entityField to store the URI, have so far been configurable using the assembler or EntityDefinition. Should this field name be made configurable as well? Not having it set could then mean "backwards compatible mode" where checksums are not stored and old entries are thus not deleted. 2. You are using DigestUtils.shaHex() to calculate the checksum, but this seems to be a deprecated method nowadays. I suggest using sha1Hex() or one of the other more explicit variants instead. 3. I'm not sure TextQueryFuncs.getLiteralLanguage is a very useful method (at least in its current form, with no error handling for null values etc), as one could just do node.getLiteral().getLanguage() directly. But this is a minor style issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Lucene index synchro on triple deletion (jena-t...
Github user amiara514 commented on the pull request: https://github.com/apache/jena/pull/53#issuecomment-106848124 @osma The status is still pending. Ok, I will fix minor changes to provide a re-mergeable version. 1. yes maybe the unique ID could be configurable. Hence the feature would be a kind of: "Enabling deletion mode". Agree with that ? 2. no problem for change it. 3. I will take a look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Update TextDatasetFactory.java
Github user asfgit closed the pull request at: https://github.com/apache/jena/pull/74 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Add (?uri ?score) to jena-text
Github user afs commented on the pull request: https://github.com/apache/jena/pull/72#issuecomment-106861440 @osma - I'll find some time to review this but To your points: 1. I dont think that change is a problem and \@Deprecated isn't needed because (a) Jena3 allows us to be a bit more flexibility to do things right and (b) the major use is via SPARQL and that is not affected. I don't recall anyone asking about detailed direct use of the module. 1. IIRC solr works differently and puts in a "score" field in the results. 1. Yes - we already have enough old practices! I'll look at the PR for these. I'd like to proceed getting functionality in and worry about clearing up very soon after. It exposes the changes early for those interested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
A couple pull requests in the pending tray.
Claude, Rob, I took a look at our outstanding pull requests on github. I was wondering if you have a moment to assess them. It's clearing up suggestions from ajs6f, not bug fixes nor new features. They were part of large sets of changes over several modules but ajs6f very kindly broken them up in units that reflect the module structure of Jena. Claude: "Nonfunctional cleanup in jena-querybuilder" https://github.com/apache/jena/pull/68 the bulk of this is subQuery is known to be instanceof the various cases do the if/instanceof isn't necessary. Your choice on style. Rob: "Nonfunctional cleanup in jena-jdbc" https://github.com/apache/jena/pull/62 Much of this is adjusting the throws clauses on @Override methods. I personally prefer normally to keep the throws even if not actually thrown because they are in the contract of the interface. I didn't apply similar changes elsewhere. Andy
[GitHub] jena pull request: Lucene index synchro on triple deletion (jena-t...
Github user osma commented on the pull request: https://github.com/apache/jena/pull/53#issuecomment-106868586 > yes maybe the unique ID could be configurable. Hence the feature would be a kind of: "Enabling deletion mode". But insertions without this option enabled are unremovable. Agree with that ? Sure, that's how it would work then. Without stored id's deletion is impossible. Unit tests are missing too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Jena-text multilingual alternative implementati...
Github user afs commented on the pull request: https://github.com/apache/jena/pull/64#issuecomment-106875094 Yes - adding the version info is more nice than essential. (Really, we ought to undo references to really old stuff.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Jena-text multilingual alternative implementati...
Github user amiara514 commented on the pull request: https://github.com/apache/jena/pull/64#issuecomment-106881469 Also, I want to mention that the md version from "Improve this Page" button (on doc web site) does not correspond to the web site. https://svn.apache.org/repos/asf/jena/site/trunk/content/documentation/query/text-query.mdtext seems to be the good one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request: Add (?uri ?score) to jena-text
Github user osma commented on the pull request: https://github.com/apache/jena/pull/72#issuecomment-106892384 > 1. I dont think that change is a problem and \@Deprecated isn't needed Great, no worries then! > 1. IIRC solr works differently and puts in a "score" field in the results. Yes, I looked at some Solr example code which handled scores and tried to follow that. But as I said I was unable to hook up Solr to current jena-text even without my changes, so I wasn't able to test my code for real. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---