[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-04 Thread osma
GitHub user osma opened a pull request: https://github.com/apache/jena/pull/97 JENA-1062: configurable Lucene analyzer for jena-text This is a configurable Analyzer implementation for jena-text / Lucene. It is similar to what can be achieved in [Solr configuration](https://wiki.apa

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-04 Thread rvesse
Github user rvesse commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-153874598 Looks good to me One open question, how does this interact with past work for language specific indexing and multi-lingual indexing in general? It's been a

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-04 Thread osma
Github user osma commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-153981033 Good questions @rvesse ! Right now (before this PR) one can either use a few generic, non-language-specific Analyzers: StandardAnalyzer, SimpleAnalyzer, KeywordAnalyz

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-05 Thread ajs6f
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/97#discussion_r44012158 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/analyzer/ConfigurableAnalyzer.java --- @@ -0,0 +1,93 @@ +/** + * Licensed to the Apache Softw

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-05 Thread osma
Github user osma commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154063083 @ajs6f You're right, I did that now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-05 Thread ajs6f
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/97#discussion_r44013590 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/assembler/ConfigurableAnalyzerAssembler.java --- @@ -0,0 +1,101 @@ +/** + * Licensed to the A

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-05 Thread ajs6f
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/97#discussion_r44015443 --- Diff: jena-text/src/test/java/org/apache/jena/query/text/TestDatasetWithConfigurableAnalyzer.java --- @@ -0,0 +1,63 @@ +/** + * Licensed to the Apache

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-05 Thread ajs6f
Github user ajs6f commented on a diff in the pull request: https://github.com/apache/jena/pull/97#discussion_r44015231 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/assembler/ConfigurableAnalyzerAssembler.java --- @@ -0,0 +1,101 @@ +/** + * Licensed to the A

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-05 Thread osma
Github user osma commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154092054 Thanks @ajs6f for your detailed notes! I fixed the things you pointed out but I think especially the Guava Sets.newHashSet pattern could be applied in many other places in the

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-05 Thread ajs6f
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154093794 @osma I am a nitpicker of the worst kind! {grin} If you would like me to make a PR against `jena-text` looking for this kind of thing (using Guava or Java 8 idioms to

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-05 Thread afs
Github user afs commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154098822 The test should be "is it clear?". `new HashSet<>()` is common usage. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-05 Thread ajs6f
Github user ajs6f commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154099939 Sure, there's nothing wrong with what's there. All I meant is that `newHashSet` seems as clear as `new HashSet<>()` and allows elements, so is shorter. --- If your project

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-06 Thread osma
Github user osma commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154353023 @ajs6f : >If you would like me to make a PR against jena-text looking for this kind of thing (using Guava or Java 8 idioms to shorten things up) I'm happy to do that somet

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-06 Thread afs
Github user afs commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154376006 Looks good. With a documentation modifications to follow? The text index seems to becoming quite popular. --- If your project is set up for it, you can reply to this email an

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-06 Thread osma
Github user osma commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-154468394 Yes, of course I will also update the jena-text documentation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-17 Thread osma
Github user osma commented on the pull request: https://github.com/apache/jena/pull/97#issuecomment-157324689 Rebased on current master and squashed my commits into one, preparing to merge to Apache git --- If your project is set up for it, you can reply to this email and have your r

[GitHub] jena pull request: JENA-1062: configurable Lucene analyzer for jen...

2015-11-17 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/jena/pull/97 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled