LowercaseFilter, preserveOriginal?

2015-01-27 Thread Clemens Wyss DEV
Why does the LowecaseFilter, opposed to the ASCIIFoldingFilter, have no preserveOriginal-argument? I very much preserveOriginal="true" when applying the ASCIIFoldingFilter for (german)suggestions

AW: LowercaseFilter, preserveOriginal?

2015-01-27 Thread Clemens Wyss DEV
> I very much preserveOriginal="true" when applying the >ASCIIFoldingFilter for (german)suggestions Must revise my statement, as I just noticed that the original token is just appended tot he stream/token e.g. "chamaleon chamäeleon" And suggest returns the two, whereas I'd like to have the origi

Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread kiwi clive
Hello guys, We currently run with Lucene 3.6 and Java6. In view of the fact that Java7 is soon to be deprecated, we are keen to move to Java8 and also to move to the latest version of Lucene. I understand Lucene 5 is coming although we are happy to move to 4.x as there are lots of goodies there

Re: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread Chris Hostetter
: I seem to remember reading that certain versions of lucene were : incompatible with some java versions although I cannot find anything to : verify this. As we have tens of thousands of large indexes, backwards : compatibility without the need to reindex on an upgrade is of prime : importance

Re: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread kiwi clive
Hi Hoss, Many thanks for the information. This looks very encouraging as the Java7 bug I remember  was fixed and as far as I know, we should not be affected by the others. I'll put a few tests together and put my toe in the water :-) Clive From: Chris Hostetter To: "java-user@lucene.apac

RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread Uwe Schindler
Java 8 update 20 or later is also fine. At current time, always use latest update release and you are be fine with Java 7 and Java 8. Don't use older releases and don't use G1 Garbage Collector. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.

RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread McKinley, James T
Why do you say not to use G1GC? We are using Java 7 & G1GC with Lucene 4.8.1 in production. Thanks. Jim From: Uwe Schindler [u...@thetaphi.de] Sent: Tuesday, January 27, 2015 2:49 PM To: java-user@lucene.apache.org; 'kiwi clive' Subject: RE: Lucene Versi

RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread Uwe Schindler
Hi., About G1GC. We consistently see problems when running the Lucene Testsuite with G1GC enabled. The people from Elasticsearch concluded: "There is a newer GC called the Garbage First GC (G1GC). This newer GC is designed to minimize pausing even more than CMS, and operate on large heaps. It

RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread McKinley, James T
Hi Uwe, OK, thanks for the info. We'll see if we can download the Lucene test suite and check it out. FWIW, we use G1GC in our production runtime (~70 12-16 core Cisco UCS and HP Gen7/Gen8 nodes with 20+ GB heaps using Java 7 and Lucene 4.8.1 with pairs of 30 index partitions with 15M-23M d

Re: AW: LowercaseFilter, preserveOriginal?

2015-01-27 Thread Ahmet Arslan
Hi Clemens, Please see : https://issues.apache.org/jira/browse/LUCENE-5620 Ahmet On Tuesday, January 27, 2015 10:56 AM, Clemens Wyss DEV wrote: > I very much preserveOriginal="true" when applying the >ASCIIFoldingFilter for (german)suggestions Must revise my statement, as I just noticed tha

Can we configure analyzers to not exclude specific characters

2015-01-27 Thread Shivashankar Maddanimath
Hi, I am using Lucene standard and uax29urlemailtokenizer. These analysers are excluding some characters like "+" ( I can't search C++). Is there any way we can configure analyzers to include specific characters in analyzers while tokenising? Regards, Shiv -Original Message- From: "