[
https://issues.apache.org/jira/browse/LUCENE-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805034#action_12805034
]
Stanislaw Osinski commented on LUCENE-2221:
---
I ran the benchmark on a 6
[
https://issues.apache.org/jira/browse/LUCENE-871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521344
]
Stanislaw Osinski commented on LUCENE-871:
--
I've just quickly decompiled the ISOLatin1AccentFilter.
[
https://issues.apache.org/jira/browse/LUCENE-871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521089
]
Stanislaw Osinski commented on LUCENE-871:
--
One possible (and probably large) speed up for this code would
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518636
]
Stanislaw Osinski commented on LUCENE-966:
--
> * I removed StandardAnalyzer.html "grammar doc&quo
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518521
]
Stanislaw Osinski commented on LUCENE-966:
--
Absolutely -- the header was there only because I used Carrot2
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517577
]
Stanislaw Osinski commented on LUCENE-966:
--
Good news -- thanks for the test!
> A faster JFlex-ba
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517515
]
Stanislaw Osinski commented on LUCENE-966:
--
This time I used Tortoise, but it made things worse :) I
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stanislaw Osinski updated LUCENE-966:
-
Attachment: jflex-analyzer-r562378-patch-nodup.txt
One more try -- this time without
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stanislaw Osinski updated LUCENE-966:
-
Attachment: jflex-analyzer-r562378-patch.txt
I've attached another patch
>
> > Mark -- have you tried the jflex-analyzer-r560135-patch.txt patch with
> your wikipedia diff test? That's the early one whose grammar was "dot for
> dot" translated from the original JavaCC spec -- for further patches I did
> some "optimizations", which seem to have broken the compatibility..
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517371
]
Stanislaw Osinski commented on LUCENE-966:
--
O -- only now I realized I made a really silly mistake
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517233
]
Stanislaw Osinski commented on LUCENE-966:
--
To be precise -- I'm not 100% sure that this is a bug in J
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517184
]
Stanislaw Osinski commented on LUCENE-966:
--
Thanks for more test cases. I guess the biggest problem here is
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stanislaw Osinski updated LUCENE-966:
-
Attachment: jflex-analyzer-r561693-compatibility.txt
A patch for better compatibility
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516893
]
Stanislaw Osinski commented on LUCENE-966:
--
When digging deeper into the issues of compatibility with the
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516762
]
Stanislaw Osinski commented on LUCENE-966:
--
Thanks for spotting the differences, I'll add them to the
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stanislaw Osinski updated LUCENE-966:
-
Attachment: jflex-analyzer-r561292-patch.txt
Here is another (this time let's ca
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stanislaw Osinski updated LUCENE-966:
-
Attachment: jflex-analyzer-r560135-patch.txt
Here's another patch (against r5
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stanislaw Osinski updated LUCENE-966:
-
Attachment: AnalyzerBenchmark.java
Here is a very simple benchmark I used to test the
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stanislaw Osinski updated LUCENE-966:
-
Attachment: jflex-analyzer-patch.txt
Here comes a somewhat rough (needing refactorings
Components: Analysis
Reporter: Stanislaw Osinski
Fix For: 2.3
JFlex (http://www.jflex.de/) can be used to generate a faster (up to several
times) replacement for StandardAnalyzer. Will add a patch and a simple
benchmark code in a while.
--
This message is automatically
p;CFID=48792235&CFTOKEN=73030559
Here's a paper on how Lingo performs compared to e.g. STC:
http://citeseer.ist.psu.edu/osinski04conceptual.html
Cheers,
Staszek
--
Stanislaw Osinski, [EMAIL PROTECTED]
http://www.carrot-search.com
---
Lucene.
Please let me know if you need more details or would like to set up a
demo with your data source.
Staszek
--
Stanislaw Osinski, [EMAIL PROTECTED]
http://www.carrot-search.com
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
23 matches
Mail list logo