[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mike Drob updated LUCENE-6993: ------------------------------ Attachment: LUCENE-6993.patch bq. Well this test is already marked @Slow and just took 41.2s on my machine. Were you seeing stuff like that? As far as i know from the original issue, there were tests for this bug that would basically never finish at all . I left it to run and came back later and saw that it took 50 minutes. But it passed. 40 seconds on your machine sounds great, I won't worry about it, thanks. bq. Mike, can you please exclude generated files from your patch? The patches here are way big, and reviewers/committers will want to regenerate anyway. Sure, this makes sense. Steps to generate everything: {code} #!/usr/bin/env bash pushd lucene/analysis/common ANT_OPTS="-Xmx5g" ant gen-tlds jflex ant jflex-legacy # For some reason this needs to be run separately from the jflex command. I could never figure out why. pushd src/test/org/apache/lucene/analysis/standard rm WordBreakTestUnicode_6_3_0.java perl generateJavaUnicodeWordBreakTest.pl -v 8.0.0 popd popd {code} > Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all > JFlex-based tokenizers to support Unicode 8.0 > ---------------------------------------------------------------------------------------------------------------- > > Key: LUCENE-6993 > URL: https://issues.apache.org/jira/browse/LUCENE-6993 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Reporter: Mike Drob > Assignee: Robert Muir > Fix For: 6.0 > > Attachments: LUCENE-6993.patch, LUCENE-6993.patch, LUCENE-6993.patch, > LUCENE-6993.patch, LUCENE-6993.patch, LUCENE-6993.patch, LUCENE-6993.patch > > > We did this once before in LUCENE-5357, but it might be time to update the > list of TLDs again. Comparing our old list with a new list indicates 800+ new > domains, so it would be nice to include them. > Also the JFlex tokenizer grammars should be upgraded to support Unicode 8.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org