[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-05-11 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281171#comment-15281171 ] Steve Rowe commented on LUCENE-6993: Hi Mike, my review of your latest patch: * All

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-05-04 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271376#comment-15271376 ] Steve Rowe commented on LUCENE-6993: Thanks for persisting Mike. I (and other JFle

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-05-04 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271363#comment-15271363 ] Mike Drob commented on LUCENE-6993: --- [~steve_rowe] - I see no movement coming from the

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-04-08 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232236#comment-15232236 ] Steve Rowe commented on LUCENE-6993: Hi Mike, sorry I haven't had the bandwidth to en

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-04-08 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232225#comment-15232225 ] Mike Drob commented on LUCENE-6993: --- [~steve_rowe] - I pinged the jflex list about gett

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-22 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207199#comment-15207199 ] Mike Drob commented on LUCENE-6993: --- Any updates here? I'm not sure if there is anythin

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-14 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193824#comment-15193824 ] Robert Muir commented on LUCENE-6993: - My mistake. thanks for the reminder. I have be

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-14 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193818#comment-15193818 ] Mike Drob commented on LUCENE-6993: --- [~rcmuir] - did you get a chance to look at this?

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181701#comment-15181701 ] Robert Muir commented on LUCENE-6993: - OK, thanks for the work! I will try to review

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-04 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180171#comment-15180171 ] Mike Drob commented on LUCENE-6993: --- There's a clean-jflex-legacy target that takes car

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-04 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180161#comment-15180161 ] Robert Muir commented on LUCENE-6993: - sorry, that just affects the cleaning part. Bu

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-04 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180146#comment-15180146 ] Robert Muir commented on LUCENE-6993: - I don't understand this change: {code} -

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179311#comment-15179311 ] Robert Muir commented on LUCENE-6993: - If that test really took 50 minutes, there may

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-03 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15178886#comment-15178886 ] Steve Rowe commented on LUCENE-6993: Mike, can you please exclude generated files fro

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15178765#comment-15178765 ] Robert Muir commented on LUCENE-6993: - {quote} Had issues with TestUAX29URLEmailToken

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-03-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176520#comment-15176520 ] Robert Muir commented on LUCENE-6993: - I wouldnt change any of the ClassicTokenizer r

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-29 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172324#comment-15172324 ] Mike Drob commented on LUCENE-6993: --- I think I am getting to a good place here, just a

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169658#comment-15169658 ] Robert Muir commented on LUCENE-6993: - Yeah its tricky. I kinda view classictokenizer

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-26 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169638#comment-15169638 ] Steve Rowe commented on LUCENE-6993: {{ClassicTokenizer}} does have direct Unicode ve

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-26 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169499#comment-15169499 ] Mike Drob commented on LUCENE-6993: --- Yea, Uwe understood my question. I wasn't planning

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-26 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169481#comment-15169481 ] Uwe Schindler commented on LUCENE-6993: --- bq. Uwe Schindler has written that he stil

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-26 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168956#comment-15168956 ] Steve Rowe commented on LUCENE-6993: bq. Steve, did you get a chance to look at the b

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-25 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167852#comment-15167852 ] Mike Drob commented on LUCENE-6993: --- Steve, did you get a chance to look at the buffer

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-19 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155160#comment-15155160 ] Steve Rowe commented on LUCENE-6993: Yeah, the generated code underwent some changes

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-19 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155154#comment-15155154 ] Mike Drob commented on LUCENE-6993: --- Using newer version of jflex breaks our existing m

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-19 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155062#comment-15155062 ] Steve Rowe commented on LUCENE-6993: +1 > Update UAX29URLEmailTokenizer TLDs to late

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-19 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155035#comment-15155035 ] Robert Muir commented on LUCENE-6993: - I think we should be ok. As far as i understan

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-19 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154949#comment-15154949 ] Mike Drob commented on LUCENE-6993: --- bq. I think we need to regenerate still, because t

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-19 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154936#comment-15154936 ] Mike Drob commented on LUCENE-6993: --- Question about what is proper behaviour in terms o

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-19 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154933#comment-15154933 ] Robert Muir commented on LUCENE-6993: - I think we need to regenerate still, because t

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-18 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153306#comment-15153306 ] Steve Rowe commented on LUCENE-6993: I think you're right, Mike, I don't see any defa

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-18 Thread Mike Drob (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153258#comment-15153258 ] Mike Drob commented on LUCENE-6993: --- Looking at http://unicode.org/reports/tr29/#Modifi

[jira] [Commented] (LUCENE-6993) Update UAX29URLEmailTokenizer TLDs to latest list, and upgrade all JFlex-based tokenizers to support Unicode 8.0

2016-02-18 Thread Steve Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153209#comment-15153209 ] Steve Rowe commented on LUCENE-6993: [~mdrob], I haven't looked at your patch yet but