[ 
https://issues.apache.org/jira/browse/LUCENE-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423551#comment-13423551
 ] 

Steven Rowe commented on LUCENE-3747:
-------------------------------------

There was a source generation problem: "Picked up JAVA_TOOL_OPTIONS: 
-Dfile.encoding=UTF-8" got embedded in two of the intermediate generated 
.jflex-macro files.  If JAVA_TOOLS_OPTIONS env. var is set, it gets picked up 
as if it were cmdline options by JVM, then the JVM outputs that string, 
apparently into the same stream that gets captured by one of the source 
generation processes.

I committed a fix to trunk: 
[r1366231|http://svn.apache.org/viewvc?view=revision&revision=1366231].
                
> Support Unicode 6.1.0
> ---------------------
>
>                 Key: LUCENE-3747
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3747
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.5, 4.0-ALPHA
>            Reporter: Steven Rowe
>            Priority: Minor
>         Attachments: LUCENE-3747.patch, LUCENE-3747.patch
>
>
> Now that Unicode 6.1.0 has been released, Lucene/Solr should support it.
> JFlex trunk now supports Unicode 6.1.0.
> Tasks include:
> * Upgrade ICU4J to v49 (after it's released, on 2012-03-21, according to 
> http://icu-project.org).
> * Use {{icu}} module tools to regenerate the supplementary character 
> additions to JFlex grammars.
> * Version the JFlex grammars: copy the current implementations to 
> {{*Impl3<X>}}; cause the versioning tokenizer wrappers to instantiate this 
> version when the {{Version}} c-tor param is in the range 3.1 to the version 
> in which these changes are released (excluding the range endpoints); then 
> change the specified Unicode version in the non-versioned JFlex grammars from 
> 6.0 to 6.1.
> * Regenerate JFlex scanners, including {{StandardTokenizerImpl}}, 
> {{UAX29URLEmailTokenizerImpl}}, and {{HTMLStripCharFilter}}.
> * Using {{generateJavaUnicodeWordBreakTest.pl}}, generate and then run 
> {{WordBreakTestUnicode_6_1_0.java}}  under 
> {{modules/analysis/common/src/test/org/apache/lucene/analysis/core/}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to