[ https://issues.apache.org/jira/browse/SOLR-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865339#action_12865339 ]
Robert Muir commented on SOLR-1865: ----------------------------------- Hoss Man: it is true that, as bytes, other encodings represent the BOM in a different way. However, your last statement is the important part: the Reader converts it to java characters (UTF-16) encoding for us. So in String or char context it is always going to be U+FEFF, regardless of whichever unicode encoding it was originally in. > ignore byte-order markers in SolrResourceLoader > ----------------------------------------------- > > Key: SOLR-1865 > URL: https://issues.apache.org/jira/browse/SOLR-1865 > Project: Solr > Issue Type: Improvement > Reporter: Robert Muir > Priority: Minor > Fix For: 3.1 > > Attachments: SOLR-1865.patch > > > If you create say a stopwords list with windows notepad or other editors and > save as UTF-8, > some of these editors will insert a byte-order marker (zero-width no-break > space) as the first > character of the file. > http://www.lucidimagination.com/search/document/5101871231fc95af/is_this_a_bug_of_the_ressourceloader -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org