I agree that the number of encodings makes a full proof transparent solution impossible to implement.
I still think that some simpler text file handling out of the box should exist on the JVM to read utf files. Utf-8 is kind of natural within the JVM. Exposing all this BOM machinery every time you need to read a text file is a pain. Either implement BOM recognition on the fly or make it mandatory in utf-8 files every where. The BOM is required for utf-16 and above as far as I know. The time spent on stupid issues like this one must be significant given the number of people struggling with this... Sent from my iPhone > On Jul 13, 2015, at 18:46, Sungjin Chun <chu...@castlesoft.co.kr> wrote: > > Assume that charset is the same, even this case, there're many types of > encoding scheme for it and for portability, > you have to consider both input and output encoding. On Mac OS X or Linux, > this is controlled by locale system, > on windows 1. you can force encoding system using control panel or you have > to change your encoding before > output to console. Here, we in korea, do this stuffs for internationalized > application development. Of course, you have > to use correct charset for i18n application :-) > >> On Mon, Jul 13, 2015 at 11:56 PM Luc Préfontaine >> <lprefonta...@softaddicts.ca> wrote: >> I cannot remember the details but in 2010 I had similar problem in a >> cross-platform project >> using Clojure. And problems earlier in another cross-platform/cross-language >> project. >> >> So it's the reverse way, no BOM at all... >> >> Can't believe we are in 2015 still struggling with character set issues. >> Having to to think about this when saving a file in notepad...That's >> depressing. >> No wonder why I now stay away from Windows as much as possible. >> >> I can't understand why we cannot get some transparent behavior from the Java >> runtime. >> These are human readable text files. Not some unreadable binary format. >> Googled a bit about this and numerous people face this problem reading >> windows generated >> files. They all ended up having to skip the BOM if present when reading the >> file. >> >> So much for portability. Beurk. >> >> > On Mon, Jul 13, 2015 at 2:52 PM, Luc Préfontaine < >> > lprefonta...@softaddicts.ca> wrote: >> > >> > > BG is right on it. I hit this problem a decade ago (roughly :)). >> > > UTF-8 files with no BOM are not handled properly on windows. >> > > It assumes that they are ASCII coded. That works partially (both >> > > character >> > > sets have the same >> > > encoding for many characters) but eventually fails. >> > > >> > >> > > Make sure that the files have a BOM. You can do this on a per file basis >> > > using an IDE >> > > (Eclipse, ...) or if you can use bash scripts to do this if you have >> > > access to a u*x environment. >> > > I did not find an equivalent native windows tool but they might be some >> > > to >> > > do this in batch. >> > > >> > > Luc P. >> > > >> > >> > Clojure source files are expected to be in UTF-8 and Clojure on Windows >> > doesn't require a BOM. >> > >> > In fact, Clojure files must not contain a BOM because it isn't considered >> > to be whitespace by the clojure parser and will cause the error "Unable to >> > resolve symbol: ? in this context". >> > >> > Some software, such as Windows notepad uses the presence of a BOM to detect >> > UTF-8, but that can be overridden in the File | Open dialog. Other than >> > that, the behaviour of the BOM on Clojure between Linux and Windows should >> > be the same - this stuff is all handled by Java code in the JDK - not by >> > the Windows platform. >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups "Clojure" group. >> > To post to this group, send email to clojure@googlegroups.com >> > Note that posts from new members are moderated - please be patient with >> > your first post. >> > To unsubscribe from this group, send email to >> > clojure+unsubscr...@googlegroups.com >> > For more options, visit this group at >> > http://groups.google.com/group/clojure?hl=en >> > --- >> > You received this message because you are subscribed to the Google Groups >> > "Clojure" group. >> > To unsubscribe from this group and stop receiving emails from it, send an >> > email to clojure+unsubscr...@googlegroups.com. >> > For more options, visit https://groups.google.com/d/optout. >> > >> -- >> Luc Préfontaine<lprefonta...@softaddicts.ca> sent by ibisMail! >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with your >> first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to a topic in the >> Google Groups "Clojure" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/clojure/Rk5JGhq-IJY/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> clojure+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.