OK, thank you for your reply! In the meantime I figured out why this was working without errors in my first code!
There I had some REGEX checks before saving each row into the database. That means the first row always got skipped, because the unicode indentifiers didn't fit to the REGEX. Now I know where my fault is, but I don't really know how to solve it. If the source csv is in utf-8 I can of course strip the first three chars. But if it is in another encoding, that means I strip of chars that I need. How can I check which encoding the file has? I tried this here, but that gives me always CP850 as encoding: file = File.open("my.csv") puts file.external_encoding.name Or is there a way to transform a file before uploading? I use file.temp for uploading. Cheers, Sebastian On 4 Jul., 18:31, Walter Lee Davis <wa...@wdstudio.com> wrote: > Unicode uses them to indicate to the application reading the text file > which order the following bytes are in. Since UTF-8 uses compound > characters to indicate the scary-high end of the unicode character > table (two bytes needed to encode some characters) the order that the > bits arrived in is of critical importance. Text files may be little- > endian or big-endian, and unless you know what order to expect, you > can't really know. > > Walter > > On Jul 4, 2011, at 3:02 AM, Sebastian wrote: > > > > > > > > > Thank you for your reply! > > > Stripping the first chars is possible of course, but I don't > > understand why these chars are there. > > > It was working before! I could just upload the utf-8 csv and everthing > > was working great before. I don't really know what I changed that now > > these chars are appearing. > > > Sebastian > > > On 1 Jul., 15:12, Frederick Cheung <frederick.che...@gmail.com> wrote: > >> On Jul 1, 11:48 am, Sebastian <sebastian.go...@googlemail.com> wrote: > > >>> OK, > > >>> it was working perfectly when I just made sure that my csv file is > >>> in > >>> utf-8 encoding format. > > >>> I deleted some of my programm, so I had to write a lot of stuff > >>> again. > > >>> If I now upload a csv file which is in utf-8 format and then I have > >>> every time in the first row that the first three character are: \xEF > >>> \xBBxBF > > >> That's a utf BOM: a magic unicode character that tells whoever is > >> reading the stream what endianness is and also allows to tell UTF8 > >> apart from utf16 > >> You can safely strip them from the file. > > >>> I read that these is something about unicode and ordering, but i > >>> don't > >>> know where these hex chars come from. > > >>> Also every german special character is also shown in this hex code, > >>> e.g. "k\xC3\xBChler" should be "kühler" > > >> That is probably just an output thing if you are seeing this in a > >> terminal window- \xC3\xBC is the utf8 sequence for ü > > >> Fred > > >>> If I use files in other encodings there are not these three chars in > >>> the beginning, but every special char is "?" > > >>> Has anyone an idea where this comes from? > > >>> Cheers, > >>> Sebastian > > >>> On 22 Jun., 13:26, Sebastian <sebastian.go...@googlemail.com> wrote: > > >>>> file.temp is an object. I have a form where a csv can be > >>>> uploaded, but > >>>> it is never stored. That's why I use tempfile. That means that I > >>>> probably have no path to use in that method. > > >>>> BUT, the open and foreach method for the CSV class is working > >>>> with an > >>>> object whenever I don't have a german special character in my csv > >>>> file > >>>> or when my csv file is already in utf-8 encoding format. > > >>>> On 22 Jun., 12:05, Chirag Singhal <chirag.sing...@gmail.com> wrote: > > >>>>> What does file.tempfile return? > >>>>> If it is a file object, then we have a problem, we need to pass > >>>>> in file path > >>>>> here. > >>>>> So call path on the file object and pass that as the first > >>>>> argument. > > > -- > > You received this message because you are subscribed to the Google > > Groups "Ruby on Rails: Talk" group. > > To post to this group, send email to rubyonrails- > > t...@googlegroups.com. > > To unsubscribe from this group, send email to > > rubyonrails-talk+unsubscr...@googlegroups.com > > . > > For more options, visit this group > > athttp://groups.google.com/group/rubyonrails-talk?hl=en > > . -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-talk+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.