On Sunday, September 2, 2018 at 3:16:30 PM UTC-5, [email protected] wrote: > > > May I ask what sort of non-ASCII in the Excel was forcing encoding of the > CSV? The usual Latin-1 mix of multinational characters, emojis, or > non-Western scripts? >
Just the usual Latin-1 mix: The data includes full names of people, some of which are Hispanic or French, and have the expected diacritics. For *some* use-cases, Spreadsheets in the cloud can be better for shared > data than passing them around -- Office 360 or better yet Google Sheets. > Especially if it's data collection, Google Sheets have data entry forms. > (Not appropriate for highly sensitive information, even at the "only people > with URL can see" level, of course.) > This option was considered, but Google Sheets had intolerable performance issues for the larger files. And I had so much hope for V8. > This solution was acceptable to everyone, including me, >> > > This is somewhat surprising, i didn't expect to see UTF-16 be useful > outside of Asian text processing! > It was definitely a surprise relief. For my own purposes it was just a matter of explicitly setting I/O encodings in the project codebase. It took a couple minutes and yielded zero a complaint from the test suite. > > until the first time I tried to ack through one of the new UTF-16-encoded >> files. >> > > That you're searching DATA -- that you may have programs processing and so > are using Ack to peek into the data while debugging (since we debug the > GIGO data as often as we debug code!) -- will add just a little weight to > the idea of detecting BOM prefixes and doing the right thing with them > (decoding to internal). BOM detection and processing *looks* simple to > _do_ but *not* simple to expand the test suite adequately to assure it > doesn't result interact badly elsewhere. > // Bill > For what it's worth, I'd happily settle for an explicit encoding flag that could be tucked away in a working directory's .ackrc file. Either way, thanks for the useful discussion! -- Richard Simões Internet -- You received this message because you are subscribed to the Google Groups "ack users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/ack-users. For more options, visit https://groups.google.com/d/optout.
