On Sunday, September 2, 2018 at 3:16:30 PM UTC-5, [email protected] wrote:
>
>
> May I ask what sort of non-ASCII in the Excel was forcing encoding of the 
> CSV? The usual Latin-1 mix of multinational characters, emojis, or 
> non-Western scripts?
>

Just the usual Latin-1 mix: The data includes full names of people, some of 
which are Hispanic or French, and have the expected diacritics.

For *some* use-cases, Spreadsheets in the cloud can be better for shared 
> data than passing them around -- Office 360 or better yet Google Sheets.  
> Especially  if it's data collection, Google Sheets have data entry forms.  
> (Not appropriate for highly sensitive information, even at the "only people 
> with URL can see" level, of course.)
>

This option was considered, but Google Sheets had intolerable performance 
issues for the larger files. And I had so much hope for V8.
 

> This solution was acceptable to everyone, including me,
>>
>
> This is somewhat surprising, i didn't expect to see UTF-16 be useful 
> outside of Asian text processing!
>

It was definitely a surprise relief. For my own purposes it was just a 
matter of explicitly setting I/O encodings in the project codebase. It took 
a couple minutes and yielded zero a complaint from the test suite.
 

>
> until the first time I tried to ack through one of the new UTF-16-encoded 
>> files.
>>
>
> That you're searching DATA -- that you may have programs processing and so 
> are using Ack to peek into the data while debugging (since we debug the 
> GIGO data as often as we debug code!) -- will add just a little weight to 
> the idea of detecting BOM prefixes and doing the right thing with them 
> (decoding to internal).  BOM detection and processing *looks* simple to 
> _do_ but *not* simple to expand the test suite adequately to assure it 
> doesn't result interact  badly elsewhere.
> // Bill
>

For what it's worth, I'd happily settle for an explicit encoding flag that 
could be tucked away in a working directory's .ackrc file. Either way, 
thanks for the useful discussion!

--
Richard Simões
Internet

-- 
You received this message because you are subscribed to the Google Groups "ack 
users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/ack-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to