> On 20 Jul 2021, at 12:11, Guillermo Polito <guillermopol...@gmail.com> wrote:
>
>
>
>> El 20 jul 2021, a las 11:45, Sven Van Caekenberghe <s...@stfx.eu> escribió:
>>
>>
>>
>>> On 20 Jul 2021, at 11:03, Sven Van Caekenberghe <s...@stfx.eu> wrote:
>>>
>>> Hi Tim,
>>>
>>> An introduction to this part of the system is in
>>> https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html
>>> [Character Encoding and Resource Meta Description] from the "Enterprise
>>> Pharo" book.
>>>
>>> The error means that a file that you try to read as UTF-8 does contain
>>> things that are invalid with respect to the UTF-8 standard.
>>>
>>> Are you sure the file is in UTF-8, maybe it is in ASCII, Latin-1 or
>>> something else ?
>>>
>>> It is possible to customise the encoding to something different than the
>>> default UTF-8. For non-UTF encoders, there is a strict/lenient option to
>>> disallow/allow illegal stuff (but then you will get these in your strings).
>>>
>>> I can show you how to do that if you want.
>>
>> '/var/log/system.log' asFileReference readStreamDo: [ :in | in upToEnd ].
>>
>> '/var/log/system.log' asFileReference binaryReadStreamDo: [ :in |
>> (ZnCharacterReadStream on: in encoding: #ascii) upToEnd ].
>>
>> '/var/log/system.log' asFileReference binaryReadStreamDo: [ :in |
>> (ZnCharacterReadStream on: in encoding: ZnCharacterEncoder ascii
>> beLenient) upToEnd ].
>
> There is also readStreamEncoded:[do:], which is a bit more concise but does
> the same :)
Yes indeed !
>> HTH
>>
>>> Sven
>>>
>>>> On 20 Jul 2021, at 10:31, Tim Mackinnon <tim@testit.works> wrote:
>>>>
>>>> Hi - I’m doing a bit of log file processing with Pharo - and I’ve hit an
>>>> unexpected error and am wondering what the best way to approach it is.
>>>>
>>>> It seems that I have a log file that has unexpected characters, and so my
>>>> readStream loop that reads lines gets an error: "ZnInvalidUTF8: Illegal
>>>> continuation byte for utf-8 encoding”.
>>>>
>>>> For some reason this file (unlike my others) seems to contain characters
>>>> that it shouldn’t - but what is the best way for me to continue
>>>> processing? Should I be opening my files in a different way - or can I
>>>> resume the error somehow- I’m not familiar with this area of Pharo and am
>>>> after a bit of advice.
>>>>
>>>> My code is like this (and I get the error when doing nextLine)
>>>>
>>>>
>>>> parseStream: aFileStream with: aBlock
>>>> | line items |
>>>> [ (line := aFileStream nextLine) isNil ]
>>>> whileFalse: [
>>>> items := $/ split: line.
>>>> items size = 3 ifTrue: [aBlock value: items]]
>>>>
>>>> My stream is created like this:
>>>>
>>>> firmEfs := (pathName , '/' , firmName , '_files') asFileReference.
>>>> details parseStream: firmEfs readStream.
>>>>
>>>>
>>>> Should I be opening the stream a bit differently - or can I catch that
>>>> encoding error and resume it with some safe character?
>>>>
>>>> Thanks for any help.
>>>>
>>>> Tim