Re: [basex-talk] Encoding hassle ....

2022-06-13 Thread Marco Lettere

Thank you very much.

We'll give it a try ASAP.

M.

On 13/06/22 16:45, Christian Grün wrote:

If the content of a file is written to another file without
intermediate steps, it is streamed and consumes constant memory. The
implementation for streaming the data was deficient.


Re: [basex-talk] Encoding hassle ....

2022-06-13 Thread Christian Grün
Hi Marco,

If the content of a file is written to another file without
intermediate steps, it is streamed and consumes constant memory. The
implementation for streaming the data was deficient.

The bug has been fixed; a new snapshot is available [1,2].

Grazie e ciao,
Christian

[1] https://github.com/BaseXdb/basex/issues/2117
[2] https://files.basex.org/releases/latest/


On Fri, May 27, 2022 at 11:40 PM Marco Lettere  wrote:
>
> Oh yes thanks. Forgot to mention this. Forcing utf8 doesn't help.
>
> Il ven 27 mag 2022, 19:26 Bridger Dyson-Smith  ha 
> scritto:
>>
>> Marco -
>> I'm sorry but I can only corroborate your findings, and that trying to force 
>> UTF-8 by adding the encoding parameter to the functions doesn't seem to 
>> help; e.g.
>>
>> ) ./bin/basex
>> BaseX 9.7.1 [Standalone]
>> Try 'help' to get more information.
>> > xquery file:current-dir()
>> /usr/home/bridger/bin/basex/
>> Query executed in 886.62 ms.
>> > xquery file:write-text("a1.txt", "°" || out:nl(), "UTF-8")
>>
>> Query executed in 4.32 ms.
>> > xquery file:read-text("a1.txt")
>> °
>>
>> Query executed in 1.99 ms.
>> > xquery file:write-text("a2.txt", file:read-text("a1.txt", "UTF-8"), 
>> > "UTF-8")
>>
>> Query executed in 1.83 ms.
>> > xquery file:read-text("a2.txt")
>> [file:io-error] Decoding error: xb0
>> > xquery file:read-text("a2.txt", "UTF-8")
>> [file:io-error] Decoding error: xb0
>> > xquery file:read-text("a2.txt", "ISO-8859-1")
>> °
>>
>> Query executed in 2.01 ms.
>>
>> On Fri, May 27, 2022 at 1:00 PM Marco Lettere  wrote:
>>>
>>> Dear all,
>>>
>>> after wrapping our heads around this for hours today, we don't know how
>>> to get rid of this inconsistency. Thus I ask for help ...
>>>
>>> SSCE:
>>>
>>> BaseX 9.6.4 [Standalone]
>>> Try 'help' to get more information.
>>>  > xquery file:write-text("a1.txt", "°" || out:nl()) (: Same with
>>> codepoints-to-string(176) instead of "°" :)
>>>
>>> Query executed in 183.94 ms.
>>>  > xquery file:read-text("a1.txt")
>>> °
>>>
>>> Query executed in 1.49 ms.
>>>  > xquery file:write-text("a2.txt", file:read-text("a1.txt"))
>>> Query executed in 3.4 ms.
>>>
>>>  > xquery file:read-text("a2.txt")
>>> [file:io-error] Decoding error: xb0
>>>
>>> Testing the files with linux command-line tool "file", this is the output:
>>>
>>>  > file a1.txt
>>> a1.txt: Unicode text, UTF-8 text
>>>
>>>  > file a2.txt
>>> a2.txt: ISO-8859 text
>>>
>>> Reading the file after "copying" it seems to change the encoding. How is
>>> this supposed to be handled?
>>>
>>> Regards,
>>>
>>> Marco.
>>>


Re: [basex-talk] Encoding hassle ....

2022-05-27 Thread Marco Lettere
Oh yes thanks. Forgot to mention this. Forcing utf8 doesn't help.

Il ven 27 mag 2022, 19:26 Bridger Dyson-Smith  ha
scritto:

> Marco -
> I'm sorry but I can only corroborate your findings, and that trying to
> force UTF-8 by adding the encoding parameter to the functions doesn't seem
> to help; e.g.
>
> ) ./bin/basex
> BaseX 9.7.1 [Standalone]
> Try 'help' to get more information.
> > xquery file:current-dir()
> /usr/home/bridger/bin/basex/
> Query executed in 886.62 ms.
> > xquery file:write-text("a1.txt", "°" || out:nl(), "UTF-8")
>
> Query executed in 4.32 ms.
> > xquery file:read-text("a1.txt")
> °
>
> Query executed in 1.99 ms.
> > xquery file:write-text("a2.txt", file:read-text("a1.txt", "UTF-8"),
> "UTF-8")
>
> Query executed in 1.83 ms.
> > xquery file:read-text("a2.txt")
> [file:io-error] Decoding error: xb0
> > xquery file:read-text("a2.txt", "UTF-8")
> [file:io-error] Decoding error: xb0
> > xquery file:read-text("a2.txt", "ISO-8859-1")
> °
>
> Query executed in 2.01 ms.
>
> On Fri, May 27, 2022 at 1:00 PM Marco Lettere  wrote:
>
>> Dear all,
>>
>> after wrapping our heads around this for hours today, we don't know how
>> to get rid of this inconsistency. Thus I ask for help ...
>>
>> SSCE:
>>
>> BaseX 9.6.4 [Standalone]
>> Try 'help' to get more information.
>>  > xquery file:write-text("a1.txt", "°" || out:nl()) (: Same with
>> codepoints-to-string(176) instead of "°" :)
>>
>> Query executed in 183.94 ms.
>>  > xquery file:read-text("a1.txt")
>> °
>>
>> Query executed in 1.49 ms.
>>  > xquery file:write-text("a2.txt", file:read-text("a1.txt"))
>> Query executed in 3.4 ms.
>>
>>  > xquery file:read-text("a2.txt")
>> [file:io-error] Decoding error: xb0
>>
>> Testing the files with linux command-line tool "file", this is the output:
>>
>>  > file a1.txt
>> a1.txt: Unicode text, UTF-8 text
>>
>>  > file a2.txt
>> a2.txt: ISO-8859 text
>>
>> Reading the file after "copying" it seems to change the encoding. How is
>> this supposed to be handled?
>>
>> Regards,
>>
>> Marco.
>>
>>


Re: [basex-talk] Encoding hassle ....

2022-05-27 Thread Christian Grün
Definitely looks like a bug. I’m currently on the road, but I’ll get to the
bottom of this once I’m back.



Bridger Dyson-Smith  schrieb am Fr., 27. Mai 2022,
19:27:

> Marco -
> I'm sorry but I can only corroborate your findings, and that trying to
> force UTF-8 by adding the encoding parameter to the functions doesn't seem
> to help; e.g.
>
> ) ./bin/basex
> BaseX 9.7.1 [Standalone]
> Try 'help' to get more information.
> > xquery file:current-dir()
> /usr/home/bridger/bin/basex/
> Query executed in 886.62 ms.
> > xquery file:write-text("a1.txt", "°" || out:nl(), "UTF-8")
>
> Query executed in 4.32 ms.
> > xquery file:read-text("a1.txt")
> °
>
> Query executed in 1.99 ms.
> > xquery file:write-text("a2.txt", file:read-text("a1.txt", "UTF-8"),
> "UTF-8")
>
> Query executed in 1.83 ms.
> > xquery file:read-text("a2.txt")
> [file:io-error] Decoding error: xb0
> > xquery file:read-text("a2.txt", "UTF-8")
> [file:io-error] Decoding error: xb0
> > xquery file:read-text("a2.txt", "ISO-8859-1")
> °
>
> Query executed in 2.01 ms.
>
> On Fri, May 27, 2022 at 1:00 PM Marco Lettere  wrote:
>
>> Dear all,
>>
>> after wrapping our heads around this for hours today, we don't know how
>> to get rid of this inconsistency. Thus I ask for help ...
>>
>> SSCE:
>>
>> BaseX 9.6.4 [Standalone]
>> Try 'help' to get more information.
>>  > xquery file:write-text("a1.txt", "°" || out:nl()) (: Same with
>> codepoints-to-string(176) instead of "°" :)
>>
>> Query executed in 183.94 ms.
>>  > xquery file:read-text("a1.txt")
>> °
>>
>> Query executed in 1.49 ms.
>>  > xquery file:write-text("a2.txt", file:read-text("a1.txt"))
>> Query executed in 3.4 ms.
>>
>>  > xquery file:read-text("a2.txt")
>> [file:io-error] Decoding error: xb0
>>
>> Testing the files with linux command-line tool "file", this is the output:
>>
>>  > file a1.txt
>> a1.txt: Unicode text, UTF-8 text
>>
>>  > file a2.txt
>> a2.txt: ISO-8859 text
>>
>> Reading the file after "copying" it seems to change the encoding. How is
>> this supposed to be handled?
>>
>> Regards,
>>
>> Marco.
>>
>>


Re: [basex-talk] Encoding hassle ....

2022-05-27 Thread Bridger Dyson-Smith
Marco -
I'm sorry but I can only corroborate your findings, and that trying to
force UTF-8 by adding the encoding parameter to the functions doesn't seem
to help; e.g.

) ./bin/basex
BaseX 9.7.1 [Standalone]
Try 'help' to get more information.
> xquery file:current-dir()
/usr/home/bridger/bin/basex/
Query executed in 886.62 ms.
> xquery file:write-text("a1.txt", "°" || out:nl(), "UTF-8")

Query executed in 4.32 ms.
> xquery file:read-text("a1.txt")
°

Query executed in 1.99 ms.
> xquery file:write-text("a2.txt", file:read-text("a1.txt", "UTF-8"),
"UTF-8")

Query executed in 1.83 ms.
> xquery file:read-text("a2.txt")
[file:io-error] Decoding error: xb0
> xquery file:read-text("a2.txt", "UTF-8")
[file:io-error] Decoding error: xb0
> xquery file:read-text("a2.txt", "ISO-8859-1")
°

Query executed in 2.01 ms.

On Fri, May 27, 2022 at 1:00 PM Marco Lettere  wrote:

> Dear all,
>
> after wrapping our heads around this for hours today, we don't know how
> to get rid of this inconsistency. Thus I ask for help ...
>
> SSCE:
>
> BaseX 9.6.4 [Standalone]
> Try 'help' to get more information.
>  > xquery file:write-text("a1.txt", "°" || out:nl()) (: Same with
> codepoints-to-string(176) instead of "°" :)
>
> Query executed in 183.94 ms.
>  > xquery file:read-text("a1.txt")
> °
>
> Query executed in 1.49 ms.
>  > xquery file:write-text("a2.txt", file:read-text("a1.txt"))
> Query executed in 3.4 ms.
>
>  > xquery file:read-text("a2.txt")
> [file:io-error] Decoding error: xb0
>
> Testing the files with linux command-line tool "file", this is the output:
>
>  > file a1.txt
> a1.txt: Unicode text, UTF-8 text
>
>  > file a2.txt
> a2.txt: ISO-8859 text
>
> Reading the file after "copying" it seems to change the encoding. How is
> this supposed to be handled?
>
> Regards,
>
> Marco.
>
>


[basex-talk] Encoding hassle ....

2022-05-27 Thread Marco Lettere

Dear all,

after wrapping our heads around this for hours today, we don't know how 
to get rid of this inconsistency. Thus I ask for help ...


SSCE:

BaseX 9.6.4 [Standalone]
Try 'help' to get more information.
> xquery file:write-text("a1.txt", "°" || out:nl()) (: Same with 
codepoints-to-string(176) instead of "°" :)


Query executed in 183.94 ms.
> xquery file:read-text("a1.txt")
°

Query executed in 1.49 ms.
> xquery file:write-text("a2.txt", file:read-text("a1.txt"))
Query executed in 3.4 ms.

> xquery file:read-text("a2.txt")
[file:io-error] Decoding error: xb0

Testing the files with linux command-line tool "file", this is the output:

> file a1.txt
a1.txt: Unicode text, UTF-8 text

> file a2.txt
a2.txt: ISO-8859 text

Reading the file after "copying" it seems to change the encoding. How is 
this supposed to be handled?


Regards,

Marco.