>From the documentation about the CHOP option I assumed that since
xml:space="preserve" sets CHOP = false for that part of the document, that
if I set CHOP = false in my configuration file, that the behavior you get
when you use xml:space="preserve" would be applied to the whole database (I
created the database after setting the option). However the only way I have
ever been able to get this behavior, has been to set xml:space="preserve"
at the root element. Am I missing something, or is this a bug? How could I
get this behavior by default in my databases?

I thought this would not warrent a thorough example given the clear
conditions which cause the above situation, but I was asked for it anyway,
so here it goes:

A little bit of context, although it should not matter, I have had this
issue for years (at least 4) under Arch Linux. At the time I just assumed I
did something wrong and went with xml:space="preserve" workaround
everywhere.

> uname -a
Linux phoenix 4.9.38 #1-NixOS SMP Sat Jul 15 10:17:55 UTC 2017 x86_64
GNU/Linux

The default situation:

No config file (.basex)

> basex
BaseX 8.6.4 [Standalone]

> create db chop-test
Database 'chop-test' created in 123.01 ms.

> open chop-test
Database 'chop-test' was opened in 1.19 ms.

> replace /a <root><a><b>foo</b></a> bar</root>
0 resource(s) replaced in 103.68 ms.

> xquery /root
<root>
  <a>
    <b>foo</b>
  </a>bar</root>
Query executed in 106.05 ms.

I never use the REPL other than to create and drop databases, so I was a
bit suprised that this did not work:
> replace /a <root xml:space="preserve"><a><b>foo</b></a> bar</root>
"a.xml" (Line 1): Open quote is expected for attribute "xml:space"
associated with an  element type  "root".

While this does:
> replace /a <root xml:space='preserve'><a><b>foo</b></a> bar</root>
1 resource(s) replaced in 4.06 ms.

> xquery /root
<root xml:space="preserve"><a><b>foo</b></a> bar</root>
Query executed in 0.95 ms.

> quit
Have fun.

The resource with xml:space="preserve" is the behavior I want to have
within my database, because all my documents are mixed content.

On the wiki (http://docs.basex.org/wiki/Options#CHOP) this is also
mentioned:

It explicitly states that in my use case I should set CHOP to false:
"The flag should be turned off if a document contains mixed content."

It also states that setting the xml:space="preserve" attribute is the same
as having CHOP = false:
"If the xml:space="preserve" attribute is attached to an element, chopping
will be turned off for all descendant text nodes."

So lets do that:

Let us first confirm that the config file is correctly read:
> echo 'FOO = 0' > /some/path/.basex
> BASEX_JVM='-Dorg.basex.path=/some/path' basex
/some/path/.basex: Unknown option 'FOO'.
/some/path/.basex: writing new configuration file.

Now we set the option CHOP = false in our config:
> echo 'CHOP = false' >> /some/path/.basex

So lets see what this changes in the basex REPL:

> BASEX_JVM='-Dorg.basex.path=/some/path' basex
BaseX 8.6.4 [Standalone]

> drop db chop-test
Database 'chop-test' was dropped.

> create db chop-test
Database 'chop-test' created in 106.42 ms.

> open chop-test
Database 'chop-test' was opened in 0.05 ms.

> replace /a <root><a><b>foo</b></a> bar</root>
0 resource(s) replaced in 39.24 ms.

> xquery /root
<root>
  <a>
    <b>foo</b>
  </a> bar</root>
Query executed in 97.09 ms.

> quit
Have fun.

This is not what I expect, it should have been:
> xquery /root
<root><a><b>foo</b></a> bar</root>

And hence my question: Shouldn't CHOP = false make xml:space="preserve" the
default behavior?

I even tried this:

> BASEX_JVM='-Dorg.basex.path=/some/path' basex
BaseX 8.6.4 [Standalone]
Try 'help' to get more information.
> open chop-test
Database 'chop-test' was opened in 90.65 ms.
> set chop off
CHOP: false
> replace /a <root><a><b>foo</b></a> bar</root>
1 resource(s) replaced in 45.58 ms.
> xquery /root
<root>
  <a>
    <b>foo</b>
  </a> bar</root>
Query executed in 96.78 ms.
> quit
See you.

Am I making some mistake in the above?
Is the wiki simply outdated and should this be configured differently?
Is having having mixed content in basex so rare that this bug has gone
unnoticed for years?

Reply via email to