Re: [basex-talk] Fwd: Re: whitespaces in import

2014-03-27 Thread Christian Grün
I have added a GitHub issue for the CHOP option:

  https://github.com/BaseXdb/basex/issues/913

I invite all of you to add comments, report on potential difficulties
caused by the switch, etc.

Christian



On Wed, Mar 26, 2014 at 7:10 PM, Dirk Kirsten d...@basex.org wrote:
 Hi Graydon,

 as already said, we are going to change the default and we are aware of
 the problem.

 But I do think your statement that this is an error and we are
 preserving errors does not hold. As Liam Quin pointed out at
 https://mailman.uni-konstanz.de/pipermail/basex-talk/2013-April/004983.html
 it is in fact a quirky behavior of BaseX, but it does not contradict the
 spec.

 Also, I'd like to remind everyone that it is fact quite easy to change
 the CHOP value, so this should also be possible at a client side.

 We do understand the problems with the default value and take them
 seriously. However, there are implications with changing the default
 value and we have to consider them. A client would also certainly
 complain if we change default values all the time...

 Cheers,
 Dirk

 On 26/03/14 18:50, Graydon Saunders wrote:
 Hi Dirk --

 I understand that you don't want to break backward compatibility, but
 a)it never gets easier, only harder, when you must do that, as more
 existing applications accumulate and b)the example given of the type
 of problem is seriously bad code.  (I totally believe this kind of
 code exists in the wild; I can't muster much sympathy for an
 observation that give me every text node in the document is likely
 to go wrong, because my experience says it is _certain_ to go wrong,
 you never do that without qualifying which text nodes you mean.)

 And, really, yeah, a commitment to backward compatibility is good; a
 commitment to preserving error is not, and the CHOP default behaviour
 is an error.  Abstractions like backward compatibility don't even get
 into the discussion when the initial client response is you changed
 our data and broke it.  The client having that reaction wasn't very
 nuanced about it (and didn't immediately understand they were looking
 at a copy) but they weren't wrong.  This really is a serious bug for
 mixed content.

 So, please, could we get a release version commitment for when this
 bug will be fixed?

 Thanks!
 Graydon

 On Wed, Mar 26, 2014 at 1:37 PM, Dirk Kirsten d...@basex.org wrote:
 Hello Graydon,

 As Christian already pointed out, we had this discussion on the mailing
 list quite frequently. See Christians answer from some time ago for some
 insight:
 https://mailman.uni-konstanz.de/pipermail/basex-talk/2013-April/004984.html

 So although it is of course easily changed in our code, it does has some
 serious implications in terms of backward compatibility. So you might be
 able to convince your client by stating that we take backward
 compatibility seriously and will take some time and thought to change
 things that break applications :)

 Cheers,
 Dirk

 On 26/03/14 18:24, Graydon Saunders wrote:
 Could we get _which_ future version?  I recall this being said before
 7.8 was released, too, and was feeling hopeful.

 I understand that from a technical perspective this is a completely
 trivial thing -- set the flag! -- but from the perspective of having
 to argue for three months to be able to use BaseX at all at a client
 whose internal tester did a naive load of some narrative-style XML,
 observed the loss of white space around internal-to-the-mixed-content
 tags, and said this application is banned, it's not trivial at all.
 Especially since it's not technically correct to the XML spec; one is
 forced to argue that something is a minor quirk of an otherwise
 excellent application *after* it's already given senior, stubborn, and
 non-technical content experts metaphorical heart attacks.

 On Wed, Mar 26, 2014 at 1:15 PM, Christian Grün
 christian.gr...@gmail.com wrote:
 True; this discussion is already going on for quite a while now. The 
 default
 value of CHOP will be changed in a future version of BaseX.


 On Wed, Mar 26, 2014 at 6:14 PM, Graydon Saunders graydon...@gmail.com
 wrote:

 Though I think CHOP defaulting to true is a bug compared to the
 expected behaviour of XML.

 And while it's very useful to have CHOP there for some kinds of data,
 for other kinds it's a severe hazard that it's the default.  People
 have major freakouts when their documentation XML documents look like
 they've lost the white spaces around bold tags and will have nothing
 to do with BaseX thereafter.

 -- Graydon

 On Wed, Mar 26, 2014 at 1:08 PM, Dirk Kirsten d...@basex.org wrote:
 Sorry, mail was supposed to be send to the mailing list... Information
 is the same as in Leos mail

  Original Message 
 Subject: Re: [basex-talk] whitespaces in import
 Date: Wed, 26 Mar 2014 17:59:12 +0100
 From: Dirk Kirsten d...@basex.org
 To: Stefan Sechelmann sec...@math.tu-berlin.de

 Hello Stefan,

 On 26/03/14 17:51, Stefan Sechelmann wrote:
 Is there some kind of 

Re: [basex-talk] Fwd: Re: whitespaces in import

2014-03-27 Thread Fabrice Etanchaud
Dear all (pasted from GitHub),

What about a mechanism like the output:cdata-section-elements option ?
That way, one could set the list of elements not to be (recursivly) chopped.

For my usage (bibliographic data in intellectual propery domain), chopping by 
default is a good thing, because only a few document's elements contain textual 
mixed-content.

Hoping it helps,

Thank you Christian for the great job you do with your team.

-Message d'origine-
De : basex-talk-boun...@mailman.uni-konstanz.de 
[mailto:basex-talk-boun...@mailman.uni-konstanz.de] De la part de Christian Grün
Envoyé : jeudi 27 mars 2014 14:03
À : Dirk Kirsten
Cc : BaseX
Objet : Re: [basex-talk] Fwd: Re: whitespaces in import

I have added a GitHub issue for the CHOP option:

  https://github.com/BaseXdb/basex/issues/913

I invite all of you to add comments, report on potential difficulties caused by 
the switch, etc.

Christian



On Wed, Mar 26, 2014 at 7:10 PM, Dirk Kirsten d...@basex.org wrote:
 Hi Graydon,

 as already said, we are going to change the default and we are aware 
 of the problem.

 But I do think your statement that this is an error and we are 
 preserving errors does not hold. As Liam Quin pointed out at 
 https://mailman.uni-konstanz.de/pipermail/basex-talk/2013-April/004983
 .html it is in fact a quirky behavior of BaseX, but it does not 
 contradict the spec.

 Also, I'd like to remind everyone that it is fact quite easy to change 
 the CHOP value, so this should also be possible at a client side.

 We do understand the problems with the default value and take them 
 seriously. However, there are implications with changing the default 
 value and we have to consider them. A client would also certainly 
 complain if we change default values all the time...

 Cheers,
 Dirk

 On 26/03/14 18:50, Graydon Saunders wrote:
 Hi Dirk --

 I understand that you don't want to break backward compatibility, but 
 a)it never gets easier, only harder, when you must do that, as more 
 existing applications accumulate and b)the example given of the type 
 of problem is seriously bad code.  (I totally believe this kind of 
 code exists in the wild; I can't muster much sympathy for an 
 observation that give me every text node in the document is likely 
 to go wrong, because my experience says it is _certain_ to go wrong, 
 you never do that without qualifying which text nodes you mean.)

 And, really, yeah, a commitment to backward compatibility is good; a 
 commitment to preserving error is not, and the CHOP default behaviour 
 is an error.  Abstractions like backward compatibility don't even get 
 into the discussion when the initial client response is you changed 
 our data and broke it.  The client having that reaction wasn't very 
 nuanced about it (and didn't immediately understand they were looking 
 at a copy) but they weren't wrong.  This really is a serious bug for 
 mixed content.

 So, please, could we get a release version commitment for when this 
 bug will be fixed?

 Thanks!
 Graydon

 On Wed, Mar 26, 2014 at 1:37 PM, Dirk Kirsten d...@basex.org wrote:
 Hello Graydon,

 As Christian already pointed out, we had this discussion on the 
 mailing list quite frequently. See Christians answer from some time 
 ago for some
 insight:
 https://mailman.uni-konstanz.de/pipermail/basex-talk/2013-April/0049
 84.html

 So although it is of course easily changed in our code, it does has 
 some serious implications in terms of backward compatibility. So you 
 might be able to convince your client by stating that we take 
 backward compatibility seriously and will take some time and thought 
 to change things that break applications :)

 Cheers,
 Dirk

 On 26/03/14 18:24, Graydon Saunders wrote:
 Could we get _which_ future version?  I recall this being said 
 before
 7.8 was released, too, and was feeling hopeful.

 I understand that from a technical perspective this is a completely 
 trivial thing -- set the flag! -- but from the perspective of 
 having to argue for three months to be able to use BaseX at all at 
 a client whose internal tester did a naive load of some 
 narrative-style XML, observed the loss of white space around 
 internal-to-the-mixed-content tags, and said this application is banned, 
 it's not trivial at all.
 Especially since it's not technically correct to the XML spec; one 
 is forced to argue that something is a minor quirk of an otherwise 
 excellent application *after* it's already given senior, stubborn, 
 and non-technical content experts metaphorical heart attacks.

 On Wed, Mar 26, 2014 at 1:15 PM, Christian Grün 
 christian.gr...@gmail.com wrote:
 True; this discussion is already going on for quite a while now. 
 The default value of CHOP will be changed in a future version of BaseX.


 On Wed, Mar 26, 2014 at 6:14 PM, Graydon Saunders 
 graydon...@gmail.com
 wrote:

 Though I think CHOP defaulting to true is a bug compared to the 
 expected behaviour of XML.

 And while it's very useful

[basex-talk] Fwd: Re: whitespaces in import

2014-03-26 Thread Dirk Kirsten
Sorry, mail was supposed to be send to the mailing list... Information
is the same as in Leos mail

 Original Message 
Subject: Re: [basex-talk] whitespaces in import
Date: Wed, 26 Mar 2014 17:59:12 +0100
From: Dirk Kirsten d...@basex.org
To: Stefan Sechelmann sec...@math.tu-berlin.de

Hello Stefan,

On 26/03/14 17:51, Stefan Sechelmann wrote:
Is there some kind of whitespace normalization
 going on during import? 

Yes, it is.

Can I set options that influence this behavior or is this a bug?

Yes, you can. Set the CHOP option to false (see
https://docs.basex.org/wiki/Options#CHOP for details) or start BaseX
with the -w flag (which sets CHOP to false).

Hope that helps.

Cheers,
Dirk

-- 
Dirk Kirsten, BaseX GmbH, http://basex.org
|-- Firmensitz: Blarerstrasse 56, 78462 Konstanz
|-- Registergericht Freiburg, HRB: 708285, Geschäftsführer:
|   Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle
`-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22


___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Fwd: Re: whitespaces in import

2014-03-26 Thread Christian Grün
True; this discussion is already going on for quite a while now. The
default value of CHOP will be changed in a future version of BaseX.


On Wed, Mar 26, 2014 at 6:14 PM, Graydon Saunders graydon...@gmail.comwrote:

 Though I think CHOP defaulting to true is a bug compared to the
 expected behaviour of XML.

 And while it's very useful to have CHOP there for some kinds of data,
 for other kinds it's a severe hazard that it's the default.  People
 have major freakouts when their documentation XML documents look like
 they've lost the white spaces around bold tags and will have nothing
 to do with BaseX thereafter.

 -- Graydon

 On Wed, Mar 26, 2014 at 1:08 PM, Dirk Kirsten d...@basex.org wrote:
  Sorry, mail was supposed to be send to the mailing list... Information
  is the same as in Leos mail
 
   Original Message 
  Subject: Re: [basex-talk] whitespaces in import
  Date: Wed, 26 Mar 2014 17:59:12 +0100
  From: Dirk Kirsten d...@basex.org
  To: Stefan Sechelmann sec...@math.tu-berlin.de
 
  Hello Stefan,
 
  On 26/03/14 17:51, Stefan Sechelmann wrote:
  Is there some kind of whitespace normalization
  going on during import?
 
  Yes, it is.
 
  Can I set options that influence this behavior or is this a bug?
 
  Yes, you can. Set the CHOP option to false (see
  https://docs.basex.org/wiki/Options#CHOP for details) or start BaseX
  with the -w flag (which sets CHOP to false).
 
  Hope that helps.
 
  Cheers,
  Dirk
 
  --
  Dirk Kirsten, BaseX GmbH, http://basex.org
  |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz
  |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer:
  |   Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle
  `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
 
 
  ___
  BaseX-Talk mailing list
  BaseX-Talk@mailman.uni-konstanz.de
  https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
 ___
 BaseX-Talk mailing list
 BaseX-Talk@mailman.uni-konstanz.de
 https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Fwd: Re: whitespaces in import

2014-03-26 Thread Graydon Saunders
Could we get _which_ future version?  I recall this being said before
7.8 was released, too, and was feeling hopeful.

I understand that from a technical perspective this is a completely
trivial thing -- set the flag! -- but from the perspective of having
to argue for three months to be able to use BaseX at all at a client
whose internal tester did a naive load of some narrative-style XML,
observed the loss of white space around internal-to-the-mixed-content
tags, and said this application is banned, it's not trivial at all.
Especially since it's not technically correct to the XML spec; one is
forced to argue that something is a minor quirk of an otherwise
excellent application *after* it's already given senior, stubborn, and
non-technical content experts metaphorical heart attacks.

On Wed, Mar 26, 2014 at 1:15 PM, Christian Grün
christian.gr...@gmail.com wrote:
 True; this discussion is already going on for quite a while now. The default
 value of CHOP will be changed in a future version of BaseX.


 On Wed, Mar 26, 2014 at 6:14 PM, Graydon Saunders graydon...@gmail.com
 wrote:

 Though I think CHOP defaulting to true is a bug compared to the
 expected behaviour of XML.

 And while it's very useful to have CHOP there for some kinds of data,
 for other kinds it's a severe hazard that it's the default.  People
 have major freakouts when their documentation XML documents look like
 they've lost the white spaces around bold tags and will have nothing
 to do with BaseX thereafter.

 -- Graydon

 On Wed, Mar 26, 2014 at 1:08 PM, Dirk Kirsten d...@basex.org wrote:
  Sorry, mail was supposed to be send to the mailing list... Information
  is the same as in Leos mail
 
   Original Message 
  Subject: Re: [basex-talk] whitespaces in import
  Date: Wed, 26 Mar 2014 17:59:12 +0100
  From: Dirk Kirsten d...@basex.org
  To: Stefan Sechelmann sec...@math.tu-berlin.de
 
  Hello Stefan,
 
  On 26/03/14 17:51, Stefan Sechelmann wrote:
  Is there some kind of whitespace normalization
  going on during import?
 
  Yes, it is.
 
  Can I set options that influence this behavior or is this a bug?
 
  Yes, you can. Set the CHOP option to false (see
  https://docs.basex.org/wiki/Options#CHOP for details) or start BaseX
  with the -w flag (which sets CHOP to false).
 
  Hope that helps.
 
  Cheers,
  Dirk
 
  --
  Dirk Kirsten, BaseX GmbH, http://basex.org
  |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz
  |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer:
  |   Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle
  `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
 
 
  ___
  BaseX-Talk mailing list
  BaseX-Talk@mailman.uni-konstanz.de
  https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
 ___
 BaseX-Talk mailing list
 BaseX-Talk@mailman.uni-konstanz.de
 https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Fwd: Re: whitespaces in import

2014-03-26 Thread Graydon Saunders
Hi Dirk --

I understand that you don't want to break backward compatibility, but
a)it never gets easier, only harder, when you must do that, as more
existing applications accumulate and b)the example given of the type
of problem is seriously bad code.  (I totally believe this kind of
code exists in the wild; I can't muster much sympathy for an
observation that give me every text node in the document is likely
to go wrong, because my experience says it is _certain_ to go wrong,
you never do that without qualifying which text nodes you mean.)

And, really, yeah, a commitment to backward compatibility is good; a
commitment to preserving error is not, and the CHOP default behaviour
is an error.  Abstractions like backward compatibility don't even get
into the discussion when the initial client response is you changed
our data and broke it.  The client having that reaction wasn't very
nuanced about it (and didn't immediately understand they were looking
at a copy) but they weren't wrong.  This really is a serious bug for
mixed content.

So, please, could we get a release version commitment for when this
bug will be fixed?

Thanks!
Graydon

On Wed, Mar 26, 2014 at 1:37 PM, Dirk Kirsten d...@basex.org wrote:
 Hello Graydon,

 As Christian already pointed out, we had this discussion on the mailing
 list quite frequently. See Christians answer from some time ago for some
 insight:
 https://mailman.uni-konstanz.de/pipermail/basex-talk/2013-April/004984.html

 So although it is of course easily changed in our code, it does has some
 serious implications in terms of backward compatibility. So you might be
 able to convince your client by stating that we take backward
 compatibility seriously and will take some time and thought to change
 things that break applications :)

 Cheers,
 Dirk

 On 26/03/14 18:24, Graydon Saunders wrote:
 Could we get _which_ future version?  I recall this being said before
 7.8 was released, too, and was feeling hopeful.

 I understand that from a technical perspective this is a completely
 trivial thing -- set the flag! -- but from the perspective of having
 to argue for three months to be able to use BaseX at all at a client
 whose internal tester did a naive load of some narrative-style XML,
 observed the loss of white space around internal-to-the-mixed-content
 tags, and said this application is banned, it's not trivial at all.
 Especially since it's not technically correct to the XML spec; one is
 forced to argue that something is a minor quirk of an otherwise
 excellent application *after* it's already given senior, stubborn, and
 non-technical content experts metaphorical heart attacks.

 On Wed, Mar 26, 2014 at 1:15 PM, Christian Grün
 christian.gr...@gmail.com wrote:
 True; this discussion is already going on for quite a while now. The default
 value of CHOP will be changed in a future version of BaseX.


 On Wed, Mar 26, 2014 at 6:14 PM, Graydon Saunders graydon...@gmail.com
 wrote:

 Though I think CHOP defaulting to true is a bug compared to the
 expected behaviour of XML.

 And while it's very useful to have CHOP there for some kinds of data,
 for other kinds it's a severe hazard that it's the default.  People
 have major freakouts when their documentation XML documents look like
 they've lost the white spaces around bold tags and will have nothing
 to do with BaseX thereafter.

 -- Graydon

 On Wed, Mar 26, 2014 at 1:08 PM, Dirk Kirsten d...@basex.org wrote:
 Sorry, mail was supposed to be send to the mailing list... Information
 is the same as in Leos mail

  Original Message 
 Subject: Re: [basex-talk] whitespaces in import
 Date: Wed, 26 Mar 2014 17:59:12 +0100
 From: Dirk Kirsten d...@basex.org
 To: Stefan Sechelmann sec...@math.tu-berlin.de

 Hello Stefan,

 On 26/03/14 17:51, Stefan Sechelmann wrote:
 Is there some kind of whitespace normalization
 going on during import?

 Yes, it is.

 Can I set options that influence this behavior or is this a bug?

 Yes, you can. Set the CHOP option to false (see
 https://docs.basex.org/wiki/Options#CHOP for details) or start BaseX
 with the -w flag (which sets CHOP to false).

 Hope that helps.

 Cheers,
 Dirk

 --
 Dirk Kirsten, BaseX GmbH, http://basex.org
 |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz
 |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer:
 |   Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle
 `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22


 ___
 BaseX-Talk mailing list
 BaseX-Talk@mailman.uni-konstanz.de
 https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
 ___
 BaseX-Talk mailing list
 BaseX-Talk@mailman.uni-konstanz.de
 https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


 ___
 BaseX-Talk mailing list
 BaseX-Talk@mailman.uni-konstanz.de
 https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


 --
 Dirk 

Re: [basex-talk] Fwd: Re: whitespaces in import

2014-03-26 Thread Stefan Sechelmann
Hi Graydon,
it is not so easy. For instance: The moment I turned off CHOP during import 
of my test data base, all of my unit tests involving assert-equals with a 
comparison 
of nodes failed because of extra of different whitespace. So this has indeed 
deep 
implications for running code.
Stefan

Stefan Sechelmann
DFG-Forschungszentrum Matheon
Mathematik für Schlüsseltechnologien
Technische Universität Berlin
Sekretariat MA 8-3  Tel. 030/314 29 486
Straße des 17. Juni 136 Fax  030/314 79 282
10623 Berlinsec...@math.tu-berlin.de
http://www.math.tu-berlin.de/~sechel 

Am 26.03.2014 um 18:50 schrieb Graydon Saunders graydon...@gmail.com:

 Hi Dirk --
 
 I understand that you don't want to break backward compatibility, but
 a)it never gets easier, only harder, when you must do that, as more
 existing applications accumulate and b)the example given of the type
 of problem is seriously bad code.  (I totally believe this kind of
 code exists in the wild; I can't muster much sympathy for an
 observation that give me every text node in the document is likely
 to go wrong, because my experience says it is _certain_ to go wrong,
 you never do that without qualifying which text nodes you mean.)
 
 And, really, yeah, a commitment to backward compatibility is good; a
 commitment to preserving error is not, and the CHOP default behaviour
 is an error.  Abstractions like backward compatibility don't even get
 into the discussion when the initial client response is you changed
 our data and broke it.  The client having that reaction wasn't very
 nuanced about it (and didn't immediately understand they were looking
 at a copy) but they weren't wrong.  This really is a serious bug for
 mixed content.
 
 So, please, could we get a release version commitment for when this
 bug will be fixed?
 
 Thanks!
 Graydon
 
 On Wed, Mar 26, 2014 at 1:37 PM, Dirk Kirsten d...@basex.org wrote:
 Hello Graydon,
 
 As Christian already pointed out, we had this discussion on the mailing
 list quite frequently. See Christians answer from some time ago for some
 insight:
 https://mailman.uni-konstanz.de/pipermail/basex-talk/2013-April/004984.html
 
 So although it is of course easily changed in our code, it does has some
 serious implications in terms of backward compatibility. So you might be
 able to convince your client by stating that we take backward
 compatibility seriously and will take some time and thought to change
 things that break applications :)
 
 Cheers,
 Dirk
 
 On 26/03/14 18:24, Graydon Saunders wrote:
 Could we get _which_ future version?  I recall this being said before
 7.8 was released, too, and was feeling hopeful.
 
 I understand that from a technical perspective this is a completely
 trivial thing -- set the flag! -- but from the perspective of having
 to argue for three months to be able to use BaseX at all at a client
 whose internal tester did a naive load of some narrative-style XML,
 observed the loss of white space around internal-to-the-mixed-content
 tags, and said this application is banned, it's not trivial at all.
 Especially since it's not technically correct to the XML spec; one is
 forced to argue that something is a minor quirk of an otherwise
 excellent application *after* it's already given senior, stubborn, and
 non-technical content experts metaphorical heart attacks.
 
 On Wed, Mar 26, 2014 at 1:15 PM, Christian Grün
 christian.gr...@gmail.com wrote:
 True; this discussion is already going on for quite a while now. The 
 default
 value of CHOP will be changed in a future version of BaseX.
 
 
 On Wed, Mar 26, 2014 at 6:14 PM, Graydon Saunders graydon...@gmail.com
 wrote:
 
 Though I think CHOP defaulting to true is a bug compared to the
 expected behaviour of XML.
 
 And while it's very useful to have CHOP there for some kinds of data,
 for other kinds it's a severe hazard that it's the default.  People
 have major freakouts when their documentation XML documents look like
 they've lost the white spaces around bold tags and will have nothing
 to do with BaseX thereafter.
 
 -- Graydon
 
 On Wed, Mar 26, 2014 at 1:08 PM, Dirk Kirsten d...@basex.org wrote:
 Sorry, mail was supposed to be send to the mailing list... Information
 is the same as in Leos mail
 
  Original Message 
 Subject: Re: [basex-talk] whitespaces in import
 Date: Wed, 26 Mar 2014 17:59:12 +0100
 From: Dirk Kirsten d...@basex.org
 To: Stefan Sechelmann sec...@math.tu-berlin.de
 
 Hello Stefan,
 
 On 26/03/14 17:51, Stefan Sechelmann wrote:
 Is there some kind of whitespace normalization
 going on during import?
 
 Yes, it is.
 
 Can I set options that influence this behavior or is this a bug?
 
 Yes, you can. Set the CHOP option to false (see
 https://docs.basex.org/wiki/Options#CHOP for details) or start BaseX
 with the -w flag (which sets CHOP to false).
 
 Hope that helps.
 
 Cheers,
 Dirk
 
 --
 Dirk Kirsten, BaseX GmbH, http://basex.org
 |-- Firmensitz: 

Re: [basex-talk] Fwd: Re: whitespaces in import

2014-03-26 Thread Christian Grün
Graydon,

It's a plain fact that we can't simply switch the default behavior of
BaseX, because there are various applications out there that first need to
be checked in depth before we can realize the switch. We simply need
(..enough..) time to get aware of the consequences before we do simple
things like inverting a boolean flag. I think there is no need to tell us
what decision is right or wrong; we know by ourselves that there are enough
reasons to change the default.

Please remember that BaseX is a completely free product. I'm glad to hear
that you are using BaseX with customers, and that you believe it's an
(otherwise) excellent choice. If you think that the current default is a
serious show stopper that's worth to be fixed, you are invited to sponsor
the efforts; this may be more helpful than complaining about the status quo.

Christian
_

On Wed, Mar 26, 2014 at 6:50 PM, Graydon Saunders graydon...@gmail.comwrote:

 Hi Dirk --

 I understand that you don't want to break backward compatibility, but
 a)it never gets easier, only harder, when you must do that, as more
 existing applications accumulate and b)the example given of the type
 of problem is seriously bad code.  (I totally believe this kind of
 code exists in the wild; I can't muster much sympathy for an
 observation that give me every text node in the document is likely
 to go wrong, because my experience says it is _certain_ to go wrong,
 you never do that without qualifying which text nodes you mean.)

 And, really, yeah, a commitment to backward compatibility is good; a
 commitment to preserving error is not, and the CHOP default behaviour
 is an error.  Abstractions like backward compatibility don't even get
 into the discussion when the initial client response is you changed
 our data and broke it.  The client having that reaction wasn't very
 nuanced about it (and didn't immediately understand they were looking
 at a copy) but they weren't wrong.  This really is a serious bug for
 mixed content.

 So, please, could we get a release version commitment for when this
 bug will be fixed?

 Thanks!
 Graydon

 On Wed, Mar 26, 2014 at 1:37 PM, Dirk Kirsten d...@basex.org wrote:
  Hello Graydon,
 
  As Christian already pointed out, we had this discussion on the mailing
  list quite frequently. See Christians answer from some time ago for some
  insight:
 
 https://mailman.uni-konstanz.de/pipermail/basex-talk/2013-April/004984.html
 
  So although it is of course easily changed in our code, it does has some
  serious implications in terms of backward compatibility. So you might be
  able to convince your client by stating that we take backward
  compatibility seriously and will take some time and thought to change
  things that break applications :)
 
  Cheers,
  Dirk
 
  On 26/03/14 18:24, Graydon Saunders wrote:
  Could we get _which_ future version?  I recall this being said before
  7.8 was released, too, and was feeling hopeful.
 
  I understand that from a technical perspective this is a completely
  trivial thing -- set the flag! -- but from the perspective of having
  to argue for three months to be able to use BaseX at all at a client
  whose internal tester did a naive load of some narrative-style XML,
  observed the loss of white space around internal-to-the-mixed-content
  tags, and said this application is banned, it's not trivial at all.
  Especially since it's not technically correct to the XML spec; one is
  forced to argue that something is a minor quirk of an otherwise
  excellent application *after* it's already given senior, stubborn, and
  non-technical content experts metaphorical heart attacks.
 
  On Wed, Mar 26, 2014 at 1:15 PM, Christian Grün
  christian.gr...@gmail.com wrote:
  True; this discussion is already going on for quite a while now. The
 default
  value of CHOP will be changed in a future version of BaseX.
 
 
  On Wed, Mar 26, 2014 at 6:14 PM, Graydon Saunders 
 graydon...@gmail.com
  wrote:
 
  Though I think CHOP defaulting to true is a bug compared to the
  expected behaviour of XML.
 
  And while it's very useful to have CHOP there for some kinds of data,
  for other kinds it's a severe hazard that it's the default.  People
  have major freakouts when their documentation XML documents look like
  they've lost the white spaces around bold tags and will have nothing
  to do with BaseX thereafter.
 
  -- Graydon
 
  On Wed, Mar 26, 2014 at 1:08 PM, Dirk Kirsten d...@basex.org wrote:
  Sorry, mail was supposed to be send to the mailing list...
 Information
  is the same as in Leos mail
 
   Original Message 
  Subject: Re: [basex-talk] whitespaces in import
  Date: Wed, 26 Mar 2014 17:59:12 +0100
  From: Dirk Kirsten d...@basex.org
  To: Stefan Sechelmann sec...@math.tu-berlin.de
 
  Hello Stefan,
 
  On 26/03/14 17:51, Stefan Sechelmann wrote:
  Is there some kind of whitespace normalization
  going on during import?
 
  Yes, it is.
 
  Can I set 

Re: [basex-talk] Fwd: Re: whitespaces in import

2014-03-26 Thread Dirk Kirsten
Hi Graydon,

as already said, we are going to change the default and we are aware of
the problem.

But I do think your statement that this is an error and we are
preserving errors does not hold. As Liam Quin pointed out at
https://mailman.uni-konstanz.de/pipermail/basex-talk/2013-April/004983.html
it is in fact a quirky behavior of BaseX, but it does not contradict the
spec.

Also, I'd like to remind everyone that it is fact quite easy to change
the CHOP value, so this should also be possible at a client side.

We do understand the problems with the default value and take them
seriously. However, there are implications with changing the default
value and we have to consider them. A client would also certainly
complain if we change default values all the time...

Cheers,
Dirk

On 26/03/14 18:50, Graydon Saunders wrote:
 Hi Dirk --
 
 I understand that you don't want to break backward compatibility, but
 a)it never gets easier, only harder, when you must do that, as more
 existing applications accumulate and b)the example given of the type
 of problem is seriously bad code.  (I totally believe this kind of
 code exists in the wild; I can't muster much sympathy for an
 observation that give me every text node in the document is likely
 to go wrong, because my experience says it is _certain_ to go wrong,
 you never do that without qualifying which text nodes you mean.)
 
 And, really, yeah, a commitment to backward compatibility is good; a
 commitment to preserving error is not, and the CHOP default behaviour
 is an error.  Abstractions like backward compatibility don't even get
 into the discussion when the initial client response is you changed
 our data and broke it.  The client having that reaction wasn't very
 nuanced about it (and didn't immediately understand they were looking
 at a copy) but they weren't wrong.  This really is a serious bug for
 mixed content.
 
 So, please, could we get a release version commitment for when this
 bug will be fixed?
 
 Thanks!
 Graydon
 
 On Wed, Mar 26, 2014 at 1:37 PM, Dirk Kirsten d...@basex.org wrote:
 Hello Graydon,

 As Christian already pointed out, we had this discussion on the mailing
 list quite frequently. See Christians answer from some time ago for some
 insight:
 https://mailman.uni-konstanz.de/pipermail/basex-talk/2013-April/004984.html

 So although it is of course easily changed in our code, it does has some
 serious implications in terms of backward compatibility. So you might be
 able to convince your client by stating that we take backward
 compatibility seriously and will take some time and thought to change
 things that break applications :)

 Cheers,
 Dirk

 On 26/03/14 18:24, Graydon Saunders wrote:
 Could we get _which_ future version?  I recall this being said before
 7.8 was released, too, and was feeling hopeful.

 I understand that from a technical perspective this is a completely
 trivial thing -- set the flag! -- but from the perspective of having
 to argue for three months to be able to use BaseX at all at a client
 whose internal tester did a naive load of some narrative-style XML,
 observed the loss of white space around internal-to-the-mixed-content
 tags, and said this application is banned, it's not trivial at all.
 Especially since it's not technically correct to the XML spec; one is
 forced to argue that something is a minor quirk of an otherwise
 excellent application *after* it's already given senior, stubborn, and
 non-technical content experts metaphorical heart attacks.

 On Wed, Mar 26, 2014 at 1:15 PM, Christian Grün
 christian.gr...@gmail.com wrote:
 True; this discussion is already going on for quite a while now. The 
 default
 value of CHOP will be changed in a future version of BaseX.


 On Wed, Mar 26, 2014 at 6:14 PM, Graydon Saunders graydon...@gmail.com
 wrote:

 Though I think CHOP defaulting to true is a bug compared to the
 expected behaviour of XML.

 And while it's very useful to have CHOP there for some kinds of data,
 for other kinds it's a severe hazard that it's the default.  People
 have major freakouts when their documentation XML documents look like
 they've lost the white spaces around bold tags and will have nothing
 to do with BaseX thereafter.

 -- Graydon

 On Wed, Mar 26, 2014 at 1:08 PM, Dirk Kirsten d...@basex.org wrote:
 Sorry, mail was supposed to be send to the mailing list... Information
 is the same as in Leos mail

  Original Message 
 Subject: Re: [basex-talk] whitespaces in import
 Date: Wed, 26 Mar 2014 17:59:12 +0100
 From: Dirk Kirsten d...@basex.org
 To: Stefan Sechelmann sec...@math.tu-berlin.de

 Hello Stefan,

 On 26/03/14 17:51, Stefan Sechelmann wrote:
 Is there some kind of whitespace normalization
 going on during import?

 Yes, it is.

 Can I set options that influence this behavior or is this a bug?

 Yes, you can. Set the CHOP option to false (see
 https://docs.basex.org/wiki/Options#CHOP for details) or start BaseX
 with the -w flag (which sets CHOP to