Nothing wrong here with BaseX behavior.

If your xml import chops the whitespace, following-sibling::text()[1] will of 
course always match “.-“ for the first two word nodes.

Cheers, Daniel


Von: Mark Bordelon [mailto:markcborde...@yahoo.com]
Gesendet: Freitag, 1. März 2019 22:51
An: Michael Seiferle
Cc: BaseX
Betreff: Re: [basex-talk] following-sibling axis -- real data example

Gentlemen (especially Michael)!
My follow-up to my original question from a few days ago:


 XML:
<sent id="242">
    <clause>
      <word lexmorph="ab|P|">A</word>
      <word lexmorph="to^tus|D|NsCbGn">toto</word>. -<word 
lexmorph="ab|P|">A</word>
      <word lexmorph="substantia|N|NsCb">substantia</word>
    </clause>.</sent>

XPATH:
//clause[word and not(word[not(@lexmorph) or @lexmorph='' or 
contains(@lexmorph,' ')])]/string-join(    word[not(@implicit)]/concat(        
text()        ,'|'        ,tokenize(@lexmorph, '\|')[2]        ,'|'        
,normalize-space(./following-sibling::text()[1])    ),'~~~’)


executing this in https://www.freeformatter.com/xpath-tester.html#ad-output 
returns the (correct) result:
A|P|~~~toto|D|. -~~~A|P|~~~substantia|N|

executing using XQUERY in basex returns the (incorrect) result:
A|P|. -~~~toto|D|. -~~~A|P|~~~substantia|N|

executing using XQUERY in basex adding Kristian's [. instance of text()] to the 
axis returns the (incorrect) result:
A|P|. -~~~toto|D|. -~~~A|P|~~~substantia|N|

I have tried using the -w option’s true and false values, but my results are 
always as above.

Any ideas?





On Feb 27, 2019, at 01:59, Michael Seiferle 
<m...@basex.org<mailto:m...@basex.org>> wrote:

Hi Mark,

as Martin already stated, the '-w‘-Option has to be active at import time, 
otherwise the whitespace will be chopped.

If I were to do it, I’d reindex all data and explicitly mark all elements that 
should preserve whitespace, if this is not an option I’d reindex all data with 
whitespace chopping set to off.

Looking forward to your example, I am sure we can figure this out :-)

Best
Michael





Am 26.02.2019 um 18:52 schrieb Mark Bordelon 
<markcborde...@yahoo.com<mailto:markcborde...@yahoo.com>>:

A follow-up:  starting basex -w does NOT seem to solve completely my issue 
after all. Real data (more complicated than the simplified example) still does 
not query correctly: text nodes from after later elements are displayed in the 
place of null text nodes.
I’ll try to get a better example, still simplified, that shows this.


Reply via email to