Re: [basex-talk] how to pass raw bytes intact?

2013-01-02 Thread Christian Grün
As Liam indicated (thanks!), XQuery may not be the best choice to process data on byte level: XQuery was built to work with Unicode characters as basic unit, which means that it will never be possible with pure XQuery to create illegal UTF8 sequences. This also means that the language provides no s

Re: [basex-talk] how to pass raw bytes intact?

2012-12-31 Thread jidanni
> "LREQ" == Liam R E Quin writes: LREQ> Treating the individual UTF-8 octets individually? Yes. LREQ> Not in standard XQuery, but that doesn't preclude a BaseX extension... Well no big deal, I was just curious. >> I was just curious if there was a way in basex if I could do s!!!g >> like I can

Re: [basex-talk] how to pass raw bytes intact?

2012-12-31 Thread Liam R E Quin
On Tue, 2013-01-01 at 11:47 +0800, jida...@jidanni.org wrote: > Not exactly after it. 1/3 of the way through it. I.e., shattered UTF-8. Treating the individual UTF-8 octets individually? Not in standard XQuery, but that doesn't preclude a BaseX extension... > I was just curious if there was a w

Re: [basex-talk] how to pass raw bytes intact?

2012-12-31 Thread jidanni
LREQ> Your perl substitution is putting after the first non-ascii LREQ> character on the line, and 你 is for sure not an ascii character, LREQ> so you get after it. Not exactly after it. 1/3 of the way through it. I.e., shattered UTF-8. I was just curious if there was a way in basex if I could do

Re: [basex-talk] how to pass raw bytes intact?

2012-12-31 Thread Liam R E Quin
On Tue, 2013-01-01 at 10:52 +0800, jida...@jidanni.org wrote: > I'm just trying to find a way to remove the injected here, > $ echo '你好'|perl -pwle 's![^[:ascii:]]!$&!'|qprint -e > =E4=BD=A0=E5=A5=BD I don't have a qprint command on my system, so I'm not sure what's going on for you here. Your p

Re: [basex-talk] how to pass raw bytes intact?

2012-12-31 Thread jidanni
> "CG" == Christian Grün writes: CG> Jidanni, >> echo '你好'|perl -pwle 's![^[:ascii:]]!$&!'|basex -q ' >> declare option db:parser "html"; >> declare option output:method "raw"; >> doc("/dev/stdin")//*:wbr/..' CG> If you want help, please try to help, too. Your example is not what I CG> would

Re: [basex-talk] how to pass raw bytes intact?

2012-12-31 Thread Christian Grün
Jidanni, > echo '你好'|perl -pwle 's![^[:ascii:]]!$&!'|basex -q ' > declare option db:parser "html"; > declare option output:method "raw"; > doc("/dev/stdin")//*:wbr/..' If you want help, please try to help, too. Your example is not what I would call very helpful; give us at least

Re: [basex-talk] how to pass raw bytes intact?

2012-12-30 Thread jidanni
Our mission today is to use Basex to remove tags injected right between the bytes of multibyte UTF-8 characters. http://www.couchsurfing.org/group_read.html?gid=430&post=13986932 > "CG" == Christian Grün writes: CG> Have you tried method=raw, as mentioned in our documentation CG> (http://doc