> Hmm. Is there a way to send other encodings to the server via the remote > API?
A difficult one for me to answer, because I have never worked with Ruby before… Maybe there are some other users on the list who can reply on this? > I'm on my way to Japan for a workshop where we'll be using my system and > Japanese-language documents are more efficiently stored in UTF-16 so my > expectation is that users will either already have documents in that > encoding or will create new ones. Of course, for the workshop we can limit > ourselves to UTF-8 but I'm trying to make the system as foolproof as > possible. Sounds interesting, and absolutely reasonable. Maybe our HTTP services (e.g. the default REST API) could be an alternative? Christian > I think the issue with my script was that I was putting quotes around the > XML strings, which causes the server to treat it as a file path rather > than as XML to load. Once I fixed that then I was able to delete and add > files from my Ruby git hooks. > > I'll have to get a better understanding of how Ruby handles arbitrary byte > sequences (this is where there's a little too much magic for my taste) but > I would expect that if I provide the remote API with a byte sequence that > starts with 0xFFFE, 0xFEFF, 0x003C003F, or 0x3C003F00 that it would treat > it as UTF-16. > > Cheers, > > E. > ---- > Eliot Kimber, Owner > Contrext, LLC > http://contrext.com > > > > > On 2/18/16, 4:58 PM, "Christian Grün" <christian.gr...@gmail.com> wrote: > >>Hi Eliot, >> >>For most client bindings, files must indeed be sent in UTF-8, so I >>guess it’s also the case for the Ruby binding. If the sent bytes are >>correct UTF-8, everything should work be fine. >> >>Christian >> >> >>On Thu, Feb 18, 2016 at 6:08 PM, Eliot Kimber <ekim...@contrext.com> >>wrote: >>> This test document as a non-ascii character '〺' (\u303A), which I added >>>to >>> test handling of multi-byte characters. >>> >>> Ruby and the BaseX client seem to be handling the UTF-8 correctly but >>> UTF-16 didn't. I'm guessing it's Ruby's fault because it's treating the >>> bytes as a string and of course that's not going to work in a naive way. >>> >>> Cheers, >>> >>> E. >>> ---- >>> Eliot Kimber, Owner >>> Contrext, LLC >>> http://contrext.com >>> >>> >>> >>> >>> On 2/18/16, 11:04 AM, "Eliot Kimber" >>> <basex-talk-boun...@mailman.uni-konstanz.de on behalf of >>> ekim...@contrext.com> wrote: >>> >>>>I turned my UTF-8 file into a UTF-16 file and trying to commit it to >>>>BaseX >>>>via the Ruby client it did not work: >>>> >>>>BaseXClient.rb:50:in `execute': Resource "/opt/basex/?" not found. >>>>(RuntimeError) >>>> >>>>Where "?" is some kind of "unrecognized character" indicator >>>> >>>>Cheers, >>>> >>>>E. >>>> >>>> >>>>---- >>>>Eliot Kimber, Owner >>>>Contrext, LLC >>>>http://contrext.com >>>> >>>> >>>> >>>> >>>>On 2/18/16, 10:26 AM, "Eliot Kimber" >>>><basex-talk-boun...@mailman.uni-konstanz.de on behalf of >>>>ekim...@contrext.com> wrote: >>>> >>>>>I'm implementing server-side git hooks for use in GitLab under Docker >>>>>where Java is not available (at least that I can see). The hooks load >>>>>or >>>>>delete files from databases in BaseX. >>>>> >>>>>I'm trying to implement the hooks in Ruby (which is much more pleasant >>>>>than bash scripting in any case) and I'm using the BaseXClient.rb from >>>>>https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/ruby >>>>> >>>>>I need to create or replace files by sending the bytes--I'd rather not >>>>>read the input file into a Ruby string and send that since I don't >>>>>trust >>>>>Ruby to not hose up the data (even when it's UTF-8 I still don't trust >>>>>it, >>>>>but I only started using Ruby yesterday so maybe my mistrust is >>>>>misplaced?). >>>>> >>>>>Using the AddExample.rb as guide, I'm doing this: >>>>> >>>>>(Earlier code to open or create database, which works). >>>>> >>>>>file = File.new("../../" + path, "rb") >>>>> bytes = file.read >>>>> file.close >>>>> puts "file=/#{bytes}/" >>>>> @basex.add(path, "#{bytes}") >>>>> >>>>>I also tried: >>>>> >>>>>@basex.add(path, bytes) >>>>> >>>>> >>>>> >>>>>And I get this result (I added some debugging messages to sendCmd()): >>>>> >>>>>ensureDatabase(): Checking database "_dfst^metadata^temp^master"... >>>>>BaseXResult: Database '_dfst^metadata^temp^master' was opened in 1.53 >>>>>ms. >>>>>Added or modified file: "test-newname.xml" >>>>>file=/<test>This is a test 20</test> >>>>>/ >>>>> >>>>>*** sendCmd(): >>>>>cmd= >>>>>arg=test-newname.xml >>>>>input=<test>This is a test 20</test> >>>>>BaseXClient.rb:110:in `sendCmd': "test-newname.xml.xml" (Line 1): >>>>>Premature end of file. (RuntimeError) >>>>> >>>>> from commit-hooks/git/server-side/BaseXClient.rb:64:in `add' >>>>> from commit-hooks/git/server-side/post-receive:80:in `block in >>>>>update' >>>>> from commit-hooks/git/server-side/post-receive:74:in `each' >>>>> from commit-hooks/git/server-side/post-receive:74:in `update' >>>>> from commit-hooks/git/server-side/post-receive:111:in `block in >>>>><main>' >>>>> from commit-hooks/git/server-side/post-receive:103:in `each' >>>>> from commit-hooks/git/server-side/post-receive:103:in `<main>' >>>>>Eliots-MBP:hooks ekimber$ >>>>> >>>>>A couple of things here: >>>>> >>>>> >>>>>Where is the extra ".xml" in the target filename coming from? >>>>> >>>>>What is causing the premature end of file? It feels like it's trying >>>>>interpret the second argument as a filename rather than the data to be >>>>>loaded. >>>>> >>>>>If I use basex.execute("add to #{path} #{bytes}") it works but of >>>>>course >>>>>I >>>>>get duplicate files if I run the command twice. >>>>> >>>>>If I try: >>>>> >>>>>@basex.execute("replace #{path} #{bytes}") >>>>> >>>>>Then I get the same failure. >>>>> >>>>> >>>>>So something is not right. >>>>> >>>>>My Docker container is running 8.4.1 beta. >>>>> >>>>>What am I missing? >>>>> >>>>>Thanks, >>>>> >>>>>Eliot >>>>>---- >>>>>Eliot Kimber, Owner >>>>>Contrext, LLC >>>>>http://contrext.com >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >> > >