Re: Phobos Proposal: replace std.xml with kxml.

Andrei Alexandrescu Tue, 04 May 2010 12:00:27 -0700

Graham Fawcett wrote:

On Tue, 04 May 2010 09:09:29 -0700, Andrei Alexandrescu wrote:

Graham Fawcett wrote:

On Mon, 03 May 2010 16:01:30 -0700, Andrei Alexandrescu wrote:

Graham Fawcett wrote:

The fact that libxml2/libxslt support not only XML parsing and DOM
building, but also XSLT, XPath, XPointer, XInclude, RelaxNG, etc.,
means that any homegrown library will be hard-pressed to cover the
same range of tools and features.

There are too many half-baked XML libraries in the world. No
disrespect intended to opticron or anyone else; it just doesn't make
a lot of sense to reinvent such a complex wheel (and believing that
XML processing isn't complex is a sure sign that your homegrown
library's design is incomplete!).

Graham

I think what we need for the standard library is to take a solid XML
library licensed generously and adapt it to work with arbitrary
ranges.

By "adapt" do you mean writing a wrapper for an existing library, or
translating the source code of the library into D?

What constitutes a "generous license" in this context? (For what it's
worth, libxml2 is under the MIT License.)

Graham

We'd need to modify the code. I haven't looked into available xml
libraries so I don't know which would be eligible.


I think I understand your motivations: this is standard library, and
so you want to minimize dependencies. But from a maintenance
perspective, it seems a bad idea to translate a complex library into D
code that few people will actively maintain -- whereas writing a
wrapper (and introducing a library dependency) would keep the codebase
small, let you share maintenance costs with the third-party library's
developers, and (arguably) increase the stability and quality of the
stdlib?

I am not pushing for libxml2 as The Answer. I'm just questioning the
motivation to translate other people's code to D, when the D platform
excels at library integration. (Although I agree with your suggestion
to borrow inspiration/code from Boost for datetime and other features;
that's different, since Boost cannot feasibly be wrapped.)

Best,
Graham

My concern is purely technical - a library we just link to would force anumber of choices, such as input representation (e.g. arrays of char).Ideally we should be able to change the library to accept any compatiblerange of any compatible characters.

As a simple example, consider std.algorithm.levenshteinDistance. Thereare plenty of good implementations and initially I just wrote one almostidentical to the Web lore. However, later I needed to computeLevenshtein distances between strings stored in lists (tries, actually).Well that doesn't work because the implementation at that time usedrandom access s[i] and t[i] all over the place. But it wasn't difficultto change the algorithm to work with forward ranges. So now we have oneof the few Levenshtein distance implementations that work with otherinputs than arrays. In particular, we work correctly with UTF inputswithout needing to copy the input, something that I haven't seenanywhere else. If you google for ``levenshtein utf'' Google will eventhink the query has a typo. Search results include an OCamlimplementation that copies the input(http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#OCaml)and a Ruby implementation that also copies the input(http://rubyforge.org/frs/?group_id=2080&release_id=7389). By using therange abstraction, we get to support UTF Levenshtein without significantadditional implementation effort - the code is very similar to the oneusing indices throughout.




Andrei

Re: Phobos Proposal: replace std.xml with kxml.

Reply via email to