Re: [uf-discuss] hCite elevator pitch and my bibliography generator

Henri Sivonen Thu, 22 Mar 2007 13:14:24 -0800

(Sorry about my frustrated tone. I always get frustrated when I tryto extract implementation directions from the wiki and fail. Thisisn't the first time. And I can read specs in general.)


On Mar 10, 2007, at 23:10, Paul Wilkins wrote:

Henri Sivonen wrote:
I needed a .bib-based bibliography generator for XHTML, so Iwrote one with help from a friend who had developed a .bibparser. The output of my generator can be seen athttp://hsivonen.iki.fi/thesis/html5-conformance-checker.xhtml#references
I've wrapped the values of .bib fields in elements whose classname is the .bib field name. I did it just in case. I don't haveany consumer use case for those class names. It was just super-easy to generate them.
My use case (publishing an academic paper with a bibliography) isnot mentioned as a use case athttp://microformats.org/wiki/citation-brainstorming . More to thepoint, the wiki has no consumer use case for my publication use case.
Does this mean that hCite is not for me at all?
Not at all. You are using the BibTex format, which is covered inthe citation formats http://microformats.org/wiki/citation-formats

Sure, but considering that I share my .bib, should I expect people towant to scrape my (X)HTML-formatted bibliography?

If hCite is for me, what's the elevator pitch convincing me toput more effort into my generator? What benefits should I expectif I do? Is hCite mature enough to be implemented yet?
The citation microformat is a work in progress at this stage, soit's not mature enough for programs to extract information from it,

I guess this means that I shouldn't try to support hCite on thegenerator side in my thesis considering that the document should gofinal on the first week of April.

Would it be of any use to anyone if I wrapped the name of each author/editor in a <span class='fn'> if I otherwise leave my markup the wayit is now?

The benefits are that people visitng your content with nextgeneration tools wil be able to easily extract and use theinformation in more interesting and useful ways.

So basically, my effort would not be about catering to specificrealistic foreseeable use cases. Instead, it would be about puttingdata out there in case someone figures out a use case later on.

Tantek has a recent presentation about the big picture ofmicroformats at http://tantek.com/presentations/2007/02/microformats/

I think I know the base theory. I am interested in practical usecases and implementability in this particular case.

Moreover, is it even possible to generate hCite from my sourcedata (http://hsivonen.iki.fi/thesis/dippa.bib) withoutsacrificing the presentation that I want and without potentiallygenerating bogus markup for personal names?
One of the big ideas behind the use of microformats is that it willallow you to markup the content on your page without modifying thepresentation of it.

Somehow, I was under the impression that hCite required bibliographyitems as <li>s instead of <dt>/<dd> pairs (which is what I use andwhat W3C and WHATWG specs use).

For example, my source data does not encode explicitly the givenname, the family name and other stuff that isn't quite neither.As far as I can tell, it is impossible to tell heuristically thatthe middle token in these two names is semantically different:
Gavin Thomas Nicol
Henrik Frystyk Nielsen
Those issues haven't yet been covered for for the citationmicroformat.

What I'm trying to say is that I think hCite should allow names to bemarked up as formatted names tossing the deformatting problem to theconsumer. After all, one of the most popular bibliography dataformat, BibTeX, stores formatted names.

It may be possible for for a generator to parse through them andextract the appropriate information though.For example, honorific-prefix and honorific-suffix are a rathershort list. Then after those, the given name, family name andadditional name could be extracted in that particular order.

Using heuristics in the generator to make explicit metadatastatements is generally a bad idea. If the result is wrong, it stillpretends to be authoritative. If heuristics are involved, the inputto the heuristic should be sent and consumers should be able tocompete on how good their heuristics are.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/


_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

Re: [uf-discuss] hCite elevator pitch and my bibliography generator

Reply via email to