Re: [pdf-devel] Updated tokeniser/parser patch

jemarch Fri, 23 Jan 2009 14:31:14 -0800

   > It is a pity that you dedicated all that time working on that
   > obsolete pdf_obj* code: quite probably we wont use it at all.


   For the basic type, something like pdf_obj_t is needed by the tokeniser
   anyway even if it's never exported publically.

Could we use something like 'pdf_token_t'? The more I think about
moving the tokeniser to the base layer the more convinced I am in that
it is a good idea:

- We could use it in the type 4 functions parser in the fp module.

- We could let the user of the library to use the tokeniser,
  publishing it as a module in the base layer: some applications may
  find it useful.

   > But it is also my fault to keep that old obsolete code in the
   > repository. Sorry about that.

   That's no problem. I realize some of that may be thrown away, but it
   could still be useful for ideas.

Yep, I agree.

   >    We wouldn't necessarily lose all the benefits of opaque pointers. Only
   >    the size of the structure needs to be public in order to allocate it on
   >    the stack. The contents don't need to be documented, and we could leave
   >    some padding for future use as recommended in
   >      http://people.redhat.com/drepper/goodpractice.pdf
   > 
   > Good point! How that would be done for our public data types
   > implemented as structures? Can you provide an example?

   The paper gives this example:
     struct the_struct
     {
       int foo;
       // ...and more fields
       uintptr_t filler[8];
     };
   But we could make the whole thing filler and avoid declaring any fields.
   We'd then need to cast it to the proper internal type when using it (we
   should verify sizeof(public_struct) >= sizeof(actual_struct) as a sanity
   check).

That is like a "fat" opaque pointer :) I think that we can use that
approach when publishing opaque little structures (such as cartesian
points, list iterators, etc).

   BTW, another paper some people may find useful is
     http://people.redhat.com/drepper/dsohowto.pdf
   Maybe we should link those papers from the web site or hacker's
     guide.

Good idea. I knew about the dsohowto but not about the
goodpractice.pdf I will drop a note in the hacker's guide about this.

   > These are mainly lexical issues... maybe we could think about moving
   > the lexer module to the base layer. In that way the fp module could
   > use it in the little type 4 functions parser.
   > 
   > What do you (and people) think about that? If you agree I will open
   > some NEXT tasks to create the new module (and its tests, etc) and will
   > mark it as a dependency for the error reporting in type 4 functions.

   I don't have a problem with it, but it will need something like
   pdf_obj_t to store these types:
     int, real, string, name, comment, keyword
   as well as the valueless types corresponding to
     "{", "}", "<<", ">>", "[", "]"

   And the parser will want to put these objects inside dicts and
   arrays, though it could convert them or wrap them if necessary.

What about the 'pdf_token_t' that I mentioned above? The parser in the
object layer would still be able to use pdf_token_t if needed.

Re: [pdf-devel] Updated tokeniser/parser patch

Reply via email to