On Tue, Feb 17, 2009 at 19:56:40 -0800, [email protected] wrote: > > Date: Tue, 17 Feb 2009 16:32:39 -0500 > > From: Michael Gold <[email protected]> > > > > > > > - What's the point of header_string? > > > - The client probably doesn't care about the header -- they just > > > want the library to open a valid PDF. > > >=20 > > > The user may want to open non-pdf file such as a FDF file, that uses > > > the "%FDF-" header. Also, we may want to introduce new headers such as > > > "%GNUPDF-". > > > > Why would we introduce a new header? > > Who is the client here ? > > BTW, don't forget there is an upper 'document' layer which could provide a > wrapper for all this 'low-level' details (e.g if client==PDFviewer)
Right, the client can be internal (the document layer) or external. It seems unlikely that we'd want a nonstandard header in either case. > > For the standard headers, the client could check using the > > pdf_obj_doc_get_header you proposed, but that still seems like a > > low-level detail they shouldn't have to deal with. > > > > We could have separate functions for opening PDF or FDF, or a function > > that returns the type (PDF or FDF). > > Here again, client is ambigous to me. Though the procedure you mention, > IMHO, is in the correct level of abstraction. > > Moreover the phrase 'low-level' depends on where you're looking it from, and > although a procedure is defined on some API it's not enough reason for the > user to call it. True, but I'd still prefer to hide details when it's not useful to expose them, and to keep them in the proper layer. I don't think passing a header string on open/save provides a real benefit. It may seem like a generic way to handle different types of files, but in reality the header won't be the only difference: - the encoding of names varies depending on the PDF version - the xref table is optional for FDF files - FDF files can't use indirect /Length values for streams So the object layer needs to know the file type anyway (not just the header string); and given this, it can choose an appropriate header when saving, or detect an appropriate header when opening. > > I don't really like the idea of the library creating temporary files on > > its own. Opening a file in a library can cause security issues, for > > example: > > http://udrepper.livejournal.com/20407.html > > (Linux 2.6.27 is needed to protect against this, and I'm sure there are > > operating systems without this feature.) > > Interesting security risk, but if we make all memory-based, how much memory > will we need, on average, to edit a document ? > Maybe we could provide this as an optional feature for the poor user with > few MB of RAM. I definitely wouldn't want to require all data to be stored in RAM. The callback idea I suggested in my first mail would be OK for low- memory systems, and would work as follows: - when creating a stream object (pdf_obj_stream_new), the client would provide a callback function instead of a pdf_stm_t - when saving a file, the object layer would execute this callback when it needed to write a stream - the callback would return an open pdf_stm_t - the object layer would read this until EOF, writing the filtered data to the output file; then it would close the stream The client could provide a callback that reads from a file (possibly a temp file) if it wanted to. -- Michael
signature.asc
Description: Digital signature
