Hi Michael, First, please note that most of your comments are for already existing code, so we should ignore my previous patch on this discussion.
Now to the comments. > Date: Mon, 18 May 2009 19:30:27 -0400 > From: Michael Gold <[email protected]> > > On Mon, May 18, 2009 at 16:44:15 -0300, gerel wrote: > >=20 > > Hi, > >=20 > > After modifying the Fsys API I noticed that stream backends considered an= > EOF > > and an error as the same thing (i.e. zero-length write/read). I attach a= > patch > > to distinguish one from the other. > =2E.. > > pdf_stm_be_cfile_read (pdf_stm_be_t be, > =2E.. > > + *read_bytes =3D fread (buffer,=20 > > + 1, bytes, > > + be->data.cfile.file); > > =20 > > - return readed_bytes; > > + if (feof(be->data.cfile.file)) > > + { > > + ret =3D PDF_EEOF; > > + } > > + else if (ferror(be->data.cfile.file)) > > + { > > + ret =3D PDF_ERROR; > > + } > > + return ret; > > I think this code can return PDF_EEOF on a successful read (i.e. when > you read some bytes at the end of a file). PDF_OK should be returned if > any bytes were read, even if you're now at the end of the file. > So you mean it should return PDF_EEOF only if 0 bytes where read and the EOF is reached ?, if you mean that then I agree. > =2E.. > > =3D=3D=3D modified file 'src/base/pdf-stm-filter.c' > =2E.. > > + ret =3D pdf_stm_be_read (filter->backend, > > + filter->in->data, > > + filter->in->size, > > + &read_bytes); > > + > > + if (ret !=3D PDF_ERROR) > > + { =20 > > + filter->in->wp =3D read_bytes; > > + if (read_bytes < filter->in->size) > > + { > > + ret =3D PDF_EEOF; > > + } > > The condition (read_bytes < filter->in->size) has no documented > significance, i.e. it's not documented to mean EOF (or error). > In addition, it has the same bug I mentioned above. > > This check should be removed, and the backend should return PDF_EEOF > when necessary. Same answer as above. > > =2E.. > > +pdf_stm_read (pdf_stm_t stm, > =2E.. > > + if ((*read_bytes =3D=3D bytes) && > > + (ret =3D=3D PDF_EEOF)) > > + { > > + /* Avoid a false PDF_EEOF in the current operation */ > > + ret =3D PDF_OK; > > + } > > This check (*read_bytes =3D=3D bytes) doesn't make sense to me. If you get > any data at all, PDF_OK should be returned. > > Also, the text "If this value is less than bytes then an end of file > condition occurred" should be removed from this function's > documentation (and EAGAIN added as an error code). A short read can > occur before EAGAIN, but this doesn't mean anything. I agree. > For this function and all similar ones, PDF_EEOF should only be returned > when 0 bytes were read, and the end of file was reached (we don't need > to set *read_bytes in this case, unless that's documented to happen for > all error types). > It happens that now you answered my first question so we're on the same path. > =2E.. > > +pdf_stm_write (pdf_stm_t stm, > =2E.. > > + if ((*written_bytes =3D=3D bytes) && > > + (ret =3D=3D PDF_EEOF)) > > + { > > + /* Avoid a false PDF_EEOF in the current operation */ > > + ret =3D PDF_OK; > > + } > > Does PDF_EEOF make sense when writing a file? In any case, I think this > check is invalid as above. > The PDF_EEOF happens to mean disk full when writting, _very_ odd code. :-) > =2E.. > > +pdf_stm_flush (pdf_stm_t stm, > =2E.. > > + if (written_bytes !=3D cache_size) > > + { > > + /* Could not write all the contents of the cache buffer into > > + the backend */ > > + stm->cache->rp +=3D written_bytes; > > + ret =3D PDF_EEOF; > > + break; > > This function documentation says PDF_EEOF means "A disk full condition > occurred", which > - is properly represented by PDF_ENOSPC > - is not what's detected above > > The condition (written_bytes !=3D cache_size) has no documented meaning > (for pdf_stm_be_write). If pdf_stm_be_write returns PDF_OK, > pdf_stm_flush should keep trying to write more data until an error is > returned (or all data is flushed). Note that a short write can > legitimately occur before EAGAIN, and shouldn't be treated as an error. Yeah, sure. > > =2E.. > > +pdf_stm_read_peek_char (pdf_stm_t stm, > =2E.. > > + /* Is the cache empty? */ > > + ret =3D PDF_OK; > > + if (pdf_buffer_eob_p (stm->cache)) > > + { > > + ret =3D pdf_stm_filter_apply (stm->filter, PDF_FALSE); > > + } > > + > > + if (pdf_buffer_eob_p (stm->cache)) > > + { > > + ret =3D PDF_EEOF; > > + } > > + else > > + { > > + /* Read a character from the cache */ > > + *c =3D=20 > > + (pdf_u32_t) stm->cache->data[stm->cache->rp]; > > + > > + if (!peek_p) > > + { > > + stm->cache->rp++; > > + } > > + > > + if (ret =3D=3D PDF_EEOF) > > + { > > + /* Avoid a false PDF_EEOF */ > > + ret =3D PDF_OK; > > + } > > + } > =2E.. > > If pdf_stm_filter_apply returned PDF_EEOF, pdf_buffer_eob_p should be > true, which makes that final check redundant (assuming the other changes > I've suggested are made). > Sure, it all comes down to one thing: error != eof != no_read/written data. I believe the current code was meant to work mainly for memory backends. So, the set of status code for Stm module read/write operations should be: - PDF_OK - PDF_EEOF (read) - PDF_EAGAIN (read/write) - PDF_ENOSPC (write) - PDF_EINVOP (read for write mode and viceversa) - PDF_ERROR (other error) If everyone agrees, I'll work on these issues and send a patch for the Stm module. cheers, -gerel
