> On 8 Jan 2015, at 12:36, Andreas Lehmkuehler <[email protected]> wrote: > > Am 08.01.2015 um 21:03 schrieb John Hewson: >> >>> On 8 Jan 2015, at 06:54, Leonard Rosenthol <[email protected]> wrote: >>> >>>> That's the benefit of the current approach that after the deep cloning >>>> the imported page is independent from the source page. >>> >>> And that’s what you (IMO) want to maintain. >>> >>> Once you copy the object(s) to the new document they are, in fact, new >>> objects and should be treated that way. They may still point back to data >>> in the original file (since you don’t need to copy the stream until write >>> time (aka COW model)) but the object itself is now part of the tree of the >>> new document. >> >> This is not the Java way of doing things, if I add an object to a list I can >> share that object among other lists and mutate it at a later point. >> >> Likewise, I can add a COSDictionary to a COSDocument and share that >> among other COSDocuments and mutate it at a later point. >> >> That’s exactly what a Java developer would expect - if they are to produce >> a copy of an object, then they expect to call some sort of clone() API to >> do so. There’s no real way around this in Java - it’s fundamental. >> >>> One thing you might want to think about is shallow vs. deep copies. In >>> many of the other libraries, the CosObjCopy() method (or equivalent >>> thereof) offers the option to copy the full tree/structure or just the top >>> level - this distinction is useful in a variety of operations. Perhaps >>> that might help here as well. >> >> Yes, we’re doing a deep copy and there’s no real need for it, a shallow >> copy would work just as well - that’s what I’m trying to advocate. >> >> I’m going a bit further and pointing out that in Java there’s no need for >> any copy. If the user wants to share objects between documents, then >> Java enables that by default and we’d have to work very hard to stop >> them doing something natural. >> >> It sounds like we should probably add a copy() API to our COS objects >> rather than having it in an easily-overlooked utility class, and also >> deepCopy() for COSStreams. But there’s no need to use them, unless >> the user actually desires a copy - it’s perfectly valid to share COS objects >> in Java. > I have the impression that somehow the focus got lost. In your first post you > wrote about users who wants to copy pages and IMO those people don't care > about > shallow or deep copies. They just want to get the job done. They won't be > aware of the fact that a resulting pdf is connected to one or more source > pdfs if we use shared objects. That might lead to broken/unwanted altered > pdfs if those users add/remove/alter objects in one of the involved.
I’d argue the opposite - in Java one expects objects to be shared unless there is an explicit call to clone(). e.g. can you think of an example from the Java standard library where explicit copying occurs? I can’t. There’s just no way to fight this in Java, it’s a fact of life. It’s not like C++ where there are copy semantics. But I’m not suggesting that we get rid of cloning from LayerUtility. I’m pointing out that there already exists a far simpler way of using PDFBox’s API, but that it is hampered by the behaviour of COSDocument’s close() method due to various historical misunderstandings about GC in Java. > I already asked this sooner, but I'm happy to repeat my question about > concurrent editing. > How do we ensure that the whole stuff is "foolproof", so that people who > don't have a clue about the internals can use it without breaking their pdfs > by accident? You can’t. Objects in Java are passed by reference, and there’s nothing we can do about it. Today you can use the PDFBox API to take a COSDictionary from one document and insert it directly into another document and *it’s fine*, it works. Except that COSDictionary’s close() method clobbers its objects which causes silent failures, for no reason other than there seems to have been a historical misunderstanding about GC. In other words, the current API allows users to shoot themselves in the foot because it corrupts COS objects from closed documents. All I’m proposing is to fix that by not clearing the memory the COS objects when closing their parent document. For COSStreams this won’t work as the underlying stream has been closed, but by adding an isClosed() method we can throw a friendly error explaining the problem. Currently the user gets an NPE. It’s really very simple. >> — John > > BR > Andreas Lehmkühler
