I’ve formalized this issue here: https://issues.apache.org/jira/browse/TIKA-1726
Please take the time and share your opinion on the new method names, so I can go ahead a provide some patches. *From:* Yaniv Kunda [mailto:yaniv.ku...@answers.com] *Sent:* Monday, August 31, 2015 18:51 *To:* dev@tika.apache.org *Subject:* Re: Adding API support for Java 7's java.nio.file.Path I've already done that, I'm just waiting for the group's opinions on names for the new methods, especially the two that I've added to augment org.apache.tika.io.TemporaryResources#createTemporaryFile And org.apache.tika.io.TikaInputStream#getFile As described below. On Aug 31, 2015 3:26 PM, "Konstantin Gribov" <gros...@gmail.com> wrote: My two cents, we can migrate to Files.copy, Files.newBufferedReader etc in places where it can replace commons-io and Tika's internal copy of it. сб, 29 авг. 2015 г. в 19:48, Ken Krugler <kkrugler_li...@transpac.com>: > > > From: Yaniv Kunda > > Sent: August 29, 2015 2:21:23am PDT > > To: dev@tika.apache.org > > Subject: RE: Adding API support for Java 7's java.nio.file.Path > > > > In addition to the discussion I've raised about the methods returning a > > File, I have another problem: > > Some of the methods that accept a File throw a FileNotFoundException. > > This exception is thrown by FIS/FOS/RAF constructors in response to > > anything - from an file that's actually not there to access denied. > > The NIO api methods usually declare to throw an IOException, which can > be a > > subclass representing a more accurate reason - NoSuchFileException or > > AccessDeniedException. > > > > When adding the overloaded methods accepting a Path, I initially thought > to > > delegate the old methods to the new ones, but the new ones declare an > > IOException while the old declare a FileNotFoundException. > > > > I have three options: > > 1) Leave the old methods with their own code - > > this means essentially duplicate code, but complete backward > > compatibility. > > +1 > > I don't feel strongly, but I think we get max bang for the development > buck by doing the simplest thing here. > > And it doesn't feel like it'll be that long before Tika 2.0, when the old > method code can be removed. > > -- Ken > > > 2) Delegate the old methods to the new ones, but catch the IOException > and > > wrap it in a FileNotFoundException - > > this will remain backward compatible, unless some catching a > > FileNotFoundException does text analysis on the exception message. > > 3) Delegate the old methods to the new ones, and change the signature > > accordingly to throw an IOException instead of a FileNotFoundException - > > this will break backward compatibility, only in cases a > > FileNotFoundException was caught explicitly. > > > > What do you think? > > > > -----Original Message----- > > From: Yaniv Kunda [mailto:yaniv.ku...@answers.com] > > Sent: Friday, August 28, 2015 03:33 > > To: dev@tika.apache.org > > Subject: RE: Adding API support for Java 7's java.nio.file.Path > > > > Thanks, I just like to move things forward :-) > > > > Regarding my proposed API additions - > > since adding new methods will make them a part of a new API, this is a > > change to make their names more meaningful/concise/correct: replacing > File > > with Path in the method name might be awkward. > > > > I'd like to gather alternatives for the changes/additions to methods that > > return a File. > > I found a total of 4 methods that return a java.io.File and are public, > in > > public non-test classes and not in tika-example (I assume the rest can be > > changed without breaking anything). > > For each method I will provide my suggestion/s, which will be either "Add > > newName", "Replace with newName" or "Keep": > > > > tika-batch: > > - org.apache.tika.batch.fs.FSUtil#getOutputFile > > + Keep > > - org.apache.tika.util.PropsUtil#getFile > > + Keep > > > > tika-core: > > - org.apache.tika.io.TemporaryResources#createTemporaryFile > > + Add addTemporaryFile > > Add addTempFile > > Add createTempFile > > - org.apache.tika.io.TikaInputStream#getFile > > + Add asFile > > Add toPath > > Add getPath > > > > I've added a '+' to the left of my preference - please add yours to your > > preference or add a new suggestion. > > > > Regarding added methods - I really think that the old methods should be > > deprecated. > > IMO a typo or a simple name change is a good enough reason for > deprecating a > > method - so returning a legacy class makes it even more welcome. > > > > -----Original Message----- > > From: Allison, Timothy B. [mailto:talli...@mitre.org] > > Sent: Thursday, August 27, 2015 17:36 > > To: dev@tika.apache.org > > Subject: RE: Adding API support for Java 7's java.nio.file.Path > > > > +1 > > > > Thank you, Yaniv, for leading this effort. > > > > I have a small preference for getting rid of File entirely eventually > (2.0?) > > as Lucene and Hadoop seem to have done (?). > > > > -----Original Message----- > > From: Yaniv Kunda [mailto:yaniv.ku...@answers.com] > > Sent: Wednesday, August 26, 2015 5:31 PM > > To: dev@tika.apache.org > > Subject: RE: Adding API support for Java 7's java.nio.file.Path > > > > I can point out several benefits of supporting the new API, in no > particular > > order: > > - Exception handling: operations like File.delete return a boolean which > > provides less useful information if the operation failed than the > exception > > thrown by Files.delete() (or a Minion...) > > - Performance: The new API delegates more parts of I/O operations to the > OS, > > resulting in better usage of resources. > > In independent testing I've done (considering big files, cache warmup and > > randomized order) I've achieved 30% faster reads when using Files.copy() > or > > FileChannel.transferTo() > > - Adoption: Java 7, in which the new API appeared, is already EOL. > > Supporting this API, considering that java.io is considered legacy, is > good > > for keeping us with times, and even better for our users as it offers > them > > an incentive of moving forward as well. > > > > More can be found here: > > http://docs.oracle.com/javase/tutorial/essential/io/legacy.html > > > > I believe that the library <-> user relationship must have a balance > between > > compatibility and progress, as if libraries are stuck at compatibility - > the > > users are sometimes stuck without progress... > > If we can have progress without breaking compatibility - we have a > winner. > > > > I propose to add support for and make the most of the new (4 y/o) API > > without breaking compatibility, which means: > > - Public methods accepting a File will not be changed; overloaded > versions > > will be added. > > - Public methods returning a File will not be changed; methods with > > different names will be added. > > - Non-public methods accepting or returning a File will be changed > > - Internal uses of the legacy I/O will be updated to use the new API > where > > easy > > > > Regarding deprecation, I suggest that: > > 1) Methods accepting a File will not be deprecated - they will probably > be > > used as long as File itself is not deprecated (forever?) > > 2) Methods returning a File will be deprecated - progressive users can > use > > the new methods easily, less progressive can use the new methods adding > > .toFile() to the result, and the rest can still use the deprecated > methods > > (which will most likely call the new methods internally anyway). > > To summarize: overloading = convenience, methods with the same operation > but > > different name and return value = confusing. > > > > If this seems like a decent proposal, I can separate this work into > several > > JIRA issues and patches, so that reviewing the changes is easier. > > > > -----Original Message----- > > From: Nick Burch [mailto:apa...@gagravarr.org] > > Sent: Wednesday, August 26, 2015 13:27 > > To: dev@tika.apache.org > > Subject: Re: Adding API support for Java 7's java.nio.file.Path > > > > On Wed, 26 Aug 2015, Yaniv Kunda wrote: > >> I would like to propose adding support for Java 7’s java.nio.file.Path > >> as an alternative to those methods in the API that deal with a > >> java.io.File. > > > > Any chance you could briefly summarise what advantages this would give > to us > > and/or our users? > > > >> 1) What can we do with methods returning a File? e.g. > >> TemporaryResources.createTemporaryFile, TikaInputStream.getFile. > >> Should we break compatibility and encourage (=force) users to change > >> their code (Note that since they all use Java 7 now, the change is > >> minimal by adding .toFile() to the result), or create new methods with > >> different names (confusing)? > > > > Breaking compatibility outside of a 2.0 release is a big no-no, sorry. > > > > TemporaryResources.createTemporaryPath and TikaInputStream.getPath could > > work as naming > > > >> 2) Should we deprecate the old methods accepting a File, or delete > >> them? > > > > Deleting would break compatibility, so shouldn't be done. Deprecating > could > > be done, if there's a strong reason to encourage people off them > > > > > > https://wiki.apache.org/tika/Tika2_0RoadMap is where we're tracking > proposed > > API-breaking changes for 2.0 > > > > Nick > > > > -- > > > > > > This email communication (including any attachments) contains information > > from Answers Corporation or its affiliates that is confidential and may > be > > privileged. The information contained herein is intended only for the > use of > > the addressee(s) named above. If you are not the intended recipient (or > the > > agent responsible to deliver it to the intended recipient), you are > hereby > > notified that any dissemination, distribution, use, or copying of this > > communication is strictly prohibited. If you have received this email in > > error, please immediately reply to sender, delete the message and destroy > > all copies of it. If you have questions, please email le...@answers.com. > > > > If you wish to unsubscribe to commercial emails from Answers and its > > affiliates, please go to the Answers Subscription Center > > http://campaigns.answers.com/subscriptions to opt out. Thank you. > > > > -- > > > > > > This email communication (including any attachments) contains information > > from Answers Corporation or its affiliates that is confidential and may > be > > privileged. The information contained herein is intended only for the use > > of the addressee(s) named above. If you are not the intended recipient > (or > > the agent responsible to deliver it to the intended recipient), you are > > hereby notified that any dissemination, distribution, use, or copying of > > this communication is strictly prohibited. If you have received this > email > > in error, please immediately reply to sender, delete the message and > > destroy all copies of it. If you have questions, please email > > le...@answers.com. > > > > If you wish to unsubscribe to commercial emails from Answers and its > > affiliates, please go to the Answers Subscription Center > > http://campaigns.answers.com/subscriptions to opt out. Thank you. > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://www.scaleunlimited.com > custom big data solutions & training > Hadoop, Cascading, Cassandra & Solr > > > > > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://www.scaleunlimited.com > custom big data solutions & training > Hadoop, Cascading, Cassandra & Solr > > > > > > -- Best regards, Konstantin Gribov -- This email communication (including any attachments) contains information from Answers Corporation or its affiliates that is confidential and may be privileged. The information contained herein is intended only for the use of the addressee(s) named above. If you are not the intended recipient (or the agent responsible to deliver it to the intended recipient), you are hereby notified that any dissemination, distribution, use, or copying of this communication is strictly prohibited. If you have received this email in error, please immediately reply to sender, delete the message and destroy all copies of it. If you have questions, please email le...@answers.com. If you wish to unsubscribe to commercial emails from Answers and its affiliates, please go to the Answers Subscription Center http://campaigns.answers.com/subscriptions to opt out. Thank you.