Thanks, I just like to move things forward :-)

Regarding my proposed API additions -
since adding new methods will make them a part of a new API, this is a
change to make their names more meaningful/concise/correct: replacing File
with Path in the method name might be awkward.

I'd like to gather alternatives for the changes/additions to methods that
return a File.
I found a total of 4 methods that return a java.io.File and are public, in
public non-test classes and not in tika-example (I assume the rest can be
changed without breaking anything).
For each method I will provide my suggestion/s, which will be either "Add
newName", "Replace with newName" or "Keep":

tika-batch:
- org.apache.tika.batch.fs.FSUtil#getOutputFile
+ Keep
- org.apache.tika.util.PropsUtil#getFile
+ Keep

tika-core:
- org.apache.tika.io.TemporaryResources#createTemporaryFile
+ Add addTemporaryFile
Add addTempFile
Add createTempFile
- org.apache.tika.io.TikaInputStream#getFile
+ Add asFile
Add toPath
Add getPath

I've added a '+' to the left of my preference -
please add yours to your preference or add a new suggestion.

Regarding added methods - I really think that the old methods should be
deprecated.
IMO a typo or a simple name change is a good enough reason for deprecating a
method - so returning a legacy class makes it even more welcome.

-----Original Message-----
From: Allison, Timothy B. [mailto:talli...@mitre.org]
Sent: Thursday, August 27, 2015 17:36
To: dev@tika.apache.org
Subject: RE: Adding API support for Java 7's java.nio.file.Path

+1

Thank you, Yaniv, for leading this effort.

I have a small preference for getting rid of File entirely eventually (2.0?)
as Lucene and Hadoop seem to have done (?).

-----Original Message-----
From: Yaniv Kunda [mailto:yaniv.ku...@answers.com]
Sent: Wednesday, August 26, 2015 5:31 PM
To: dev@tika.apache.org
Subject: RE: Adding API support for Java 7's java.nio.file.Path

I can point out several benefits of supporting the new API, in no particular
order:
- Exception handling: operations like File.delete return a boolean which
provides less useful information if the operation failed than the exception
thrown by Files.delete() (or a Minion...)
- Performance: The new API delegates more parts of I/O operations to the OS,
resulting in better usage of resources.
In independent testing I've done (considering big files, cache warmup and
randomized order) I've achieved 30% faster reads when using Files.copy() or
FileChannel.transferTo()
- Adoption: Java 7, in which the new API appeared, is already EOL.
Supporting this API, considering that java.io is considered legacy, is good
for keeping us with times, and even better for our users as it offers them
an incentive of moving forward as well.

More can be found here:
http://docs.oracle.com/javase/tutorial/essential/io/legacy.html

I believe that the library <-> user relationship must have a balance between
compatibility and progress, as if libraries are stuck at compatibility - the
users are sometimes stuck without progress...
If we can have progress without breaking compatibility - we have a winner.

I propose to add support for and make the most of the new (4 y/o) API
without breaking compatibility, which means:
- Public methods accepting a File will not be changed; overloaded versions
will be added.
- Public methods returning a File will not be changed; methods with
different names will be added.
- Non-public methods accepting or returning a File will be changed
- Internal uses of the legacy I/O will be updated to use the new API where
easy

Regarding deprecation, I suggest that:
1) Methods accepting a File will not be deprecated - they will probably be
used as long as File itself is not deprecated (forever?)
2) Methods returning a File will be deprecated - progressive users can use
the new methods easily, less progressive can use the new methods adding
.toFile() to the result, and the rest can still use the deprecated methods
(which will most likely call the new methods internally anyway).
To summarize: overloading = convenience, methods with the same operation but
different name and return value = confusing.

If this seems like a decent proposal, I can separate this work into several
JIRA issues and patches, so that reviewing the changes is easier.

-----Original Message-----
From: Nick Burch [mailto:apa...@gagravarr.org]
Sent: Wednesday, August 26, 2015 13:27
To: dev@tika.apache.org
Subject: Re: Adding API support for Java 7's java.nio.file.Path

On Wed, 26 Aug 2015, Yaniv Kunda wrote:
> I would like to propose adding support for Java 7’s java.nio.file.Path
> as an alternative to those methods in the API that deal with a
> java.io.File.

Any chance you could briefly summarise what advantages this would give to us
and/or our users?

> 1)      What can we do with methods returning a File? e.g.
> TemporaryResources.createTemporaryFile, TikaInputStream.getFile.
> Should we break compatibility and encourage (=force) users to change
> their code (Note that since they all use Java 7 now, the change is
> minimal by adding .toFile() to the result), or create new methods with
> different names (confusing)?

Breaking compatibility outside of a 2.0 release is a big no-no, sorry.

TemporaryResources.createTemporaryPath and TikaInputStream.getPath could
work as naming

> 2)  Should we deprecate the old methods accepting a File, or delete
> them?

Deleting would break compatibility, so shouldn't be done. Deprecating could
be done, if there's a strong reason to encourage people off them


https://wiki.apache.org/tika/Tika2_0RoadMap is where we're tracking proposed
API-breaking changes for 2.0

Nick

-- 


This email communication (including any attachments) contains information
from Answers Corporation or its affiliates that is confidential and may be
privileged. The information contained herein is intended only for the use of
the addressee(s) named above. If you are not the intended recipient (or the
agent responsible to deliver it to the intended recipient), you are hereby
notified that any dissemination, distribution, use, or copying of this
communication is strictly prohibited. If you have received this email in
error, please immediately reply to sender, delete the message and destroy
all copies of it. If you have questions, please email le...@answers.com.

If you wish to unsubscribe to commercial emails from Answers and its
affiliates, please go to the Answers Subscription Center
http://campaigns.answers.com/subscriptions to opt out.  Thank you.

-- 


This email communication (including any attachments) contains information 
from Answers Corporation or its affiliates that is confidential and may be 
privileged. The information contained herein is intended only for the use 
of the addressee(s) named above. If you are not the intended recipient (or 
the agent responsible to deliver it to the intended recipient), you are 
hereby notified that any dissemination, distribution, use, or copying of 
this communication is strictly prohibited. If you have received this email 
in error, please immediately reply to sender, delete the message and 
destroy all copies of it. If you have questions, please email 
le...@answers.com. 

If you wish to unsubscribe to commercial emails from Answers and its 
affiliates, please go to the Answers Subscription Center 
http://campaigns.answers.com/subscriptions to opt out.  Thank you.

Reply via email to