[
https://issues.apache.org/jira/browse/HADOOP-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641174#action_12641174
]
Sanjay Radia commented on HADOOP-4044:
--------------------------------------
> I would like to avoid a design that incurs an overhead of an additional RPC
> everytime a link is traversed.
>+1. This will affect not only NNBench but all benchmarks including DFSIO and
>especially NNThroughputBenchmark.
>GridMix and Sort will probably be less affected, but will suffer too.
+1. I would also like to avoid an extra rpc, since avoiding one is straight
forward.
Doug >What did you think about my suggestion above that we might use a cache to
avoid this? First, we implement the naive approach, benchmark it, and, it it's
too slow, optimize it with a pre-fetch cache of block locations.
Clearly your cache solution deals with the extra RPC issue.
Generally I see a cache as a way of improving the performance of an ordinarily
good design or algorithm. I don't like the use of caches as part of a design
to make an algorithm work when alternate good designs are available that don't
need a cache. Would we have come up with this design if we didn't have such an
emotionally charged discussion on exceptions?
We have a good design where if the resolution fails due to a symlink, we return
this information to the caller. It does not require the use of a cache.
We are divided over how to return this information - use the return status or
use an exception.
The cache solution is a way to avoid making the painfully emotionally charged
decision for the Hadoop community.
I don't want to explain the reason we use the cache to hadoop developers again
and again down the road.
We should not avoid the decision, but make it.
A couple of weeks ago I was confident that a compromise vote would pass. I am
hoping that the same is true now.
> Create symbolic links in HDFS
> -----------------------------
>
> Key: HADOOP-4044
> URL: https://issues.apache.org/jira/browse/HADOOP-4044
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: HADOOP-4044-strawman.patch, symLink1.patch,
> symLink1.patch, symLink4.patch, symLink5.patch, symLink6.patch,
> symLink8.patch, symLink9.patch
>
>
> HDFS should support symbolic links. A symbolic link is a special type of file
> that contains a reference to another file or directory in the form of an
> absolute or relative path and that affects pathname resolution. Programs
> which read or write to files named by a symbolic link will behave as if
> operating directly on the target file. However, archiving utilities can
> handle symbolic links specially and manipulate them directly.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.