On 9 January 2016 at 03:48, Stefan Fuhrmann <stef...@apache.org> wrote: > > Hi Ivan & interested parties, > Hi Stefan!
See my response inline. > Excellent of you starting with the API work! For all > who don't know yet: Ivan and I talked about this idea > in Berlin and agreed that it sounds like a good idea. > I had played with some prototypic implementation FSX > and learned a bit about the its pitfalls. > > Below there is a list of questions that I had when > reviewing the first few commits. There are also sections > on what I have in mind and partly in code for the new API. > > -- Stefan^2. > > > Questions / Feedback to fs-node-api as of r1723819 > -------------------------------------------------- > > [Not commenting on the implementation since that depends > on points in the other sections.] > > * BRANCH-README: "The branch is NOT going to be merged to > trunk; I'm going to re-implement everything in trunk ..." > > - Why implementing it a second time (assuming the new > API turns out to be useful)? > - Do you expect the code to be "too prototypic" to be > used for production? If so, any specific indication > for that? > - I think having the feature developed on a branch makes > sense because the switch-over to the new API will be > very noisy. Otherwise, it could be done on /trunk > directly - and be reverted again if necessary. > I don't like developing new features in branches, so I consider this branch as experiment to explore the FS node API idea and make important decisions. It's not a big problem to re-implement all this on trunk using multiple incremental changes with detailed log messages and proper docstrings. > * svn_fs.h > > - Maybe, move the new API functions to new section. > They will not always be 1:1 replacements. > Sure, but for simplicity I just revving existing functions to accept svn_fs_node_t instead ROOT+PATH. > * "svn_fs_node_t does not depends on svn_fs_root object created from" > > - I'm fine with that. Any specific reason behind this > other than that node_t should be be self-contained > within fs_t and result_pool? > libsvn_repos/reporter.c open and closes svn_fs_root_t dynamically and it will be very challenging to rewrite to maintain dependency between svn_fs_root_t and svn_fs_node_t. > * "svn_fs_node_t always reference existing node in tree (?)" > > - Within a transaction, nodes of type svn_node_none > make sense, e.g. after a deletion. Footnote: in-txn > nodes do have a number of update / invalidation issues, > which makes having separate vtables instances for > in-txn and in-rev nodes a good idea. > - I'm not too sure we should allow creating nodes for > any random path but I guess we must at least have > the option to do so b/c of symmetry. > There are two possible design approaches: 1. svn_fs_node_t is just combination of ROOT+PATH and may point to some never existent node. 2. svn_fs_node_t is always point to some existing node in tree. Yes, we may end up with dangling node in case of TXN root nodes, but it's out of scope. I decided to go with (2) at this time. > * svn_fs_open_node > > - Could we name this svn_fs_node or svn_fs_get_node? > "open" suggests that there are resources involved > which require an implicit and sometimes explicit > "close". > This is intentional name: svn_fs_open_node() actually do some work to check whether node exists. > * svn_fs_file_length > > - I was thinking of naming that svn_fs_node_length. > It could error out on non-files as it does today > or return some informative value like the representation > size. Caller logic needs to handle the "not a file" > case anyway, so it could use the new info on its own > discretion. > Makes sense, but for the branch I wanted to keep API like similar as possible for easy switching callers. > * svn_fs_dir_entries2 > > - Again, this could simply be svn_fs_sub_nodes. I kept existing semantic just to making porting to new API easier. We can discuss this improvement later. > - The result should be an array of node_t *, sorted > by their names. This is what the storage layer > delivers in FS* and it is what the repos layer > would like to scan during reporting. I think it's completely ortogonal to FS node API, so decided to move this task out of scope of the branch. Btw current update reporter implementation uses hashes very intensively. > Additional Goals > ---------------- > > * Use 2-pool paradigm. > Yes, I already using 2-pool paradigm in all revv functions. > * Where appropriate, make the API more orthogonal and > consistent (e.g. svn_fs_node_make instead of > svn_fs_node_make_file and svn_fs_node_make_dir). > I think it's of scope at least for experimental branch. Personally I'm fine with current naming and would like to avoid just flavoring changes. > > Observations in my Experiments > ------------------------------ > > * Despite calling it "node API" it actually concerned with > efficiently navigating directories. So, always represent > the node using 2 structs: the actual node plus a (usually > shared) struct containing all the context info - representing > the directory. > > * The parameters for svn_fs_node_make, svn_fs_node_copy etc. > should take pure paths to describe the target instead of > parent-node + local-name. > > * Some function API function names clash with existing ones, > i.e. need to be revved (svn_fs_node_history3, > svn_fs_node_relation2 and everything prop). > > * Txn nodes and revision nodes should use 2 different > instances of the vtable. Error handling for operations > that either apply to txn nodes or revision nodes only > can then be moved to lib_fs by checking for NULL pointers > in the vtable. > What is purpose of different vtables for txn and revision nodes? > > Proposal for an Implementation Strategy > --------------------------------------- > > * node_compat.* > - Provide a default / fallback implementation in lib_fs > for backends that don't have native node API support. > - Blueprint for backend code, i.e. ensure we can have > a clean structure > - Implement API (roughly) one function at a time. > > * Ensure semantic equivalence to old API and high test > coverage from the start > - Implement node_compat in terms of the old backend API > - Implement the old FS API in terms of the new API, > i.e. route all calls through node_compat.* > - Have #define that controls whether the old API is run > natively or gets diverted through node_compat.* > - If the backend vtable entry is NULL for an old API > function, divert it through node_compat.* I've already added '#if 0' code for that. See svn_fs_open_node(). > > * Switch API users over to the new API > - This is what makes the branch worthwhile. > - Deprecate the old API only after migrating most callers > > * Implement the node API in FSFS > - Take node_compat and only replace the vtable calls > with direct function calls. > - Add direct addressing info (noderev IDs) to the structs. I don't understand this point. Why we cannot use pointer to dag_node_t for svn_fs_node_t implemention in FSFS? > - Switch functions over to "native" code. > - NULL the old API vtable entries to enforce the new > backend logic. > > * Final review and merger into /trunk > - Report processing as in "diff --summary" is a good > performance indicator. My first goal is to switch 'svn ls -v' to new API and see how it works. -- Ivan Zhakov