I'd like to rationalize this set of internal libsvn_wc functions: svn_wc__get_pristine_contents(path) svn_wc__get_ultimate_base_contents(path)
svn_wc__get_pristine_text_status(path) svn_wc__text_base_path_to_read(path) svn_wc__ultimate_base_text_path(path) svn_wc__ultimate_base_text_path_to_read(path) svn_wc__get_ultimate_base_checksums(path) svn_wc__get_working_checksums(path) svn_wc__get_actual_props(path) svn_wc__get_pristine_props(path) They're a really rather random set of getter functions over the possible space of information getting, and they have inconsistent names. I have contributed to the mess when I added functions into the set. Another problem with them is that each one retrieves only a tiny isolated bit of information about the node, and yet requires the WC to first determine whether a versioned node exists at that path and which WC it belongs to, open the DB, convert abspath to relpath, and so on. This situation could be improved in one of two directions. Either have single-datum getters like this but have each one take a reference to an internal "versioned node" object rather than (db context, abspath): svn_wc__get_props(versioned_node_object) or have each function return a set of data about the versioned node it finds: svn_wc__get_base_node_info(path) I think the best thing to do at this point is the latter, like with the svn_wc__db_read_info except without so many parameters :-) Have a small group of functions to access the state of the base node at a given path, maybe like this: read_base_node(path) => kind, props read_base_file(path) => text-stream, checksum, size read_base_dir(path) => children, depth read_base_link(path) => target and an identical set of functions that access the whole state of the topmost pristine version[2]: read_pris_node(path) => kind, props read_pris_file(path) => text-stream, checksum, size read_pris_dir(path) => children, depth read_pris_link(path) => target and some similar functions that access the state of the current working node: read_curr_node(path) => kind, props but not read_curr_file/dir/link(), for reasons mentioned below. The important principles here are: * The function names follow a consistent pattern. The actual names are subject to bikeshedding of course [1]. * Each function returns a group of closely related information, and each output parameter is optional. When we read "a node", a single function should fetch everything the caller wants to know about the generic node. When we read "a file", a single function should fetch everything the caller wants to know about the file, and so on. The exact set of info returned by each class of function is to be decided. For example, we would need to decide whether read_*_file() should return both SHA-1 and MD-5 checksums or just the SHA-1, and the decision would depend partly on how commonly a caller wants the MD-5. * Reading the topmost pristine version is semantically identical to reading the base version, so the set of APIs is identical. (If we were to try to push this principle outside the realm of simple getter functions, we would find some semantic differences, of course.) * Reading a (base or topmost) pristine node is semantically similar to reading the current working version. However, here the practical differences are enough that we may not wish to provide the same APIs. For example, "get the checksum" would be a slow or deferred operation with the working file because it's not pre-calculated, and the disk state can be different from the metadata state, and reading the file/dir/link can be done by the caller without going through the WC API, and so on. Thoughts? - Julian [1] We've so overloaded the word "working" that I think in this case it's better to use some other word such as "actual" or "current". And we've overloaded the word "base" so that I felt compelled to invent this horrible "ultimate base" phrase to distinguish it from existing functions that used "base" to mean "the topmost pristine version". We can get rid of that "ultimate base" phrase now. [2] Just to make sure we're understanding each other, the "topmost" pristine version is the "base" version in simple cases but is different when there are scheduled copies and so on. It's what was often called "base" or "the normal base" in WC-1. ### Maybe rationalize these too. svn_wc__has_magic_property svn_wc_has_binary_prop svn_wc__internal_propget svn_wc__internal_propdiff svn_wc__internal_transmit_prop_deltas svn_wc_get_prop_diffs2 svn_wc_prop_get2 svn_wc_props_modified_p2 svn_wc_transmit_prop_deltas2