[freenet-dev] Library mega-code-review

Mike Bush Sat, 15 Aug 2009 03:02:29 +0100

2009/8/14 Matthew Toseland <toad at amphibian.dyndns.org>:
> Sorry if I haven't kept up! Git revisions with issues, then a rather chaotic 
> changelog. This is aimed at both infinity0 and platy.
>
> In testing, currently XML searches don't show progress - or perhaps they 
> don't start at all. Searching for "test" (no quotes) stays at "Unstarted", 
> "Operation length unknown" apparently indefinitely.
>
> First, IMHO we should have a specific version of SnakeYAML to download with a 
> checksum. Would we have to host this ourselves?
>
> Does the current code require the form password to start a search? I guess 
> not for purposes of inlining searches from freesites - but we could add it in 
> in the filter with a slightly more sophisticated form filtering callback.


No it's not required, the difference it currently makes(finished
today) is that if the request involves making a new index bookmark and
the form password is not given (ie the search was made from outside),
authorisation is requested from the user before adding the bookmark.
Currently inline searches will work
USK at 
Cx45290sdWAn7bS9-7ZsUPLfcvqcPNJjvZtt4glhyOU,fEUAxgcb1O0lDiS4UabmSAtLcfeqSPx2e2iMG2e59ko,AQACAAE/searchtest/5/
The top one should work for anyone with up-to-date Library, the bottom
only if you are running fred-staging with commit
da919c24cdcba1bc657a8932d59eddd38b9a6781 which allows '/library/'
actions through the form filter.

>
> Now on to per-commit issues. Some of these may be out of date.
>
>
> Search.java (originally cde966c7353a87bac8f84eca192c0fc99cb91b57):
>
> accesses to subsearches and allsearches need to be synchronized???

Thanks i've synchronized against operations on allsearches and made
subsearches unmodifiable after creation.

>
> 7f356226e4a0a175fcf5ccb16f36059327f352b6
>
> MapCollectionTranslator assumes backing is what later becomes a SortedMapSet 
> - a map of elements each to itself. should maybe be private?
>
> 570872c82603d75e1922d061c51fd77f0d8d7183
>
> CAPITALS should really only be used for constants (this commit is gone but 
> other places do this).
>
> bd50ba2b048e153d34893dc0384a1b63d41ee6a8
>
> is this wise? it will be different for every SSK under the same routing key, 
> as the routingkey is constructed from the pubkey hash and e(h(docname)) ... 
> otoh it will work for KSKs, unlike the old code.
>
> 740498979f511e4aafd710a448dae20586336304
>
> containsValue() broken - the value will not equal the SkeletonValue.
>
> 74603b79fbce24725751cdfbfe7bcd15df704213
>
> TermPageEntry: will the title be used? It says "here for back compatibility"??

It is used

>
> cd84583f555ea5b57daed98a97ef00d9dedb9728 (platy)
>
> LibrarianHandler:
> setting description to "not available" has localisation issues - can we use 
> "" ? I suppose we could use the localised version of "not available" ...
> old bug, easy solution?

Fixed, it's using the obvious choice - null .
dc2b7cb0f4bb231f41120339157d1ded92c21c3c

>
> b18414028a2ce5eadbd9220eeca2b845aa13bb67 (platy)
>
> ResultSet:
> private HashMap<TermEntry, TermEntry> internal; - huh? why not a set?

This is only used in unite, I will do it differently, it's so that if
it contains entry A and is being united with another Set which
contains entry B where A.equals(B). B could potentially have more meta
on the entry, for instance if it was from a different index. As a Map,
A can be retrieved by internal.get(B)

>
> intersect : internal hashset is unnecessary, just count them. and it will 
> break if any of them are equal subject - do we check for this?

Good point, merging will only need to occur between different indexes,
this will only ever be a union, I don't need to worry about merging
with any other operation.

>
> 99080a11562b4fee3eeed617bebd8dddaa0d080a
>
> ResultSet again:
> phrase : this could also be written with a counter and keeping the TermEntry 
> from the last iteration for the term positions, avoiding a list.

Again, yes this can be changed

>
> bcb80c457089f2114c0b31e85127c0b379786533
>
> Breadth-first parallel inflate: this will likely make a lot of threads! I 
> think this may have been made pooled later?
>
> Also you need to use NativeThread's to set priority class, which means you 
> probably need to use the node's executor (to increase priority)... we can 
> deal with this later if need be.
>
> 4000b1b88d14c1eda47e18fc45283c3d9a561a52
>
> ThreadPoolExecutor may be a problem because of NativeThread, not an 
> immediately urgent issue though. We could use a ThreadFactory to solve this.
>
> 9a1855c7ba183a38ccd19f273a4fdb32edff7a83
>
> Can we be certain that a freenet index can never cause us to search a file 
> index?
>
> 5394f13e60ca206370aeba722db261a897436fb2
>
> FreenetArchiver.pushLive():
> note that the bucket will in some cases be limited to expected_bytes, so you 
> should pass in a big number if you are not sure!
>
> b880788bbe1f6966c0d887d369bfc840d602c39a
>
> Library: singletons: seems to make sense, what about a high level regression 
> test using multiple nodes and a Library plugin on each? Maybe far fetched... 
> I have heard you can do this with classloader tricks, maybe that's better... 
> Not a problem if these objects are reusable and don't have any significant 
> state...?
>
> b5a597cfd57dbbb1eebf1eefebc5fac4cafc8cbf
>
> ProtoIndexSerialiser:
> Can we be sure here that we never leave the insert ID in the metadata and it 
> get serialised? It should be external, no? Or at least, it should be turned 
> from an insert uri into a request uri?
>
> 83491ddd786d84e080a467e989384916afd56416
>
> Library.getType* etc:
> Getting the type from the filename means that if the user (or the bookmark) 
> passes the *full* URI with the /index.xml or /index.yml, we only need to 
> fetch it once. We should try to make sure this happens as it's far more 
> efficient! First step would be to change the bookmark URLs...

Yeah doing this.

>
> caefcb5034a1a1aa4595391bc4203797ac47941e
>
> Interesting - when you do a search from a freesite it can optionally add the 
> bookmark, so you confirm it. How about assuming that you *might* want to 
> bookmark it and just having an explanation and a button on every search using 
> a nonstandard index? Also, it should be a button, a POST request, not a link.

Yeah that's a good idea.

>
> 8702a6745fee793fe77af537b7c347adadcd2fc0
>
> FreenetArchiver:
> Fetch can go over 100% on newly inserted data??? If so this is a bug, but it 
> is prudent to detect and handle it. Certainly it can on old data.
>
> 8702a6745fee793fe77af537b7c347adadcd2fc0
>
> SimpleProgress:
> + ? ? ? @Override public ProgressParts getParts() throws TaskAbortException {
> + ? ? ? ? ? ? ? // updates are made such that the ProgressParts contract 
> isn't broken even
> + ? ? ? ? ? ? ? // in mid-update so we don't need to synchronize here.
> + ? ? ? ? ? ? ? if (abort != null) { throw abort; }
> + ? ? ? ? ? ? ? return new ProgressParts(pdone, known, known, estimate);
> ? ? ? ?}
>
> Are you sure? Write order is not guaranteed if we don't synchronize.
>
> ed0e629e2ba514ce23d33a286d380851e0eef8f5 (platy)
>
> ? ? ? ?/**
> ? ? ? ? * Construct a LibrarianHandler to look for many terms
> ? ? ? ? * @param requests the requests wanting to be resolved by this 
> LibrarianHandler, results are written back to them
> ? ? ? ? * @param wantPositions should the positions be recorded? this has a 
> performance impact but allows for phrase searching
> ? ? ? ? * @throws java.lang.Exception
> ? ? ? ? */
> ? ? ? ?public LibrarianHandler(List<FindRequest> requests) {
> ? ? ? ? ? ? ? ?this.requests = new ArrayList(requests);
> ? ? ? ? ? ? ? ?for (FindRequest r : requests){
> ? ? ? ? ? ? ? ? ? ? ? ?r.setResult(new HashSet<TermPageEntry>());
> -// ? ? ? ? ? ? ? ? ? ? wantPositions = wantPositions || 
> r.shouldGetPositions();
> ? ? ? ? ? ? ? ?}
> - ? ? ? ? ? ? ? wantPositions = true;
> ? ? ? ?}
>
> Are we always getting the positions now for the relevance? We don't have to 
> keep them in memory after we have calculated relevance...

Yes, wantPositions as an option has been removed, I realised the
reason it slowed indexing considerably was that it was parsing all the
positions in the file and ignored them, now it only parses those terms
being looked for and its impact doesn't seem significant. The
positions are still needed for phrase searching which is actually
working quite well over the last week (crossed fingers).

>
> cafcea9db4c0007e9153eaa93bf4cebca450a07d
>
> Library:
> +
> + ? ? ? public Object getKeyTypeFromString(String indexuri) {
> + ? ? ? ? ? ? ? try {
> + ? ? ? ? ? ? ? ? ? ? ? return KeyExplorerUtils.sanitizeURI(new 
> ArrayList<String>(), indexuri);
> + ? ? ? ? ? ? ? } catch (MalformedURLException e) {
> + ? ? ? ? ? ? ? ? ? ? ? File file = new File(indexuri);
> + ? ? ? ? ? ? ? ? ? ? ? if (file.canRead()) {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? return file;
> + ? ? ? ? ? ? ? ? ? ? ? } else {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? throw new 
> UnsupportedOperationException("Could not recognise index type from string: " 
> + indexuri);
> + ? ? ? ? ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? }
> ? ? ? ?}
>
>
> Confusing function name.
>
> cd058a3293e037da9428091ded0b09aa712c5e1e
>
> ProgressParts:
> ? ? ? ?/**
> ? ? ? ?** Returns the parts done as a fraction of the estimated total parts.
> ? ? ? ?** Note: this will return negative (ie. invalid) if there is no 
> estimate.
> ? ? ? ?*/
> ? ? ? ?final public float getEstimatedFractionDone() {
> - ? ? ? ? ? ? ? return totalest == 0? 0.0f: (float)done / totalest;
> + ? ? ? ? ? ? ? return totalest == 0? 0.0f: totalest == TOTAL_FINALIZED? 
> 1.0f: (float)done / totalest;
> ? ? ? ?}
>
> Huh???? These don't make sense - finalized is surely not the same as done?
>
> ? ? ? ?public static ProgressParts getSubParts(Iterable<? extends Progress> 
> subprogress, boolean try_to_be_smart) throws TaskAbortException {
> ? ? ? ? ? ? ? ?int d = 0, s = 0, k = 0, t = 0;
> ? ? ? ? ? ? ? ?int num = 0, unknown = 0;
> + ? ? ? ? ? ? ? boolean totalfinalized = true;
> ? ? ? ? ? ? ? ?for (Progress p: subprogress) {
> ? ? ? ? ? ? ? ? ? ? ? ?++num;
> ? ? ? ? ? ? ? ? ? ? ? ?if (p == null) { ++unknown; continue; }
> ? ? ? ? ? ? ? ? ? ? ? ?ProgressParts parts = p.getParts();
> ? ? ? ? ? ? ? ? ? ? ? ?d += parts.done;
> ? ? ? ? ? ? ? ? ? ? ? ?s += parts.started;
> ? ? ? ? ? ? ? ? ? ? ? ?k += parts.known;
> - ? ? ? ? ? ? ? ? ? ? ? if (parts.totalest < 0) { ++unknown; }
> - ? ? ? ? ? ? ? ? ? ? ? else { t += parts.totalest; }
> + ? ? ? ? ? ? ? ? ? ? ? switch (parts.totalest) {
> + ? ? ? ? ? ? ? ? ? ? ? case ESTIMATE_UNKNOWN:
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ++unknown;
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? totalfinalized = false;
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? break;
> + ? ? ? ? ? ? ? ? ? ? ? case TOTAL_FINALIZED:
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? t += parts.totalest;
>
> This can't possibly be right. += -1 ??
>
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? break;
> + ? ? ? ? ? ? ? ? ? ? ? default:
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? totalfinalized = false;
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? t += parts.totalest;
> + ? ? ? ? ? ? ? ? ? ? ? }
> ? ? ? ? ? ? ? ?}
>
> 66a26efb893d9491f72de59e5f3730933ebfd12c
>
> If it completes, check the MIME type on the FetchResult before throwing.
>
> Also you should consider accepting an index on the basis of its default 
> filename fetching successfully even if it is inserted as 
> application/octet-stream? I dunno, you are expecting the new spider to insert 
> indexes itself? I guess it's more important that we detect the filename if it 
> is specified, and encourage users to specify it.
>
> Also, SnoopMetadata allows you to do equivalent things (directory listing) to 
> KeyExplorer without depending on a plugin; obviously this isn't a priority.
>
> d1e0f212e51b2cbeb45ffcf087e44b9fbd569952
>
> FindRequest:
>
> ? ? ? ? ? ? ? ? ? ? ? ?}else
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? throw new UnsupportedOperationException("Not 
> supported yet. : "+ce.getClass().getName());
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? throw new UnsupportedOperationException("Not 
> supported yet. This ClientEvent hasn't been identified as being needed yet : 
> "+ce.getClass().getName()); ? // Haven't come across any other ClientEvents 
> arriving here yet
>
> Eeek, do you really have to throw? We are likely to introduce new events, you 
> should just ignore them, or log them.

This was mainly to be able to identify whether any others would come
up, ignoring or logging them i suppose would mean progress wouldn't
work but searches would still complete. Switched to logging and
ignoring in e0c5c16e900390fb7065a7b7ee5f67826dd220b6

>
> d1e0f212e51b2cbeb45ffcf087e44b9fbd569952
>
> XMLIndex:
>
> ? ? ? ?/** Called on failed/canceled fetch */
> ? ? ? ?public void onFailure(FetchException e, ClientGetter state, 
> ObjectContainer container){
> + ? ? ? ? ? ? ? if(e.newURI!=null){
> + ? ? ? ? ? ? ? ? ? ? ? try {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? indexuri = 
> e.newURI.setMetaString(null).toString();
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? startFetch();
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? return;
> + ? ? ? ? ? ? ? ? ? ? ? } catch (FetchException ex) {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Logger.error(this, "what?", ex);
> + ? ? ? ? ? ? ? ? ? ? ? } catch (MalformedURLException ex) {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Logger.error(this, "what?", ex);
> + ? ? ? ? ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? }
> ? ? ? ? ? ? ? ?fetchStatus = FetchStatus.FAILED;
> + ? ? ? ? ? ? ? Logger.error(this, "Fetch failed on "+toString() + " -- state 
> = "+state, e);
> ? ? ? ?}
>
>
> Maybe there should be a counter? Infinite loop *should* be impossible, but 
> bugs might do bad things...

Put a limit of 20 in 34d086d45a515222e243d7d35bd2aebd555ff3d9 I dont
know whether thats a good number.


>
> 2884d18dd7d2472677a363caf38989e2d4c69d52
>
> MainPage:
> Do we actually forget the index after the user clicks no? I still think it 
> might be easier to just have a button shown when it's not a standard index...

Yeah i think this is a better idea

>
> ec328e158b438bd15894c7380efa7b95b32cb714 (platy)
>
> is it necessary to use java/xml logging? The 200ms sleep is messy, should add 
> a TODO; not entirely sure what the problem is...

That was a mistake, using Netbeans, it adds that by default in
catch{}. The wait was for something, I don't think it's a problem,
both removed in 22a5278c72ea3dc92c802ea5de7224114250d512

>
> Changelog (rather schizophrenic, you don't have to read it, I was reviewing 
> the code anyway so; it has gone in the tag so you might want to look at it):
>
> Library:
> - Base on XMLLibrarian up to 3ed7453c0f70a61e21839bb5c0b4d11aab6c5c69 *AND* 
> Interdex at rev 0860091a26323360ff1a77c9d48077fc17b0d3e2. (minor build.xml 
> conflicts).
> - Rebrand to Library. This will have UI and FCP for both old and new 
> (interdex) indexes. Move from src/plugins/XMLLibrarian/ to 
> src/plugins/Library/ and update build.xml (in some places using 
> $(packagebase)). Move some packages around, e.g. Interdex into 
> index/Interdex, some stuff into util or search, xmlindex/ into index/xml.
> - findTerm(), getIndex(), will presumably be exposed to FCP.
> - Delete some unused code.
> - Experiments with RPC-over-FCP code, all gone now.
> - Add GPL headers.
> - Javadocs: packages.
> - Javadocs.
> - Comments.
> - Move plugin stuff to Main classs. Library for index stuff.
> - Logging.
> - Index interface.
> - WritableIndex interface.
> - Move some older Index stuff into XMLIndex, make it implement Index.
> - Generics.
> - Factor out AbstractRequest, base of FinndRequest.
> - Simplify build-dep instructions.
> - Move Interdex stuff to index/, or util/ or serial/, delete duplicated or 
> irrelevant files, move many files around.
> - Indenting.
> - Deprecate PrefixTreeMap.
> - Try to merge the progress (Request) interfaces from both plugins. This 
> affects a remarkable amount of code, involving much refactoring. delete the 
> ERROR completion code, will throw in getResult().
> - TODO file editing.
> - bookmarks: bookmark:name(uri) bookmarks an index.
> - Library.addBookmark(). Add one by searching ~name(url).
> - catch RuntimeException's and display in the web UI.
> - ProtoIndex.getTermEntries.getResult() returns an immutable collection.
> - change serialised representation of btree root.
> - Style changes, centralising style.
> - better ui for searching multiple indexes, showing which you are using and 
> showing checkboxes for ones you could use.
> - build.xml: fetch SnakeYAML and verify checksum.
> - *Packer -> *SplitPacker. Deprecate *SplitPacker.
> - New Packer: abstract class, a few methods must be filled in, but provides 
> configurable packing algorithm. Best fit decreasing with some options. Can 
> fit stuff into bins even if they already have data in them. Can repack. Pulls 
> data in to repack where necessary. Old Packer would split items, this greatly 
> complicates matters; new Packer does not.
> - Toadlet interface for search. Searches can only be started from POSTs. 
> POSTs require the formPassword, thus neutralising any threat from malicious 
> external sites. Lots of bug chasing...
> - BTreeSet. SkeletonBTreeSet.
> - Interdex: error handling work.
> - Interdex: Where a BTree has the data internally, there are no external 
> dependancies (as opposed to e.g. btree roots packed into a series of 
> buckets). which means that serialisation is tricky, because we expect to push 
> the dependancies as ghosts, at which point isBare() will be true and we can 
> push the node itself. If there is only internal data, copy it to the metadata 
> so we can be isBare() and then copy it when we serialise the node. In future 
> there will be a more efficient solution.
> - Interdex: error checking.
> - Interdex: put TokenEntry's within a keyword into a BTreeSet, not a TreeSet. 
> Pack the roots rather than the whole sets.
> - Interdex: BinInfo, no longer an Object[2].
> - Interdex: many corner cases.
> - Interdex: abstract out SortedMapSet from BTreeSet, implements subSet etc, 
> cleans up code.
> - Interdex: Fix BTreeMap iterator hasNext().
> - Interdex: Extend unit tests.
> - Interdex: Scale knows its Packer.
> - Interdex: delete old code.
> - Interdex: progress tracking support in Packer. Reinstate progress tracking 
> code, tweak job names for logging purposes.
> - Interdex: fix some inflate bugs.
> - Interdex: fix packer bugs related to completion.
> - Interdex: support accessing by index, cache size of tree below each node 
> (_size or treeSize()), rename size() to nodeSize().
> - Interdex: Get routing key from the NodeKey for URIKey's.
> - Interdex: implement utab (uri -> uri entries). Implement translators and 
> packers for utab and ttab, BIndexSerialiser should now be functional.
> - Interdex: make packer more generic/configurable, Inventory contains most 
> functions of Scale and proxies actual weighing to Scale
> - Interdex: rename MAGIC to serialVersionUID.
> - Interdex: proper hashCode, equals, etc for Token*Entry.
> - Interdex: rename TokenEntry -> TermEntry, TokenURIEntry -> TermPageEntry.
> - Interdex: set dumper options so dumps are readable for debugging purposes.
> - Interdex: custom yaml format for BinInfo.
> - Interdex: new SkeletonTreeMap, uses a wrapper for values with loaded status 
> and metadata, cleaner code, better iterators etc, in particular support for 
> iterating over unloaded data and it throws when you try to get the values, 
> partial support for subMap/headMap/tailMap, can load data while iterating 
> (will throw, caller can then load on the map, then go back to iterator), 
> fixes some bugs.
> - Interdex: generics.
> - Remove progress trackables in ProtoIndex, replace with a stack of 
> trackables, show progress while getting the root
> - SkeletonBTreeMap: deflate/inflate on right hand side not left hand side.
> - Interdex: Lots of minor refactoring.
> - Don't depend on FredPluginAPI. It is unlikely to work any time soon, we 
> will use FCP instead.
> - Eliminate URIWrapper in favour of TermPageEntry, which now includes title. 
> Remove code related to uri being a string on URIWrapper, since it's a 
> FreenetURI on TermPageEntry.
> - Sanity checking.
> - Interdex: refactor makeProgressIterable etc.
> - CompositeRequest interface for requests with subrequests.
> - TermEntry.equalsTarget(), rel.combine().
> - Rewrite/refactor ResultSet to handle TermEntry's better, move combination 
> code out of Search. No phrase search yet, was turned off anyway. Is 
> immutable, operations happen in constructor. Split DIFFERENTINDEXES mode out 
> of UNION, currently handled the same, add SINGLE mode. Intersection may be 
> slightly more efficient.
> - Refactor web interface for TermEntry's and CompositeRequest.
> - Delete some old code.
> - Combine relevances in Term*Entry, not in TermEntry, and make TermEntry 
> immutable, create a new one in combine().
> - More refactoring, optimising merging modes in ResultSet.
> - Working phrase search.
> - Re-enable positions support in XML index code. Only use it if we need to.
> - XML index: Calculate a simple relevance score based on word count and file 
> count.
> - Fix comparisons (left/right/middle, different null handling in each) in 
> btreemap, order Node's for breadth-first search.
> - Breadth-first search for inflation. ****** likely to use a lot of threads...
> - AbstractRequest: record time taken.
> - Interdex: TaskAbortException thrown some places, setAbort -> abort (which 
> throws it after setting it).
> - Interdex: Improve progress text.
> - Use a ThreadPoolExecutor in ParallelSerialiser. TaskHandler handles tasks 
> on a thread (in serial mode), shifts them to the executor if they are already 
> running, keeps progress data up to date; in parallel mode, run directly on 
> executor (caller must create progress).
> - Interdex: ProgressTracker now uses a WeakHashMap, of tasks (not data). 
> get*Progress just gets the progress, add*Progress throws a 
> TaskInProgressException if the task is already running otherwise adds it.
> - Interdex: Parallel inflation on BTreeMap.
> - Interdex: Various refactoring.
> - Remove wanna19 from default indexes.
> - Remove FredPluginHTTP interface.
> - Refactor web interface, new class ResultNodeGenerator, delete WebUI. 
> ResultNodeGenerator splits up results, will properly handle TermTermEntry's 
> and TermIndexEntry's, and less code duplication than old code.
> - Split YamlArchiver into FileArchiver and YamlReaderWriter, use 
> ObjectStreamReader/Writer (simpler than ObjectInput/Output) interfaces.
> - Freenet archiver. Initially write-only.
> - Web interface access to Tester, on-freenet tests.
> - Tester creates a FreenetArchiver, puts a load of test data, and pushes the 
> data, showing progress; progress test tests a map of ints to ints, push test 
> tests a smaller but more complex map of string to bunch of fake CHKs.
> - SimpleProgress changes: track parts done, parts known, estimated parts, 
> last status string, addTotal -> addPartKnown. if estimated parts == total, 
> then isTotalFinal() is true. enteredSerialiser/exitingSerialiser must be 
> called.
> - Imports.
> - Option to group USKs. Sort by relevance either way.
> - Show title (tooltip) of links as the url for the direct link and the [ USK 
> ].
> - Show relevance in tooltip for the main link.
> - Put search toadlet on the Browse Freenet menu.
> - Refetch on demand after a failure.
> - Make sure the search is removed after a TaskAbortException.
> - Factor out combine() from TermEntry into a method in ResultSet.
> - Remove WebUI references, reinstate FredPluginHTTP. **** merge fix hack
> - CSS fixes.
> - Detect when trying to do a phase search on an index with no position info.
> - Only show the full error trace with a big box if the error is other than an 
> InvalidSearchException.
> - Automatically determine index types by partially fetching the URI. 
> Initially implemented using KeyExplorer, detect via either mime type or 
> existence of the base filename for the index type. This is dubious, SEE BELOW.
> - UI for adding an index.
> - POST password check works a different way, can do a search from a POST but 
> not add an index.
> - Use L10n (even though it doesn't work).
> - Singleton-ise Library.
> - Split BIndexSerialiser into ProtoIndexComponentSerialiser and 
> ProtoIndexSerialiser.
> - Static map of one ProtoIndexComponentSerialiser per file format. Make many 
> members non-static. ProtoIndexSerialiser handles the overall ProtoIndex, 
> ProtoIndexComponentSerialiser helps to serialise the individual maps.
> - Add a testing yaml index to the test for the auto-detect code.
> - Pass the insert or fetch URI in to a push task / pull task via the 
> metadata, just as we pass in the filenames.
> - Remove most DataFormatException constructors to avoid misuse.
> - Static singleton ProtoIndexSerialiser in Library.
> - Pass a FreenetURI into getIndexType, not a String.
> - Wire in support for yaml indexes in Library.
> - Yaml indexes support for pulling from Freenet
> - Catch exceptions from getIndexType in parent method getIndex.
> - Server-side state based authorisation/confirmation scheme for adding 
> bookmarks (this makes sense because we can add a bookmark by changing the 
> name of the index).
> - Progress: ProgressParts object contains done, total, known, can presumably 
> be got atomically so avoids races. Remove RequestState from AbstractRequest. 
> Lots of related work. Request extends ChainedProgress, remove unneeded 
> methods. Eliminate AtomicProgress. New class BaseCompositeProgress, base 
> composite progress tracker for parallel tasks. ChainedProgress: series of 
> tasks.
> - BaseCompositeProgress.join() can see nulls because it is fed an iterable 
> over the tasks, which then looks up the progress's, which can be null before 
> the progress has started. After it has finished it will however still be 
> present as the iterator has a strong reference to the progress, and the 
> weakhashmap is weak on the keys - the tasks. Updates to SimpleProgress, lots 
> of code.
> - Various minor refactoring related to Progress, use the Progress interfaces 
> where possible, track what is needed for them in e.g. FindRequest.
> - More work on progress, refactoring.
> - System.currentTimeMillis, not new Date().getTime().
> - Always get the termpositions if only for relevance.
> - Avoid NPE displaying search page, show failures better.
> - Better auto-detect of URI vs filename for indexes.
> - Index type test: test the SSK URIs, not the USKs.
> - Refactor: ProtoIndexSerialiser.forIndex() not new ProtoIndexSerialiser(), 
> have a static map of serialisers by class, change 
> FreenetArchiver/FileArchiver parameters, minor constants refactoring, minor 
> refactoring in FileArchiver.
> - Bugfix in ParallelSerialiser.isActive(), no idea consequences, probably 
> none. Bugfix in SkeletonNode.rev().
> - Handle over 100% in splitfile progress better in Interdex index fetching.
> - Another fix to FreenetArchiver progress from splitfile code (derive 
> percentage from minSuccessful not total).
> - getTermEntriesHandler: combine metas and trackers into a LinkedHashMap, 
> track current tracker and meta.
> - Refactor Progress estimate type. UNKNOWN or FINALIZED (= known, but only 
> this enables finalizedTotal() and thus completion), or a specific value. 
> Finalized total must be set before the request completes. This is often done 
> automatically by SimpleProgress.addPartKnown(,true). Set total finalised in 
> other places.
> - Rename rootSize -> sizeRoot, add nodeMin, entMax, heightEstimate to BTree*.
> - Progress with number of btree levels lookup estimate on 
> getTermEntriesHandler.
> - Less efficient index type detection not using KeyExplorer (it was having to 
> download a non-staging binary, IMHO this was bad). Based on MIME type, 
> fetches [key]/, [key]/index.xml, [key]/index.yml.
> - Refactor FindRequest (XML index progress), use an enum not a string for 
> stages, keep an internal Progress impl up to date for the current stage, use 
> the index of the enum for the current stage to set the stage for 
> ProgressParts purposes. Use the enums when setting progress externally.
> - Handle failure with a new URI - USK@ updates, not enough path components 
> etc - for XML indexes.
> - Logging.
> - Improve progress bars a little, fix them for new Progress.
> - More CSS changes for progress bar.
> - Minor fixes in web interface, better error catching/handling/displaying.
> - Refactor web interface, now there is only MainPage.
> - Beginnings of Interdex distributed search implementation. Interdex, 
> InterdexQuery, IndexQuery. Document algorithms a bit more in comments.
> - Make fields private or final as needed.
> - AbstractRequest: implement join().
> - Web interface bugfix adding bookmarks.
> - Style fixes, use style in POST handler.
> - Search progress: don't return as composites, stick it all together, unless 
> it is different indexes being fetched. Probably looks better on search 
> progress. Implement status functions.
> - Fix some NPEs, simpler progress bar.
> - Synchronization in parsing.
> - Temporary hack, put toadlet on /plugins/plugin.Library.FreesiteSearch - we 
> need to let /library/ through the filter.
>
>
> platy
> infinity0
> sdiz
>

[freenet-dev] Library mega-code-review

Reply via email to