Re: [freenet-dev] RFC: WOT event-notifications FCP API

xor Tue, 29 Oct 2013 20:48:26 -0700

[re-arranged part of the quotes for better structure of the reply, I don't 
want to answer to the same question in three different areas]

On Monday, October 28, 2013 12:29:51 AM Steve Dougherty wrote:
> >> Short summary of what event-notifications provides: Before
> >> event-notifications, a WOT client would be implemented by periodically
> >> polling WOT for identities / trusts / scores. This caused a lot of
> >> database queries and made everything slow and laggy.
> >> With event-notifications, clients of WOT subscribe to types of objects
> >> which they are interested in: Identities / Trusts / Scores / and in the
> >> future introduction puzzles. WOT will then send all known objects at
> >> time of subscripion once, and afterwards keep the client up to date by
> >> only sending single objects as they are changed (= as an event happens).
> >> This causes database
> >> and network traffic only to happen when there is an actual need for it.
> 
> This seems concerning to me. It is a very odd client application that
> wants to pull in WoT's entire view of the web of trust. I assume this
> means the initial state of all things matching a query, and
> notifications are sent whenever the results change?

Yes, when something changes, you only get notified about the object which has 
changed, not about objects of the same type which were not changed.
For example if you subscribe to the set of all identities, a change-
notification will only contain the single identity which has changed.

> Having the ability to subscribe to everything or an entire category of
> things - though very odd - I am okay with, if only because it's
> convenient for debugging. A condition of "*" or something would be fine.

> [...]

> >> Reply:
> >> The reply consists of two separate FCP messages:
> >> The first message is "Message" = "Identities" | "Trusts" | "Scores".
> >> It contains the full dataset of the type you have subscribed to.
> >> By storing this dataset, your client is completely synchronized with WOT.
> >> Upon changes of anything, WOT will only have to send
> >> the single {@link Identity}/{@link Trust}/{@link Score} object which has
> >> changed for your client to be fully synchronized again.
> 
> WoT is synchronizing entire categories of stuff? Always sending
> everything is insanity. I really think WoT should allow much more
> selectiveness. The concept of keeping state up to date with an initial
> synchronization and subsequent notifications is great, but this
> returning too much information in a way that requires a lot of
> additional processing to be useful. Requiring such processing invites
> another wave of duplicate implementations of WoT-wrangling code in clients.
> 
> Were I writing a client I think I'd be interested in WoT sending me
> updates on things like "identities with a Sone context that have a score
> higher than 0 from the perspective of local identity XYZ."

The assumption that sending the full dataset at start of a subscription is bad 
will  become the #1 FAQ entry about event-notifications once I am able to edit 
in the wiki :) - I've already had this discussion for an hour on IRC with 
someone (I think digger3), I hope that I'm able to explain it better now.
So please read carefully:

It is NOT only for debugging purposes, and it is not insane. Forget about WOT 
for a moment and think of a database as a set of objects, entities, files, 
things, whatever. 
If you have a set of things, and you want to keep synchronized copies of that 
set among different machines, you CAN keep it up-to-date by only sending single 
elements of the set as they change -  BUT you can only do that if you knew 
what the initial state was, i.e. the FULL initial set. This is not an 
implementation detail. Its logically impossible to build a full view of 
something if you ONLY get notified about changes. You HAVE TO know the initial 
state for the change-only notifications to be able to provide you with a full 
view of everything.
Lets look at this from a different example which you have worked in length 
with: Version control, for example Git.
If you update a git repository by "git pull", you WILL only download changes, 
the "diffs" of each commit. But that only works if you did a "git clone" 
before. And what does git clone do? It copies the WHOLE repository. 
Would it be possible to compile the code WITHOUT a clone, just by having the 
diffs? No way. Only having random pieces of modifications of a bunch of source 
code produces random garbage which is not equal to the full source code.

What CAN be optimized about the process of initial synchronization:
1) Re-defining what "the WHOLE dataset" means by introducing filters on 
subscriptions (not implemented yet) as you suggested. The filter will cut down 
the size of the  dataset, while still having it be a logical unit which is 
atomically  meaningful.
In version control terms this might mean that you only clone 
certain branches of a repository. A branch will compile cleanly even though it 
is only part of a project - because it is a senseful logical unit.
WOT examples A publish-only client like FlogHelper only needs to know about 
the user's own identities.  A typical WOT instance will only have a handful of 
user-owned identities, which is completely acceptable to be transferred as a 
whole. So the filter would be "Only OwnIdentity objects".
Another WOT example: Clients which plan to fetch data from remote identities 
will only want to know about the identities which WOT actually considers as 
trustworthy based upon its core goal of computing a rating ("score"). So there 
will be a filter for receiving only identities with a positive score.
2) In the near future, I should implement "persistent" subscriptions. 
Persistent in our case means that you  can restart WOT or the client which 
filed the subscription WITHOUT re-creating  a fresh subscription, and therfore 
WITHOUT a new synchronization of the  database content. 
While WOT is running but the client is not connected, WOT will just continue 
storing events for it, up to a certain timeout of lets say  1 hour, after 
which the subscription will be killed. So when the client reconnects, 
synchronization of the initial dataset does not have to happen again. It can 
instead just send out all events which happened while the client was offline.
This should be sufficient to guarantee that in normal operation of a node, 
there 
will only be ONE synchronization of the database per plugin - when loading it 
for the first time. Downtime of plugins typically only happens when the node is 
restarted or a plugin is updated, and it shouldn't be more than an hour I 
guess. The bugtracker entry for that is here: 
https://bugs.freenetproject.org/view.php?id=6113

One more thing: You could argue that introducing "filters" on the initial 
dataset is semantically identical to just allowing the client to request WOT 
to NOT send any initial dataset: You could set a filter which filters by date 
and filters out EVERYTHING before the date of subscription.
And thereby, you could request me to just implement a flag "No intial 
synchronization please" to allow the client to request no synchronization. But 
I do think that it is wise to make synchronization mandatory and require 
clients to use "filters" instead: If someone has not understood that the 
fundamental nature of synchronizing data between machines via diffs requires 
some kind of initial state synchronization, he should not yet be writing a 
client which is driven by  change-notifications. 
So having the synchronization be mandatory induces client authors to notice 
their misconception about synchronization which I just elaborated a lot about. 
Hopefully they will wind up to read a FAQ entry then :)
Further, the "filter out EVERYTHING before start of subscription" filter is 
just 
a special case of synchronization where the full dataset at start of the 
subscription is *empty*. So having filters in addition to initial 
synchronization is the most generic implementation possible.

> > The initial state could be huge. Please don't send it as part of the FCP
> > message's SimpleFieldSet. Either split it up into multiple messages
> > (e.g. one per identity; but then make sure there is some way of telling
> > that you got all of them), or (probably better) put it in a Bucket i.e.
> > a data field.
> 
> I will go further and say I think it very rarely makes sense for a
> client application to have to pull in (and keep up-to-date with!) WoT's
> entire state in the first place. I would expect queries and subscribing
> to changes in their results.

Forum systems such as Freetalk *do* want to give every identity which is 
trusted a chance to post messages, so they need to know about them. That will 
apply to most identities in the database. So synchronizing almost the full set 
of identities will happen.

> >> I had a better idea though: The end of this  mail will contain a dump of
> >> the FCP traffic between a client of event-notifications and WOT. So you
> >> can have a look at how the communication typically looks.
> >> Before that, there will be a copy-pasta of the JavaDoc of the "Subscribe"
> >> FCP function - this is the initial function which enables
> >> event-notifications. Its JavaDoc will give you a rough idea of how to
> >> interpret the following FCP dump.
> >> 
> >> NOTICE: Even though by this mail significant effort is being spent on
> >> making FCP easy to use by client authors, you should only implement an
> >> actual FCP client as a last resort: I've done the job of implementing a
> >> Java class
> >> "FCPClientReferenceImplementation" which serves the purpose of providing
> >> a
> >> reference client to event-notifications. This took around 3 weeks, is
> >> ~1000
> >> lines and very thoroughly tested. Please use it.
> 
> Very nice! I think this is an excellent layer of separation to have.

Hooray :)

> Client applications should not always have to deal with parsing text to
> do WoT things. Maybe this interface could even be cleanly replaced in
> the future with something more direct where appropriate like OSGi?

Whats visible of it from the outside of FCPClientReferenceImplementation is 
very primitive, basically
- start() / stop()
- subscribe() / unsubscribe()
- Generic single-function interface for receving the initial synchronization, 
the function is "handleSubscriptionSynchronization(Collection<T> )". T is the 
type to which you subscribe()
- Generic single-function interface for receiving change notifications. 
handleSubscribedObjectChanged(ChangeSet<T>). T again is the type to which you 
subscrived. The ChangeSet is a pair which provides a copy of the changed 
object before and after the change.

So it doesn't touch any specifics of FCP, almost not even those of WOT - the 
chances  of being able to use a different backend than FCP are nice.
Also, the backend implementation of event-notifications, i.e. the "server", 
isn't hardcoded to only work with FCP, the type of client is already 
abstracted into an enum, and FCP is just one of the types.

> >> FCP "Subscribe" JavaDoc (removed some uninteresting parts):
> >> -------------------------------------------------------------------------
> >> ----- Processes the "Subscribe" FCP message, filing a {@link
> >> Subscription} to event-{@link Notification}s via {@link
> >> SubscriptionManager}.
> 
> So far so good.
> 
> >> Required fields:
> >> "To" = "Identities" | "Trusts" | "Scores" - chooses among {@link
> >> IdentitiesSubscription} / {@link TrustsSubscription} /
> >> {@link ScoresSubscription}.
> 
> It took me a few minutes to understand that this means that the "To"
> field can be one of the values "Identities", "Trusts", or "Scores". It'd
> be helpful if it was clearer somehow. Maybe: To =
> [Identities|Trusts|Scores] ? Maybe just English.

I've fixed the JavaDoc to use "or". Thank you.

> Is my understanding correct that this is an internal WoT function for
> parsing FCP from clients, and not a part of the reference event
> notifications client? It's just for understanding the FCP fields in use?

It is an internal function of the FCP server, yes. Currently, we have no 
separate place for documenting the API of FCP calls. Instead, people who want 
to write a FCP client are told to read the JavaDoc of the internal FCP server 
functions which handle the FCP calls which they want to use.

> >> This message is send via the synchronous FCP-API: You can signal that
> >> processing it failed by returning an error in the FCP message processor.
> >> This allows your client to be programmed in a transactional style: If
> >> part of the transaction which
> >> stores the dataset fails, you can just roll it back and signal the error
> >> to
> >> WOT. It will rollback the subscription then and
> >> send an "Error" message, indicating that subscribing failed. You must
> >> file
> >> another subscription attempt then.
> 
> This sounds like it's just for debugging purposes. Is there a way in
> which failure to apply an incremental change to the state is not a bug?

No, its NOT just for debugging.
Transactional programming means that you assume that the contents of your 
database are always 100% valid in terms of logically making sense. 
You are free to do partial changes to the database which cause a non-senseful 
state for as long as after the transaction the database is 100% semantically 
correct again. Disruptive changes are always wrapped in a transaction, and the 
transaction is either commited fully or thrown away completely. 
So transactions are atomic.
Leaving out a single part of a event-notifications subscription chain means 
that the semantic integrity of the client would be damaged.:The data does not 
match the actual WOT database contents anymore if you leave out the initial 
syncronization or an event in the event-chain!

The good thing about transactions it that they can guard against failure of 
LOTS of things in a complex process. If a minor very low level detail of low 
level code fails, you just throw out an exception to the high level code. It 
will rollback() the transaction, and event-notifications will cause it to 
happen again. While progamming transactionally, you basically assume that any 
code can fail, even if you cannot come up with an example of WHY it might 
fail. . 
The core point I'm trying to make is that the "top level" code which actually 
initiates the transactions is what makes that assumption. It considers the 
actual workers which do the processing of the transaction as little magic 
black boxes which can fail for weird reasons, and are free to do so. It might 
be out of memory, or whatever. You throw one worker away, and let the next try 
to do the transaction again. The top level code in our case is the event-
notifications server of WOT: It allows event-processors to fail, and can resend 
the event then.
A simple example of something which can and will happen in practice: Shutdown. 
At shutdown, there might be threads running to process event-notifications. 
Some of them might be in "waiting" state, waiting for a lock on the database 
where they store identities/trusts/scores. The database might be terminated in 
between, so the waiting event-processors cannot process the event. As event-
notifications might be persistent in the future, it is necessary that WOT 
resents the aborted event-notification as the client plugin is started again.

> >> The second message is formatted as:
> >> "Message" = "Subscribed"
> >> "SubscriptionID" = Random {@link UUID} of the Subscription.
> >> "To" = Same as the "To" field of your original message.
> 
> I'm curious why this ID is not in the first message? Why two messages?

If you indicate failure of processing the intial synchronization message, you 
are not subscribed because you don't have the full dataset available and it is 
impossible to obtain it without the initial synchronization.

> >> Errors:
> >> If you are already subscribed to the selected type, you will only receive
> >> a
> >> message:
> >> "Message" = "Error"
> >> "Description" =
> >> "plugins.WebOfTrust.SubscriptionManager$SubscriptionExistsAlreadyExceptio
> >> n"
> >> "SubscriptionID" = Same as in the original "Subscribed" message
> >> "To" = Same as you requested
> >> "OriginalMessage" = "Subscribe"
> 
> Would it be appropriate to have an error code so that some error types
> can be machine-readable? I remember running into difficulties with this
> sort of thing with my Infocalypse patches.

The "Description" is the actual class name of the exception, so it should be 
safe to assume that the same literal value is always used.

> >> {@link Notification}s:
> >> Further  messages will be sent at any time in the future if an {@link
> >> Identity} / {@link Trust} / {@link Score}
> >> object has changed. They will contain the version of the object before
> >> the
> >> change and after the change.
> 
> Is there a reason to contain the before? Is it to relieve the client of
> having to store all values to see how much the change was?

I didn't have any specific use in mind, I just thought that it was easy to 
implement and nice to have.
One example for its use might be: It allows the client to easily monitor 
special event types which it has invented on its own and which are constituted 
by a certain member value of identity/trust/score having changed. For example 
you might be interested in a certain "property" of an identity having changed. 
Remember: Identity "properties" are key/value pairs which are intented as 
primitive storage for client applications. For example FlogHelper stores which 
flogs an identity publishes in those k/v pairs. A client might want to notice 
you once your favorite identities publish a new flog.

Also, for debugging purposes, it allows to check your own database contents 
with every message. This *is* implemented by DebugFCPClient.

> >> These messages are also send with the synchronous FCP API. In opposite to
> >> the initial synchronization message, by replying with failure to the
> >> synchronous FCP call, you can signal that you want to receive the same
> >> notification again.
> This is inconsistent,

It is, and that annoyed my while implementing it. I would rather have both the 
synchronization and the event-notifications be automatically resent. But 
automatic resending of the initial synchronization would be difficult to 
implement. Also, it is a LARGE dataset and therefore we should avoid sending 
it too often.

> and it is confusing to me. If there is not an
> error during transmission what could resending a message achieve? The
> client already got the message.

As explained above, the goal is to allow transactional programming in the 
client. A transactional, and as such in this case especially database-based 
client ONLY keeps its transactional database as storage: One of the point of 
using a database is to be able to store data which is so large that it doesn't 
fit into memory. Therefore, a truly database-based client doesn't keep ANYTHING 
in memory permanently. If an event causes the need for it do to something, it 
only queries from the database what is needed to process the event - which 
will fit into memory. Afterwards processing finishes, all in memory objects are 
flushed. For example consider forum systems such as Freetalk: If the user wants 
to display the threads in a forum, you only query the first 50 threads in the 
forum and display them as page 1. You do NOT load the other 1000000 pages into 
memory, they wouldn't fit and the user isn't viewieng them now anyway.

So if the transaction aimed at storing it to the database fails, there is no 
other place where the client can store it to retry the transaction, because 
conceptually it shall not keep ANYTHING in memory permanently. If it did keep 
a queue of failed event-notifications in memory, what happens if 100000 of them 
fail in a row? Out of memory! What happens if they had initially failed due to 
out of memory? Out of memory ^ 2.
The source of information which triggered the transaction must re-trigger it 
if it failed because it has stored its existence anyway. In our case the 
source is WOT.

> >> After a typical delay of {@link
> >> SubscriptionManager#PROCESS_NOTIFICATIONS_DELAY}, it will be re-sent.
> >> There is a maximal amount of {@link
> >> SubscriptionManager#DISCONNECT_CLIENT_AFTER_FAILURE_COUNT} failures per
> >> FCP- Client.
> >> If you exceed this limit, your subscriptions will be terminated. You will
> >> receive an "Unsubscribed" message then as long as your client has not
> >> terminated the FCP connection. See {@link
> >> #handleUnsubscribe(SimpleFieldSet)}. The fact that you can request a
> >> notification to be re-sent may also be used to program your client in a
> >> transactional style.
> >> If the transaction which processes an event-notification fails, you can
> >> indicate failure to the synchronous FCP sender and
> >> WOT will then re-send the notification, causing the transaction to be
> >> retried.
> The notion of resending enabling transactional state changes seems at
> odds with being unsubscribed after hitting a limit on them. 

This is necessary to guard against bugs which cause WOT not noticing that a 
client has disconnected. If the client has disconnected, its notifications' 
resinding will fail continously and WOT will just disconnect it due to that 
then.
If there was no such mechanism, ghost clients would cause the database to grow 
indefinitely from notifications which cannot be deployed.

> >> If your client is shutting down or not interested in the subscription
> >> anymore, you should send an "Unsubscribe" message.
> >> See {@link #handleUnsubscribe(SimpleFieldSet)}. This will make sure that
> >> WOT stops gathering data for your subscription,
> >> which would be expensive to do if its not even needed. But if you cannot
> >> send the message anymore due to a dropped connection,
> >> the subscription will be terminated automatically after some time due to
> >> notification-deployment failing. Nevertheless,
> >> please always unsubscribe when possible.
> 
> Does WoT not have visibility on when FCP connections close?
> How does WoT
> handle pushing messages over FCP? I had to jump through hoops to do that
> with Infocalypse.

What it keeps as a "connection" for being able to send to the client in a 
pushing manner is WeakReference<PluginReplySender> . PluginReplySender is the 
object which you get when a plugin uses a PluginTalker to send a message to 
you as a server. In other words: The client uses PluginTalker to send the 
original message to the server. The server's message handler gets called by 
the node, and the node gives it a PluginReplySender for being able to answer. 
By keeping a WeakReference<PluginReplySender>, WOT can keep the sender in 
memory for being able to deploy notifications in the future. Because it is a 
WeakReference, it will get garbage-collected if the client drops its 
PluginTalker. WOT also montiors a ReferenceQueue on the WeakReference objects, 
which allows it to notice if one of their pointers got GC'ed, and purge the 
WeakReference object itself.
However, this is not immediate, and it is not a strong mechanism - GC might 
happen very far in the future. 

So to answer the original question: The PluginTalker API just does not support 
explicit disconnection. I decided it would be easier to just deal with it as 
is than changing it.
Further, network connections can randomly drop dead, so you need to assume 
that both graceful disconnection as well as random death (= timeout) can 
happen. The "Unsubscribe" message is the graceful disconnection. The exceeding 
of the event-notification failure counter deals with random death.
Graceful disconnection is usually implemented in addition to timeout for 
performance reasons: The quicker we get told that the client is disconnected, 
the quicker we can stop gathering data for it. So that's why you are also able 
to unsubscribe.

> 
> >> -------------------------------------------------------------------------
> >> -----
> >> 
> >> FCP dump of a typical connection which subscribes to all types of
> >> objects::
> >> (Notice that the duplicate fields are for backwards compatibility with
> >> old
> >> clients.
> >> They are present even in event-notifications because the functions for
> >> generating
> >> FCP data are re-used in different areas of code.)
> >> -------------------------------------------------------------------------
> >> ----- ---------------- Fri Oct 25 02:14:07 CEST 2013 Connected.
> >> ---------------- ---------------- Fri Oct 25 02:14:07 CEST 2013 Sent:
> >> ---------------- Message=Subscribe
> >> To=Identities
> >> End
> 
> I guess it makes sense that it needs no more information if there are
> three predetermined things a client can subscribe to.

?

> >> ---------------- Fri Oct 25 02:14:08 CEST 2013 Received: ----------------
> >> Message=Identities
> >> Identities.Amount=5
> >> Identities.0.CurrentEditionFetchState=NotFetched
> >> Identities.0.ID=QeTBVWTwBldfI-lrF~xf0nqFVDdQoSUghT~PvhyJ1NE
> >> Identities.0.Identity=QeTBVWTwBldfI-lrF~xf0nqFVDdQoSUghT~PvhyJ1NE
> >> Identities.0.PublishesTrustList=true
> >> Identities.0.RequestURI=USK@QeTBVWTwBldfI-
> >> lrF~xf0nqFVDdQoSUghT~PvhyJ1NE,OjEywGD063La2H-
> >> IihD7iYtZm3rC0BP6UTvvwyF5Zh4,AQACAAE/WebOfTrust/1344
> >> Identities.0.Type=Identity
> >> Identities.0.Contexts.Amount=0
> >> Identities.0.Properties.Amount=0
> >> Identities.1.CurrentEditionFetchState=NotFetched
> >> Identities.1.ID=D3MrAR-AVMqKJRjXnpKW2guW9z1mw5GZ9BB15mYVkVc
> >> Identities.1.Identity=D3MrAR-AVMqKJRjXnpKW2guW9z1mw5GZ9BB15mYVkVc
> >> Identities.1.PublishesTrustList=true
> >> Identities.1.RequestURI=USK@D3MrAR-
> >> AVMqKJRjXnpKW2guW9z1mw5GZ9BB15mYVkVc,xgddjFHx2S~5U6PeFkwqO5V~1gZngFLoM-
> >> xaoMKSBI8,AQACAAE/WebOfTrust/4959
> >> Identities.1.Type=Identity
> >> Identities.1.Contexts.Amount=0
> >> Identities.1.Properties.Amount=0
> >> Identities.2.CurrentEditionFetchState=NotFetched
> >> Identities.2.ID=s88mAwLB3OW6mYlZ43XaHDM1K6QXosZ4QTt2UX-hq6s
> >> Identities.2.Identity=s88mAwLB3OW6mYlZ43XaHDM1K6QXosZ4QTt2UX-hq6s
> >> Identities.2.PublishesTrustList=true
> >> Identities.2.RequestURI=USK@s88mAwLB3OW6mYlZ43XaHDM1K6QXosZ4QTt2UX-
> >> hq6s,555tpw1TUReXUixAMDQD3RcD6gUKwOBCDQ6Dot2v6qg,AQACAAE/WebOfTrust/5
> >> Identities.2.Type=Identity
> >> Identities.2.Contexts.Amount=0
> >> Identities.2.Properties.Amount=0
> >> Identities.3.CurrentEditionFetchState=NotFetched
> >> Identities.3.ID=z9dv7wqsxIBCiFLW7VijMGXD9Gl-EXAqBAwzQ4aq26s
> >> Identities.3.Identity=z9dv7wqsxIBCiFLW7VijMGXD9Gl-EXAqBAwzQ4aq26s
> >> Identities.3.PublishesTrustList=true
> >> Identities.3.RequestURI=USK@z9dv7wqsxIBCiFLW7VijMGXD9Gl-
> >> EXAqBAwzQ4aq26s,4Uvc~Fjw3i9toGeQuBkDARUV5mF7OTKoAhqOA9LpNdo,AQACAAE/WebOf
> >> Trust/1270 Identities.3.Type=Identity
> >> Identities.3.Contexts.Amount=0
> >> Identities.3.Properties.Amount=0
> >> Identities.4.CurrentEditionFetchState=NotFetched
> >> Identities.4.ID=o2~q8EMoBkCNEgzLUL97hLPdddco9ix1oAnEa~VzZtg
> >> Identities.4.Identity=o2~q8EMoBkCNEgzLUL97hLPdddco9ix1oAnEa~VzZtg
> >> Identities.4.PublishesTrustList=true
> >> Identities.4.RequestURI=USK@o2~q8EMoBkCNEgzLUL97hLPdddco9ix1oAnEa~VzZtg,X
> >> ~vTpL2LSyKvwQoYBx~eleI2RF6QzYJpzuenfcKDKBM,AQACAAE/WebOfTrust/9379
> >> Identities.4.Type=Identity
> >> Identities.4.Contexts.Amount=0
> >> Identities.4.Properties.Amount=0
> >> End
> 
> It seems evident that actual responses will be orders of magnitude larger.

Yes.

> Why are both Identity and ID specified as the same thing? I was under
> the impression is is an identity ID - would just ID be okay? 
> IIRC it's already not going to be backwards-compatible with existing clients 
> due to the full stops between key segments.

Yes it is for backwards compatibility even though it still doesn't match what 
existing clients expect:
The low-level backwards-comptaible code which produces the stuff after the dot 
does not know about the high-level stuff which adds the dot.
Basically, there is a function "addIdentityFields(..., String prefix, String 
suffix)". Of course I could have chosen to have the function contain some weird 
logic which only adds certain fields for certain prefix/suffix combinations but 
I 
think thats ugly because it doesn't cleanly separate code into different areas 
of concern. Arguably you already did this with part of the function, but I 
really didn't want it to get even more complicated.

>  What are Contexts.Amount and Properties.Amount for? The number of each
> that are given? I think "Count" might be a clearer name.

I have been using "amount" as a synonym for "count" for years :( 
Is it really not a synonym?
I'm not a English native speaker, sorry.

> >> ---------------- Fri Oct 25 02:15:20 CEST 2013 Received: ----------------
> >> Message=IdentityChangedNotification
> >> AfterChange.Context0=Introduction
> 
> I think a key of AfterChange.Context.0 would be preferable for
> consistent separation of the number. Why isn't there an
> AfterChange.Context (without the trailing 0) like the other fields?

Legacy as well. orry. I should have stripped all  legacy stuff from the 
original message. The most recent code path produces the 
following (which you can see below is also part of the original message):
"AfterChange.Identities.0.Contexts.0.Name=Introduction"

> >> AfterChange.CurrentEditionFetchState=Fetched
> >> AfterChange.CurrentEditionFetchState0=Fetched
> 
> I'm confused why these and others are duplicated save for a 0 on the end
> of the key. 

Both those fields are also legacy. We have 3 syntaxes of which 2 are legacy. 
The one without 0 is from the oldest code path. Then someone suggested to 
number everything so parsers can eat both versions (you?). Then it was 
suggested to split everything with a dot.

> Where are the meanings of the possible values for fields
> like this documented?

I was reluctant to bloat the "Subscribe" documentation with full documentation 
of the Identity/Trust/Score syntax. So there currently is no documentation :| 
You could enforce me to write one at the underlying functions which generate 
the SFS data by asking me to do it.

But it can be clearly seen what is the most recent syntax by looking at the 
reference parser implementations IdentityParser / TrustParser / ScoreParser 
which are member classes of FCPClientReferenceImplementation:
https://github.com/freenet/plugin-WoT-
staging/blob/master/src/plugins/WebOfTrust/ui/fcp/FCPClientReferenceImplementation.java

> >> AfterChange.ID=WNOyZsnZtpFjwmwfVBqC1PhSeg-hErXHlkrR43h0tiU
> >> AfterChange.ID0=WNOyZsnZtpFjwmwfVBqC1PhSeg-hErXHlkrR43h0tiU
> >> AfterChange.Identity=WNOyZsnZtpFjwmwfVBqC1PhSeg-hErXHlkrR43h0tiU
> >> AfterChange.Identity0=WNOyZsnZtpFjwmwfVBqC1PhSeg-hErXHlkrR43h0tiU
> >> AfterChange.InsertURI=USK@AILRi~9nfD2pesTkeDvwZxe3cRmkY7Q00CUxQyUOVW-
> >> H,GzEIwcFQ78J7-RCzxgvY4Pfq0T8Lm4v0BazjMtkqT~8,AQECAAE/WebOfTrust/0
> >> AfterChange.InsertURI0=USK@AILRi~9nfD2pesTkeDvwZxe3cRmkY7Q00CUxQyUOVW-
> >> H,GzEIwcFQ78J7-RCzxgvY4Pfq0T8Lm4v0BazjMtkqT~8,AQECAAE/WebOfTrust/0
> >> AfterChange.Nickname=Alexandre_Umpleby
> >> AfterChange.Nickname0=Alexandre_Umpleby
> >> AfterChange.PublishesTrustList=true
> >> AfterChange.PublishesTrustList0=true
> >> AfterChange.RequestURI=USK@WNOyZsnZtpFjwmwfVBqC1PhSeg-
> >> hErXHlkrR43h0tiU,GzEIwcFQ78J7-
> >> RCzxgvY4Pfq0T8Lm4v0BazjMtkqT~8,AQACAAE/WebOfTrust/0
> >> AfterChange.RequestURI0=USK@WNOyZsnZtpFjwmwfVBqC1PhSeg-
> >> hErXHlkrR43h0tiU,GzEIwcFQ78J7-
> >> RCzxgvY4Pfq0T8Lm4v0BazjMtkqT~8,AQACAAE/WebOfTrust/0
> >> AfterChange.Type=OwnIdentity
> >> AfterChange.Type0=OwnIdentity
> 
> Would this API change be a place to change the name to LocalIdentity?

$ grep -R OwnIdentity WebOfTrust/src | wc -l
508

I don't think that we can/should ever change it. Are you 100% sure that it 
doesn't make any sense in terms of the English language?

> >> AfterChange.Contexts.Amount=1
> >> AfterChange.Contexts.0.Name=Introduction
> >> AfterChange.Contexts0.Amount=1
> >> AfterChange.Contexts0.Context0=Introduction
> 
> I notice - for example - "Contexts" is plural and in the initial sync
> message "Identity" is singular.

I cannot find the singular you are talking about.

> It'd make sense to me to make the
> components consistently one way or another. I have a preference for
> singular because it's shorter.

It is plural to:
- because the code contains something like 
"subscribeTo(SubscriptionType.Identities)" and I did want the enum names to 
match the string literals and "subscribeTo(Identity)" makes less sense
- it DOES contain multiple identities.

> >> AfterChange.Identities.Amount=1
> >> AfterChange.Identities.0.Context0=Introduction
> >> AfterChange.Identities.0.CurrentEditionFetchState=Fetched
> >> AfterChange.Identities.0.ID=WNOyZsnZtpFjwmwfVBqC1PhSeg-hErXHlkrR43h0tiU
> >> AfterChange.Identities.0.Identity=WNOyZsnZtpFjwmwfVBqC1PhSeg-hErXHlkrR43h
> >> 0tiU
> >> AfterChange.Identities.0.InsertURI=USK@AILRi~9nfD2pesTkeDvwZxe3cRmkY7Q00
> >> CUxQyUOVW-
> >> H,GzEIwcFQ78J7-RCzxgvY4Pfq0T8Lm4v0BazjMtkqT~8,AQECAAE/WebOfTrust/0
> >> AfterChange.Identities.0.Nickname=Alexandre_Umpleby
> >> AfterChange.Identities.0.PublishesTrustList=true
> >> AfterChange.Identities.0.RequestURI=USK@WNOyZsnZtpFjwmwfVBqC1PhSeg-
> >> hErXHlkrR43h0tiU,GzEIwcFQ78J7-
> >> RCzxgvY4Pfq0T8Lm4v0BazjMtkqT~8,AQACAAE/WebOfTrust/0
> >> AfterChange.Identities.0.Type=OwnIdentity
> >> AfterChange.Identities.0.Contexts.Amount=1
> >> AfterChange.Identities.0.Contexts.0.Name=Introduction
> 
> The ".Name" part seems unnecessary to me.

Mmh I think that is too little of a change for doing actual touching of the 
code. And contexts do might receive additional attributes once more complex 
stuff is implemented such as per-context trust.
[Per-context trust might allow you to rate the behavior of identities only for 
certain client applications. While it would be an optimization in some areas, 
it would probably be a full rewrite of the most of WOT, and is unlikely to 
happen. It has been demanded by Matthew rather often though for the 
performance impact. I think it would overcomplicate things. Further, either 
someone is a spammer or he isn't. If a person spams in one client application, 
why would you want to trust him in another?]

> >> AfterChange.Identities.0.Properties.Amount=1
> >> AfterChange.Identities.0.Properties.0.Name=IntroductionPuzzleCount
> >> AfterChange.Identities.0.Properties.0.Value=10
> 
> How about AfterChange.Identity.0.Property.IntroductionPuzzleCount=10 ?

Mmh nice catch. Properties are key/value pairs, SFS as well, so you suggest to 
use the key as the key and the value as the value - good idea.

But it is susceptible to mismatch of allowed string contents in the key-space 
of properties and the key-space of SFS. 
Typically, key/value implementations chose well-definied, small allowed spaces 
for keys, while allowing arbitrary crap in the values. So the as-is code 
guards againts SFS being more restrictive in key-space than properties by only 
using a well-defined key of "Name" / "Value", while shoving all user-definied 
data into the value only. Remember, properties are downloaded from the 
network, their keys are at arbitrary choice of the users. There is some 
limiting, but I still would have to review it.

> >> AfterChange.Identities.0.Property0.Name=IntroductionPuzzleCount
> >> AfterChange.Identities.0.Property0.Value=10
> >> AfterChange.Properties.Amount=1
> >> AfterChange.Properties.0.Name=IntroductionPuzzleCount
> >> AfterChange.Properties.0.Value=10
> >> AfterChange.Properties0.Amount=1
> >> AfterChange.Properties0.Property0.Name=IntroductionPuzzleCount
> >> AfterChange.Properties0.Property0.Value=10
> >> AfterChange.Property0.Name=IntroductionPuzzleCount
> >> AfterChange.Property0.Value=10
> >> BeforeChange.Type=Inexistent
> >> BeforeChange.Type0=Inexistent
> >> BeforeChange.Identities.0.Type=Inexistent
> 

> Why isn't there a BeforeChange for every AfterChange?

Because it "inexistent" that the whole object did not exist, not that a member 
value did not exist. It means that before there was no such identity / trust / 
score before (or after).
So I chose one arbitrary field to indicate that the whole object didn't exist. 
I used the one which sounded the most synonymous with the whole object while 
also being a field which can never be null for an existing object. Identity 
objects always have a "Type", trusts and scores always have a "Value", so I 
chose those for indicating inexistance.

If you can come up with a field name which isn't used for any actual data and 
is good for indicating whether it exists or not, suggest it please. Before you 
think about it, read my reply to that please:

> It looks like "inexistent" is actually a word, but it's not one that I
> knew. I'd have expected "nonexistent."
> T his may be a place where
> including the "before" value can lead to odd corner cases, and I wonder
> if it's worth including. Instead of a word, would it make sense to have
> an empty string?

Maybe I should just set "BeforeChange.Identities.Amount=0" to make inexstance 
very clear, eliminating the whole need for any attribute fields 
BeforeChange.Identities.0.* of the non-existant identity?

Thanks for your long review and for reading my very long reply :)
_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] RFC: WOT event-notifications FCP API

Reply via email to