There are two considerations.How many typos are likely and how is the local filtering done.
If the local result filtering is not relaxed about typos of the sort "Woh" than it would make no sense at all to sort the consonants since non matching results would get filtered out anyway. If the local filtering can handle those typos it's still a question of COST vs. GAIN, and this decision will be left to your guts. One should consider though that most of the typos will probably happen during search input rather than when inserting a file. And I must say, if a program is smart enough to handle my search typos I am likely to be very pleased. You have a much better idea about the impact on the net so I can't really say anything about that. regards leo ps: I am wondering if you have an opinion about the matters that i am trying to talk about in the forum. On Wed, Jun 24, 2009 at 8:15 PM, Christian Grothoff <[email protected]>wrote: > I like this idea (at least as an option that should likely be the default) > and > have added it to the list of things to change for 0.9.x. What I wonder if > sorting the consonants should be omitted or not. Some statistics on bad > collisions with and without sorting would probably be nice to have... > > Christian > > On Tuesday 23 June 2009 07:27:17 leo stone wrote: > > I believe the biggest factor on how we judge a system for future > usability > > is how many results we get if we are looking for "something" like > > "something". > > Imagine a shoe shop, with only two pair of shoes in it. And one with a > few > > hundreds. > > > > The result in the end might be the same you leave both shop's not finding > > what you want, but most people will consider > > the shop with a hundred pairs more promising and worth spending time next > > time they try to find some shoes. > > > > So making sure people are getting results in their searches is probably > one > > of the more important issues, after > > my doubts about how the routing is handled. > > > > Even though it might mean some significant overhead, i would consider > doing > > something like normalizing keywords. > > If it must be, per language but in the beginning English should be > enough. > > > > So if i wanted to share the following file, and i would like it public, > so > > people can find it, why not store it such: > > > > "Woh_the.fuck_is ALICe(2008).divx.avi.WMV" => { HW , HT , CFK , S , CL > , > > 2008 , DVX , V , MVW } > > > > Put the file under the hash's of those nine "key words". > > > > When i seach now for "fuck alice" => { CFK , CL } > > > > search h(CFK) AND h(CL) will return a lot of wrong similar results but > > them one can filter locally in a more elaborate way. > > > > It might even be more selective than search h(video/x-msvideo) > > > > At least it returns results, whereas "Woh_the.fuck_is > > ALICe(2008).divx.avi.WMV" as a key word is very unlikely that any one > > would think to search for and therefore never be found, never be spread > > ....., except by chance of course. > > > > regards leo > >
_______________________________________________ GNUnet-developers mailing list [email protected] http://lists.gnu.org/mailman/listinfo/gnunet-developers
