Hi,

A previous e-mailed raised the question:

how to deal with lexicon
> that is necessarily changing over time. For instance, indexing
documents
> based
> on author. How could FASD deal with an author that has a first or last
> name
> not in the lexicon?

The standard technique given a term that is not in the lexicon is to
assign the highest possible discrimination value to that term.  This
term did not appear once in the representative document collection and
therefore is likely a good differentiator.  Thus, an author's name would
appear close to the top of a metadata key.

Regards,

Amr


***************************************************
Amr Z. Kronfol                        
Home: 609-986-9907
Cell: 206-856-7471
Fax:  443-339-2518
 
11 Frist Campus Center #1128
Princeton University
Princeton NJ 08544
***************************************************

> -----Original Message-----
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]
> On Behalf Of [EMAIL PROTECTED]
> Sent: Thursday, April 18, 2002 3:01 PM
> To: [EMAIL PROTECTED]
> Subject: freenet-tech digest, Vol 1 #286 - 3 msgs
> 
> Send freenet-tech mailing list submissions to
>       [EMAIL PROTECTED]
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>       http://lists.freenetproject.org/mailman/listinfo/tech
> or, via email, send a message with subject or body 'help' to
>       [EMAIL PROTECTED]
> 
> You can reach the person managing the list at
>       [EMAIL PROTECTED]
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of freenet-tech digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: Fuzzy Search (Spike Gronim)
>    2. 60 Million Emails inc. 600,000 UK $30, 30 Euros, �20         .
chngb
> (ybie.BulkEmailCd)
>    3. Freenet-like networks for distribution of public-key data (Nick
> Johnson)
> 
> --__--__--
> 
> Message: 1
> Date: Wed, 17 Apr 2002 21:05:23 -0400
> From: Spike Gronim <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: Re: [freenet-tech] Fuzzy Search
> Reply-To: [EMAIL PROTECTED]
> 
> On Fri, Apr 12, 2002 at 08:21:12AM -0400, Amr Kronfol wrote:
> > Hello all,
> >
> > I am an undergraduate computer science student at Princeton.  I just
> > completed the first draft of my thesis paper in which I describe
FASD, a
> > fault-tolerant adaptive, scalable, distributed search engine
inspired by
> > Freenet and designed with Freenet in mind.  I propose a protocol for
> > decentralized fuzzy search and then analyse the results from my
> > simulation.  The draft is somewhat long and a little rough on the
edges
> > but I would love to hear some feedback.  It is available at
> >
> > http://www.cs.princeton.edu/~akronfol/kronfol_thesis_draft.ps
> >
> 
>       A practical question that comes to mind is how to deal with
lexicon
> that is necessarily changing over time. For instance, indexing
documents
> based
> on author. How could FASD deal with an author that has a first or last
> name
> not in the lexicon?
> 
> >
> > Warm regards,
> >
> > Amr Kronfnol
> >
> > ***************************************************
> > Amr Z. Kronfol
> > Home: 609-986-9907
> > Cell: 206-856-7471
> > Fax:  443-339-2518
> >
> > 11 Frist Campus Center #1128
> > Princeton University
> > Princeton NJ 08544
> > ***************************************************
> >
> >
> >
> > _______________________________________________
> > freenet-tech mailing list
> > [EMAIL PROTECTED]
> > http://lists.freenetproject.org/mailman/listinfo/tech
> 
> --
> 
> 
>       --Spike Gronim
>         [EMAIL PROTECTED]
> 
> 
> --__--__--
> 
> Message: 2
> From: ybie.Bulk Email Cd <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED],
>       [EMAIL PROTECTED]
> Cc:
> Date: Thu, 18 Apr 2002 04:35:01 +0100
> Subject: [freenet-tech] 60 Million Emails inc. 600,000 UK $30, 30
Euros,
> �20         . chngb
> Reply-To: [EMAIL PROTECTED]
> 
> Bulk Email CD just US$30, or 30 Euros, or �20 sterling  inc. postage
and
> contains:
> 
> 60 Million World wide email addresses in text format.
> 600,000 Verified UK email addresses
> 
> The email addresses have been split and sorted, with seperate easily
> indentifiable files for each country, and major domains.
> 
> 
> TO PURCHASE: Please send either a Cheque/Worldwide Money
Order/Travellers
> Cheque; Payable to "Teletech" for either US$30 or 30 Euros or �20
sterling
> to:
> 
> Teletech
> Office 434
> 405 kings Rd
> London SW10 OBB
> United kingdom
> 
> The CD has simple instuctions and will be sent by first class post as
soon
> as your money has been received.
> 
> Please Remember to supply a perfectly readable return address.
> 
> vhgxpitaamfaxthiesmbiqlttaaifxs
> 
> 
> --__--__--
> 
> Message: 3
> From: "Nick Johnson" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Date: Thu, 18 Apr 2002 22:51:05 +1200
> Subject: [freenet-tech] Freenet-like networks for distribution of
public-
> key data
> Reply-To: [EMAIL PROTECTED]
> 
> Pondering the whole public-key integrity & man in the middle attack
> problem,
> a thought occurred to me:-
> It should be possible to devise a peer-based system similar to freenet
for
> the distribution of public-key data in an attack resistant manner.
> Essentially, such a system could use encrypted links between each peer
in
> the system, where the shared-secret for each link is initially
exchanged
> using public-key crypto. To prevent the attack being shifted to the
> replacement of the public-keys of the nodes, the public key of the
first
> node any new node connects to could be transferred over a trusted
channel,
> or the fingerprint of the key could be confirmed. Once the first few
> trustworthy keys are established, the link can then be used to
transfer
> the
> public keys of other nodes in the network the client wishes to connect
to,
> and the network can then be used to request any public-key that has
been
> placed on the network (perhaps indexed by email address, so plugins
could
> be
> made for major mail clients to securely retrieve public keys of any
person
> you wish to send data to.
> Each client in the network can store, along with the key itself, data
on
> the
> trustworthiness of that public key, based on what sources it was
obtained
> from, and whenever a key is requested, the trustworthiness value
depends
> on
> how many channels the key was recieved on, and the trustworthiness
value
> of
> each. - This system could be refined further to give an accurate idea
of
> how
> trustworthy a given key is.
> Since the keys are delivered over multiple different links through the
> network, and the actual links are encrypted (with the links directly
or
> indirectly verified over a secure channel such as a telephone
conversation
> or physical meeting), replacing or corrupting a key would require that
at
> least one node on every path from datastore to requester be malicious,
a
> feat that, in any reasonable sized and well-connected network should
be
> next-to impossible. Naturally, intefering would be easier, as any
> malicious
> node could return a key of it's own, but this is certain to be
detected
> since multiple different keys would be returned for a request.
> I realise this system will not give perfect trustworthiness, but I
think
> it
> could be a massive improvement on systems such as http-requests to
> retrieve
> public keys.
> Questions, comments? Does anyone see this as a practical or desirable
> scheme? Does anyone see obvious flaws or reasons this would not work?
> 
> Thanks,
> 
> Nick Johnson
> 
> --Crossposted to sci.crypt and the freenet-tech mailing list--
> 
> 
> 
> 
> --__--__--
> 
> _______________________________________________
> freenet-tech mailing list
> [EMAIL PROTECTED]
> http://lists.freenetproject.org/mailman/listinfo/tech
> 
> 
> End of freenet-tech Digest


_______________________________________________
freenet-tech mailing list
[EMAIL PROTECTED]
http://lists.freenetproject.org/mailman/listinfo/tech

Reply via email to