Re: Best Practice: emails and file-attachments

2006-08-16 Thread John Haxby
Oh rats. Thunderbird ate the indenting. The two examples should be: multipart/alternative text/plain multipart/related text/html image/gif image/gif application/msword and multipart/related text/html image/

Re: Best Practice: emails and file-attachments

2006-08-16 Thread John Haxby
lude wrote: You also mentioned indexing each bodypart ("attachment") separately. Why? To my mind, there is no use case where it makes sense to search a particular bodypart I will give you the use case: [snip] 3.) The result list would show this: 1. mail-1 'subject' 'Abstract of the messa

Re: Best Practice: emails and file-attachments

2006-08-16 Thread lude
Hi Johan, thanks again for the many words and explanations! You also mentioned indexing each bodypart ("attachment") separately. Why? To my mind, there is no use case where it makes sense to search a particular bodypart I will give you the use case: 1.) User searches for "abcd" 2.) Luc

Re: Best Practice: emails and file-attachments

2006-08-16 Thread John Haxby
lude wrote: Hi John, thanks for the detailed answer. You wrote: If you're indexing a multipart/alternative bodypart then index all the MIME headers, but only index the content of the *first* bodypart. Does this mean you index just the first file-attachment? What do you advice, if you have to

Re: Best Practice: emails and file-attachments

2006-08-16 Thread lude
essage- From: lude [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 15, 2006 10:29 AM To: java-user@lucene.apache.org Subject: Best Practice: emails and file-attachments Hello, does anybody has an idea what is the best design approch for realizing the following: The goal is to index

Re: Best Practice: emails and file-attachments

2006-08-16 Thread lude
Hi John, thanks for the detailed answer. You wrote: If you're indexing a multipart/alternative bodypart then index all the MIME headers, but only index the content of the *first* bodypart. Does this mean you index just the first file-attachment? What do you advice, if you have to index mulitp

RE: Best Practice: emails and file-attachments

2006-08-15 Thread Dejan Nenov
l. Dejan -Original Message- From: lude [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 15, 2006 10:29 AM To: java-user@lucene.apache.org Subject: Best Practice: emails and file-attachments Hello, does anybody has an idea what is the best design approch for realizing the following: The goal i

Re: Best Practice: emails and file-attachments

2006-08-15 Thread John Haxby
lude wrote: does anybody has an idea what is the best design approch for realizing the following: The goal is to index emails and their corresponding file attachments. One email could contain for example: I put a fair amount of thought into this when I was doing the design for our mail server -

Best Practice: emails and file-attachments

2006-08-15 Thread lude
Hello, does anybody has an idea what is the best design approch for realizing the following: The goal is to index emails and their corresponding file attachments. One email could contain for example: 1 x subject 1 x sender-address 1 x to-addresses 1 x message-text 0..n x file-attachments (each