Re: [expert] Mail formats revisited... and xml?

2003-06-09 Thread Stephlub
Thank you for your answer and explainations.
It think the problem harder than it looked like to me.

I told XML, cause it is universal, but anything easy to manipulate and
transform could be fine.
My problem is how to manage hundred and hundred email I need to archive and
access quickly, and how to convert old outlook archives to the best format I
can do (fast and compact). Mbox or Mail and tar/gzip could be what I'm
looking for ;-)
The Kmail converter crashes when I want to convert these old archives.

I 'll try grepmail (time for me to learn a bit of perl).

> Not likely -- XML would not be a good fit. XML is a great solution for
> getting structured data out of one system into another system, because
> it allows you to define the meta data attributes right there with the
> raw data. XML is a lousy solution for storing unstructured data that you
> don't intend to send to a foreign system. Email has some loose structure
> in the header, but the part you probably care about, body and
> attachments, has practically no structure at all and can only be
> indexed/searched with brute force. Since it'll be brute forced any way,
> why add the bulk of XML?
>
> Using a SQL database might make a little more sense, until you start to
> think about how to build the tables and realize that this is putting raw
> data into a system designed to hold meta data. Now building a SQL
> database that indexes the meta data and spits out pointers to the raw
> data would make more sense, if you can think of a way to extract useful
> meta data from body and attachments without just throwing the whole
> damned mess into the database.
>
> There are two solutions that make sense to me:
>
> 1) leave everything in plain text mbox or maildir on a hard disk. When
> you want to find something, use Unix tools. For instance,
> #!/bin/sh
> # This is a wrapper to the grepmail Perl script which searches mail.
> # The wrapper will take regexp from the commandline, recurse through
> # a mail folder, then put the results into a new box: "results.$TERM".
> # If no parameters, show proper usage and fail.
> if [ $# -lt 1 ] ; then
> echo "Usage: grepmymail \"singleterm\""
> echo "Usage: grepmymail \"(1term|2terms|3terms)\""
> exit 2
> fi
> # Options: -R is recursive, -m adds a header line showing the mailbox
> # the message was found in, -M skips MIME attachments, and -b searches
> # bodies, not headers.
> for TERM in $1; do
> grepmail -RmMb $TERM $HOME/mail > /tmp/results.$TERM
> mv /tmp/results.$TERM $HOME/mail/
> done
>
> 2) Use Evolution and create vfolders when you want to look for
> something. Note that these are not mutually exclusive as Evolution keeps
> everything in plain text formats any way.
>
> Jack
>
> On Mon, 2003-06-09 at 16:28, Stephlub wrote:
> > is anybody can help me?
> > > ...and about an xml converter
> > > mbox2xml maildir2xml anyproprietary2xml
> > > (mailxml2mbox mailxml2maildir mailxml2sql...)
> > >
> > > 1 I think it very useful: convert and archive mail with xml for
cataloging
> > > reasons, from kmail
> > > 2 less important: make kmail (and others) able to read this xml
> > >
> > > this could be wery interesting for sorting cataloging and fist:
archiving
> > > and have best access to archives
> > > I can't figure out how to archive my emails and access it like with a
DB.
> > > Just make a small search seems to be impossible with kmail to me.
> > > I used outlook and even if it's hard to manage archives, i could do
> > > recursive search for mail of 2 years old... further true db management
> > could
> > > be great!
> > >
> > > This feature could exists yet. Just tell me.
> >
> >
> >
> > __
> >
> > Want to buy your Pack or Services from MandrakeSoft?
> > Go to http://www.mandrakestore.com
> --
> Jack Coates
> Monkeynoodle: A Scientific Venture...
> http://www.monkeynoodle.org/resume.html
>
>
>






> Want to buy your Pack or Services from MandrakeSoft?
> Go to http://www.mandrakestore.com
>


Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


Re: [expert] Mail formats revisited... and xml?

2003-06-09 Thread Jack Coates
Not likely -- XML would not be a good fit. XML is a great solution for
getting structured data out of one system into another system, because
it allows you to define the meta data attributes right there with the
raw data. XML is a lousy solution for storing unstructured data that you
don't intend to send to a foreign system. Email has some loose structure
in the header, but the part you probably care about, body and
attachments, has practically no structure at all and can only be
indexed/searched with brute force. Since it'll be brute forced any way,
why add the bulk of XML?

Using a SQL database might make a little more sense, until you start to
think about how to build the tables and realize that this is putting raw
data into a system designed to hold meta data. Now building a SQL
database that indexes the meta data and spits out pointers to the raw
data would make more sense, if you can think of a way to extract useful
meta data from body and attachments without just throwing the whole
damned mess into the database.

There are two solutions that make sense to me:

1) leave everything in plain text mbox or maildir on a hard disk. When
you want to find something, use Unix tools. For instance,
#!/bin/sh
# This is a wrapper to the grepmail Perl script which searches mail.
# The wrapper will take regexp from the commandline, recurse through
# a mail folder, then put the results into a new box: "results.$TERM".
# If no parameters, show proper usage and fail.
if [ $# -lt 1 ] ; then
echo "Usage: grepmymail \"singleterm\""
echo "Usage: grepmymail \"(1term|2terms|3terms)\""
exit 2
fi
# Options: -R is recursive, -m adds a header line showing the mailbox
# the message was found in, -M skips MIME attachments, and -b searches
# bodies, not headers.
for TERM in $1; do
grepmail -RmMb $TERM $HOME/mail > /tmp/results.$TERM
mv /tmp/results.$TERM $HOME/mail/
done

2) Use Evolution and create vfolders when you want to look for
something. Note that these are not mutually exclusive as Evolution keeps
everything in plain text formats any way.

Jack

On Mon, 2003-06-09 at 16:28, Stephlub wrote:
> is anybody can help me?
> > ...and about an xml converter
> > mbox2xml maildir2xml anyproprietary2xml
> > (mailxml2mbox mailxml2maildir mailxml2sql...)
> >
> > 1 I think it very useful: convert and archive mail with xml for cataloging
> > reasons, from kmail
> > 2 less important: make kmail (and others) able to read this xml
> >
> > this could be wery interesting for sorting cataloging and fist: archiving
> > and have best access to archives
> > I can't figure out how to archive my emails and access it like with a DB.
> > Just make a small search seems to be impossible with kmail to me.
> > I used outlook and even if it's hard to manage archives, i could do
> > recursive search for mail of 2 years old... further true db management
> could
> > be great!
> >
> > This feature could exists yet. Just tell me.
> 
> 
> 
> __
> 
> Want to buy your Pack or Services from MandrakeSoft? 
> Go to http://www.mandrakestore.com
-- 
Jack Coates
Monkeynoodle: A Scientific Venture...
http://www.monkeynoodle.org/resume.html


Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


Re: [expert] Mail formats revisited... and xml?

2003-06-09 Thread Stephlub
is anybody can help me?
> ...and about an xml converter
> mbox2xml maildir2xml anyproprietary2xml
> (mailxml2mbox mailxml2maildir mailxml2sql...)
>
> 1 I think it very useful: convert and archive mail with xml for cataloging
> reasons, from kmail
> 2 less important: make kmail (and others) able to read this xml
>
> this could be wery interesting for sorting cataloging and fist: archiving
> and have best access to archives
> I can't figure out how to archive my emails and access it like with a DB.
> Just make a small search seems to be impossible with kmail to me.
> I used outlook and even if it's hard to manage archives, i could do
> recursive search for mail of 2 years old... further true db management
could
> be great!
>
> This feature could exists yet. Just tell me.


Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


[expert] Mail formats revisited... and xml?

2003-06-08 Thread Stephlub
...and about an xml converter
mbox2xml maildir2xml anyproprietary2xml
(mailxml2mbox mailxml2maildir mailxml2sql...)

1 I think it very useful: convert and archive mail with xml for cataloging
reasons, from kmail
2 less important: make kmail (and others) able to read this xml

this could be wery interesting for sorting cataloging and fist: archiving
and have best access to archives
I can't figure out how to archive my emails and access it like with a DB.
Just make a small search seems to be impossible with kmail to me.
I used outlook and even if it's hard to manage archives, i could do
recursive search for mail of 2 years old... further true db management could
be great!

This feature could exists yet. Just tell me.


Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com