[
https://issues.apache.org/jira/browse/TIKA-295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765410#action_12765410
]
Alex Baranov edited comment on TIKA-295 at 10/13/09 11:12 PM:
--------------------------------------------------------------
I guess since the Tika is subproject of Lucene you should use the same format
as for other Lucene projects:
http://wiki.apache.org/lucene-java/HowToContribute
http://wiki.apache.org/solr/HowToContribute
(in the end of the pages).
[Edited: well it turned out that they use another coding styles on Tika
project. At least the indent is 4 spaces instead of 2...]
One question about the parser - do you still work on it? Any progress from the
first draft?
was (Author: alexb):
I guess since the Tika is subproject of Lucene you should use the same
format as for other Lucene projects:
http://wiki.apache.org/lucene-java/HowToContribute
http://wiki.apache.org/solr/HowToContribute
(in the end of the pages).
One question about the parser - do you still work on it? Any progress from the
first draft?
> Rough cut of mbox parser
> ------------------------
>
> Key: TIKA-295
> URL: https://issues.apache.org/jira/browse/TIKA-295
> Project: Tika
> Issue Type: New Feature
> Affects Versions: 0.4
> Reporter: Ken Krugler
> Assignee: Jukka Zitting
> Fix For: 0.5
>
> Attachments: tika-295.patch
>
>
> Attached is a patch for a first-cut at a parser that handles mailbox (.mbox,
> application/mbox) files.
> * The first email headers are used to fill in metadata. Subsequent email
> headers are tossed.
> * Charset handling needs to be fixed up. It's unclear (not spec'd) whether
> emails individually use the charset as specified in their individual header,
> or the entire file should be re-encoded (and the encoding is sent in the
> response header, or auto-detected).
> * Multi-part emails won't be handled properly, though it's unclear what
> should be done in that case (if anything).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.