[ http://issues.apache.org/jira/browse/NUTCH-140?page=all ]
     
Jerome Charron closed NUTCH-140:
--------------------------------

    Fix Version: 0.8-dev
     Resolution: Fixed

I have committed the patch provided by Chris with some modifications:
(http://svn.apache.org/viewcvs.cgi?rev=379403&view=rev)

* Some minor code reformatting
* An extension id can be used directly in the parse-plugin.xml file without any 
alias definition (will help in a transitional phase when we get a admin gui)
* The API provides the ability to retrieve a parser from its extension-id or 
its alias (getParserByExtensionId)
* Remove the deprecated methods.
* Make use of the new APIs in parse-mp3 and parse-rtf

Thanks Chris


> Add alias capability in parse-plugins.xml file that allows 
> mimeType->extensionId mapping
> ----------------------------------------------------------------------------------------
>
>          Key: NUTCH-140
>          URL: http://issues.apache.org/jira/browse/NUTCH-140
>      Project: Nutch
>         Type: Improvement
>   Components: fetcher
>  Environment:  Power Mac OS X 10.4, Dual Processor G5 2.0 Ghz, 1.5 GB RAM, 
> although bug is independent of environment
>     Reporter: Chris A. Mattmann
>     Assignee: Chris A. Mattmann
>     Priority: Minor
>      Fix For: 0.8-dev
>  Attachments: NUTCH-140.20051502.patch.txt
>
>  Jerome and I have been talking about an idea to address the current issue 
> raised by Stefan G. about having a mapping of mimeType->list of pluginIds 
> rather than mimeType->list of extensionIds in the parse-plugins.xml file. 
> We've come up with the following proposed update that would seemingly fix 
> this problem.
>   We propose to have the concept of "aliases" in the parse-plugins.xml file, 
> defined at the end of the file, something lie:
>  <parse-plugins>
>     ....
>    <mimeType name="text/html">
>       <plugin id="parse-html"/>
>    </mimeType>
>     .....
>   
>    <aliases>
>    <alias name="parse-html"
> extension-point="org.apache.nutch.parse.html.HtmlParser"/>
>    ....
>    <alias name="parse-html2" extension-point="my.other.html.Parser"/>
>    
>    ....
>    </aliases>
> </parse-plugins>
> What do you guys think? This approach would be flexible enough to allow the 
> mapping of extensionIds to mimeTypes, but without impacting the current 
> "pluginId" concept.
> Comments welcome. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to