cache.jsp does not recognize encoding conversion from content different to UTF-8
--------------------------------------------------------------------------------

                 Key: NUTCH-946
                 URL: https://issues.apache.org/jira/browse/NUTCH-946
             Project: Nutch
          Issue Type: Bug
          Components: web gui
    Affects Versions: 1.2
         Environment: Server version: Apache Tomcat/6.0.29
Server built:   July 19 2010 1458
Server number:  6.0.0.29
OS Name:        Linux
OS Version:     2.6.18-128.7.1.el5
Architecture:   i386
JVM Version:    1.6.0_22-b04
JVM Vendor:     Sun Microsystems Inc.
            Reporter: Enrique Berlanga
            Priority: Minor


Cache view does not recognize encoding conversion needed to show properly page 
content stored in a segment.

The problem is that it searchs "CharEncodingForConversion" meta in content 
metadata, but it's stored in parse metadata.

Here is the patch I've generated for the fixed version:

### Eclipse Workspace Patch 1.0
#P branch-1.2
Index: src/web/jsp/cached.jsp
===================================================================
--- src/web/jsp/cached.jsp      (revision 1027060)
+++ src/web/jsp/cached.jsp      (working copy)
@@ -39,17 +39,18 @@
     ResourceBundle.getBundle("org.nutch.jsp.cached", request.getLocale())
     .getLocale().getLanguage();
 
-  Metadata metaData = bean.getParseData(details).getContentMeta();
+  Metadata contentMetaData = bean.getParseData(details).getContentMeta();
+  Metadata parseMetaData = bean.getParseData(details).getParseMeta();
 
   String content = null;
-  String contentType = (String) metaData.get(Metadata.CONTENT_TYPE);
+  String contentType = (String) contentMetaData.get(Metadata.CONTENT_TYPE);
   if (contentType.startsWith("text/html")) {
     // FIXME : it's better to emit the original 'byte' sequence 
     // with 'charset' set to the value of 'CharEncoding',
     // but I don't know how to emit 'byte sequence' in JSP.
     // out.getOutputStream().write(bean.getContent(details)) may work, 
     // but I'm not sure.
-    String encoding = (String) metaData.get("CharEncodingForConversion"); 
+    String encoding = (String) parseMetaData.get("CharEncodingForConversion"); 
     if (encoding != null) {
       try {
         content = new String(bean.getContent(details), encoding);


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to