[ https://issues.apache.org/jira/browse/NUTCH-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13700010#comment-13700010 ]
Markus Jelsma commented on NUTCH-1602: -------------------------------------- I think newlines would mess up the CSV output of the reader tool. > improve the readability of metadata in readdb dump normal > ---------------------------------------------------------- > > Key: NUTCH-1602 > URL: https://issues.apache.org/jira/browse/NUTCH-1602 > Project: Nutch > Issue Type: Improvement > Components: crawldb > Affects Versions: 1.7 > Reporter: lufeng > Assignee: lufeng > Priority: Minor > Fix For: 1.8 > > Attachments: NUTCH-1602.patch > > > the dumped metadata format is not readable. > {code:xml} > $bin/nutch readdb crawldb/ -dump dir > http://www.baidu.com/ Version: 7 > Status: 3 (db_gone) > Fetch time: Sat Aug 17 22:35:37 CST 2013 > Modified time: Thu Jan 01 08:00:00 CST 1970 > Retries since fetch: 0 > Retry interval: 3888000 seconds (45 days) > Score: 1.0 > Signature: null > Metadata: m1: v22m3: v3m2: v2m5: v5m4: m4_pst_: robots_denied(18), > lastModified=0m6: v6 > {code} > so I improve the Metadata format to this > {code:xml} > Metadata: m1=v22;m3=v3;m2=v2;m5=v5;m4=m4;_pst_=robots_denied(18), > lastModified=0;m6=v6; > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira