[ https://issues.apache.org/jira/browse/NUTCH-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381358#comment-14381358 ]
Lewis John McGibbney commented on NUTCH-1959: --------------------------------------------- [~gostep]'s [comment on NUTCH-1974|https://issues.apache.org/jira/browse/NUTCH-1974?focusedCommentId=14378396&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14378396] describes the state of affairs bq. I need the changes reported on NUTCH-1959, but not yet committed, so I think this patch can absorb NUTCH-1959. > Improving CommonCrawlFormat implementations > ------------------------------------------- > > Key: NUTCH-1959 > URL: https://issues.apache.org/jira/browse/NUTCH-1959 > Project: Nutch > Issue Type: Improvement > Affects Versions: 1.9 > Reporter: Giuseppe Totaro > Assignee: Chris A. Mattmann > Priority: Minor > Fix For: 1.10 > > Attachments: NUTCH-1959.patch, NUTCH-1959.v02.patch > > > {{CommonCrawlFormat}} is an interface for Java classes that implement methods > for writing data into Common Crawl format. {{AbstractCommonCrawlFormat}} is > an abstract class that implements {{CommonCrawlFormat}} and provides abstract > methods for "CommonCrawl formatter" classes. > You can find in attachment a PATCH that includes some improvements for > {{CommonCrawlFormat}}-based classes; > * {{CommonCrawlFormat}} and {{AbstractCommonCrawlFormat}} now provide only > the {{getJsonData()}} method, responsible for getting out JSON data. > * {{AbstractCommonCrawlFormat}} provides also the abstract methods that each > subclass has to implement in order to handle JSON objects. > * {{CommonCrawlFormatSimple}} is a {{StringBuilder}}-based formatter that now > provide also escaping of JSON string values. > This PATCH aims at providing a better interface for implementing/extending > {{CommonCrawlFormat}} classes. > I would really appreciate your feedback. > Thanks a lot, > Giuseppe -- This message was sent by Atlassian JIRA (v6.3.4#6332)