[ https://issues.apache.org/jira/browse/CONNECTORS-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092695#comment-14092695 ]
Karl Wright commented on CONNECTORS-1006: ----------------------------------------- r1617258 > Google native documents are not crawled > --------------------------------------- > > Key: CONNECTORS-1006 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1006 > Project: ManifoldCF > Issue Type: Bug > Components: GoogleDrive connector > Affects Versions: ManifoldCF 1.4.1 > Reporter: Shigeki Kobayashi > > I use MCF 1.4.1 and try to crawl google native documents such as spreadsheet > then index to solr. > It seems that MCF would not extract the contents. Maybe MCF would not export > spreadsheet to PDF. > The Simple History tells the result of crawl is "NO LENGTH". > > The documents are saved as Google Spreadsheet in Google Docs, which are also > managed in Google Drive. > As MCF documentation says "native Google documents such as spreadsheets and > word documents are exported to PDF and then ingested", those Google > Spreadsheets should be crawled and indexed. -- This message was sent by Atlassian JIRA (v6.2#6252)