That information isn't being recorded in manifoldcf.log unfortunately -- I included all that was there. And there are no exceptions in elasticsearch.log either...
I'll try running wireshark to see if I can follow the TCP stream. On 30 January 2013 14:16, Karl Wright <daddy...@gmail.com> wrote: > Ok, ElasticSearch is not happy about something when the document is > being posted. The connector is seeing a non-200 HTTP response, and > throwing an exception as a result: > > if (!checkResultCode(method.getStatusCode())) > throw new ManifoldCFException(getResultDescription()); > > Presumably the exception message in the log tells us what that HTTP > code is, but you did not include that key info. > > Karl > > On Wed, Jan 30, 2013 at 9:06 AM, Andrew Clegg <andrew.cl...@gmail.com> wrote: >> Thanks for all your help Karl! >> >> It's 1.0.1 from the binary distro. >> >> And yes, it says "Connection working" when I view it. >> >> On 30 January 2013 14:03, Karl Wright <daddy...@gmail.com> wrote: >>> Ok, so let's back up a bit. >>> >>> First, which version of ManifoldCF is this? I need to know that >>> before I can interpret the stack trace. >>> >>> Second, what do you see when you view the connection in the crawler >>> UI? Does it say "Connection working", or something else, and if so, >>> what? >>> >>> I've created a ticket for better error reporting in this connector - >>> it was a contribution and AFAIK the error handling is not very robust >>> at this point, but I can fix that quickly with your help. ;-) >>> >>> Karl >>> >>> On Wed, Jan 30, 2013 at 8:55 AM, Andrew Clegg <andrew.cl...@gmail.com> >>> wrote: >>>> On 30 January 2013 13:33, Karl Wright <daddy...@gmail.com> wrote: >>>> >>>>> So you saw events in the history which correspond to these documents >>>>> and which are of type "Indexation" that say "success"? If that is the >>>>> case, then the ElasticSearch connector thinks it handed the documents >>>>> successfully to the ElasticSearch server. >>>> >>>> Ah, no, the activity is fetch rather than indexation. e.g. >>>> >>>> 01-30-2013 13:08:16.217 fetch 09026205800698a9 Success 549541 361 >>>> >>>> I don't see any history entries relating to indexing as a specific >>>> activity in its own right. Sorry, that was probably a red herring, I >>>> don't think it's getting that far. >>>> >>>> I just noticed that above all the "service interruption reported" >>>> warnings are some errors like this: >>>> >>>> ERROR 2013-01-30 13:44:15,356 (Worker thread '45') - Exception tossed: >>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: >>>> at >>>> org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:97) >>>> at >>>> org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchIndex.<init>(ElasticSearchIndex.java:138) >>>> at >>>> org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.addOrReplaceDocument(ElasticSearchConnector.java:322) >>>> at >>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579) >>>> at >>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504) >>>> at >>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370) >>>> at >>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1652) >>>> at >>>> org.apache.manifoldcf.crawler.connectors.DCTM.DCTM.processDocuments(DCTM.java:1820) >>>> at >>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423) >>>> at >>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:551) >>>> >>>> Sadly there's no description, just a stacktrace. >>>> >>>> I know the ES server is visible from the MCF server -- actually >>>> they're the same machine, and it's configured to use >>>> http://127.0.0.1:9200/ as the server URL. And I can go to the command >>>> line on that server and curl that URL successfully. >> >> >> >> -- >> >> http://tinyurl.com/andrew-clegg-linkedin | http://twitter.com/andrew_clegg -- http://tinyurl.com/andrew-clegg-linkedin | http://twitter.com/andrew_clegg