FWIW, the newest version of the Solr connector now has configurable timeout values. But my original comment still stands; you really should not find yourself in a position to need this.
Karl On Wed, Dec 26, 2012 at 6:19 AM, Karl Wright <daddy...@gmail.com> wrote: > Hi Shigeki, > > While timeout values into Solr could theoretically be configured as > connection parameters, the timeout values for jCIFS are currently only > settable globally. Therefore, to make changes configurable by > connection, the jCIFS library needs to change. I've already > approached the jCIFS developer about changes of this kind, and he was > unreceptive to this request. Part of the reason is the nature of the > CIFS protocol, which multiplexes many simultaneous requests using the > same connection. So this cannot be solved in the manner you suggest, > in any case. > > Furthermore, on a properly-set-up system, it should be unnecessary to > adjust either jCIFS timeout parameters or Solr timeout parameters. If > you are consistently getting timeouts from jCIFS, it is a strong sign > you are overloading the Windows servers you are trying to crawl, and > you should take steps immediately to reduce the maximum number of > connections you are trying to crawl with. Similarly, chronically > exceeding the Solr timeout parameters indicates you are pushing > documents into a Solr that is either insufficiently powered, or has > too few available threads. Cutting back on the max number of > connections is also indicated here as well. > > Since ManifoldCF retries failures, occasional failures due to other > loads on either the Windows servers or on Solr are expected and will > not cause problems. But chronic failures indicate serious > configuration problems, for which increasing the timeouts is the wrong > solution. So I hesitate to add features of the kind you request, > unless you can convince me that there is a fundamental reason why it > should be necessary to change these parameters. > > Thanks, > Karl > > > On Wed, Dec 26, 2012 at 2:18 AM, Shigeki Kobayashi > <shigeki.kobayas...@g.softbank.co.jp> wrote: >> >> >> Hi. >> >> As I have used MCF so far, I've faced timeout error many times while >> crawling and indexing files to Solr. >> I would like to propose to have the following timeout values configurable in >> properties.xml. >> >> Timeout errors often occur depending on files and environments(machines), so >> it would be nice to change >> the timeout value without rebuild the whole source. >> >> >> $MCF_HOME\connectors\solr\connector\src\main\java\org\apache\manifoldcf\agents\output\solr\HttpPoster.java >> >> int responseRetries = 9000; // Long basic wait: 3 minutes. This >> will also be added to by a term based on the size of the request. >> >> $MCF_HOME\connectors\jcifs\connector\src\main\java\org\apache\manifoldcf\crawler\connectors\sharedrive\SharedDriveConnector.java >> System.setProperty("jcifs.smb.client.soTimeout","150000"); >> System.setProperty("jcifs.smb.client.responseTimeout","120000"); >> >> >> Regards, >> >> >> Shigeki