Repeated service interruptions - failure processing document: null

2013-01-14 Thread Ahmet Arslan
Hello, I am indexing a SharePoint 2010 instance using mcf-trunk (At revision 1432907) There is no problem with a Document library that contains word excel etc. However, I receive the following errors with a Document library that has *.aspx files in it. Status of Jobs = Error: Repeated service

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Karl Wright
Hi Ahmet, The exception that seems to be causing the abort is a socket exception coming from a socket write: Caused by: java.net.SocketException: Broken pipe This makes sense in light of the http code returned from Solr, which was 413: http://www.checkupdown.com/status/E413.html . So there

[jira] [Created] (CONNECTORS-609) Solr connector does not handle http code 413 properly

2013-01-14 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-609: -- Summary: Solr connector does not handle http code 413 properly Key: CONNECTORS-609 URL: https://issues.apache.org/jira/browse/CONNECTORS-609 Project: ManifoldCF

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Karl Wright
CONNECTORS-609 Karl On Mon, Jan 14, 2013 at 8:30 AM, Karl Wright daddy...@gmail.com wrote: Hi Ahmet, The exception that seems to be causing the abort is a socket exception coming from a socket write: Caused by: java.net.SocketException: Broken pipe This makes sense in light of the http

[jira] [Commented] (CONNECTORS-609) Solr connector does not handle http code 413 properly

2013-01-14 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552657#comment-13552657 ] Karl Wright commented on CONNECTORS-609: r1432919 Solr

[jira] [Resolved] (CONNECTORS-609) Solr connector does not handle http code 413 properly

2013-01-14 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-609. Resolution: Fixed Solr connector does not handle http code 413 properly

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Ahmet Arslan
Hi Karl, Thanks for quick fix. I am still seeing the following error after 'svn up' and 'ant build' ERROR 2013-01-14 17:09:41,949 (Worker thread '6') - Exception tossed: Repeated service interruptions - failure processing document: null

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Ahmet Arslan
Hi, If I enable security (Active Directory), job seems hang and I get this too: FATAL 2013-01-14 17:13:46,871 (Worker thread '15') - Error tossed: null java.lang.NullPointerException at

[jira] [Created] (CONNECTORS-610) Solr connector should disable httpclient retries, or errors will always cause a job abort

2013-01-14 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-610: -- Summary: Solr connector should disable httpclient retries, or errors will always cause a job abort Key: CONNECTORS-610 URL:

[jira] [Created] (CONNECTORS-611) SharePoint connector throws NPE getting permissions in some cases

2013-01-14 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-611: -- Summary: SharePoint connector throws NPE getting permissions in some cases Key: CONNECTORS-611 URL: https://issues.apache.org/jira/browse/CONNECTORS-611 Project:

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Karl Wright
Hmm, this makes no sense. The code is this: com.microsoft.sharepoint.webpartpages.GetPermissionCollectionResponseGetPermissionCollectionResult aclResult = aclCall.getPermissionCollection( encodedRelativePath, Item ); org.apache.axis.message.MessageElement[] aclList =

[jira] [Resolved] (CONNECTORS-610) Solr connector should disable httpclient retries, or errors will always cause a job abort

2013-01-14 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-610. Resolution: Fixed Solr connector should disable httpclient retries, or errors

[jira] [Commented] (CONNECTORS-610) Solr connector should disable httpclient retries, or errors will always cause a job abort

2013-01-14 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552796#comment-13552796 ] Karl Wright commented on CONNECTORS-610: r1432958. Solr

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Karl Wright
I checked in a fix for this ticket on trunk. Please let me know if it resolves this issue. Karl On Mon, Jan 14, 2013 at 10:20 AM, Karl Wright daddy...@gmail.com wrote: This is because httpclient is retrying on error for three times by default. This has to be disabled in the Solr connector,

[jira] [Updated] (CONNECTORS-611) SharePoint connector throws NPE getting permissions in some cases

2013-01-14 Thread Ahmet Arslan (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated CONNECTORS-611: Attachment: Archive.zip I added these two lines to logging.ini file {code}

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Ahmet Arslan
Hi Karl, Now 39 aspx files (out of 130) are indexed. Job didn't get killed. No exceptions in the log. I increased the maximum POST size of solr/jetty but that 39 number didn't increased. I will check the size of remaining 130 - 39 *.aspx files. Actually I am mapping extracted content of

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Karl Wright
Hi Ahmet, We could specifically treat .aspx files specially, so that they are considered to never have any content. But are there cases where someone might want to index any content that these URLs might return? Specifically, what do .aspx files typically contain, when found in a SharePoint

[jira] [Commented] (CONNECTORS-611) SharePoint connector throws NPE getting permissions in some cases

2013-01-14 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552952#comment-13552952 ] Karl Wright commented on CONNECTORS-611: It looks like the basic problem is

[jira] [Resolved] (CONNECTORS-611) SharePoint connector throws NPE getting permissions in some cases

2013-01-14 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-611. Resolution: Fixed SharePoint connector throws NPE getting permissions in some

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Karl Wright
It's also possible that the getACLs() problem has to do with these files. Apparently you can't get the permissions for them. In that case, if security is on, we can't index them, because we can't get valid ACLs. Karl On Mon, Jan 14, 2013 at 11:46 AM, Karl Wright daddy...@gmail.com wrote: Hi

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Ahmet Arslan
Hi Karl, I think people may want to index content aspx files, so treating them specially may not be a good solution. In our environment, aspx files are used to construct a web site that used internally. In my understanding this one of the use cases of SharePoint. In our case content of aspx

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Ahmet Arslan
Hi Karl, I track down problem to this: A metadata is causing this. If I select only ID metadata (Normally I select all these : Created, FileLeafRef, ID, IKAccessGroup, IKContentType, IKDocuments, IKExpertise, IKExplanation, IKFAQ, IKImportant, Modified, Title ) all aspx files are indexed

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Ahmet Arslan
Hello, I increased these settings of jetty/solr : Set type=java.lang.Integer name=requestHeaderSize147483647/Set Set type=java.lang.Integer name=requestBufferSize147483647/Set Now I can index all 130 aspx files (will all metadata) with security disabled. Thanks, Ahmet --- On Mon, 1/14/13,

[jira] [Commented] (CONNECTORS-608) Solr connector gets socket timeouts on slow documents

2013-01-14 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13553205#comment-13553205 ] Karl Wright commented on CONNECTORS-608: Please synch up again. Another fix

Re: Repeated service interruptions - failure processing document: null

2013-01-14 Thread Karl Wright
Ok, it seems like ManifoldCF is working correctly here, no? Karl On Mon, Jan 14, 2013 at 5:24 PM, Ahmet Arslan iori...@yahoo.com wrote: Hello, I increased these settings of jetty/solr : Set type=java.lang.Integer name=requestHeaderSize147483647/Set Set type=java.lang.Integer