Re: Cannot connect to SharePoint 2010 instance
I've seen situations where a SharePoint site is configured to perform a redirection, and this is messing things up internally. Does the your connection server name etc. match precisely the URL you see when you are in the SharePoint user interface? Karl On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, After further review it appears the MCpermissions.asmx was installed globally in SharePoint. I am able to access it from within my SharePoint site as well as all other valid SharePoint sub-sites. So this connection http://server/sitepath/_vti_bin works with any valid site in sitepath including the previously mentioned _admin site. That said do you have any thoughts on why I would be getting the 404 error? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Monday, November 05, 2012 2:45 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance The 404 error indicates that your MCPermissions service is not properly deployed. The _admin in your path is a clue that something might not be right. The place you want to see the MCPermissions.asmx is in the following location: http[s]://server/sitepath/_vti_bin ... where the server is your server name, and the sitepath is your site path. The best way to get this is to enter the SharePoint UI (NOT the admin UI, but the SharePoint end-user UI), and log into the root site. Then make note of the URL in your browser. If the MCPermissions.asmx service appears under that URL, look at your IIS settings and make sure that the MCPermissions.asmx service can be executed. Also, this may be of some help: https://cwiki.apache.org/confluence/display/CONNECTORS/Debugging+Connections The end user documentation is also extremely helpful in describing how to properly set up connections. You can uninstall the MCPermissions.asmx service using the .bat files that are included with the plugin. When you re-install, please make sure that you are logged in as a user with full admin privileges, or the service will not work properly. Thanks, Karl On Mon, Nov 5, 2012 at 2:33 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Hello, I have installed apache-manifoldcf-1.0.1 on my Linux server and apache-manifoldcf-sharepoint-2010-plugin-0.1-bin on my SharePoint 2010 server. On my SharePoint server I can see the Permissions Page when I enter http://x:x/_admin/_vti_bin/MCPermissions.asmx in my browser. When I try to make a SharePoint Services 4.0 (2010) connection to my SharePoint 2010 server in the ManifoldCF interface I get this error. Got an unknown remote exception accessing site - axis fault = Client, detail = The request failed with HTTP status 404: Not Found. I can connect using SharePoint Services 2.0 (2003) but when I try a crawl it does not work properly and aborts. The SharePoint Services 3.0 (2007) connection fails the same as the above 2010 connection. Can you please give some direction on how best to resolve this issue. Thanks Bob Robert P. Iannetti Application Architect Novartis Institute for BioMedical Research 186 Massachusetts Avenue Cambridge, MA 02139 Phone: +1 (617) 871-5414 robert.ianne...@novartis.com
RE: Cannot connect to SharePoint 2010 instance
Yes, The URL and what I enter in the ManifoldCF interface are a match. -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 8:52 AM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance I've seen situations where a SharePoint site is configured to perform a redirection, and this is messing things up internally. Does the your connection server name etc. match precisely the URL you see when you are in the SharePoint user interface? Karl On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, After further review it appears the MCpermissions.asmx was installed globally in SharePoint. I am able to access it from within my SharePoint site as well as all other valid SharePoint sub-sites. So this connection http://server/sitepath/_vti_bin works with any valid site in sitepath including the previously mentioned _admin site. That said do you have any thoughts on why I would be getting the 404 error? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Monday, November 05, 2012 2:45 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance The 404 error indicates that your MCPermissions service is not properly deployed. The _admin in your path is a clue that something might not be right. The place you want to see the MCPermissions.asmx is in the following location: http[s]://server/sitepath/_vti_bin ... where the server is your server name, and the sitepath is your site path. The best way to get this is to enter the SharePoint UI (NOT the admin UI, but the SharePoint end-user UI), and log into the root site. Then make note of the URL in your browser. If the MCPermissions.asmx service appears under that URL, look at your IIS settings and make sure that the MCPermissions.asmx service can be executed. Also, this may be of some help: https://cwiki.apache.org/confluence/display/CONNECTORS/Debugging+Conne ctions The end user documentation is also extremely helpful in describing how to properly set up connections. You can uninstall the MCPermissions.asmx service using the .bat files that are included with the plugin. When you re-install, please make sure that you are logged in as a user with full admin privileges, or the service will not work properly. Thanks, Karl On Mon, Nov 5, 2012 at 2:33 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Hello, I have installed apache-manifoldcf-1.0.1 on my Linux server and apache-manifoldcf-sharepoint-2010-plugin-0.1-bin on my SharePoint 2010 server. On my SharePoint server I can see the Permissions Page when I enter http://x:x/_admin/_vti_bin/MCPermissions.asmx in my browser. When I try to make a SharePoint Services 4.0 (2010) connection to my SharePoint 2010 server in the ManifoldCF interface I get this error. Got an unknown remote exception accessing site - axis fault = Client, detail = The request failed with HTTP status 404: Not Found. I can connect using SharePoint Services 2.0 (2003) but when I try a crawl it does not work properly and aborts. The SharePoint Services 3.0 (2007) connection fails the same as the above 2010 connection. Can you please give some direction on how best to resolve this issue. Thanks Bob Robert P. Iannetti Application Architect Novartis Institute for BioMedical Research 186 Massachusetts Avenue Cambridge, MA 02139 Phone: +1 (617) 871-5414 robert.ianne...@novartis.com
Re: The Schedulars are not starting automatically
Hi Anupam, I'm having difficulty understanding what you posted here, but I will try to explain the difference between rescan dynamically and scan every document once. You may find more help also in ManifoldCF in Action, at http://www.manning.com/wright . The first option causes your job to run forever. The job runs only in the schedule windows allotted for it. It periodically discovers new documents, and (depending on the crawling model of the connector) may check for existence or modification of an already-crawled document. Each document has its own schedule for doing this. The second option is more likely to be what you want. Each job starts, runs, and completes, being sure to run only in the scheduling windows you provide. You then run it again, and again (or your job schedule makes that happen). It will do the minimal work to keep your index up to date. There are significant differences between how you would set up a job using one model vs. the other. I strongly suggest you read at least the first few chapters of the book. Karl On Tue, Nov 6, 2012 at 12:35 PM, Anupam Bhattacharya anupam...@gmail.com wrote: My incremental indexing was working previously but I have messed up with few settings due to which the documents indexed for the previous day gets deleted only the new once shows up. I suspect that it is due to the settings in List all JobEdit selected jobSchedulingSchedule type: Rescan documents dynamically OR Scan every document once ? Please let me know the appropriate settings to index only the new documents in the repository. After deleting the SOLR indexes data folder and clearing the table records in jobqueue, repohistory, ingeststatus I found that ManifoldCF scans only the rest new document list. Untill I go to List Output Connections and Click View for a SOLR connection and Click and Ok the Re-ingest all associated documents. How it is functioning to keep a track of which documents ingested previously and then fetch only the list of new document list ? Regards Anupam On Tue, Aug 14, 2012 at 10:01 AM, Anupam Bhattacharya anupam...@gmail.com wrote: Thanks.. There is a option to set Start Method in Connection tab in the Job settings. I made to changes to Start when the Schedule window starts and the problem got resolved. Regards Anupam On Thu, Aug 2, 2012 at 10:59 PM, Karl Wright daddy...@gmail.com wrote: The incremental will work the same whether the job is run manually or started automatically. If you have added the appropriate schedule record to your job, you also have to select the run job automatically radio button on one of the other job tabs for automatic runs to take place. I suspect that is what you are missing. Karl On Thu, Aug 2, 2012 at 1:12 PM, Anupam Bhattacharya anupam...@gmail.com wrote: I have a Job which is indexing properly even the incremental indexing, if initiated/Run manually. Although even after adding a specific time to Run the schedular process the Jobs is not starting on its own. What is the ideal configuration to configure a Job which run automatically everyday at 12 am and does and incremental re-indexing (only look for those document which are new OR modified after the last crawl) of the repository ? Is it necessary to input/give the total run time details for adding a specific schedule time. Regards Anupam -- Thanks Regards Anupam Bhattacharya
Re: Cannot connect to SharePoint 2010 instance
Yes, this can be somewhat tricky. There are a lot of potential configurations that could affect this. First, you want to verify that your IIS is using NTLM authentication, and that all the web services directories are executable. This is critical. Second, the credentials, in the form of domain\user, may be sensitive to whether you use a fully-qualified domain name or a shortcut domain name, e.g. mydomain.novartis.com or just mydomain. I suggest you try some combinations. The other thing you may want to check is whether the machine you are running ManifoldCF on is known by your domain controller; you may not be able to authenticate if it is not. If this doesn't help, and you want to eliminate ManifoldCF's NTLM implementation from the list of possibilities, I suggest downloading the curl utility, and trying to fetch a web service listing or wsdl using it (specifying NTLM of course as the authentication method). If that also doesn't work, it's a server-side configuration problem of some kind. You can also refer to the server-side IIS logs for some additional info. But I've found these are not very helpful for authentication issues. Let me know if you are still stuck after this; there are other diagnostics available but they start to get ugly. Kral On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, I turned on the additional debugging and was able to resolve the 404 issue. Now I am getting: Crawl user did not authenticate properly, or has insufficient permissions to access http://.xxx.xxx: (401)Unauthorized I can log into the SharePoint site from the browser using the same credentials. Any Thoughts? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 10:05 AM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Well, you can turn on httpclient wire debugging, as I believe is described in the article URL I sent you before, and then you can see precisely what URL the connector is trying to reach when it accesses the MCPermissions service. There's no magic here. If the connector gets a 404 error back from IIS, either its URL is wrong, or IIS has decided it's not going to serve that page to the client. Karl On Tue, Nov 6, 2012 at 8:58 AM, Iannetti, Robert robert.ianne...@novartis.com wrote: Yes, The URL and what I enter in the ManifoldCF interface are a match. -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 8:52 AM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance I've seen situations where a SharePoint site is configured to perform a redirection, and this is messing things up internally. Does the your connection server name etc. match precisely the URL you see when you are in the SharePoint user interface? Karl On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, After further review it appears the MCpermissions.asmx was installed globally in SharePoint. I am able to access it from within my SharePoint site as well as all other valid SharePoint sub-sites. So this connection http://server/sitepath/_vti_bin works with any valid site in sitepath including the previously mentioned _admin site. That said do you have any thoughts on why I would be getting the 404 error? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Monday, November 05, 2012 2:45 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance The 404 error indicates that your MCPermissions service is not properly deployed. The _admin in your path is a clue that something might not be right. The place you want to see the MCPermissions.asmx is in the following location: http[s]://server/sitepath/_vti_bin ... where the server is your server name, and the sitepath is your site path. The best way to get this is to enter the SharePoint UI (NOT the admin UI, but the SharePoint end-user UI), and log into the root site. Then make note of the URL in your browser. If the MCPermissions.asmx service appears under that URL, look at your IIS settings and make sure that the MCPermissions.asmx service can be executed. Also, this may be of some help: https://cwiki.apache.org/confluence/display/CONNECTORS/Debugging+Conn e ctions The end user documentation is also extremely helpful in describing how to properly set up connections. You can uninstall the MCPermissions.asmx service using the .bat files that are included with the plugin. When you re-install, please make sure that you are logged in as a user with full admin privileges, or the service will not work properly. Thanks, Karl On Mon, Nov 5, 2012 at 2:33 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Hello, I have
Re: Cannot connect to SharePoint 2010 instance
No, Kerberos is not supported. This is a limitation of the Apache commons-httpclient library that we use for communicating with SharePoint. It is possible to set up IIS to serve a different port with different authentication that goes to the same SharePoint instance but is NTLM protected, not Kerberos protected. Perhaps you can do this and limit access to that port to only the ManifoldCF machine. Karl On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, Our SharePoint sites use Kerberos authentication is this supported in ManifoldCF? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 2:50 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Yes, this can be somewhat tricky. There are a lot of potential configurations that could affect this. First, you want to verify that your IIS is using NTLM authentication, and that all the web services directories are executable. This is critical. Second, the credentials, in the form of domain\user, may be sensitive to whether you use a fully-qualified domain name or a shortcut domain name, e.g. mydomain.novartis.com or just mydomain. I suggest you try some combinations. The other thing you may want to check is whether the machine you are running ManifoldCF on is known by your domain controller; you may not be able to authenticate if it is not. If this doesn't help, and you want to eliminate ManifoldCF's NTLM implementation from the list of possibilities, I suggest downloading the curl utility, and trying to fetch a web service listing or wsdl using it (specifying NTLM of course as the authentication method). If that also doesn't work, it's a server-side configuration problem of some kind. You can also refer to the server-side IIS logs for some additional info. But I've found these are not very helpful for authentication issues. Let me know if you are still stuck after this; there are other diagnostics available but they start to get ugly. Kral On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, I turned on the additional debugging and was able to resolve the 404 issue. Now I am getting: Crawl user did not authenticate properly, or has insufficient permissions to access http://.xxx.xxx: (401)Unauthorized I can log into the SharePoint site from the browser using the same credentials. Any Thoughts? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 10:05 AM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Well, you can turn on httpclient wire debugging, as I believe is described in the article URL I sent you before, and then you can see precisely what URL the connector is trying to reach when it accesses the MCPermissions service. There's no magic here. If the connector gets a 404 error back from IIS, either its URL is wrong, or IIS has decided it's not going to serve that page to the client. Karl On Tue, Nov 6, 2012 at 8:58 AM, Iannetti, Robert robert.ianne...@novartis.com wrote: Yes, The URL and what I enter in the ManifoldCF interface are a match. -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 8:52 AM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance I've seen situations where a SharePoint site is configured to perform a redirection, and this is messing things up internally. Does the your connection server name etc. match precisely the URL you see when you are in the SharePoint user interface? Karl On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, After further review it appears the MCpermissions.asmx was installed globally in SharePoint. I am able to access it from within my SharePoint site as well as all other valid SharePoint sub-sites. So this connection http://server/sitepath/_vti_bin works with any valid site in sitepath including the previously mentioned _admin site. That said do you have any thoughts on why I would be getting the 404 error? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Monday, November 05, 2012 2:45 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance The 404 error indicates that your MCPermissions service is not properly deployed. The _admin in your path is a clue that something might not be right. The place you want to see the MCPermissions.asmx is in the following location: http[s]://server/sitepath/_vti_bin ... where the server is your server name, and the sitepath is your site path. The best way to get this is to enter the SharePoint UI (NOT the admin UI, but the
RE: Cannot connect to SharePoint 2010 instance
Karl, If this is not possible can you recommend any other products to crawl SharePoint content and index it in Solr? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:10 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance No, Kerberos is not supported. This is a limitation of the Apache commons-httpclient library that we use for communicating with SharePoint. It is possible to set up IIS to serve a different port with different authentication that goes to the same SharePoint instance but is NTLM protected, not Kerberos protected. Perhaps you can do this and limit access to that port to only the ManifoldCF machine. Karl On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, Our SharePoint sites use Kerberos authentication is this supported in ManifoldCF? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 2:50 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Yes, this can be somewhat tricky. There are a lot of potential configurations that could affect this. First, you want to verify that your IIS is using NTLM authentication, and that all the web services directories are executable. This is critical. Second, the credentials, in the form of domain\user, may be sensitive to whether you use a fully-qualified domain name or a shortcut domain name, e.g. mydomain.novartis.com or just mydomain. I suggest you try some combinations. The other thing you may want to check is whether the machine you are running ManifoldCF on is known by your domain controller; you may not be able to authenticate if it is not. If this doesn't help, and you want to eliminate ManifoldCF's NTLM implementation from the list of possibilities, I suggest downloading the curl utility, and trying to fetch a web service listing or wsdl using it (specifying NTLM of course as the authentication method). If that also doesn't work, it's a server-side configuration problem of some kind. You can also refer to the server-side IIS logs for some additional info. But I've found these are not very helpful for authentication issues. Let me know if you are still stuck after this; there are other diagnostics available but they start to get ugly. Kral On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, I turned on the additional debugging and was able to resolve the 404 issue. Now I am getting: Crawl user did not authenticate properly, or has insufficient permissions to access http://.xxx.xxx: (401)Unauthorized I can log into the SharePoint site from the browser using the same credentials. Any Thoughts? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 10:05 AM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Well, you can turn on httpclient wire debugging, as I believe is described in the article URL I sent you before, and then you can see precisely what URL the connector is trying to reach when it accesses the MCPermissions service. There's no magic here. If the connector gets a 404 error back from IIS, either its URL is wrong, or IIS has decided it's not going to serve that page to the client. Karl On Tue, Nov 6, 2012 at 8:58 AM, Iannetti, Robert robert.ianne...@novartis.com wrote: Yes, The URL and what I enter in the ManifoldCF interface are a match. -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 8:52 AM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance I've seen situations where a SharePoint site is configured to perform a redirection, and this is messing things up internally. Does the your connection server name etc. match precisely the URL you see when you are in the SharePoint user interface? Karl On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, After further review it appears the MCpermissions.asmx was installed globally in SharePoint. I am able to access it from within my SharePoint site as well as all other valid SharePoint sub-sites. So this connection http://server/sitepath/_vti_bin works with any valid site in sitepath including the previously mentioned _admin site. That said do you have any thoughts on why I would be getting the 404 error? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Monday, November 05, 2012 2:45 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance The 404 error indicates that your MCPermissions service is not properly deployed. The _admin in your
Re: Cannot connect to SharePoint 2010 instance
Hi Bob, The only products I know have a similar limitations. The only one I know is the SharePoint google appliance connector, which when I looked last had exactly the same restriction. It also has other limitations, some severe, such as limiting the number of documents you can crawl to no more than 5000 per library. We are willing to do a reasonable amount of work to upgrade ManifoldCF to be able to support Kerberos. Here's a link which describes the situation: http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html We currently use a significantly-patched version of 3.1, which supplied the NTLM implementation for 4.0 that is currently in use. Our issue is similar to the commons-httpclient team's, which is we have no good way of testing all of this, and none of us are security protocol experts. If you have (or know somebody with) such expertise, who would be willing/able to donate their time, this problem could be tackled I think without too much pain. So at least httpclient, given the right tickets, would be able to connect. The other issue with Kerberos auth is that I believe it will require a significant amount of work to allow anything using it to obtain the tickets from the AD domain controller. This would obviously require UI work for all connectors that would support Kerberos. But that is something I am willing to attempt if everything else is in place. Karl On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, If this is not possible can you recommend any other products to crawl SharePoint content and index it in Solr? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:10 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance No, Kerberos is not supported. This is a limitation of the Apache commons-httpclient library that we use for communicating with SharePoint. It is possible to set up IIS to serve a different port with different authentication that goes to the same SharePoint instance but is NTLM protected, not Kerberos protected. Perhaps you can do this and limit access to that port to only the ManifoldCF machine. Karl On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, Our SharePoint sites use Kerberos authentication is this supported in ManifoldCF? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 2:50 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Yes, this can be somewhat tricky. There are a lot of potential configurations that could affect this. First, you want to verify that your IIS is using NTLM authentication, and that all the web services directories are executable. This is critical. Second, the credentials, in the form of domain\user, may be sensitive to whether you use a fully-qualified domain name or a shortcut domain name, e.g. mydomain.novartis.com or just mydomain. I suggest you try some combinations. The other thing you may want to check is whether the machine you are running ManifoldCF on is known by your domain controller; you may not be able to authenticate if it is not. If this doesn't help, and you want to eliminate ManifoldCF's NTLM implementation from the list of possibilities, I suggest downloading the curl utility, and trying to fetch a web service listing or wsdl using it (specifying NTLM of course as the authentication method). If that also doesn't work, it's a server-side configuration problem of some kind. You can also refer to the server-side IIS logs for some additional info. But I've found these are not very helpful for authentication issues. Let me know if you are still stuck after this; there are other diagnostics available but they start to get ugly. Kral On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, I turned on the additional debugging and was able to resolve the 404 issue. Now I am getting: Crawl user did not authenticate properly, or has insufficient permissions to access http://.xxx.xxx: (401)Unauthorized I can log into the SharePoint site from the browser using the same credentials. Any Thoughts? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 10:05 AM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Well, you can turn on httpclient wire debugging, as I believe is described in the article URL I sent you before, and then you can see precisely what URL the connector is trying to reach when it accesses the MCPermissions service. There's no magic here. If the connector gets a 404 error back from IIS, either its URL is wrong, or IIS has decided it's not going to
RE: Cannot connect to SharePoint 2010 instance
Karl, On another topic is there a roadmap for supporting SharePoint 2013 ? We are in the process of migrating and were wondering when your ManifoldCF product would be available to support it. Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:34 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Hi Bob, The only products I know have a similar limitations. The only one I know is the SharePoint google appliance connector, which when I looked last had exactly the same restriction. It also has other limitations, some severe, such as limiting the number of documents you can crawl to no more than 5000 per library. We are willing to do a reasonable amount of work to upgrade ManifoldCF to be able to support Kerberos. Here's a link which describes the situation: http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html We currently use a significantly-patched version of 3.1, which supplied the NTLM implementation for 4.0 that is currently in use. Our issue is similar to the commons-httpclient team's, which is we have no good way of testing all of this, and none of us are security protocol experts. If you have (or know somebody with) such expertise, who would be willing/able to donate their time, this problem could be tackled I think without too much pain. So at least httpclient, given the right tickets, would be able to connect. The other issue with Kerberos auth is that I believe it will require a significant amount of work to allow anything using it to obtain the tickets from the AD domain controller. This would obviously require UI work for all connectors that would support Kerberos. But that is something I am willing to attempt if everything else is in place. Karl On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, If this is not possible can you recommend any other products to crawl SharePoint content and index it in Solr? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:10 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance No, Kerberos is not supported. This is a limitation of the Apache commons-httpclient library that we use for communicating with SharePoint. It is possible to set up IIS to serve a different port with different authentication that goes to the same SharePoint instance but is NTLM protected, not Kerberos protected. Perhaps you can do this and limit access to that port to only the ManifoldCF machine. Karl On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, Our SharePoint sites use Kerberos authentication is this supported in ManifoldCF? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 2:50 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Yes, this can be somewhat tricky. There are a lot of potential configurations that could affect this. First, you want to verify that your IIS is using NTLM authentication, and that all the web services directories are executable. This is critical. Second, the credentials, in the form of domain\user, may be sensitive to whether you use a fully-qualified domain name or a shortcut domain name, e.g. mydomain.novartis.com or just mydomain. I suggest you try some combinations. The other thing you may want to check is whether the machine you are running ManifoldCF on is known by your domain controller; you may not be able to authenticate if it is not. If this doesn't help, and you want to eliminate ManifoldCF's NTLM implementation from the list of possibilities, I suggest downloading the curl utility, and trying to fetch a web service listing or wsdl using it (specifying NTLM of course as the authentication method). If that also doesn't work, it's a server-side configuration problem of some kind. You can also refer to the server-side IIS logs for some additional info. But I've found these are not very helpful for authentication issues. Let me know if you are still stuck after this; there are other diagnostics available but they start to get ugly. Kral On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, I turned on the additional debugging and was able to resolve the 404 issue. Now I am getting: Crawl user did not authenticate properly, or has insufficient permissions to access http://.xxx.xxx: (401)Unauthorized I can log into the SharePoint site from the browser using the same credentials. Any Thoughts? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 10:05 AM To: user@manifoldcf.apache.org
Re: Cannot connect to SharePoint 2010 instance
Hi Bob, That depends very strongly on whether SharePoint 2013 continues the Microsoft tradition of breaking web services that used to work. :-) Seriously, we need three things to develop a SharePoint 2013 solution: (1) A stable release (a beta is not sufficient because Microsoft is famous for changing things in a major way between beta and release); (2) a benevolent client with sufficient patience to try things out that we develop in their environment, and (3) enough time so that we're not on the bleeding edge and that other people have run into most of the sticky problems first. We're volunteers here and we all have day jobs, so we mostly can't afford to be pounding away at brick walls on our own. It could be the case that everything just works, in which case the development is trivial. We'll have to see. Karl On Tue, Nov 6, 2012 at 3:37 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, On another topic is there a roadmap for supporting SharePoint 2013 ? We are in the process of migrating and were wondering when your ManifoldCF product would be available to support it. Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:34 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Hi Bob, The only products I know have a similar limitations. The only one I know is the SharePoint google appliance connector, which when I looked last had exactly the same restriction. It also has other limitations, some severe, such as limiting the number of documents you can crawl to no more than 5000 per library. We are willing to do a reasonable amount of work to upgrade ManifoldCF to be able to support Kerberos. Here's a link which describes the situation: http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html We currently use a significantly-patched version of 3.1, which supplied the NTLM implementation for 4.0 that is currently in use. Our issue is similar to the commons-httpclient team's, which is we have no good way of testing all of this, and none of us are security protocol experts. If you have (or know somebody with) such expertise, who would be willing/able to donate their time, this problem could be tackled I think without too much pain. So at least httpclient, given the right tickets, would be able to connect. The other issue with Kerberos auth is that I believe it will require a significant amount of work to allow anything using it to obtain the tickets from the AD domain controller. This would obviously require UI work for all connectors that would support Kerberos. But that is something I am willing to attempt if everything else is in place. Karl On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, If this is not possible can you recommend any other products to crawl SharePoint content and index it in Solr? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:10 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance No, Kerberos is not supported. This is a limitation of the Apache commons-httpclient library that we use for communicating with SharePoint. It is possible to set up IIS to serve a different port with different authentication that goes to the same SharePoint instance but is NTLM protected, not Kerberos protected. Perhaps you can do this and limit access to that port to only the ManifoldCF machine. Karl On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, Our SharePoint sites use Kerberos authentication is this supported in ManifoldCF? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 2:50 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Yes, this can be somewhat tricky. There are a lot of potential configurations that could affect this. First, you want to verify that your IIS is using NTLM authentication, and that all the web services directories are executable. This is critical. Second, the credentials, in the form of domain\user, may be sensitive to whether you use a fully-qualified domain name or a shortcut domain name, e.g. mydomain.novartis.com or just mydomain. I suggest you try some combinations. The other thing you may want to check is whether the machine you are running ManifoldCF on is known by your domain controller; you may not be able to authenticate if it is not. If this doesn't help, and you want to eliminate ManifoldCF's NTLM implementation from the list of possibilities, I suggest downloading the curl utility, and trying to fetch a web service listing or wsdl using it (specifying NTLM of course as the authentication
RE: Cannot connect to SharePoint 2010 instance
Karl, That sounds reasonable. I am having my SP Admin set up the NTML SharePoint instance described below I will let you know how it works. BTW SP 2013 RTM has been released so we can cross #1 off the list :) Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:47 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Hi Bob, That depends very strongly on whether SharePoint 2013 continues the Microsoft tradition of breaking web services that used to work. :-) Seriously, we need three things to develop a SharePoint 2013 solution: (1) A stable release (a beta is not sufficient because Microsoft is famous for changing things in a major way between beta and release); (2) a benevolent client with sufficient patience to try things out that we develop in their environment, and (3) enough time so that we're not on the bleeding edge and that other people have run into most of the sticky problems first. We're volunteers here and we all have day jobs, so we mostly can't afford to be pounding away at brick walls on our own. It could be the case that everything just works, in which case the development is trivial. We'll have to see. Karl On Tue, Nov 6, 2012 at 3:37 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, On another topic is there a roadmap for supporting SharePoint 2013 ? We are in the process of migrating and were wondering when your ManifoldCF product would be available to support it. Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:34 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Hi Bob, The only products I know have a similar limitations. The only one I know is the SharePoint google appliance connector, which when I looked last had exactly the same restriction. It also has other limitations, some severe, such as limiting the number of documents you can crawl to no more than 5000 per library. We are willing to do a reasonable amount of work to upgrade ManifoldCF to be able to support Kerberos. Here's a link which describes the situation: http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html We currently use a significantly-patched version of 3.1, which supplied the NTLM implementation for 4.0 that is currently in use. Our issue is similar to the commons-httpclient team's, which is we have no good way of testing all of this, and none of us are security protocol experts. If you have (or know somebody with) such expertise, who would be willing/able to donate their time, this problem could be tackled I think without too much pain. So at least httpclient, given the right tickets, would be able to connect. The other issue with Kerberos auth is that I believe it will require a significant amount of work to allow anything using it to obtain the tickets from the AD domain controller. This would obviously require UI work for all connectors that would support Kerberos. But that is something I am willing to attempt if everything else is in place. Karl On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, If this is not possible can you recommend any other products to crawl SharePoint content and index it in Solr? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:10 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance No, Kerberos is not supported. This is a limitation of the Apache commons-httpclient library that we use for communicating with SharePoint. It is possible to set up IIS to serve a different port with different authentication that goes to the same SharePoint instance but is NTLM protected, not Kerberos protected. Perhaps you can do this and limit access to that port to only the ManifoldCF machine. Karl On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, Our SharePoint sites use Kerberos authentication is this supported in ManifoldCF? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 2:50 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Yes, this can be somewhat tricky. There are a lot of potential configurations that could affect this. First, you want to verify that your IIS is using NTLM authentication, and that all the web services directories are executable. This is critical. Second, the credentials, in the form of domain\user, may be sensitive to whether you use a fully-qualified domain name or a shortcut domain name, e.g. mydomain.novartis.com or just mydomain. I suggest you try some combinations. The other thing you
Re: Cannot connect to SharePoint 2010 instance
If you want, we can create a ticket to cover SharePoint 2013 work. If you want to attempt a sanity check, if you email me (personally, to daddy...@gmail.com) the Microsoft.SharePoint.dll I can set up a ManifoldCF-Sharepoint-2013 plugin. If I can build that, then the next step would be just trying it all out and seeing where it fails. Karl On Tue, Nov 6, 2012 at 3:49 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, That sounds reasonable. I am having my SP Admin set up the NTML SharePoint instance described below I will let you know how it works. BTW SP 2013 RTM has been released so we can cross #1 off the list :) Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:47 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Hi Bob, That depends very strongly on whether SharePoint 2013 continues the Microsoft tradition of breaking web services that used to work. :-) Seriously, we need three things to develop a SharePoint 2013 solution: (1) A stable release (a beta is not sufficient because Microsoft is famous for changing things in a major way between beta and release); (2) a benevolent client with sufficient patience to try things out that we develop in their environment, and (3) enough time so that we're not on the bleeding edge and that other people have run into most of the sticky problems first. We're volunteers here and we all have day jobs, so we mostly can't afford to be pounding away at brick walls on our own. It could be the case that everything just works, in which case the development is trivial. We'll have to see. Karl On Tue, Nov 6, 2012 at 3:37 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, On another topic is there a roadmap for supporting SharePoint 2013 ? We are in the process of migrating and were wondering when your ManifoldCF product would be available to support it. Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:34 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Hi Bob, The only products I know have a similar limitations. The only one I know is the SharePoint google appliance connector, which when I looked last had exactly the same restriction. It also has other limitations, some severe, such as limiting the number of documents you can crawl to no more than 5000 per library. We are willing to do a reasonable amount of work to upgrade ManifoldCF to be able to support Kerberos. Here's a link which describes the situation: http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html We currently use a significantly-patched version of 3.1, which supplied the NTLM implementation for 4.0 that is currently in use. Our issue is similar to the commons-httpclient team's, which is we have no good way of testing all of this, and none of us are security protocol experts. If you have (or know somebody with) such expertise, who would be willing/able to donate their time, this problem could be tackled I think without too much pain. So at least httpclient, given the right tickets, would be able to connect. The other issue with Kerberos auth is that I believe it will require a significant amount of work to allow anything using it to obtain the tickets from the AD domain controller. This would obviously require UI work for all connectors that would support Kerberos. But that is something I am willing to attempt if everything else is in place. Karl On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, If this is not possible can you recommend any other products to crawl SharePoint content and index it in Solr? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:10 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance No, Kerberos is not supported. This is a limitation of the Apache commons-httpclient library that we use for communicating with SharePoint. It is possible to set up IIS to serve a different port with different authentication that goes to the same SharePoint instance but is NTLM protected, not Kerberos protected. Perhaps you can do this and limit access to that port to only the ManifoldCF machine. Karl On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, Our SharePoint sites use Kerberos authentication is this supported in ManifoldCF? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 2:50 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Yes, this can be somewhat tricky. There are a lot of
RE: Cannot connect to SharePoint 2010 instance
Karl, Let try to get the 2010 connection working first before we proceed to the 2013. Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:59 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance If you want, we can create a ticket to cover SharePoint 2013 work. If you want to attempt a sanity check, if you email me (personally, to daddy...@gmail.com) the Microsoft.SharePoint.dll I can set up a ManifoldCF-Sharepoint-2013 plugin. If I can build that, then the next step would be just trying it all out and seeing where it fails. Karl On Tue, Nov 6, 2012 at 3:49 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, That sounds reasonable. I am having my SP Admin set up the NTML SharePoint instance described below I will let you know how it works. BTW SP 2013 RTM has been released so we can cross #1 off the list :) Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:47 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Hi Bob, That depends very strongly on whether SharePoint 2013 continues the Microsoft tradition of breaking web services that used to work. :-) Seriously, we need three things to develop a SharePoint 2013 solution: (1) A stable release (a beta is not sufficient because Microsoft is famous for changing things in a major way between beta and release); (2) a benevolent client with sufficient patience to try things out that we develop in their environment, and (3) enough time so that we're not on the bleeding edge and that other people have run into most of the sticky problems first. We're volunteers here and we all have day jobs, so we mostly can't afford to be pounding away at brick walls on our own. It could be the case that everything just works, in which case the development is trivial. We'll have to see. Karl On Tue, Nov 6, 2012 at 3:37 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, On another topic is there a roadmap for supporting SharePoint 2013 ? We are in the process of migrating and were wondering when your ManifoldCF product would be available to support it. Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:34 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance Hi Bob, The only products I know have a similar limitations. The only one I know is the SharePoint google appliance connector, which when I looked last had exactly the same restriction. It also has other limitations, some severe, such as limiting the number of documents you can crawl to no more than 5000 per library. We are willing to do a reasonable amount of work to upgrade ManifoldCF to be able to support Kerberos. Here's a link which describes the situation: http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html We currently use a significantly-patched version of 3.1, which supplied the NTLM implementation for 4.0 that is currently in use. Our issue is similar to the commons-httpclient team's, which is we have no good way of testing all of this, and none of us are security protocol experts. If you have (or know somebody with) such expertise, who would be willing/able to donate their time, this problem could be tackled I think without too much pain. So at least httpclient, given the right tickets, would be able to connect. The other issue with Kerberos auth is that I believe it will require a significant amount of work to allow anything using it to obtain the tickets from the AD domain controller. This would obviously require UI work for all connectors that would support Kerberos. But that is something I am willing to attempt if everything else is in place. Karl On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, If this is not possible can you recommend any other products to crawl SharePoint content and index it in Solr? Thanks Bob -Original Message- From: Karl Wright [mailto:daddy...@gmail.com] Sent: Tuesday, November 06, 2012 3:10 PM To: user@manifoldcf.apache.org Subject: Re: Cannot connect to SharePoint 2010 instance No, Kerberos is not supported. This is a limitation of the Apache commons-httpclient library that we use for communicating with SharePoint. It is possible to set up IIS to serve a different port with different authentication that goes to the same SharePoint instance but is NTLM protected, not Kerberos protected. Perhaps you can do this and limit access to that port to only the ManifoldCF machine. Karl On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert robert.ianne...@novartis.com wrote: Karl, Our SharePoint sites use Kerberos authentication is this