Re: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Karl Wright
I've seen situations where a SharePoint site is configured to perform
a redirection, and this is messing things up internally.  Does the
your connection server name etc. match precisely the URL you see when
you are in the SharePoint user interface?

Karl

On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert
robert.ianne...@novartis.com wrote:
 Karl,

 After further review it appears the MCpermissions.asmx was installed globally 
 in SharePoint. I am able to access it from within my SharePoint site as well 
 as all other valid SharePoint sub-sites.
 So this connection http://server/sitepath/_vti_bin works with any valid 
 site in sitepath including the previously mentioned _admin site.

 That said do you have any thoughts on why I would be getting the 404 error?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Monday, November 05, 2012 2:45 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 The 404 error indicates that your MCPermissions service is not properly 
 deployed.  The _admin in your path is a clue that something might not be 
 right.  The place you want to see the MCPermissions.asmx is in the following 
 location:

 http[s]://server/sitepath/_vti_bin

 ... where the server is your server name, and the sitepath is your site 
 path.  The best way to get this is to enter the SharePoint UI (NOT the admin 
 UI, but the SharePoint end-user UI), and log into the root site.  Then make 
 note of the URL in your browser.

 If the MCPermissions.asmx service appears under that URL, look at your IIS 
 settings and make sure that the MCPermissions.asmx service can be executed.

 Also, this may be of some help:
 https://cwiki.apache.org/confluence/display/CONNECTORS/Debugging+Connections

 The end user documentation is also extremely helpful in describing how to 
 properly set up connections.

 You can uninstall the MCPermissions.asmx service using the .bat files that 
 are included with the plugin.  When you re-install, please make sure that you 
 are logged in as a user with full admin privileges, or the service will not 
 work properly.

 Thanks,
 Karl

 On Mon, Nov 5, 2012 at 2:33 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Hello,



 I have installed apache-manifoldcf-1.0.1 on my Linux server and
 apache-manifoldcf-sharepoint-2010-plugin-0.1-bin on my SharePoint 2010
 server.

 On my SharePoint server I can see the Permissions Page when I enter
 http://x:x/_admin/_vti_bin/MCPermissions.asmx in my browser.



 When I try to make a SharePoint Services 4.0 (2010) connection to my
 SharePoint 2010 server in the ManifoldCF interface I get this error.

 Got an unknown remote exception accessing site - axis fault = Client,
 detail = The request failed with HTTP status 404: Not Found.



 I can connect using SharePoint Services 2.0 (2003) but when I try a
 crawl it does not work properly and aborts.

 The  SharePoint Services 3.0 (2007) connection fails the same as the
 above
 2010 connection.



 Can you please give some direction on how best to resolve this issue.



 Thanks

 Bob





 Robert P. Iannetti



 Application Architect

 Novartis Institute for BioMedical Research

 186 Massachusetts Avenue

 Cambridge, MA 02139

 Phone: +1 (617) 871-5414

 robert.ianne...@novartis.com








RE: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Iannetti, Robert
Yes, The URL and what I enter in the ManifoldCF interface are a match.

-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com] 
Sent: Tuesday, November 06, 2012 8:52 AM
To: user@manifoldcf.apache.org
Subject: Re: Cannot connect to SharePoint 2010 instance

I've seen situations where a SharePoint site is configured to perform a 
redirection, and this is messing things up internally.  Does the your 
connection server name etc. match precisely the URL you see when you are in the 
SharePoint user interface?

Karl

On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert robert.ianne...@novartis.com 
wrote:
 Karl,

 After further review it appears the MCpermissions.asmx was installed globally 
 in SharePoint. I am able to access it from within my SharePoint site as well 
 as all other valid SharePoint sub-sites.
 So this connection http://server/sitepath/_vti_bin works with any valid 
 site in sitepath including the previously mentioned _admin site.

 That said do you have any thoughts on why I would be getting the 404 error?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Monday, November 05, 2012 2:45 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 The 404 error indicates that your MCPermissions service is not properly 
 deployed.  The _admin in your path is a clue that something might not be 
 right.  The place you want to see the MCPermissions.asmx is in the following 
 location:

 http[s]://server/sitepath/_vti_bin

 ... where the server is your server name, and the sitepath is your site 
 path.  The best way to get this is to enter the SharePoint UI (NOT the admin 
 UI, but the SharePoint end-user UI), and log into the root site.  Then make 
 note of the URL in your browser.

 If the MCPermissions.asmx service appears under that URL, look at your IIS 
 settings and make sure that the MCPermissions.asmx service can be executed.

 Also, this may be of some help:
 https://cwiki.apache.org/confluence/display/CONNECTORS/Debugging+Conne
 ctions

 The end user documentation is also extremely helpful in describing how to 
 properly set up connections.

 You can uninstall the MCPermissions.asmx service using the .bat files that 
 are included with the plugin.  When you re-install, please make sure that you 
 are logged in as a user with full admin privileges, or the service will not 
 work properly.

 Thanks,
 Karl

 On Mon, Nov 5, 2012 at 2:33 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Hello,



 I have installed apache-manifoldcf-1.0.1 on my Linux server and 
 apache-manifoldcf-sharepoint-2010-plugin-0.1-bin on my SharePoint 
 2010 server.

 On my SharePoint server I can see the Permissions Page when I enter 
 http://x:x/_admin/_vti_bin/MCPermissions.asmx in my browser.



 When I try to make a SharePoint Services 4.0 (2010) connection to 
 my SharePoint 2010 server in the ManifoldCF interface I get this error.

 Got an unknown remote exception accessing site - axis fault = Client, 
 detail = The request failed with HTTP status 404: Not Found.



 I can connect using SharePoint Services 2.0 (2003) but when I try a 
 crawl it does not work properly and aborts.

 The  SharePoint Services 3.0 (2007) connection fails the same as 
 the above
 2010 connection.



 Can you please give some direction on how best to resolve this issue.



 Thanks

 Bob





 Robert P. Iannetti



 Application Architect

 Novartis Institute for BioMedical Research

 186 Massachusetts Avenue

 Cambridge, MA 02139

 Phone: +1 (617) 871-5414

 robert.ianne...@novartis.com








Re: The Schedulars are not starting automatically

2012-11-06 Thread Karl Wright
Hi Anupam,

I'm having difficulty understanding what you posted here, but I will
try to explain the difference between rescan dynamically and scan
every document once.  You may find more help also in ManifoldCF in
Action, at http://www.manning.com/wright .

The first option causes your job to run forever.  The job runs only in
the schedule windows allotted for it.  It periodically discovers new
documents, and (depending on the crawling model of the connector) may
check for existence or modification of an already-crawled document.
Each document has its own schedule for doing this.

The second option is more likely to be what you want.  Each job
starts, runs, and completes, being sure to run only in the scheduling
windows you provide.  You then run it again, and again (or your job
schedule makes that happen).  It will do the minimal work to keep your
index up to date.

There are significant differences between how you would set up a job
using one model vs. the other.  I strongly suggest you read at least
the first few chapters of the book.

Karl

On Tue, Nov 6, 2012 at 12:35 PM, Anupam Bhattacharya
anupam...@gmail.com wrote:
 My incremental indexing was working previously but I have messed up with few
 settings due to which the documents indexed for the previous day gets
 deleted  only the new once shows up. I suspect that it is due to the
 settings in List all JobEdit selected jobSchedulingSchedule type: Rescan
 documents dynamically OR Scan every document once ? Please let me know
 the appropriate settings to index only the new documents in the repository.

 After deleting the SOLR indexes data folder and clearing the table records
 in jobqueue, repohistory, ingeststatus I found that ManifoldCF scans only
 the rest new document list. Untill I go to List Output Connections and Click
 View for a SOLR connection and Click and Ok the Re-ingest all associated
 documents. How it is functioning to keep a track of which documents ingested
 previously and then fetch only the list of new document list ?

 Regards
 Anupam


 On Tue, Aug 14, 2012 at 10:01 AM, Anupam Bhattacharya anupam...@gmail.com
 wrote:

 Thanks..

 There is a option to set Start Method in Connection tab in the Job
 settings. I made to changes to Start when the Schedule window starts and
 the problem got resolved.

 Regards
 Anupam


 On Thu, Aug 2, 2012 at 10:59 PM, Karl Wright daddy...@gmail.com wrote:

 The incremental will work the same whether the job is run manually or
 started automatically.

 If you have added the appropriate schedule record to your job, you
 also have to select the run job automatically radio button on one of
 the other job tabs for automatic runs to take place.  I suspect that
 is what you are missing.

 Karl

 On Thu, Aug 2, 2012 at 1:12 PM, Anupam Bhattacharya anupam...@gmail.com
 wrote:
  I have a Job which is indexing properly even the incremental indexing,
  if
  initiated/Run manually. Although even after adding a specific time to
  Run
  the schedular process the Jobs is not starting on its own.
 
  What is the ideal configuration to configure a Job which run
  automatically
  everyday at 12 am and does and incremental re-indexing (only look for
  those
  document which are new OR modified after the last crawl) of the
  repository ?
 
  Is it necessary to input/give the total run time details for adding a
  specific schedule time.
 
  Regards
  Anupam





 --
 Thanks  Regards
 Anupam Bhattacharya




Re: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Karl Wright
Yes, this can be somewhat tricky.  There are a lot of potential
configurations that could affect this.

First, you want to verify that your IIS is using NTLM authentication,
and that all the web services directories are executable.  This is
critical.

Second, the credentials, in the form of domain\user, may be sensitive
to whether you use a fully-qualified domain name or a shortcut domain
name, e.g. mydomain.novartis.com or just mydomain.  I suggest you try
some combinations.  The other thing you may want to check is whether
the machine you are running ManifoldCF on is known by your domain
controller; you may not be able to authenticate if it is not.

If this doesn't help, and you want to eliminate ManifoldCF's NTLM
implementation from the list of possibilities, I suggest downloading
the curl utility, and trying to fetch a web service listing or wsdl
using it (specifying NTLM of course as the authentication method).  If
that also doesn't work, it's a server-side configuration problem of
some kind.

You can also refer to the server-side IIS logs for some additional
info.  But I've found these are not very helpful for authentication
issues.

Let me know if you are still stuck after this; there are other
diagnostics available but they start to get ugly.

Kral

On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert
robert.ianne...@novartis.com wrote:
 Karl,

 I turned on the additional debugging and was able to resolve the 404 issue.

 Now I am getting:
 Crawl user did not authenticate properly, or has insufficient permissions to 
 access http://.xxx.xxx: (401)Unauthorized

 I can log into the SharePoint site from the browser using the same 
 credentials.


 Any Thoughts?

 Thanks
 Bob

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 10:05 AM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Well, you can turn on httpclient wire debugging, as I believe is described in 
 the article URL I sent you before, and then you can see precisely what URL 
 the connector is trying to reach when it accesses the MCPermissions service.

 There's no magic here.  If the connector gets a 404 error back from IIS, 
 either its URL is wrong, or IIS has decided it's not going to serve that page 
 to the client.

 Karl


 On Tue, Nov 6, 2012 at 8:58 AM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Yes, The URL and what I enter in the ManifoldCF interface are a match.

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 8:52 AM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 I've seen situations where a SharePoint site is configured to perform a 
 redirection, and this is messing things up internally.  Does the your 
 connection server name etc. match precisely the URL you see when you are in 
 the SharePoint user interface?

 Karl

 On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 After further review it appears the MCpermissions.asmx was installed 
 globally in SharePoint. I am able to access it from within my SharePoint 
 site as well as all other valid SharePoint sub-sites.
 So this connection http://server/sitepath/_vti_bin works with any valid 
 site in sitepath including the previously mentioned _admin site.

 That said do you have any thoughts on why I would be getting the 404 error?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Monday, November 05, 2012 2:45 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 The 404 error indicates that your MCPermissions service is not properly 
 deployed.  The _admin in your path is a clue that something might not be 
 right.  The place you want to see the MCPermissions.asmx is in the 
 following location:

 http[s]://server/sitepath/_vti_bin

 ... where the server is your server name, and the sitepath is your site 
 path.  The best way to get this is to enter the SharePoint UI (NOT the 
 admin UI, but the SharePoint end-user UI), and log into the root site.  
 Then make note of the URL in your browser.

 If the MCPermissions.asmx service appears under that URL, look at your IIS 
 settings and make sure that the MCPermissions.asmx service can be executed.

 Also, this may be of some help:
 https://cwiki.apache.org/confluence/display/CONNECTORS/Debugging+Conn
 e
 ctions

 The end user documentation is also extremely helpful in describing how to 
 properly set up connections.

 You can uninstall the MCPermissions.asmx service using the .bat files that 
 are included with the plugin.  When you re-install, please make sure that 
 you are logged in as a user with full admin privileges, or the service will 
 not work properly.

 Thanks,
 Karl

 On Mon, Nov 5, 2012 at 2:33 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Hello,



 I have 

Re: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Karl Wright
No, Kerberos is not supported.  This is a limitation of the Apache
commons-httpclient library that we use for communicating with
SharePoint.

It is possible to set up IIS to serve a different port with different
authentication that goes to the same SharePoint instance but is NTLM
protected, not Kerberos protected.  Perhaps you can do this and limit
access to that port to only the ManifoldCF machine.

Karl

On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert
robert.ianne...@novartis.com wrote:
 Karl,

 Our SharePoint sites use Kerberos authentication is this supported in 
 ManifoldCF?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 2:50 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Yes, this can be somewhat tricky.  There are a lot of potential 
 configurations that could affect this.

 First, you want to verify that your IIS is using NTLM authentication, and 
 that all the web services directories are executable.  This is critical.

 Second, the credentials, in the form of domain\user, may be sensitive to 
 whether you use a fully-qualified domain name or a shortcut domain name, e.g. 
 mydomain.novartis.com or just mydomain.  I suggest you try some combinations. 
  The other thing you may want to check is whether the machine you are running 
 ManifoldCF on is known by your domain controller; you may not be able to 
 authenticate if it is not.

 If this doesn't help, and you want to eliminate ManifoldCF's NTLM 
 implementation from the list of possibilities, I suggest downloading the 
 curl utility, and trying to fetch a web service listing or wsdl using it 
 (specifying NTLM of course as the authentication method).  If that also 
 doesn't work, it's a server-side configuration problem of some kind.

 You can also refer to the server-side IIS logs for some additional info.  But 
 I've found these are not very helpful for authentication issues.

 Let me know if you are still stuck after this; there are other diagnostics 
 available but they start to get ugly.

 Kral

 On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 I turned on the additional debugging and was able to resolve the 404 issue.

 Now I am getting:
 Crawl user did not authenticate properly, or has insufficient
 permissions to access http://.xxx.xxx: (401)Unauthorized

 I can log into the SharePoint site from the browser using the same 
 credentials.


 Any Thoughts?

 Thanks
 Bob

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 10:05 AM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Well, you can turn on httpclient wire debugging, as I believe is described 
 in the article URL I sent you before, and then you can see precisely what 
 URL the connector is trying to reach when it accesses the MCPermissions 
 service.

 There's no magic here.  If the connector gets a 404 error back from IIS, 
 either its URL is wrong, or IIS has decided it's not going to serve that 
 page to the client.

 Karl


 On Tue, Nov 6, 2012 at 8:58 AM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Yes, The URL and what I enter in the ManifoldCF interface are a match.

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 8:52 AM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 I've seen situations where a SharePoint site is configured to perform a 
 redirection, and this is messing things up internally.  Does the your 
 connection server name etc. match precisely the URL you see when you are in 
 the SharePoint user interface?

 Karl

 On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 After further review it appears the MCpermissions.asmx was installed 
 globally in SharePoint. I am able to access it from within my SharePoint 
 site as well as all other valid SharePoint sub-sites.
 So this connection http://server/sitepath/_vti_bin works with any 
 valid site in sitepath including the previously mentioned _admin site.

 That said do you have any thoughts on why I would be getting the 404 error?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Monday, November 05, 2012 2:45 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 The 404 error indicates that your MCPermissions service is not properly 
 deployed.  The _admin in your path is a clue that something might not be 
 right.  The place you want to see the MCPermissions.asmx is in the 
 following location:

 http[s]://server/sitepath/_vti_bin

 ... where the server is your server name, and the sitepath is your 
 site path.  The best way to get this is to enter the SharePoint UI (NOT 
 the admin UI, but the 

RE: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Iannetti, Robert
Karl,

If this is not possible can you recommend any other products to crawl 
SharePoint content and index it in Solr?

Thanks
Bob


-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com] 
Sent: Tuesday, November 06, 2012 3:10 PM
To: user@manifoldcf.apache.org
Subject: Re: Cannot connect to SharePoint 2010 instance

No, Kerberos is not supported.  This is a limitation of the Apache 
commons-httpclient library that we use for communicating with SharePoint.

It is possible to set up IIS to serve a different port with different 
authentication that goes to the same SharePoint instance but is NTLM protected, 
not Kerberos protected.  Perhaps you can do this and limit access to that port 
to only the ManifoldCF machine.

Karl

On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert robert.ianne...@novartis.com 
wrote:
 Karl,

 Our SharePoint sites use Kerberos authentication is this supported in 
 ManifoldCF?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 2:50 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Yes, this can be somewhat tricky.  There are a lot of potential 
 configurations that could affect this.

 First, you want to verify that your IIS is using NTLM authentication, and 
 that all the web services directories are executable.  This is critical.

 Second, the credentials, in the form of domain\user, may be sensitive to 
 whether you use a fully-qualified domain name or a shortcut domain name, e.g. 
 mydomain.novartis.com or just mydomain.  I suggest you try some combinations. 
  The other thing you may want to check is whether the machine you are running 
 ManifoldCF on is known by your domain controller; you may not be able to 
 authenticate if it is not.

 If this doesn't help, and you want to eliminate ManifoldCF's NTLM 
 implementation from the list of possibilities, I suggest downloading the 
 curl utility, and trying to fetch a web service listing or wsdl using it 
 (specifying NTLM of course as the authentication method).  If that also 
 doesn't work, it's a server-side configuration problem of some kind.

 You can also refer to the server-side IIS logs for some additional info.  But 
 I've found these are not very helpful for authentication issues.

 Let me know if you are still stuck after this; there are other diagnostics 
 available but they start to get ugly.

 Kral

 On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 I turned on the additional debugging and was able to resolve the 404 issue.

 Now I am getting:
 Crawl user did not authenticate properly, or has insufficient 
 permissions to access http://.xxx.xxx: (401)Unauthorized

 I can log into the SharePoint site from the browser using the same 
 credentials.


 Any Thoughts?

 Thanks
 Bob

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 10:05 AM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Well, you can turn on httpclient wire debugging, as I believe is described 
 in the article URL I sent you before, and then you can see precisely what 
 URL the connector is trying to reach when it accesses the MCPermissions 
 service.

 There's no magic here.  If the connector gets a 404 error back from IIS, 
 either its URL is wrong, or IIS has decided it's not going to serve that 
 page to the client.

 Karl


 On Tue, Nov 6, 2012 at 8:58 AM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Yes, The URL and what I enter in the ManifoldCF interface are a match.

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 8:52 AM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 I've seen situations where a SharePoint site is configured to perform a 
 redirection, and this is messing things up internally.  Does the your 
 connection server name etc. match precisely the URL you see when you are in 
 the SharePoint user interface?

 Karl

 On Tue, Nov 6, 2012 at 8:47 AM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 After further review it appears the MCpermissions.asmx was installed 
 globally in SharePoint. I am able to access it from within my SharePoint 
 site as well as all other valid SharePoint sub-sites.
 So this connection http://server/sitepath/_vti_bin works with any 
 valid site in sitepath including the previously mentioned _admin site.

 That said do you have any thoughts on why I would be getting the 404 error?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Monday, November 05, 2012 2:45 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 The 404 error indicates that your MCPermissions service is not properly 
 deployed.  The _admin in your 

Re: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Karl Wright
Hi Bob,

The only products I know have a similar limitations.  The only one I
know is the SharePoint google appliance connector, which when I looked
last had exactly the same restriction.  It also has other limitations,
some severe, such as limiting the number of documents you can crawl to
no more than 5000 per library.

We are willing to do a reasonable amount of work to upgrade ManifoldCF
to be able to support Kerberos.  Here's a link which describes the
situation:

http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html

We currently use a significantly-patched version of 3.1, which
supplied the NTLM implementation for 4.0 that is currently in use.
Our issue is similar to the commons-httpclient team's, which is we
have no good way of testing all of this, and none of us are security
protocol experts.  If you have (or know somebody with) such expertise,
who would be willing/able to donate their time, this problem could be
tackled I think without too much pain.  So at least httpclient, given
the right tickets, would be able to connect.

The other issue with Kerberos auth is that I believe it will require a
significant amount of work to allow anything using it to obtain the
tickets from the AD domain controller.  This would obviously require
UI work for all connectors that would support Kerberos.  But that is
something I am willing to attempt if everything else is in place.

Karl


On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert
robert.ianne...@novartis.com wrote:
 Karl,

 If this is not possible can you recommend any other products to crawl 
 SharePoint content and index it in Solr?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:10 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 No, Kerberos is not supported.  This is a limitation of the Apache 
 commons-httpclient library that we use for communicating with SharePoint.

 It is possible to set up IIS to serve a different port with different 
 authentication that goes to the same SharePoint instance but is NTLM 
 protected, not Kerberos protected.  Perhaps you can do this and limit access 
 to that port to only the ManifoldCF machine.

 Karl

 On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 Our SharePoint sites use Kerberos authentication is this supported in 
 ManifoldCF?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 2:50 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Yes, this can be somewhat tricky.  There are a lot of potential 
 configurations that could affect this.

 First, you want to verify that your IIS is using NTLM authentication, and 
 that all the web services directories are executable.  This is critical.

 Second, the credentials, in the form of domain\user, may be sensitive to 
 whether you use a fully-qualified domain name or a shortcut domain name, 
 e.g. mydomain.novartis.com or just mydomain.  I suggest you try some 
 combinations.  The other thing you may want to check is whether the machine 
 you are running ManifoldCF on is known by your domain controller; you may 
 not be able to authenticate if it is not.

 If this doesn't help, and you want to eliminate ManifoldCF's NTLM 
 implementation from the list of possibilities, I suggest downloading the 
 curl utility, and trying to fetch a web service listing or wsdl using it 
 (specifying NTLM of course as the authentication method).  If that also 
 doesn't work, it's a server-side configuration problem of some kind.

 You can also refer to the server-side IIS logs for some additional info.  
 But I've found these are not very helpful for authentication issues.

 Let me know if you are still stuck after this; there are other diagnostics 
 available but they start to get ugly.

 Kral

 On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 I turned on the additional debugging and was able to resolve the 404 issue.

 Now I am getting:
 Crawl user did not authenticate properly, or has insufficient
 permissions to access http://.xxx.xxx: (401)Unauthorized

 I can log into the SharePoint site from the browser using the same 
 credentials.


 Any Thoughts?

 Thanks
 Bob

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 10:05 AM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Well, you can turn on httpclient wire debugging, as I believe is described 
 in the article URL I sent you before, and then you can see precisely what 
 URL the connector is trying to reach when it accesses the MCPermissions 
 service.

 There's no magic here.  If the connector gets a 404 error back from IIS, 
 either its URL is wrong, or IIS has decided it's not going to 

RE: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Iannetti, Robert
Karl,

On another topic is there a roadmap for supporting SharePoint 2013 ?
We are in the process of migrating and were wondering when your ManifoldCF 
product would be available to support it. 

Thanks
Bob


-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com] 
Sent: Tuesday, November 06, 2012 3:34 PM
To: user@manifoldcf.apache.org
Subject: Re: Cannot connect to SharePoint 2010 instance

Hi Bob,

The only products I know have a similar limitations.  The only one I know is 
the SharePoint google appliance connector, which when I looked last had exactly 
the same restriction.  It also has other limitations, some severe, such as 
limiting the number of documents you can crawl to no more than 5000 per library.

We are willing to do a reasonable amount of work to upgrade ManifoldCF to be 
able to support Kerberos.  Here's a link which describes the
situation:

http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html

We currently use a significantly-patched version of 3.1, which supplied the 
NTLM implementation for 4.0 that is currently in use.
Our issue is similar to the commons-httpclient team's, which is we have no good 
way of testing all of this, and none of us are security protocol experts.  If 
you have (or know somebody with) such expertise, who would be willing/able to 
donate their time, this problem could be tackled I think without too much pain. 
 So at least httpclient, given the right tickets, would be able to connect.

The other issue with Kerberos auth is that I believe it will require a 
significant amount of work to allow anything using it to obtain the tickets 
from the AD domain controller.  This would obviously require UI work for all 
connectors that would support Kerberos.  But that is something I am willing to 
attempt if everything else is in place.

Karl


On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert robert.ianne...@novartis.com 
wrote:
 Karl,

 If this is not possible can you recommend any other products to crawl 
 SharePoint content and index it in Solr?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:10 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 No, Kerberos is not supported.  This is a limitation of the Apache 
 commons-httpclient library that we use for communicating with SharePoint.

 It is possible to set up IIS to serve a different port with different 
 authentication that goes to the same SharePoint instance but is NTLM 
 protected, not Kerberos protected.  Perhaps you can do this and limit access 
 to that port to only the ManifoldCF machine.

 Karl

 On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 Our SharePoint sites use Kerberos authentication is this supported in 
 ManifoldCF?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 2:50 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Yes, this can be somewhat tricky.  There are a lot of potential 
 configurations that could affect this.

 First, you want to verify that your IIS is using NTLM authentication, and 
 that all the web services directories are executable.  This is critical.

 Second, the credentials, in the form of domain\user, may be sensitive to 
 whether you use a fully-qualified domain name or a shortcut domain name, 
 e.g. mydomain.novartis.com or just mydomain.  I suggest you try some 
 combinations.  The other thing you may want to check is whether the machine 
 you are running ManifoldCF on is known by your domain controller; you may 
 not be able to authenticate if it is not.

 If this doesn't help, and you want to eliminate ManifoldCF's NTLM 
 implementation from the list of possibilities, I suggest downloading the 
 curl utility, and trying to fetch a web service listing or wsdl using it 
 (specifying NTLM of course as the authentication method).  If that also 
 doesn't work, it's a server-side configuration problem of some kind.

 You can also refer to the server-side IIS logs for some additional info.  
 But I've found these are not very helpful for authentication issues.

 Let me know if you are still stuck after this; there are other diagnostics 
 available but they start to get ugly.

 Kral

 On Tue, Nov 6, 2012 at 2:35 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 I turned on the additional debugging and was able to resolve the 404 issue.

 Now I am getting:
 Crawl user did not authenticate properly, or has insufficient 
 permissions to access http://.xxx.xxx: (401)Unauthorized

 I can log into the SharePoint site from the browser using the same 
 credentials.


 Any Thoughts?

 Thanks
 Bob

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 10:05 AM
 To: user@manifoldcf.apache.org
 

Re: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Karl Wright
Hi Bob,

That depends very strongly on whether SharePoint 2013 continues the
Microsoft tradition of breaking web services that used to work. :-)

Seriously, we need three things to develop a SharePoint 2013 solution:
(1) A stable release (a beta is not sufficient because Microsoft is
famous for changing things in a major way between beta and release);
(2) a benevolent client with sufficient patience to try things out
that we develop in their environment, and (3) enough time so that
we're not on the bleeding edge and that other people have run into
most of the sticky problems first.  We're volunteers here and we all
have day jobs, so we mostly can't afford to be pounding away at brick
walls on our own.

It could be the case that everything just works, in which case the
development is trivial.  We'll have to see.

Karl

On Tue, Nov 6, 2012 at 3:37 PM, Iannetti, Robert
robert.ianne...@novartis.com wrote:
 Karl,

 On another topic is there a roadmap for supporting SharePoint 2013 ?
 We are in the process of migrating and were wondering when your ManifoldCF 
 product would be available to support it.

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:34 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Hi Bob,

 The only products I know have a similar limitations.  The only one I know is 
 the SharePoint google appliance connector, which when I looked last had 
 exactly the same restriction.  It also has other limitations, some severe, 
 such as limiting the number of documents you can crawl to no more than 5000 
 per library.

 We are willing to do a reasonable amount of work to upgrade ManifoldCF to be 
 able to support Kerberos.  Here's a link which describes the
 situation:

 http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html

 We currently use a significantly-patched version of 3.1, which supplied the 
 NTLM implementation for 4.0 that is currently in use.
 Our issue is similar to the commons-httpclient team's, which is we have no 
 good way of testing all of this, and none of us are security protocol 
 experts.  If you have (or know somebody with) such expertise, who would be 
 willing/able to donate their time, this problem could be tackled I think 
 without too much pain.  So at least httpclient, given the right tickets, 
 would be able to connect.

 The other issue with Kerberos auth is that I believe it will require a 
 significant amount of work to allow anything using it to obtain the tickets 
 from the AD domain controller.  This would obviously require UI work for all 
 connectors that would support Kerberos.  But that is something I am willing 
 to attempt if everything else is in place.

 Karl


 On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 If this is not possible can you recommend any other products to crawl 
 SharePoint content and index it in Solr?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:10 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 No, Kerberos is not supported.  This is a limitation of the Apache 
 commons-httpclient library that we use for communicating with SharePoint.

 It is possible to set up IIS to serve a different port with different 
 authentication that goes to the same SharePoint instance but is NTLM 
 protected, not Kerberos protected.  Perhaps you can do this and limit access 
 to that port to only the ManifoldCF machine.

 Karl

 On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 Our SharePoint sites use Kerberos authentication is this supported in 
 ManifoldCF?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 2:50 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Yes, this can be somewhat tricky.  There are a lot of potential 
 configurations that could affect this.

 First, you want to verify that your IIS is using NTLM authentication, and 
 that all the web services directories are executable.  This is critical.

 Second, the credentials, in the form of domain\user, may be sensitive to 
 whether you use a fully-qualified domain name or a shortcut domain name, 
 e.g. mydomain.novartis.com or just mydomain.  I suggest you try some 
 combinations.  The other thing you may want to check is whether the machine 
 you are running ManifoldCF on is known by your domain controller; you may 
 not be able to authenticate if it is not.

 If this doesn't help, and you want to eliminate ManifoldCF's NTLM 
 implementation from the list of possibilities, I suggest downloading the 
 curl utility, and trying to fetch a web service listing or wsdl using it 
 (specifying NTLM of course as the authentication 

RE: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Iannetti, Robert
Karl,

That sounds reasonable. I am having my SP Admin set up the NTML SharePoint 
instance described below I will let you know how it works.

BTW SP 2013 RTM has been released so we can cross #1 off the list :)

Thanks
Bob

-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com] 
Sent: Tuesday, November 06, 2012 3:47 PM
To: user@manifoldcf.apache.org
Subject: Re: Cannot connect to SharePoint 2010 instance

Hi Bob,

That depends very strongly on whether SharePoint 2013 continues the Microsoft 
tradition of breaking web services that used to work. :-)

Seriously, we need three things to develop a SharePoint 2013 solution:
(1) A stable release (a beta is not sufficient because Microsoft is famous for 
changing things in a major way between beta and release);
(2) a benevolent client with sufficient patience to try things out that we 
develop in their environment, and (3) enough time so that we're not on the 
bleeding edge and that other people have run into most of the sticky problems 
first.  We're volunteers here and we all have day jobs, so we mostly can't 
afford to be pounding away at brick walls on our own.

It could be the case that everything just works, in which case the development 
is trivial.  We'll have to see.

Karl

On Tue, Nov 6, 2012 at 3:37 PM, Iannetti, Robert robert.ianne...@novartis.com 
wrote:
 Karl,

 On another topic is there a roadmap for supporting SharePoint 2013 ?
 We are in the process of migrating and were wondering when your ManifoldCF 
 product would be available to support it.

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:34 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Hi Bob,

 The only products I know have a similar limitations.  The only one I know is 
 the SharePoint google appliance connector, which when I looked last had 
 exactly the same restriction.  It also has other limitations, some severe, 
 such as limiting the number of documents you can crawl to no more than 5000 
 per library.

 We are willing to do a reasonable amount of work to upgrade ManifoldCF 
 to be able to support Kerberos.  Here's a link which describes the
 situation:

 http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html

 We currently use a significantly-patched version of 3.1, which supplied the 
 NTLM implementation for 4.0 that is currently in use.
 Our issue is similar to the commons-httpclient team's, which is we have no 
 good way of testing all of this, and none of us are security protocol 
 experts.  If you have (or know somebody with) such expertise, who would be 
 willing/able to donate their time, this problem could be tackled I think 
 without too much pain.  So at least httpclient, given the right tickets, 
 would be able to connect.

 The other issue with Kerberos auth is that I believe it will require a 
 significant amount of work to allow anything using it to obtain the tickets 
 from the AD domain controller.  This would obviously require UI work for all 
 connectors that would support Kerberos.  But that is something I am willing 
 to attempt if everything else is in place.

 Karl


 On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 If this is not possible can you recommend any other products to crawl 
 SharePoint content and index it in Solr?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:10 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 No, Kerberos is not supported.  This is a limitation of the Apache 
 commons-httpclient library that we use for communicating with SharePoint.

 It is possible to set up IIS to serve a different port with different 
 authentication that goes to the same SharePoint instance but is NTLM 
 protected, not Kerberos protected.  Perhaps you can do this and limit access 
 to that port to only the ManifoldCF machine.

 Karl

 On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 Our SharePoint sites use Kerberos authentication is this supported in 
 ManifoldCF?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 2:50 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Yes, this can be somewhat tricky.  There are a lot of potential 
 configurations that could affect this.

 First, you want to verify that your IIS is using NTLM authentication, and 
 that all the web services directories are executable.  This is critical.

 Second, the credentials, in the form of domain\user, may be sensitive to 
 whether you use a fully-qualified domain name or a shortcut domain name, 
 e.g. mydomain.novartis.com or just mydomain.  I suggest you try some 
 combinations.  The other thing you 

Re: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Karl Wright
If you want, we can create a ticket to cover SharePoint 2013 work.  If
you want to attempt a sanity check, if you email me (personally, to
daddy...@gmail.com) the Microsoft.SharePoint.dll I can set up a
ManifoldCF-Sharepoint-2013 plugin.  If I can build that, then the next
step would be just trying it all out and seeing where it fails.

Karl

On Tue, Nov 6, 2012 at 3:49 PM, Iannetti, Robert
robert.ianne...@novartis.com wrote:
 Karl,

 That sounds reasonable. I am having my SP Admin set up the NTML SharePoint 
 instance described below I will let you know how it works.

 BTW SP 2013 RTM has been released so we can cross #1 off the list :)

 Thanks
 Bob

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:47 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Hi Bob,

 That depends very strongly on whether SharePoint 2013 continues the Microsoft 
 tradition of breaking web services that used to work. :-)

 Seriously, we need three things to develop a SharePoint 2013 solution:
 (1) A stable release (a beta is not sufficient because Microsoft is famous 
 for changing things in a major way between beta and release);
 (2) a benevolent client with sufficient patience to try things out that we 
 develop in their environment, and (3) enough time so that we're not on the 
 bleeding edge and that other people have run into most of the sticky problems 
 first.  We're volunteers here and we all have day jobs, so we mostly can't 
 afford to be pounding away at brick walls on our own.

 It could be the case that everything just works, in which case the 
 development is trivial.  We'll have to see.

 Karl

 On Tue, Nov 6, 2012 at 3:37 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 On another topic is there a roadmap for supporting SharePoint 2013 ?
 We are in the process of migrating and were wondering when your ManifoldCF 
 product would be available to support it.

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:34 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Hi Bob,

 The only products I know have a similar limitations.  The only one I know is 
 the SharePoint google appliance connector, which when I looked last had 
 exactly the same restriction.  It also has other limitations, some severe, 
 such as limiting the number of documents you can crawl to no more than 5000 
 per library.

 We are willing to do a reasonable amount of work to upgrade ManifoldCF
 to be able to support Kerberos.  Here's a link which describes the
 situation:

 http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html

 We currently use a significantly-patched version of 3.1, which supplied the 
 NTLM implementation for 4.0 that is currently in use.
 Our issue is similar to the commons-httpclient team's, which is we have no 
 good way of testing all of this, and none of us are security protocol 
 experts.  If you have (or know somebody with) such expertise, who would be 
 willing/able to donate their time, this problem could be tackled I think 
 without too much pain.  So at least httpclient, given the right tickets, 
 would be able to connect.

 The other issue with Kerberos auth is that I believe it will require a 
 significant amount of work to allow anything using it to obtain the tickets 
 from the AD domain controller.  This would obviously require UI work for all 
 connectors that would support Kerberos.  But that is something I am willing 
 to attempt if everything else is in place.

 Karl


 On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 If this is not possible can you recommend any other products to crawl 
 SharePoint content and index it in Solr?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:10 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 No, Kerberos is not supported.  This is a limitation of the Apache 
 commons-httpclient library that we use for communicating with SharePoint.

 It is possible to set up IIS to serve a different port with different 
 authentication that goes to the same SharePoint instance but is NTLM 
 protected, not Kerberos protected.  Perhaps you can do this and limit 
 access to that port to only the ManifoldCF machine.

 Karl

 On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 Our SharePoint sites use Kerberos authentication is this supported in 
 ManifoldCF?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 2:50 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Yes, this can be somewhat tricky.  There are a lot of 

RE: Cannot connect to SharePoint 2010 instance

2012-11-06 Thread Iannetti, Robert
Karl,

Let try to get the 2010 connection working first before we proceed to the 2013.

Thanks
Bob

-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com] 
Sent: Tuesday, November 06, 2012 3:59 PM
To: user@manifoldcf.apache.org
Subject: Re: Cannot connect to SharePoint 2010 instance

If you want, we can create a ticket to cover SharePoint 2013 work.  If you want 
to attempt a sanity check, if you email me (personally, to
daddy...@gmail.com) the Microsoft.SharePoint.dll I can set up a
ManifoldCF-Sharepoint-2013 plugin.  If I can build that, then the next step 
would be just trying it all out and seeing where it fails.

Karl

On Tue, Nov 6, 2012 at 3:49 PM, Iannetti, Robert robert.ianne...@novartis.com 
wrote:
 Karl,

 That sounds reasonable. I am having my SP Admin set up the NTML SharePoint 
 instance described below I will let you know how it works.

 BTW SP 2013 RTM has been released so we can cross #1 off the list :)

 Thanks
 Bob

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:47 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Hi Bob,

 That depends very strongly on whether SharePoint 2013 continues the 
 Microsoft tradition of breaking web services that used to work. :-)

 Seriously, we need three things to develop a SharePoint 2013 solution:
 (1) A stable release (a beta is not sufficient because Microsoft is 
 famous for changing things in a major way between beta and release);
 (2) a benevolent client with sufficient patience to try things out that we 
 develop in their environment, and (3) enough time so that we're not on the 
 bleeding edge and that other people have run into most of the sticky problems 
 first.  We're volunteers here and we all have day jobs, so we mostly can't 
 afford to be pounding away at brick walls on our own.

 It could be the case that everything just works, in which case the 
 development is trivial.  We'll have to see.

 Karl

 On Tue, Nov 6, 2012 at 3:37 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 On another topic is there a roadmap for supporting SharePoint 2013 ?
 We are in the process of migrating and were wondering when your ManifoldCF 
 product would be available to support it.

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:34 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 Hi Bob,

 The only products I know have a similar limitations.  The only one I know is 
 the SharePoint google appliance connector, which when I looked last had 
 exactly the same restriction.  It also has other limitations, some severe, 
 such as limiting the number of documents you can crawl to no more than 5000 
 per library.

 We are willing to do a reasonable amount of work to upgrade 
 ManifoldCF to be able to support Kerberos.  Here's a link which 
 describes the
 situation:

 http://old.nabble.com/Support-for-Kerberos-SPNEGO-td14564857.html

 We currently use a significantly-patched version of 3.1, which supplied the 
 NTLM implementation for 4.0 that is currently in use.
 Our issue is similar to the commons-httpclient team's, which is we have no 
 good way of testing all of this, and none of us are security protocol 
 experts.  If you have (or know somebody with) such expertise, who would be 
 willing/able to donate their time, this problem could be tackled I think 
 without too much pain.  So at least httpclient, given the right tickets, 
 would be able to connect.

 The other issue with Kerberos auth is that I believe it will require a 
 significant amount of work to allow anything using it to obtain the tickets 
 from the AD domain controller.  This would obviously require UI work for all 
 connectors that would support Kerberos.  But that is something I am willing 
 to attempt if everything else is in place.

 Karl


 On Tue, Nov 6, 2012 at 3:11 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 If this is not possible can you recommend any other products to crawl 
 SharePoint content and index it in Solr?

 Thanks
 Bob


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: Tuesday, November 06, 2012 3:10 PM
 To: user@manifoldcf.apache.org
 Subject: Re: Cannot connect to SharePoint 2010 instance

 No, Kerberos is not supported.  This is a limitation of the Apache 
 commons-httpclient library that we use for communicating with SharePoint.

 It is possible to set up IIS to serve a different port with different 
 authentication that goes to the same SharePoint instance but is NTLM 
 protected, not Kerberos protected.  Perhaps you can do this and limit 
 access to that port to only the ManifoldCF machine.

 Karl

 On Tue, Nov 6, 2012 at 3:03 PM, Iannetti, Robert 
 robert.ianne...@novartis.com wrote:
 Karl,

 Our SharePoint sites use Kerberos authentication is this