Hi Daniel,

If you are seeing fetches in the Simple History that include the wiki
URLs you are trying to capture, the SharePoint job is likely correct.
Are you seeing "Document ingest" activities for the same documents?
If so, they are being sent to Solr, and you'd have to look into the
Solr configuration to figure out why they aren't being indexed.


On Sun, Feb 12, 2012 at 11:37 AM, Silvia, Daniel [USA]
<silvia_dan...@bah.com> wrote:
> Hi Karl
> Quick question regarding SharePoint Wikis and ingesting them into Solr.
> I have been trying to get the Wikis, created in SharePoint, to be ingested 
> into Solr. I am able to see the Wikis in the logging where the SharePoint 
> Connector pulls everything from site, however, I do not see the Wikis content 
> in the solr instance. When creating a job to run, do I need to indicate a 
> path similar to "*Wiki* for the entire site or do I need to configure the 
> solr metadata in the job to capture "WikiField" element in the xml being 
> passed to the Solr connector?
> Thanks for your help.
> Dan
> ________________________________________
> From: Karl Wright [daddy...@gmail.com]
> Sent: Tuesday, January 31, 2012 10:52 AM
> To: Silvia, Daniel [USA]
> Cc: connectors-user@incubator.apache.org
> Subject: Re: ManifoldCF's dist/shapoint-integration dir
> It's been a while since I've set up a SharePoint job but I think what
> you are missing is a file rule (instead of just a library rule).
> Here's what the end-user documentation says on the matter:
> "Each rule consists of a path, a rule type, and an action. The actions
> are "Include" and "Exclude". The rule type tells the connection what
> kind of SharePoint entity it is allowed to exactly match. For example,
> a "File" rule will only exactly match SharePoint paths that represent
> files - it cannot exactly match sites or libraries. The path itself is
> just a sequence of characters, where the "*" character has the special
> meaning of being able to match any number of any kind of characters,
> and the "?" character matches exactly one character of any kind.
> The rule matcher extends strict, exact matching by introducing a
> concept of implicit inclusion rules. If your rule action is "Include",
> and you specify (say) a "File" rule, the matcher presumes implicit
> inclusion rules for the corresponding site and library. So, if you
> create an "Include File" rule that matches (for example)
> "/MySite/MyLibrary/MyFile", there is an implied "Site Include" rule
> for "/MySite", and an implied "Library Include" rule for
> "/MySite/MyLibrary". Similarly, if you create a "Library Include"
> rule, there is an implied "Site Include" rule that corresponds to it.
> Note that these shortcuts only applies to "Include" rules - there are
> no corresponding implied "Exclude" rules."
> What this means is that you should probably be declaring file rules
> with "*" as the file name for each library, rather than a library
> rule.  You might want to just try this.  If you still have trouble,
> you can try setting the "org.apache.manifoldcf.connectors" property to
> "DEBUG" in the properties.xml file and restarting ManifoldCF before
> your next crawl.  The manifoldcf.log file will then have output
> describing the decisions the SharePoint connector made about each
> site, library, file, or folder it encountered.
> Thanks,
> Karl
> On Tue, Jan 31, 2012 at 10:27 AM, Silvia, Daniel [USA]
> <silvia_dan...@bah.com> wrote:
>> Hi Karl
>> The Path Rules are :
>> Path Match: /Shared Documents
>> Type: library
>> Action: include
>> Path Match: /IDD/Shared Documents
>> Type: library
>> Action: include
>> Path Match: /IDD/Documents
>> Type: library
>> Action: include
>> Path Match: /manifoldcf/Shared Documents
>> Type: library
>> Action: include
>> I hope this helps.
>> I really appreciate your help.
>> ________________________________________
>> From: Karl Wright [daddy...@gmail.com]
>> Sent: Tuesday, January 31, 2012 10:01 AM
>> To: Silvia, Daniel [USA]
>> Cc: connectors-user@incubator.apache.org
>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>> "When I select only the fetch activity, I don't see anything in the
>> events, when I select the Document Ingest activity, I don't see
>> anything in the events."
>> So either you've already run the job and the documents were accessed
>> the first time (and won't be accessed again until they change), or the
>> problem is likely that your SharePoint Path Rules are not including
>> any documents.  It would be very helpful at this point to include a
>> screen shot of the job you've created.  Since you are not on the net,
>> perhaps you can jot down your SharePoint path rules for me to have a
>> look at, as they are displayed when you view the job.
>> Thanks,
>> Karl
>> On Tue, Jan 31, 2012 at 9:44 AM, Silvia, Daniel [USA]
>> <silvia_dan...@bah.com> wrote:
>>> Hi Karl
>>> Ok, I have created a new job and ran the job and went to the Simple History 
>>> Report.
>>> I see the Events. If all the  Activities in the Simple History Report, 
>>> Document Deletion(SolrPipeline), Document Ingest(SolrPipeline), and Fetch 
>>> are selected I see a start job and end job for events . When I get to the 
>>> Simple History Report I can select the "Connection", I don't have an option 
>>> to select the Activities I run the report first.
>>> When I select only the fetch activity, I don't see anything in the events, 
>>> when I select the Document Ingest activity, I don't see anything in the 
>>> events.
>>> My solr output connection has the following information:
>>> Protocol: http
>>> Server: "the server name"
>>> Port:8080 (we are running solr on Jboss port 8080)
>>> Web Application Name: solr
>>> Core Name: collection1
>>> Update Handler: update/extract
>>> Remove Handler: /update
>>> Status Handler: /admin/ping
>>> ________________________________________
>>> From: Karl Wright [daddy...@gmail.com]
>>> Sent: Tuesday, January 31, 2012 9:00 AM
>>> To: Silvia, Daniel [USA]; connectors-user@incubator.apache.org
>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>> Ok, let's do one thing at a time.
>>> First:
>>> "For the Path tab where there are Path Rules, are these the paths we
>>> want ManifoldCF to follow? Each site, and each Library like Documents
>>> and Shared Documents. And in the Metadata tab, this is the tab where
>>> you indicate for each "Site" and "Library" you want to include
>>> specific metadata or include all metadata?"
>>> For SharePoint, there are Path Rules and Metadata Rules.  The Path
>>> Rules describe what documents you want to include or exclude.  The
>>> Metadata Rules describe what metadata you want to include or exclude.
>>> For right now I would ignore the Metadata Rules and just make sure you
>>> have Path Rules that mean that you have included documents.
>>> "As I run the report, I see "Documents", "Active, and "Processed"
>>> where the numbers change under the "Active" column as well as the
>>> "Document" and "Processed" column (these just get larger, where Active
>>> changes). "
>>> This "report" we actually call the Job Status screen.  The fact that
>>> the numbers get larger and the job doesn't just end indicates that you
>>> are successfully crawling your SharePoint, and you have set up the job
>>> to include at least some documents.  This is good news.  However, this
>>> is NOT the "Simple History" report I was alluding to earlier.  To get
>>> to that report, click on the "Simple History" link on the left-hand
>>> navigation area.  This report will show the events of your choice
>>> (default - ALL recorded events) over a given time window (default: the
>>> last hour).  If you've done this right you should at least see a "Job
>>> start" event.  The events you are most interested in are the "fetch"
>>> (which describes all attempts to fetch documents from SharePoint) and
>>> "document ingest", which describe attempts to get documents into Solr.
>>>  You can refresh the displayed events by clicking the "Go" button in
>>> the middle of the screen whenever you wish.
>>> I'd like you to delete your job, create it again, and start it.  Then,
>>> while it is running, I'd like you to go to the "Simple History"
>>> screen, and select the appropriate connection (your SharePoint
>>> repository connection), and click the "Go" button.  So as not to skip
>>> anything basic:
>>> (1) What event types do you see?
>>> (2) Are there "fetch" events?
>>> (3) Are there "document ingest" events?
>>> If you see no "fetch" events, that implies you have either not
>>> specified any documents to include in your job, OR your Solr
>>> connection is configured to reject too many document types so they are
>>> all getting filtered out.
>>> If you see "document ingest" events, but those have errors, it implies
>>> that the configuration of your Solr connection is incorrect and does
>>> not match the way your Solr is configured.  If you send me a specific
>>> error code and/or text I can help you figure out what is happening.
>>> If you see "document ingest" events with NO errors, but the Solr
>>> instance is not getting documents, you are describing an impossible
>>> situation.  While your Solr instance may not be configured to have the
>>> Extracting Update Handler active, or it may be at a different URL than
>>> what you pointed at, that would definitely yield errors or
>>> notifications in the Simple History.
>>> Please let me know what you actually see.
>>> Karl
>>> On Tue, Jan 31, 2012 at 7:53 AM, Silvia, Daniel [USA]
>>> <silvia_dan...@bah.com> wrote:
>>>> Hi Karl
>>>> I am trying to figure out why I can't see anything being indexed into our 
>>>> Solr index. I was looking at another post where you were working with 
>>>> "Martijn" and that individual was not able to see info getting into Solr. 
>>>> In the report  that I have set up, I have included all metadata associated 
>>>> to each site, Share Documents, and Documents. In the Solr Field Mapping, I 
>>>> am associating metadata fields that are indicated in the MetaData tab to 
>>>> fields that exist in our solr index.
>>>> For the Path tab where there are Path Rules, are these the paths we want 
>>>> ManifoldCF to follow? Each site, and each Library like Documents and 
>>>> Shared Documents. And in the Metadata tab, this is the tab where you 
>>>> indicate for each "Site" and "Library" you want to include specific 
>>>> metadata or include all metadata?
>>>> As I run the report, I see "Documents", "Active, and "Processed" where the 
>>>> numbers change under the "Active" column as well as the "Document" and 
>>>> "Processed" column (these just get larger, where Active changes). While I 
>>>> was researching why I may not be seeing something over on the Solr side, I 
>>>> saw your communication with another individual indicating that I should 
>>>> see something like literal.xxx=yyy in the Solr log. This is an older post 
>>>> so there maybe something else I should see. But the only thing I see when 
>>>> I look at the Solr log is "[ ] webapp=/solr path=/update/extract 
>>>> params={commit=true} status=0 QTime=0".
>>>> Any ideas.
>>>> Thanks
>>>> ________________________________________
>>>> From: Karl Wright [daddy...@gmail.com]
>>>> Sent: Monday, January 30, 2012 10:40 AM
>>>> To: Silvia, Daniel [USA]
>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>> The default time range for the Simple History is the last hour.  I
>>>> suspect you are unaware of that.  If you want a different time range
>>>> you will have to modify the start and end time pulldowns accordingly.
>>>> Karl
>>>> On Mon, Jan 30, 2012 at 10:34 AM, Silvia, Daniel [USA]
>>>> <silvia_dan...@bah.com> wrote:
>>>>> Hi Karl
>>>>> I am looking at the Simple History in the UI and there isn't much to see, 
>>>>> unless I am not getting what I am suppose to.  I see the "Start Time, 
>>>>> Activity, Identifier, Bytes, and Time, I don't get anything for Result 
>>>>> Code or Result Description. I looked in the documentation and we should 
>>>>> be getting something in those fields, I believe.
>>>>> Anyway, I will look through the mail list to see what I can find.
>>>>> Thanks for the help.
>>>>> Dan
>>>>> ________________________________________
>>>>> From: Karl Wright [daddy...@gmail.com]
>>>>> Sent: Monday, January 30, 2012 8:24 AM
>>>>> To: Silvia, Daniel [USA]
>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>>> So just to be clear, I'm NOT talking about the ManifoldCF logging.
>>>>> For the Solr connector you probably won't need to turn that on; it's
>>>>> pretty simple and you can look at the Simple History in the UI to see
>>>>> what the request and response look like from Solr.  I was talking
>>>>> instead about Solr logging - when you run the Solr Webapp, by default
>>>>> all requests against the Extracting Update Handler are logged to
>>>>> standard error, so you will see them appear in the process window in
>>>>> which Solr is running.
>>>>> My suggestion to you is to first have a look at the Simple History for
>>>>> the job you are trying to run.  If you are getting back 500 errors
>>>>> from Solr, that means you have not set up Solr properly to work with
>>>>> ManifoldCF.  In recent versions of Solr, the example works fine out of
>>>>> the box, but when you try to deploy any other way you are often
>>>>> missing the jar that contains the extracting update handler, so of
>>>>> course nothing works.  Several people on the connectors-user list have
>>>>> run into this and if you search the list (go to the ManifoldCF site
>>>>> and click through to the mailing list page and there are links at the
>>>>> bottom for this purpose) you will find posts that describe exactly
>>>>> what is wrong and how to fix it.
>>>>> Hope this helps.
>>>>> Karl
>>>>> On Sun, Jan 29, 2012 at 2:30 PM, Silvia, Daniel [USA]
>>>>> <silvia_dan...@bah.com> wrote:
>>>>>> Yea,but for some reason the logging isn't coming through. The logging is 
>>>>>> set for info and I will have to change the logging level to DEBUG.
>>>>>> Thanks again for your help.
>>>>>> ________________________________________
>>>>>> From: Karl Wright [daddy...@gmail.com]
>>>>>> Sent: Friday, January 27, 2012 5:06 PM
>>>>>> To: Silvia, Daniel [USA]
>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>>>> Actually, the best thing for debugging the Solr connection is looking
>>>>>> at standard-output on the Solr instance.  You will see all the posts
>>>>>> that are made and what the arguments were.  Also, this is the kind of
>>>>>> question you'd get a lot of benefit from posting to the list.  The
>>>>>> end-user documentation I pointed you at before describes some of this
>>>>>> but the Solr connector has grown beyond the doc to some extent at this
>>>>>> point.
>>>>>> Karl
>>>>>> On Fri, Jan 27, 2012 at 9:51 AM, Silvia, Daniel [USA]
>>>>>> <silvia_dan...@bah.com> wrote:
>>>>>>> Hi Karl
>>>>>>> Is there a log level other than  Wire-level debugging to view log 
>>>>>>> staements for trying to send output to a Solr instance in the Jobs 
>>>>>>> List/Creation section? We are having an issue getting content to Solr. 
>>>>>>> Is there a document anywhere which defines the fields for the Jobs 
>>>>>>> sections for the Solr Field Mapping tab and the Paths and MetaData tabs?
>>>>>>> Thanks
>>>>>>> Dan
>>>>>>> ________________________________________
>>>>>>> From: Karl Wright [daddy...@gmail.com]
>>>>>>> Sent: Thursday, January 26, 2012 10:44 AM
>>>>>>> To: Silvia, Daniel [USA]
>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>>>>> I am afraid I don't know the answer to that.  I'm sure it's infinitely
>>>>>>> configurable but it's not clear what the SharePoint web services need
>>>>>>> to do under the hood, so anything I tell you would be just a guess.
>>>>>>> Karl
>>>>>>> On Thu, Jan 26, 2012 at 10:43 AM, Silvia, Daniel [USA]
>>>>>>> <silvia_dan...@bah.com> wrote:
>>>>>>>> Hi Karl
>>>>>>>> One more question. Do you know the minimum permissions needed to crawl 
>>>>>>>> the Sharepoint instance and all sites under the instance? The 
>>>>>>>> individual who set my permissions set me up as the "site collection 
>>>>>>>> admin" for the top most site. Is there a specific admin role without 
>>>>>>>> setting the user crawling the sharpoint instance other than "Farm 
>>>>>>>> Admin"?
>>>>>>>> Thanks
>>>>>>>> ________________________________________
>>>>>>>> From: Karl Wright [daddy...@gmail.com]
>>>>>>>> Sent: Thursday, January 26, 2012 9:53 AM
>>>>>>>> To: Silvia, Daniel [USA]
>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>>>>>> Good news!  Please keep in touch; we'd like to hear how things work
>>>>>>>> for you (it helps keep the software fresh ;-) ).
>>>>>>>> Karl
>>>>>>>> On Thu, Jan 26, 2012 at 9:48 AM, Silvia, Daniel [USA]
>>>>>>>> <silvia_dan...@bah.com> wrote:
>>>>>>>>> Hey Karl
>>>>>>>>> (1) was the issue. When requesting access to the SharePoint instance 
>>>>>>>>> I indicated that I needed to be able to crawl SharePoint, I guess the 
>>>>>>>>> problem was on my end indicating that I also needed privileges to 
>>>>>>>>> crawl the site.
>>>>>>>>> Anyway, thank you for your help. When I change the SharePoint version 
>>>>>>>>> to v 3 I get a message indicating "Connection Working".
>>>>>>>>> Appreciate the help.
>>>>>>>>> Dan
>>>>>>>>> ________________________________________
>>>>>>>>> From: Karl Wright [daddy...@gmail.com]
>>>>>>>>> Sent: Thursday, January 26, 2012 9:19 AM
>>>>>>>>> To: Silvia, Daniel [USA]
>>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>>>>>>> The error message "axisFault=Server, detail=Server was unable to
>>>>>>>>> process request --> Requested Registry access is not allowed" is Axis
>>>>>>>>> interpreting an error message from SharePoint.  What it is saying is
>>>>>>>>> that the user you are trying to crawl with is unable to read the
>>>>>>>>> SharePoint machine's registry but needs to.  There are two possible
>>>>>>>>> causes for this:
>>>>>>>>> (1) The user you gave doesn't have enough permissions to crawl 
>>>>>>>>> SharePoint
>>>>>>>>> (2) When you installed the SharePoint MCPermissions plugin, you
>>>>>>>>> installed it logged in as a user that did not enough permissions to do
>>>>>>>>> what it needs to do.
>>>>>>>>> You can tell the difference between the two by selecting "SharePoint
>>>>>>>>> 2.0" in the sharepoint version pulldown.  If a connection saved in
>>>>>>>>> this way says "Connection working", it means that the MCPermissions
>>>>>>>>> plugin has the permission problem, not your user.
>>>>>>>>> Karl
>>>>>>>>> On Thu, Jan 26, 2012 at 9:14 AM, Silvia, Daniel [USA]
>>>>>>>>> <silvia_dan...@bah.com> wrote:
>>>>>>>>>> Hi Karl
>>>>>>>>>> When I try to use option (1) and don't put anything in the Site 
>>>>>>>>>> field, I get an error message "axisFault=Server, detail=Server was 
>>>>>>>>>> unable to process request --> Requested Registry access is not 
>>>>>>>>>> allowed" and when I put a "/" in the site filed I get  a GUI error 
>>>>>>>>>> indicating that the site field can't end with a "/".
>>>>>>>>>> Anyway, do you have any ideas. Or maybe the Sharepoint instance is 
>>>>>>>>>> not configured properly for us to crawl?
>>>>>>>>>> Thanks
>>>>>>>>>> ________________________________________
>>>>>>>>>> From: Karl Wright [daddy...@gmail.com]
>>>>>>>>>> Sent: Thursday, January 26, 2012 8:52 AM
>>>>>>>>>> To: Silvia, Daniel [USA]
>>>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>>>>>>>> SharePoint has two kinds of site:
>>>>>>>>>> (1) the root site, which can be reached by the path 
>>>>>>>>>> http://server:port
>>>>>>>>>> (2) a number of sites under the 'virtual path', with URLs of the 
>>>>>>>>>> form:
>>>>>>>>>> http://server:port/something/sitename
>>>>>>>>>> The "something" is, by default, the string "site", so
>>>>>>>>>> http://server:port/site/xyz might be the URL of one such virtual 
>>>>>>>>>> site.
>>>>>>>>>> The form of the "site" field in the SharePoint connection for the
>>>>>>>>>> first is either blank or "/" (can't remember which right now), and 
>>>>>>>>>> the
>>>>>>>>>> form of the "site" field for the second is "/site/xyz".  On no 
>>>>>>>>>> account
>>>>>>>>>> does the connector expect to see default.aspx attached to that path,
>>>>>>>>>> so you should not do this; it cannot work.
>>>>>>>>>> FWIW, my recommendation to try setting the connection type to
>>>>>>>>>> "SharePoint 2.0" was to rule out any possible installation issue with
>>>>>>>>>> the ManifoldCF sharepoint plugin.  The connection check for 2.0 does
>>>>>>>>>> not look for it; only the connection check for 3.0 does.
>>>>>>>>>> Karl
>>>>>>>>>> On Thu, Jan 26, 2012 at 8:41 AM, Silvia, Daniel [USA]
>>>>>>>>>> <silvia_dan...@bah.com> wrote:
>>>>>>>>>>> Hey Karl
>>>>>>>>>>> I am also getting an "HTTP Error 401.2: Unauthorized: Access is 
>>>>>>>>>>> denied due to server configuration" when setting the Site field to 
>>>>>>>>>>> /default.aspx. Do most Sharepoint instances have the urls set to 
>>>>>>>>>>> something like http://server:port/sites/...... instead of 
>>>>>>>>>>> http://server:port/? When I use the "/default.aspx" I see in the 
>>>>>>>>>>> log files that ManifoldCF is trying to go to the Lists.asmx service 
>>>>>>>>>>> with the url http://server:port/default.aspx/_vti_bin/Lists.asmx, 
>>>>>>>>>>> where nothing is found.
>>>>>>>>>>> As you can tell I am not much of a SharePoint user or installer.
>>>>>>>>>>> Also, I don't think the issue is with the connector in ManifoldCF, 
>>>>>>>>>>> I am just trying to
>>>>>>>>>>> ________________________________________
>>>>>>>>>>> From: Silvia, Daniel [USA]
>>>>>>>>>>> Sent: Thursday, January 26, 2012 7:23 AM
>>>>>>>>>>> To: Karl Wright
>>>>>>>>>>> Subject: RE: ManifoldCF's dist/shapoint-integration dir
>>>>>>>>>>> Hey Karl
>>>>>>>>>>> The issue I am having is that the Sharepoint instance url is 
>>>>>>>>>>> something like http://server:port/default.aspx. If I don't put 
>>>>>>>>>>> anything in the site field I get a message indicating "Requested 
>>>>>>>>>>> Registry Access is not allowed". I was putting "/default.apsx" as 
>>>>>>>>>>> my Site field which I believe may have been the issue. However, 
>>>>>>>>>>> what do you put in your Site field when the site is the top most 
>>>>>>>>>>> site, as in http://server:port/default.aspx?
>>>>>>>>>>> I would love to send you the log messages, but I am working on a 
>>>>>>>>>>> network which is not connected to the outside.
>>>>>>>>>>> Thanks for your help.
>>>>>>>>>>> ________________________________________
>>>>>>>>>>> From: Karl Wright [daddy...@gmail.com]
>>>>>>>>>>> Sent: Wednesday, January 25, 2012 6:12 PM
>>>>>>>>>>> To: Silvia, Daniel [USA]
>>>>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>>>>>>>>> Daniel,
>>>>>>>>>>> FWIW, I can help you diagnose the issue, but to do so you really 
>>>>>>>>>>> need
>>>>>>>>>>> to give me some concrete data.  I'm happy to grovel over the whole
>>>>>>>>>>> wire log if you feel you can send it to me; something that may not
>>>>>>>>>>> seem important to you will likely stand out strongly to me.  I can,
>>>>>>>>>>> for example, see whether you are getting back HTML because of an
>>>>>>>>>>> authentication error, for instance.  And if you ARE getting back 
>>>>>>>>>>> valid
>>>>>>>>>>> SOAP, I would then be sure that something was wrong with the Axis
>>>>>>>>>>> client configuration, and I could pursue that here with the data
>>>>>>>>>>> provided.  The problem with software like SharePoint running on IIS 
>>>>>>>>>>> is
>>>>>>>>>>> that it can be configured a nearly infinite number of ways, so
>>>>>>>>>>> diagnosis is more of an art than a science.  I strongly suspect that
>>>>>>>>>>> you're laboring under a pretty straightforward misconception which 
>>>>>>>>>>> is
>>>>>>>>>>> likely blocking progress, rather than there being an issue with the
>>>>>>>>>>> SharePoint connector itself.  But I can't tell that without more
>>>>>>>>>>> detailed communication.
>>>>>>>>>>> Also, you mentioned that the Lists.asmx service was right where you
>>>>>>>>>>> expected it to be.  Have you read the SharePoint Connector part of 
>>>>>>>>>>> the
>>>>>>>>>>> end-user documentation?  To whit:
>>>>>>>>>>> "Select the server protocol, and enter the server name and port, 
>>>>>>>>>>> based
>>>>>>>>>>> on what you recorded from the URL for your SharePoint site. For the
>>>>>>>>>>> "Site path" field, type in the portion of the root site URL that
>>>>>>>>>>> includes everything after the server and port, except for the final
>>>>>>>>>>> "aspx" file. For example, if the SharePoint URL is
>>>>>>>>>>> "http://myserver:81/sites/somewhere/index.asp";, the site path would 
>>>>>>>>>>> be
>>>>>>>>>>> "/sites/somewhere"."  The Lists.asmx service in this example would 
>>>>>>>>>>> be
>>>>>>>>>>> expected to be found at
>>>>>>>>>>> "http://myserver:81/sites/somewhere/_vti_bin/Lists.asmx";.  And the 
>>>>>>>>>>> URL
>>>>>>>>>>> you would start with would be the URL you see in the browser when 
>>>>>>>>>>> you
>>>>>>>>>>> log into the SharePoint web client and go to the site you wish to
>>>>>>>>>>> crawl.  Is this what you are doing?
>>>>>>>>>>> Thanks again,
>>>>>>>>>>> Karl
>>>>>>>>>>> On Wed, Jan 25, 2012 at 12:33 PM, Karl Wright <daddy...@gmail.com> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> The code that parses the SOAP response is Apache Axis.  This hasn't
>>>>>>>>>>>> changed in several years.
>>>>>>>>>>>> Can you answer the following questions:
>>>>>>>>>>>> (1) When the SharePoint connector makes a request to SharePoint, is
>>>>>>>>>>>> the response HTML, or is it XML?  Does it have an XML header which
>>>>>>>>>>>> describes a Microsoft XML namespace?  It sure sounds like it is
>>>>>>>>>>>> responding with HTML.  The SharePoint connector is expecting to
>>>>>>>>>>>> communicate using SOAP.  Is the response valid SOAP?
>>>>>>>>>>>> (2) What version of SharePoint are you trying to connect to?  Is 
>>>>>>>>>>>> the
>>>>>>>>>>>> SharePoint 2007?  SharePoint 2010?
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Karl
>>>>>>>>>>>> On Wed, Jan 25, 2012 at 12:26 PM, Silvia, Daniel [USA]
>>>>>>>>>>>> <silvia_dan...@bah.com> wrote:
>>>>>>>>>>>>> Hi Karl
>>>>>>>>>>>>> I have added the specific log4j lines for Http Client wire and I 
>>>>>>>>>>>>> restarted the ManifoldCF instance. I was also see the webservice 
>>>>>>>>>>>>> Lists.asmx through IE. When reviewing the log files I was able to 
>>>>>>>>>>>>> see some of the content that resides in the Sharepoint instance 
>>>>>>>>>>>>> in the content coming back from the request. However, I am still 
>>>>>>>>>>>>> seeing the error messages in the ManifoldCF GUI as well as in the 
>>>>>>>>>>>>> log file indicating  "Bad Envelope: HTML" ,"No service named 
>>>>>>>>>>>>> ListsSoap is available" and "No service named 
>>>>>>>>>>>>> http://schemas.microsoft.com/sharepoint/soap/GetListCollection is 
>>>>>>>>>>>>> available".
>>>>>>>>>>>>> Could there be something going on with the way the services are 
>>>>>>>>>>>>> being built on the client side?
>>>>>>>>>>>>> Appreciate your help.
>>>>>>>>>>>>> Dan
>>>>>>>>>>>>> ________________________________________
>>>>>>>>>>>>> From: Karl Wright [daddy...@gmail.com]
>>>>>>>>>>>>> Sent: Tuesday, January 24, 2012 4:52 PM
>>>>>>>>>>>>> To: Silvia, Daniel [USA]; connectors-user@incubator.apache.org
>>>>>>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>>>>>>>>>>> I have not seen this exact problem before.
>>>>>>>>>>>>> The "Bad envelope tag: HTML" indicates that the SOAP request the
>>>>>>>>>>>>> SharePoint connector is attempting to perform is, in fact, 
>>>>>>>>>>>>> returning
>>>>>>>>>>>>> an HTML response.  This usually indicates that the server or path
>>>>>>>>>>>>> parameters you've used to set up the connection are not set 
>>>>>>>>>>>>> correctly,
>>>>>>>>>>>>> and SharePoint is not actually being engaged.
>>>>>>>>>>>>> But usually when that happens I don't recall a 
>>>>>>>>>>>>> ConfigurationException
>>>>>>>>>>>>> logged, unless it's what Axis does in response to the HTML.
>>>>>>>>>>>>> The best thing to do at this point is turn on Http Client wire
>>>>>>>>>>>>> logging, restart ManifoldCF, and view the connection.  The log 
>>>>>>>>>>>>> will
>>>>>>>>>>>>> then contain a record of the exact SOAP requests and the 
>>>>>>>>>>>>> responses,
>>>>>>>>>>>>> and we can see what's wrong.  The technique is described here:
>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/CONNECTORS/Debugging+Connections
>>>>>>>>>>>>> You can also confirm that the right SharePoint web services are
>>>>>>>>>>>>> functioning on the machine in question by trying to access them
>>>>>>>>>>>>> directly.  For the Lists web service, which is the one it sounds 
>>>>>>>>>>>>> like
>>>>>>>>>>>>> it was complaining about, try using IE (not Firefox etc because 
>>>>>>>>>>>>> you
>>>>>>>>>>>>> want NTLM support) to go to the url where you think the web 
>>>>>>>>>>>>> service
>>>>>>>>>>>>> lives.  This will be http: or https:, plus the server, plus the 
>>>>>>>>>>>>> port,
>>>>>>>>>>>>> plus the path, plus "_vti_bin/Lists.asmx".  You should see an
>>>>>>>>>>>>> unequivocable SharePoint response.  For an example from the 
>>>>>>>>>>>>> Microsoft
>>>>>>>>>>>>> demo service, try http://www.wssdemo.com/_vti_bin/Lists.asmx.
>>>>>>>>>>>>> Please let me know how it goes, and cc the dev list (as I have) 
>>>>>>>>>>>>> so a
>>>>>>>>>>>>> record of what you're encountering can be made available to 
>>>>>>>>>>>>> others.
>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>> Karl
>>>>>>>>>>>>> On Tue, Jan 24, 2012 at 1:52 PM, Silvia, Daniel [USA]
>>>>>>>>>>>>> <silvia_dan...@bah.com> wrote:
>>>>>>>>>>>>>> Hi Karl
>>>>>>>>>>>>>> I have downloaded the newest version of ManifoldCF v .4 and have 
>>>>>>>>>>>>>> run the necessary ant scripts to download dependencies and then 
>>>>>>>>>>>>>> built the entire project. I have also had the ShrePoint 
>>>>>>>>>>>>>> webservice MetCarta.SharePoint.MCPermissionsService.wsp deployed 
>>>>>>>>>>>>>> on the SharePoint instance due to running version 3 of 
>>>>>>>>>>>>>> SharePoint (SharePoint 2007). When I try to create a Repository 
>>>>>>>>>>>>>> Connection and select "Save" I get a message on the ManifoldCF 
>>>>>>>>>>>>>> front end of "org.xml.sax.SAXException Bad envelope tag: HTML". 
>>>>>>>>>>>>>> When I look at the log file I see an error message " 
>>>>>>>>>>>>>> org.apache.axis.ConfigurationException: No service named 
>>>>>>>>>>>>>> ListsSoap is available".
>>>>>>>>>>>>>> Can you tell me if you have seen this issue before and what may 
>>>>>>>>>>>>>> be causing this issue?
>>>>>>>>>>>>>> Thanks for your help.
>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>> ________________________________________
>>>>>>>>>>>>>> From: Karl Wright [daddy...@gmail.com]
>>>>>>>>>>>>>> Sent: Friday, January 20, 2012 7:31 AM
>>>>>>>>>>>>>> To: Silvia, Daniel [USA]
>>>>>>>>>>>>>> Subject: Re: ManifoldCF's dist/shapoint-integration dir
>>>>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>>>> In order for the SharePoint connector to build, you need to have 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> wsdls in place in the right area.  We cannot ship those because 
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>> potential copyright issues.  The easiest way to obtain the right
>>>>>>>>>>>>>> dependencies is:
>>>>>>>>>>>>>> ant download-dependencies
>>>>>>>>>>>>>> Then, just build normally:
>>>>>>>>>>>>>> ant build
>>>>>>>>>>>>>> This will only work for ManifoldCF-0.4-incubating, or trunk.
>>>>>>>>>>>>>> 0.4-incubating is still in the process of being signed off by the
>>>>>>>>>>>>>> incubator, but you can find the release candidate here:
>>>>>>>>>>>>>> http://people.apache.org/~kwright
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>> On Fri, Jan 20, 2012 at 7:02 AM, Silvia, Daniel [USA]
>>>>>>>>>>>>>> <silvia_dan...@bah.com> wrote:
>>>>>>>>>>>>>>> Hi Karl
>>>>>>>>>>>>>>> I work with Matt Parker and we are in the process of developing 
>>>>>>>>>>>>>>> a pipeline
>>>>>>>>>>>>>>> that uses ManifoldCF at the beginning. I just subscribed to the
>>>>>>>>>>>>>>> connectors-user-subscr...@incubator.apache.org
>>>>>>>>>>>>>>> group yesterday and submitted an e-mail question to the group. 
>>>>>>>>>>>>>>> Can you help
>>>>>>>>>>>>>>> us with the below issue?
>>>>>>>>>>>>>>> I downloaded MCF and started playing with the default setup 
>>>>>>>>>>>>>>> under Jetty and
>>>>>>>>>>>>>>> Derby. It starts up without any issue. I am trying to configure 
>>>>>>>>>>>>>>> a SharePoint
>>>>>>>>>>>>>>> connector, connecting to SharePoint Service 3. I have been 
>>>>>>>>>>>>>>> following the
>>>>>>>>>>>>>>> instructions and I am at the point of deploying the custom 
>>>>>>>>>>>>>>> SharePoint web
>>>>>>>>>>>>>>> service to the SharePoint instance. The instructions indicate 
>>>>>>>>>>>>>>> that I should
>>>>>>>>>>>>>>> get the web service from dist/sharepoint-integration after 
>>>>>>>>>>>>>>> building MCF.
>>>>>>>>>>>>>>> However, after looking through the entire directory structure, 
>>>>>>>>>>>>>>> I am unable
>>>>>>>>>>>>>>> to find the service to deploy.
>>>>>>>>>>>>>>> Can someone tell me where to find this service?
>>>>>>>>>>>>>>> Thanks for your help.
>>>>>>>>>>>>>>> Daniel Silvia

Reply via email to