[jira] [Commented] (CONNECTORS-1449) Add support for respecting the NoCrawl flag in Sharepoint

2019-04-12 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816578#comment-16816578
 ] 

Karl Wright commented on CONNECTORS-1449:
-

Hi, the method that is used to get the SOAP for metadata for a document is the 
following:

{code}
metadataValues = proxy.getFieldValues( sortedMetadataFields, 
encodePath(sitePath), listID, "/Lists/" + decodedItemPath.substring(cutoff+1), 
dspStsWorks );

{code}

This calls:

{code}
  {
// SharePoint 2010: Get field values some other way
// Sharepoint 2010; use Lists service instead
ListsWS lservice = new ListsWS(baseUrl + site, userName, password, 
configuration, httpClient );
ListsSoapStub stub1 = (ListsSoapStub)lservice.getListsSoapHandler();

String sitePlusDocId = serverLocation + site + docId;
if (sitePlusDocId.startsWith("/"))
  sitePlusDocId = sitePlusDocId.substring(1);

GetListItemsQuery q = buildMatchQuery("FileRef","Text",sitePlusDocId);
GetListItemsViewFields viewFields = buildViewFields(fieldNames);

GetListItemsResponseGetListItemsResult items =  
stub1.getListItems(docLibrary, "", q, viewFields, "1", 
buildNonPagingQueryOptions(), null);
if (items == null)
  return result;

MessageElement[] list = items.get_any();

final String xmlResponse = list[0].toString();
if (Logging.connectors.isDebugEnabled()){
  Logging.connectors.debug("SharePoint: getListItems FileRef value 
'"+sitePlusDocId+"', xml response: '" + xmlResponse + "'");
}
{code}

So it is calling the Lists service to do this right now (SharePoint 2010 and 
higher).  For SharePoint 2003, it used the dspsts service, but that's been 
broken for a while, and I see no need to support this feature for that version 
of SharePoint.

If you introduce a new service or method, I will also need a configuration 
switch that enables the code that calls it, or backwards compatibility will not 
be maintained.




> Add support for respecting the NoCrawl flag in Sharepoint
> -
>
> Key: CONNECTORS-1449
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1449
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: SharePoint connector
>Reporter: Markus Schuch
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF next
>
>
> There is a flag {{NoCrawl}} in sharepoint that indicates whether an object 
> should be crawled or not:
> Lists
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.splist.nocrawl.aspx
> Web
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spweb.nocrawl.aspx
> Field
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spfield.nocrawl.aspx
> Wouldn't it be nice to respect that flag?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1449) Add support for respecting the NoCrawl flag in Sharepoint

2019-04-12 Thread Drai (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816568#comment-16816568
 ] 

Drai commented on CONNECTORS-1449:
--

Karl,
Regarding your recommendation:
"I would propose (if either the dspsts, webs, or versions services do not
handle this themselves) that we either add a new MCPermissions service that
wraps whatever is currently used to obtain document metadata with one that
also adds the "NoCrawl" flag to the result,"

I can get the MCPermissions c# code changed to add a new method for this
purpose, deploy and test.
Could you give a specification on this method?
Name, Input and output, etc.
I am not clear on whether you need "NoCrawl' flag along with ListItems
returned by getListItems method or a new method to return document metadata
by Id + NoCraw Flag.
Please advise.
Once this is done, what does it take for you to modify the Sharepoint 2013
repo connector to recognize and respect this flag?

Regards





-- 
*Durai Kalaiselvan*
Founder, Cumilisys LLC
Office: 408 940-5135 Mobile: 408 835 0309



This email and (any accompanying attachments) may contain confidential
information belonging to the sender which is legally protected. The
information is intended only for the use of the individual or entity to
whom it is addressed, and others who have been specifically authorized by
the addressee to receive it. If you are not an intended recipient, you are
hereby notified that any disclosure, copying, or distribution of, or the
taking of any action in reliance on, this communication or the information
contained herein is strictly prohibited. If you have received this
communication in error, please notify us immediately by email or telephone.
Thank you for your cooperation.


> Add support for respecting the NoCrawl flag in Sharepoint
> -
>
> Key: CONNECTORS-1449
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1449
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: SharePoint connector
>Reporter: Markus Schuch
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF next
>
>
> There is a flag {{NoCrawl}} in sharepoint that indicates whether an object 
> should be crawled or not:
> Lists
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.splist.nocrawl.aspx
> Web
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spweb.nocrawl.aspx
> Field
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spfield.nocrawl.aspx
> Wouldn't it be nice to respect that flag?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1449) Add support for respecting the NoCrawl flag in Sharepoint

2019-04-12 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816164#comment-16816164
 ] 

Karl Wright commented on CONNECTORS-1449:
-

The MCPermissions plugin at present furnishes two services: one to get 
permissions for users, and the other to list documents without restrictions 
imposed by SharePoint.  I would propose (if either the dspsts, webs, or 
versions services do not handle this themselves) that we either add a new 
MCPermissions service that wraps whatever is currently used to obtain document 
metadata with one that also adds the "NoCrawl" flag to the result, OR we put it 
in the existing Lists service wrapper we currently have.

Note that the problem isn't going to be adequately addressed unless we can get 
this information on a per-document basis, somehow.  We need to be able to tell 
the framework to delete the document when the connector looks at it.  Doing 
this in a transformation connector won't work for that very same reason: the 
document won't be sent to the transformer unless it's noticed to have been 
changed in some way.  So the repository connector really has to handle this.


> Add support for respecting the NoCrawl flag in Sharepoint
> -
>
> Key: CONNECTORS-1449
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1449
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: SharePoint connector
>Reporter: Markus Schuch
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF next
>
>
> There is a flag {{NoCrawl}} in sharepoint that indicates whether an object 
> should be crawled or not:
> Lists
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.splist.nocrawl.aspx
> Web
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spweb.nocrawl.aspx
> Field
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spfield.nocrawl.aspx
> Wouldn't it be nice to respect that flag?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1449) Add support for respecting the NoCrawl flag in Sharepoint

2019-04-12 Thread Drai (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816106#comment-16816106
 ] 

Drai commented on CONNECTORS-1449:
--

Crawling needs to be avoided because content authors/managers mark a
library/list as 'non-crawlable' by using 'NoCrawl' settings. In current use
case, fetching is fine because crawl user has access to content anyways.
They do not want the items stored in these libraries to show up in search
results.
When the flag is made false again, thos items need to be indexed again.
Should inserting a transformer between Sharepoint repository connection and
Solr output connection achieve this?
This way, depending on flag switches from True to False or vice versa ,
content will be pushed to solr or be ignored.

Regarding modified list service:
Will take a look at it.
If we modify the lists service, should it be deployed to SP 2013 like
MCPermissions.asmx deployment.?

Thanks




-- 
*Durai Kalaiselvan*
Founder, Cumilisys LLC
Office: 408 940-5135 Mobile: 408 835 0309



This email and (any accompanying attachments) may contain confidential
information belonging to the sender which is legally protected. The
information is intended only for the use of the individual or entity to
whom it is addressed, and others who have been specifically authorized by
the addressee to receive it. If you are not an intended recipient, you are
hereby notified that any disclosure, copying, or distribution of, or the
taking of any action in reliance on, this communication or the information
contained herein is strictly prohibited. If you have received this
communication in error, please notify us immediately by email or telephone.
Thank you for your cooperation.


> Add support for respecting the NoCrawl flag in Sharepoint
> -
>
> Key: CONNECTORS-1449
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1449
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: SharePoint connector
>Reporter: Markus Schuch
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF next
>
>
> There is a flag {{NoCrawl}} in sharepoint that indicates whether an object 
> should be crawled or not:
> Lists
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.splist.nocrawl.aspx
> Web
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spweb.nocrawl.aspx
> Field
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spfield.nocrawl.aspx
> Wouldn't it be nice to respect that flag?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1449) Add support for respecting the NoCrawl flag in Sharepoint

2019-04-12 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816017#comment-16816017
 ] 

Karl Wright commented on CONNECTORS-1449:
-

It depends on why you want to avoid crawling something.  If it's to prevent 
fetching it then you can't do it at the transformer level.

But the right solution is to look for it in the SOAP response.

There is another solution, which is to modify the ManifoldCF SharePoint plugin 
for SharePoint 2013 to return it from the modified Lists service.  That would 
involve C# code changes, but would definitely allow us access to the flag in 
the connector.  The code is checked in under 
https://svn.apache.org/repos/asf/manifoldcf/integration/sharepoint-2013/trunk . 
 Have a look.



> Add support for respecting the NoCrawl flag in Sharepoint
> -
>
> Key: CONNECTORS-1449
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1449
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: SharePoint connector
>Reporter: Markus Schuch
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF next
>
>
> There is a flag {{NoCrawl}} in sharepoint that indicates whether an object 
> should be crawled or not:
> Lists
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.splist.nocrawl.aspx
> Web
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spweb.nocrawl.aspx
> Field
> https://msdn.microsoft.com/en-us/library/office/microsoft.sharepoint.spfield.nocrawl.aspx
> Wouldn't it be nice to respect that flag?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)