[jira] [Commented] (CONNECTORS-1620) Accept Sitemaps with content type application/xml

2022-01-21 Thread DK (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480352#comment-17480352
 ] 

DK commented on CONNECTORS-1620:


I tested this and webcrawler connector does not seem to recognize sitemap.xml 
for mimetypes text/xml and application/xml. Version 2.17. Any specifics need to 
be consider in configuring the repo or job with solr output connector?

Thanks

> Accept Sitemaps with content type application/xml
> -
>
> Key: CONNECTORS-1620
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1620
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Web connector
>Reporter: Markus Schuch
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> Given an Output Connection, that does not accepts the MIME type 
> {{application/xml}} for ingestion, it is currently not possible to crawl a 
> sitemap.xml, when the webserver returns {{application/xml}} as content type 
> for the sitemap.
> The sitemap is discarded before the links are extracted, because the mime 
> type {{application/xml}} is not listed in the {{interestingMimeTypeArray}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (CONNECTORS-1604) Documentation to run Manifold over SSL

2022-01-21 Thread DK (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830499#comment-16830499
 ] 

DK edited comment on CONNECTORS-1604 at 1/22/22, 6:32 AM:
--

Hi Karl,
Not only UI, mcf-authority-service as well.
I tried to follow documentation at
[https://www.eclipse.org/jetty/documentation/9.2.22.v20170531/configuring-ssl.html]
since
manifold runs on 9.2.x.
But, since start.jar under manifold example directory(and in other
directory) runs ManifoldJettyRunner , it does not respect start.ini etc
which specifies to load jetty-ssl.xml and jetty-https.xml. Basically, I
could not load ssl related jetty config file with manifoldcf start.jar
>From jetty doc:

The generated SSL certificates held in the key store are configured on
Jetty by injection an instance of SslContextFactory
<[http://download.eclipse.org/jetty/stable-9/apidocs/org/eclipse/jetty/util/ssl/SslContextFactory.html]>
object
and passing it to the connector's SslConnectionFactory, which is done in
the jetty distribution by both in jetty-https.xml
<[http://git.eclipse.org/c/jetty/org.eclipse.jetty.project.git/plain/jetty-server/src/main/config/etc/jetty-https.xml]>
and jetty-spdy.xml
<[http://git.eclipse.org/c/jetty/org.eclipse.jetty.project.git/plain/jetty-spdy/spdy-example-webapp/src/main/config/etc/jetty-spdy.xml]>.
Since SPDY is able to handle HTTPS also, typically you will configure jetty
to use one of these configuration files or the other, which can be done
other on the command line or by editing the start.ini file.
Thanks

 


was (Author: durai-jira):
Hi Karl,
Not only UI, mcf-authority-service as well.
I tried to follow documentation at
https://www.eclipse.org/jetty/documentation/9.2.22.v20170531/configuring-ssl.html
since
manifold runs on 9.2.x.
But, since start.jar under manifold example directory(and in other
directory) runs ManifoldJettyRunner , it does not respect start.ini etc
which specifies to load jetty-ssl.xml and jetty-https.xml. Basically, I
could not load ssl related jetty config file with manifoldcf start.jar
>From jetty doc:

The generated SSL certificates held in the key store are configured on
Jetty by injection an instance of SslContextFactory

object
and passing it to the connector's SslConnectionFactory, which is done in
the jetty distribution by both in jetty-https.xml

 and jetty-spdy.xml
.
Since SPDY is able to handle HTTPS also, typically you will configure jetty
to use one of these configuration files or the other, which can be done
other on the command line or by editing the start.ini file.
Thanks




-- 
*Durai Kalaiselvan*
Founder, Cumilisys LLC
Office: 408 940-5135 Mobile: 408 835 0309



This email and (any accompanying attachments) may contain confidential
information belonging to the sender which is legally protected. The
information is intended only for the use of the individual or entity to
whom it is addressed, and others who have been specifically authorized by
the addressee to receive it. If you are not an intended recipient, you are
hereby notified that any disclosure, copying, or distribution of, or the
taking of any action in reliance on, this communication or the information
contained herein is strictly prohibited. If you have received this
communication in error, please notify us immediately by email or telephone.
Thank you for your cooperation.


> Documentation to run Manifold over SSL
> --
>
> Key: CONNECTORS-1604
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1604
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: DK
>Priority: Major
>
> Documentation does not talk about running manifold over SSL anywhere. I am 
> sure lot of users who deployed manifold in prod configured ssl. I tried to 
> configure SSL without success. Can someone point me to instructions to run 
> manifold over SSL please?
> Thanks



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (CONNECTORS-1665) WebConnector: Add activity records for excluded URLs

2022-01-21 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1665:
---

Assignee: Julien Massiera

> WebConnector: Add activity records for excluded URLs 
> -
>
> Key: CONNECTORS-1665
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1665
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Web connector
>Affects Versions: ManifoldCF 2.18
>Reporter: Julien Massiera
>Assignee: Julien Massiera
>Priority: Trivial
> Fix For: ManifoldCF 2.19
>
> Attachments: patch-CONNECTORS-1665
>
>
> It would be interesting to add activity records in the WebConnector to keep 
> track of excluded URLs that match an exclude filter



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (CONNECTORS-1665) WebConnector: Add activity records for excluded URLs

2022-01-21 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480238#comment-17480238
 ] 

Karl Wright commented on CONNECTORS-1665:
-

Please go ahead and commit this.

> WebConnector: Add activity records for excluded URLs 
> -
>
> Key: CONNECTORS-1665
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1665
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Web connector
>Affects Versions: ManifoldCF 2.18
>Reporter: Julien Massiera
>Assignee: Julien Massiera
>Priority: Trivial
> Fix For: ManifoldCF 2.19
>
> Attachments: patch-CONNECTORS-1665
>
>
> It would be interesting to add activity records in the WebConnector to keep 
> track of excluded URLs that match an exclude filter



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (CONNECTORS-1665) WebConnector: Add activity records for excluded URLs

2022-01-21 Thread Julien Massiera (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480170#comment-17480170
 ] 

Julien Massiera commented on CONNECTORS-1665:
-

The patch has never been reviewed. [~kwri...@metacarta.com] can you take a look 
and tell if it can be integrated to the trunk please ?

> WebConnector: Add activity records for excluded URLs 
> -
>
> Key: CONNECTORS-1665
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1665
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Web connector
>Affects Versions: ManifoldCF 2.18
>Reporter: Julien Massiera
>Priority: Trivial
> Fix For: ManifoldCF 2.19
>
> Attachments: patch-CONNECTORS-1665
>
>
> It would be interesting to add activity records in the WebConnector to keep 
> track of excluded URLs that match an exclude filter



--
This message was sent by Atlassian Jira
(v8.20.1#820001)