[jira] [Commented] (CONNECTORS-1584) regex documentation
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774309#comment-16774309 ] Karl Wright commented on CONNECTORS-1584: - Have you subscribed to the list? Instructions are in the documentation for "contact us". You send mail to: user-subscr...@manifoldcf.apache.org > regex documentation > --- > > Key: CONNECTORS-1584 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1584 > Project: ManifoldCF > Issue Type: Improvement > Components: Web connector >Affects Versions: ManifoldCF 2.12 >Reporter: Tim Steenbeke >Priority: Minor > > What type of regexs does manifold include and exclude support and also in > general regex support? > At the moment i'm using a web repository connection and an Elastic output > connection. > I'm trying to exclude urls that link to documents. > e.g. website.com/document/path/this.pdf and > website.com/document/path/other.PDF > The issue i'm having is that the regex that I have found so far doesn't work > case insensitive, so for every possible case i have to add a new line. > e.g.: > {code:java} > .*.pdf$ and .*.PDF$ and .*.Pdf and ... .{code} > Is it possible to add documentation what type of regex is able to be used or > maybe a tool to test your regex and see if it is supported by manifold ? > I tried mailing this question to > [u...@manifoldcf.apache.org|mailto:u...@manifoldcf.apache.org] but this mail > adress returns a failure notice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1584) regex documentation
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774147#comment-16774147 ] Tim Steenbeke commented on CONNECTORS-1584: --- 3 colleges and I tried mailing to the address and we all get this same response. So it is the right address than, I though we made a mistake. > regex documentation > --- > > Key: CONNECTORS-1584 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1584 > Project: ManifoldCF > Issue Type: Improvement > Components: Web connector >Affects Versions: ManifoldCF 2.12 >Reporter: Tim Steenbeke >Priority: Minor > > What type of regexs does manifold include and exclude support and also in > general regex support? > At the moment i'm using a web repository connection and an Elastic output > connection. > I'm trying to exclude urls that link to documents. > e.g. website.com/document/path/this.pdf and > website.com/document/path/other.PDF > The issue i'm having is that the regex that I have found so far doesn't work > case insensitive, so for every possible case i have to add a new line. > e.g.: > {code:java} > .*.pdf$ and .*.PDF$ and .*.Pdf and ... .{code} > Is it possible to add documentation what type of regex is able to be used or > maybe a tool to test your regex and see if it is supported by manifold ? > I tried mailing this question to > [u...@manifoldcf.apache.org|mailto:u...@manifoldcf.apache.org] but this mail > adress returns a failure notice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1584) regex documentation
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774146#comment-16774146 ] Tim Steenbeke commented on CONNECTORS-1584: --- {panel:title=Failure notice send by mailer-dae...@apache.org} Hi. This is the qmail-send program at apache.org. I'm afraid I wasn't able to deliver your message to the following addresses. This is a permanent error; I've given up. Sorry it didn't work out. : Must be sent from an @apache.org address or a subscriber address or an address in LDAP. --- Below this line is a copy of the message. Return-Path: Received: (qmail 90034 invoked by uid 99); 18 Feb 2019 10:35:51 - Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Feb 2019 10:35:51 + Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 07C55C84A2 for ; Mon, 18 Feb 2019 10:35:51 + (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.998 X-Spam-Level: * X-Spam-Status: No, score=1.998 tagged_above=-999 required=6.31 tests=[DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=cronos.onmicrosoft.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id bzxj-Zwazahp for ; Mon, 18 Feb 2019 10:35:47 + (UTC) Received: from EUR02-HE1-obe.outbound.protection.outlook.com (mail-eopbgr10062.outbound.protection.outlook.com [40.107.1.62]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 5432D5F533 for ; Mon, 18 Feb 2019 10:35:47 + (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CRONOS.onmicrosoft.com; s=selector1-CRONOS-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NXxuOXxO7L5OIh8wemB0u1esV8BQdvefAryTpMAPvDU=; b=dZRUlfL4a6CvpIZbLZVeakgTuNXTti3W/oO9VcpZrao8Odjy7PljvmTce1+2kx3NxG/uWOFVhgaHgYSJXBOwRSVRwW/Ovx6YP1z5fw5nBpdoux666pZd7uzLlTJSM5kNOLwqrU2fIdSkW3J6qFqB1TMMu8Jm4BonW/kXylfb0SY= Received: from AM6PR0302MB3256.eurprd03.prod.outlook.com (52.133.27.27) by AM6PR0302MB3383.eurprd03.prod.outlook.com (52.133.28.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1622.19; Mon, 18 Feb 2019 10:35:40 + Received: from AM6PR0302MB3256.eurprd03.prod.outlook.com ([fe80::a8f3:3f23:b1f3:8ce6]) by AM6PR0302MB3256.eurprd03.prod.outlook.com ([fe80::a8f3:3f23:b1f3:8ce6%5]) with mapi id 15.20.1622.018; Mon, 18 Feb 2019 10:35:40 + From: Steenbeke Tim To: "u...@manifoldcf.apache.org" Subject: Regex support Thread-Topic: Regex support Thread-Index: AQHUx3UQjsHP1lgCt0uYyVLe47S0rw== Date: Mon, 18 Feb 2019 10:35:40 + Message-ID: Accept-Language: en- Content-Language: en- X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Tim.Steenbeke@formica.digital; x-originating-ip: [94.143.189.241] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 8b0d3335-aeb7-4dc8-f619-08d6958cd411 x-microsoft-antispam: BCL:0;PCL:0;RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600110)(711020)(4605104)(2017052603328)(7153060)(7193020);SRVR:AM6PR0302MB3383; x-ms-traffictypediagnostic: AM6PR0302MB3383: x-ms-exchange-purlcount: 2 x-microsoft-exchange-diagnostics: 1;AM6PR0302MB3383;20:GL97sCN3oMJg9YDuqqZQjTkFnP+s9blDsxlMF5L7uIMW/Cz7EUmc2qn4aUHZ/Gk7T7u0uYQUMqr5RYnJ4UUZF2FRDvKg91ZSHM2t/jcwq+Udc5ibZTY5ZByYX7bVG9i6ZqCb2tLa/S///Mc2MjH8KqSVacv1zGyCeBiOczfh3E4= x-microsoft-antispam-prvs: x-forefront-prvs: 09525C61DB x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(3986042)(366004)(396003)(136003)(346002)(376002)(199004)(189003)(106356001)(7696005)(72206003)(7736002)(966005)(2906002)(99286004)(348075)(74316002)(7120041)(7119041)(6116002)(3846002)(4744005)(105586002)(316002)(19627405001)(2351001)(256004)(606006)(25786009)(861006)(486006)(476003)(33656002)(6506007)(26005)(6916009)(566032)(81156014)(7116003)(186003)(102836004)(8676002)(97736004)(173073)(81166006)(8936002)(14454004)(53936002)(47861)(105004)(68736007)(6306002)(54896002)(86362001)(66066001)(733005)(53376002)(221733001)(564073)(55016002)(9686003)(236005)(6436002)(2501003)(19273905006)(46492003)(562404015)(563064011);DIR:OUT;SFP:1101;SCL:1;SRVR:AM6PR0302MB3383;H:AM6PR0302MB3256.eurprd03.prod.
[jira] [Commented] (CONNECTORS-1584) regex documentation
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774105#comment-16774105 ] Karl Wright commented on CONNECTORS-1584: - Actually, it *is* user@ but so many people get mixed up with that that I got it backwards myself. What failure notice did you get when you mailed to user@? I receive email from this list a dozen times a day or more so I am not sure why you'd be having trouble. > regex documentation > --- > > Key: CONNECTORS-1584 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1584 > Project: ManifoldCF > Issue Type: Improvement > Components: Web connector >Affects Versions: ManifoldCF 2.12 >Reporter: Tim Steenbeke >Priority: Minor > > What type of regexs does manifold include and exclude support and also in > general regex support? > At the moment i'm using a web repository connection and an Elastic output > connection. > I'm trying to exclude urls that link to documents. > e.g. website.com/document/path/this.pdf and > website.com/document/path/other.PDF > The issue i'm having is that the regex that I have found so far doesn't work > case insensitive, so for every possible case i have to add a new line. > e.g.: > {code:java} > .*.pdf$ and .*.PDF$ and .*.Pdf and ... .{code} > Is it possible to add documentation what type of regex is able to be used or > maybe a tool to test your regex and see if it is supported by manifold ? > I tried mailing this question to > [u...@manifoldcf.apache.org|mailto:u...@manifoldcf.apache.org] but this mail > adress returns a failure notice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1584) regex documentation
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774033#comment-16774033 ] Tim Steenbeke commented on CONNECTORS-1584: --- If the mail is userS I think the site should be updated because the mail mentioned in FAQ is user. [https://manifoldcf.apache.org/release/release-2.12/en_US/faq.html] Also thanks for responding. > regex documentation > --- > > Key: CONNECTORS-1584 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1584 > Project: ManifoldCF > Issue Type: Improvement > Components: Web connector >Affects Versions: ManifoldCF 2.12 >Reporter: Tim Steenbeke >Priority: Minor > > What type of regexs does manifold include and exclude support and also in > general regex support? > At the moment i'm using a web repository connection and an Elastic output > connection. > I'm trying to exclude urls that link to documents. > e.g. website.com/document/path/this.pdf and > website.com/document/path/other.PDF > The issue i'm having is that the regex that I have found so far doesn't work > case insensitive, so for every possible case i have to add a new line. > e.g.: > {code:java} > .*.pdf$ and .*.PDF$ and .*.Pdf and ... .{code} > Is it possible to add documentation what type of regex is able to be used or > maybe a tool to test your regex and see if it is supported by manifold ? > I tried mailing this question to > [u...@manifoldcf.apache.org|mailto:u...@manifoldcf.apache.org] but this mail > adress returns a failure notice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1584) regex documentation
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771462#comment-16771462 ] Karl Wright commented on CONNECTORS-1584: - The mailing list is us...@manifoldcf.apache.org. The regular expressions are standard Java regular expressions. The documentation is widely available. You can also experiment with regular expressions in a java applet online at: https://www.cis.upenn.edu/~matuszek/General/RegexTester/regex-tester.html > regex documentation > --- > > Key: CONNECTORS-1584 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1584 > Project: ManifoldCF > Issue Type: Improvement > Components: Web connector >Affects Versions: ManifoldCF 2.12 >Reporter: Tim Steenbeke >Priority: Minor > > What type of regexs does manifold include and exclude support and also in > general regex support? > At the moment i'm using a web repository connection and an Elastic output > connection. > I'm trying to exclude urls that link to documents. > e.g. website.com/document/path/this.pdf and > website.com/document/path/other.PDF > The issue i'm having is that the regex that I have found so far doesn't work > case insensitive, so for every possible case i have to add a new line. > e.g.: > {code:java} > .*.pdf$ and .*.PDF$ and .*.Pdf and ... .{code} > Is it possible to add documentation what type of regex is able to be used or > maybe a tool to test your regex and see if it is supported by manifold ? > I tried mailing this question to > [u...@manifoldcf.apache.org|mailto:u...@manifoldcf.apache.org] but this mail > adress returns a failure notice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)